From: (Anonymous)
1. Вы опять не заметили, что СЕГОДНЯ СМИ РУТИННО ПРАВЯТ НОВОСТИ ПОВЕРХ "СТАРЫХ" (т.е. версий несколько часов - дней "старости")

2. Вот один механизм, как образуется прямое несоответствие содержимого печатных изданий изданиям в электронных базах (из объяснений, написанных в 2006 году)
    http://searchenginewatch.com/showPage.html?page=3613561
    The Archives: Free vs. Fee

    Like many online news sites, the New York Times offers some content free to anyone with no registration required, some free content to users who have registered with a user name and password, and some content only to paying subscribers. But unlike other newspapers, which often remove all content to subscriber-only archives after a period of a week to a month, the vast majority of the content on the Times is available to anyone without a subscription. Let's look at how this works.

    The Times classifies content in three ways: Seven day content, "open" content, and archived/Times Select content. The Times produces about 500 articles per day, or about 3,500 articles per week which are freely available to anyone without registration for seven days from the publication date. After seven days, these articles are moved to either the open area or the archived/Times Select area, depending on the type of article.

    But there's a catch: Seven day content is free to anyone without subscription, though readers who continue to read other articles within the Times are asked to login or register for a free subscription once they've clicked more than five links. The Times plans to increase this threshold to eight clicks soon.

    When content changes from seven day to open status, articles keep the same URL but are physically moved to another server. This content remains accessible to both search engines and users alike. Marshall says open content consists of more than 20 million documents from the papers, including general news, theater and movie reviews, sports news, classifieds and so on.

    In all, 97% of the overall site ends up classified as open content, freely accessible to anyone who hasn't used up their 5 link quota.

    The remaining 3% is classified as archived content, also called "Times Select" materials. This content consists of daily columns, op-ed editorials, special features and so on. To access this content, you must be a subscriber to the print edition of the Times, or pay a $49.95 annual subscription fee.

    Despite this restriction for human users, Marshall says that both Google and Yahoo have been allowed to fully index the premium content, and will display results for matching queries (for example, a search for popular Times Op-Ed columnist Thomas Friedman in Google and Yahoo returns hundreds of results).

    Click through on many of these links from both engines, however, and you won't see the content indexed by the search engines. Rather, the Times web site detects that the user agent is a browser, and serves up a shorter abstract page with a login form to access the premium content.

    Isn't this cloaking—serving different pages to a search engine and an individual web browser? Yes, it is.

    Although both Google and Yahoo warn against cloaking, Marshall says both companies are aware of what the Times is doing, and apparently condone the practice.

    "They want the content, and they're very interested in displaying it," says Marshall.

    Google has allowed cloaked content from other sources before, and we've seen other instances where search engines are apparently looking the other way when cloaking is used by a web site. We plan to do a follow-up on cloaking policies at all of the search engines in the near future.

    [Note: A discussion over at our Search Engine Watch Forums since this article was written has Danny deciding that the NY Times isn't cloaking, since humans who either register with or have paid subscriptions with the New York Times do ultimately see the same content as was indexed. Marshall also emphatically stated in a follow-up conversation that the New York Times does not cloak.]



Шире, что конкретно доступно, что за плату, а что вообще нет решают архиваторы самих компаний.
Поиск по открытому сайту, не давший результата (и на одно слово тем более) не гарантирует, не доказывает, что материала в печатном издании не было.

This account has disabled anonymous posting.
If you don't have an account you can create one now.
HTML doesn't work in the subject.
More info about formatting

November 2017

S M T W T F S
   1234
56 7891011
12131415161718
19202122232425
2627282930  

Style Credit

Expand Cut Tags

No cut tags
Page generated Jul. 4th, 2025 12:49 pm
Powered by Dreamwidth Studios