This might be of interest in relation to your post earlier regarding the National Newspapers of ireland taking issue with search engines ‘pirating’ their content.
All the major search engine crawlers will obey a file called robots.txt where you can set rules as to what they are allowed to index.
All of the major newspaper can make use of this file and you will be surprised to know all the major Irish newspapers explicitly allow search engines to index their content.
If they don’t want links to their content showing up in search results all they have to do is add the following line to the file and all will be sorted,
But why would they do so as most of their traffic is from search engines?
Earlier: The Dead Tree Trolls