Google has a history of scraping content that they want, their business is built on the back of scraping other peoples content. The story I read just recently of what happened to Celebrity Net Worth was an interesting read where Google asked for an API, they refused and Google just scraped the content anyway. There was no lawsuit, but CNW put up fake content and sure enough, it made its way to Google.
It is all ironic given how aggressive Google are in blocking any attempts to scrape its content.
I’d say most of Genius’ visitors comes from the “song x lyrics” so hiding those with robots would ultimately make them lose almost all of their traffic.
To be fair to TC, if you disable JavaScript you get a pretty good experience - just the full article, legible. Not like those sites that require JS to load the text and/or images.
robots.txt is designed to keep garbage off search results. It has absolutely no power to prevent a bot to do anything. Also if the site added robots.txt they might as well shut down because their entire userbase comes from people searching lyrics on google.
The problem is that Google is stealing content and placing it on search so the user never goes to the source, By blocking it with robots they block themselves from google results AND Google may already keep scraping the content.
> The Unicorn tier is for large companies or companies that would like to have a reciprocal relationship with our foundation. If you need special guarantees, indemnities or require us to sign your contract for a data license, please select this tier. If you have another creative idea you would like to propose, please also select the unicorn tier.
> For any of these cases, please detail your request in the company information field and we will work with you to fit your company's mythical situation. We will also find an appropriate monthly support amount to our non-profit foundation of $1500 or more per month. Please always consider enabling the growth of our non-profit foundation and the continuous growth of our metadata!
We live in a society of laws. Even soldiers. Google have shown they have no respect for the law not equality before it and will cheat while using the law as a cudgel. Recall law exists that the strongest might not always get their way. "Ironic" is the pole way of pointing this out.
Without law, Google cease to exist immediately. They are incapable of enforcing property rights without it.
Pardons aside, soldiers go to jail for taking an attitude like Google's.
Just like Genius, Google licensed the lyrics. If they didn't, the publishers definitely would have sued.
Ironically, it is Genius that seems to have no respect for copyright law. Genius ended up having to settle a case years ago because they were using lyrics without the appropriate licensing [1].
Which law did google break? Scraping in and of itself isn't illegal last time i checked, and usa doesn't have database copyrights unlike some juridsictions.
> somewhat, per things like the Americans with Disabilities Act
This is just not right at all. There is nothing in the Americans with Disabilities Act that make blocking scrapers illegal.
I think you mean you don't like the power imbalance of the large company taking away from smaller companies while using technological means to stop the same thing happening to them.
I don't like it either, but that doesn't magically make it is illegal. I'm not even sure it should be.
> There is nothing in the Americans with Disabilities Act that make blocking scrapers illegal.
Retrieving, processing, and displaying information in a manner contrary to the wishes of the provider of that information is necessary for accessibility to disabled users. As a specific example, any attempt to block use of wget for scraping also blocks use of wget as part of a `wget | filter | text-to-speech` pipeline[0], and is thus a discrimination against blind or otherwise visually impaired users. The ADA is, as mentioned, only somewhat effective in prohibiting such things, though.
> it's not illegal
> that doesn't magically make it is illegal.
I don't think anyone is claiming that scraping itself actually is legally protected - I interpreted DigitalSea and harry8 as implying that it should be.
0: in either the shell sense or the workflow sense
Retrieving, processing, and displaying information in a manner contrary to the wishes of the provider of that information is necessary for accessibility to disabled users. As a specific example, any attempt to block use of wget for scraping also blocks use of wget as part of a `wget | filter | text-to-speech` pipeline[0], and is thus a discrimination against blind or otherwise visually impaired users. The ADA is, as mentioned, only somewhat effective in prohibiting such things, though.
This is not the case. Unfortunately (?) the ADA doesn't allows the disabled person to specify their own technology. If Google can reasonably say that speech to text works via a standard screenreader (which it does) then they are ok.
> The ADA is, as mentioned, only somewhat effective in prohibiting such things, though
Well that's not the intent of the ADA, so not really surprising.
It is all ironic given how aggressive Google are in blocking any attempts to scrape its content.