> People tend to overlook the decay of the modern web, when in fact these numbers are extraordinary—they represent a comprehensive breakdown in the chain of custody for facts.
This is a particularly good quote to sum up the article. The internet is not a repository of facts, it is a repository of facts, spam, junk, and things. Moreover, it is not the only repository of these.
Link rot happens. Content is subject to the will of the publisher to spend the time and/or money to continue to host it.
Depending on links to work eternally is a mistake. The problem is not the link rot, it is the bad assumption.
I know somebody who started a business that was successful for a while and then failed. Spammers got control of the domain and now it is full of ads for a dangerous diet drug.
What makes my blood boil is that it impugns the integrity of the founder who is a decent person who has nothing to do with that scam.
We shut down the business an abandoned the domain. Someone registered the domain, created a similar-looking website by hand (recycling a lot of the text and images), and added spam. It even has my old company address.
This is a .st domain, about $35/yr. The web design work cost something too. More than I would have expected the link juice from a single website to be worth.
What we really need is some sort of DNS record or meta content we can add that tells search engines "this domain is being abandoned, destroy all link juice".
The traffic you get from people following links is measurable (and real!) The traffic you get from "link juice" is imagined.
The original PageRank paper assumed that PageRank approximated the distribution of views on web pages assuming that people followed links at random.
If Google wanted to know what people are viewing today, they don't need to collect a link graph and do matrix math. They can measure it directly with Chrome, Google Analytics and data exhaust from the advertising platform.
Nodes in keywordspace don’t die with the businesses that created them. The popularity of domains and links and words and phrases are permanently altered by the existence of the business. It’s a digital footprint like how a real-world business leaves a physical footprint. Some footprints are harmless - just a memory of activity that once happened. Other footprints cause lasting harm, like contaminated soil.
Abandoned formerly-popular domains create a kind of long-tail info-environmental impact, just like an abandoned warehouse can become a real-world hazard.
Or the FDA could put alternative-health scammers in jail.
Back in the 1950s they put Wilhem Reich in jail, where he died. L. Ron Hubbard got the hint and left the country and when no country was safe he went to sea.
Today people like Dr. Oz run alt-health scams continuously and nobody seems to go to jail or even get a fine.
Exactly. The Atlantic author seems to be laboring under the misguided assumption that the web is somehow the same sort of thing as a library of books. Even libraries often have some degree of garbage information in them, and represent a survival story: the vast majority of books ever written are no longer in print, or even discoverable anywhere.
Good stuff should be preserved, but it's not the Internet's job to somehow magically do it. It's OUR job, and the nature of digital information (DRM not withstanding) makes this easier than ever.
This is a particularly good quote to sum up the article. The internet is not a repository of facts, it is a repository of facts, spam, junk, and things. Moreover, it is not the only repository of these.
Link rot happens. Content is subject to the will of the publisher to spend the time and/or money to continue to host it.
Depending on links to work eternally is a mistake. The problem is not the link rot, it is the bad assumption.