Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

its wikidata, not wikipedia, they are two disjoint datasets.


Basically every wikipedia page (across languages) is linked to wikidata, and some infoboxes are generated directly from wikidata, so they're seperate, but overlapping and increasingly so.

https://en.wikipedia.org/wiki/Category:Articles_with_infobox...

edit: slightly wider scope category pointing to pages using wikidata in different ways:

https://en.wikipedia.org/wiki/Category:Wikipedia_categories_...


I agree there is strong overlap between entities, and also infobox values, but both wikidata and wikipedia has many more disjoint datapoints: many tables, factual statements in wikipedia which are not in wikidata, and many statements in wikidata which are not in wikipedia.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: