Wikidata problems to be soloved

Wikidata is a collaboratively edited multilingual knowledge graph hosted by the Wikimedia Foundation, which is founded in late 2012, 8 years ago. At the time of writing, Wikidata currently contains 93,647,881 items. 1,422,596,352 edits have been made since the project launch.


Wikidata recenly has growing interests with more dedicated events such as Wikidata Workshop @ISWC or Wiki Workshop @WWW. A recent systematic literature review by Marçal Mora-Cantallops et al. - "A systematic literature review on Wikidata" - also summarizes the research efforts in the context of Wikidata nicely.


Keynote by Lydia Pintscher at the Wikidata Workshop 2020, co-located with ISWC 2020


Lydia Pintscher who is a product manager of Wikidata has pointed out critical challenges despite of promising progress over the last 10 years at the Wikidata Workshop 2020, co-located with ISWC 2020, inlcuding the following points:

Reliability and growth

Q: How can we make Wikidata's data openly available for querying at scale?

Dream: Free and open graph DB with fast updates and response times for a lot of data


Data quality

Q: How can we measure the quality of our datamore accurately?

Looking at factors such as accuracy, objectivity, completeness

Dream: Reliable automated quality assessment at scale


Q: How can we reliably find and address issues in the data we have?

Vandalism and misinformation, outdated data, ontology problems

Dream: Automated identification and verification tools; 

Tools for finding and suggesting fixes for ontology issues


Language and Culture coverage

Q: Where do we stand?

Dream: Privacy-respecting way of understanding the make-up of the editor community; 

Clarity on conflicts and toll usage around knowledge discovery

Good understanding and communication around the gaps and biases in the data

No comments:

Post a Comment