Wiki Workshop 2016 at ICWSM

From Brede Wiki
Jump to: navigation, search
Event (help)
Wiki Workshop 2016 at ICWSM
Location: Cologne Germany
Date & time:

Wiki Workshop

Search: DuckDuckGo Google Bing

Wiki Workshop 2016 at ICWSM was a workshop in the Wiki Workshop series at the ICWSM in 2016. Research on Wikipedia and Wikidata was presented.


[edit] Invited talks

Quick notes not necessarily completely correct.

[edit] Martin Potthast

Wikipedia Text Mining — Uncovering Quality and Reuse

Key objective:

  1. Improve low-quality content
  2. Maintain high-quality content
  • There is quite a lot of research on classifying articles into featured-or-not. Although academic interesting it does have little practical importance.

Anderka 2012 counts 445 types of quality flaws in the English Wikipedia. Verifiability is the most frequency quality flaws. Potthast talk about a one-class estimation of quality flaws. operating point on the precision-recall cure based on F_0.2.

Wikipedia text reuse detection (sometimes a euphemism of plagirism).

Text reuse: Quotation, boilerplate, translation (metaphrase, paraphrase), summarization. All of these could be plagiarism. See also Potthast 2011.

Algorithm: Keyword extraction, source retrieval, 4-gram match, knowledge-based post-processing.

Picapica online search engine for text reuse: Potthast 2014.

Question: Why is the quality of Wikipedia so high? Potthast attempted an answer: Errors can be fixed and there are more people that correct errors than make them.

[edit] Claudia Wagner

Gender Inequalities in Wikipedia

  • How are notable men and women presented in Wikipedia?
  • How are professions described on Wikipedia?

Notable men/women:

  • "Human accomplishment"
  • "Freebase"
  • Pantheon (11000 individuals (13% women)

How are wo/men depicted in Wikipedia?

  • Linguistic bias, abstract positive and abstract negative. Men is described with more positive words.
  • Structual issues.
    • How does articles about wo/men link to wo/men.
    • Men more well-connected

Pictures in Wikipedia on pages about occupations, e.g., journalist.

[edit] Ofer Arazy

Emergent Work in Wikipedia

"Emergent roles": all-round contributors, quick and dirty editors, copy editors, layout shapers, vandals, ...

  • Wiki work manual annotation (remove vandalism, insert vandalism, ...)
    • 90 articles.
    • 13'592 reliable annotations
  • Trained a model to predict the manual annotation

[edit] Jürgen Pfeffer

Applying Social Network Analysis Metrics to Large-Scale Hyperlinked Data

Linton C. Freeman, Centrality in social networks conceptual clarification: Different network centrality measures.

"Graph" and "network": "networks" are "graphs" with meaning.

[edit] Fabian M. Suchanek

A Hitchhiker's Guide to Ontology

Spoke about YAGO:

  • Extraction from Wikipedia
  • Extraction from GeoNames and WordNet.
  • Intermediate extractor
    • Clean facts
    • Deduplication
    • ...
  • Ensuring "high quality" (95%)
  • 10 language
  • 100 relations
  • 100 million facts
  • 10 milion entities
  • Used by DBpedia

Extension with products based on product ID.

Rule finding:

  • AMIE "is based on an efficient in-memory database implementation
  • For instance, type(x, pope) => diedIn(x, Rome)

Ontology plagirism:

  • Adding fake facts
  • Removal of facts (ISWC 2011)

Mining Le Monde

  • Enriching Le Monde from YAGO.
  • Statistics on people mentioned in Le Monde

His slides were made with Powerline using his own font.

[edit] Posters

  1. Semi-Supervised Automatic Generation of Wikipedia Articles for Named Entities
  2. ENRICH: A Query Expansion Service Powered by Wikipedia Graph Structure
  3. Similar Gaps, Different Origins? Women Readers and Editors at Greek Wikipedia
  4. Wiki Editors' Acceptance of Additional Guidance on Talk Pages
  5. What Can Wikipedia Tell Us about the Global or Local Character of Burstiness?
  6. State of the Union: A Data Consumer's Perspective on Wikidata and Its Properties for the Classification and Resolution of Entities
  7. Literature, geolocation and Wikidata
  8. Graph-based breaking news detection on Wikipedia
  9. A proposed solution for discovery of reusable technology pictures using textmining of surrounding article text, based on the infrastructure of Wikidata, Wikisource and Wikimedia Commons
  10. Extracting Semantics from Random Walks on Wikipedia: Comparing Learning and Counting Methods
  11. In Wikipedia We Trust: A Case Study
  12. Wikipedia Knowledge Graph with DeepDive
  13. Hidden Gems in the Wikipedia Discussions: The Wikipedians' Rationales
  14. Topical Interest and Degree of Involvement of Bilingual Editors in Wikipedia
  15. On the Reliability of Information and Trustworthiness of Web Sources in Wikipedia
  16. Collective Remembering in Wikipedia: The Case of Aircraft Crashes
  17. The Political Salience Dynamics and Users' Interaction Using the Example of Wikipedia within the Authoritarian Regime Context
  18. WikiLayers – A Visual Platform for Analyzing Content Evolution and Editing Dynamics in Wikipedia
  19. Cultural Relation Mining on Wikipedia: Beyond Culinary Analysis
  20. Discovery and efficient reuse of technology pictures using Wikimedia infrastructures. A proposal
Personal tools