Modeling user reputation in wikis

From Brede Wiki
Jump to: navigation, search
Paper (help)
Modeling user reputation in wikis
Authors: Sara Javanmardi, Cristina Lopes, Pierre Baldi
Citation: Statistical Analysis and Data Mining 3 (2): 126-139. 2010 April
DOI: 10.1002/sam.10070.
Web: Bing Google Yahoo!Google PDF
Article: BASE Google Scholar PubMed
Restricted: DTU Digital Library
Other: NIF
Format: BibTeX
Extract: Talairach coordinates from linked PDF: CSV-formated wiki-formated

Modeling user reputation in wikis describes a model for user reputation on Wikipedia.

The researchers wanted to estimate the reputation of a user at a specific time R_i(t) as a value scaled between 0 and 1.


[edit] Method

The regard "tokens" as either good-quality or poor-quality. A good-quality token is a token that "is present after the invention of the admin" (page 4). Then the consider the number of tokens inserted until time t for author i: N_i(t) and the number of good-quality tokens for author i inserted at time i: n_i(t).

The consider 3 different models:

  • The fraction of good-tokens insert to all tokens inserted for a user.
  • The first model extended so that quick deletions are weighted more a poor-quality tokens.
  • The second model extended with the reputation of the user deleting the token.

There is only one parameter in the model: the decay parameter for the exponential decay for how "quick" the deletion occurs.

They use MD5 signature to compare revisions (page 11) and a algorithm by P. Heckel for diffs: A technique for isolating differences between files. Their developed tool is called Wikipedia Event Extractor and was/is(?) publicly available: (link apparently no longer working)

[edit] Data

crawler4j crawled the English Wikipedia to download 1.9 million articles and their revisions in the summer 2009.

Properties of the dataset:

  • 124 million revision.
    • 83 million by anonymous users
    • 41 million by registered users
  • 12.8 million users.
    • 1.7 million registered users
    • 11 million anonymous users

[edit] Results

  • Admins on average submits 11% of the revisions of a paer.

[edit] Related papers

  1. A content-driven reputation system for the Wikipedia
  2. A utility for estimating the relative contributions of wiki authors
  3. Computing trust from revision history
  4. Evaluating authoritative sources in collaborative editing environments
  5. Investigations into trust for collaborative information repositories: a Wikipedia case study
  6. Measuring article quality in Wikipedia: models and evaluation
  7. Mining revision history to assess trustworthiness of article fragments
  8. Modeling trust in collaborative information systems
  9. Structuring wiki revision history
  10. Wikirep: digital reputation in virtual communities

[edit] Critique

  1. What is a "token" precisely?
  2. It is unclear why there is a minus subscript in equation 1.
Personal tools