From Brede Wiki
Jump to: navigation, search
Topic (help)


Category: Twitter

Social web site
Microblogging site


Twitter sentiment analysis

Databases: Wikipedia with DBpedia
Papers: DOAJ Google Scholar PubMed
Ontologies: MeSH NeuroLex Wikidata Wikipedia
Other: Google Twitter WolframAlpha

This is a graph with borders and nodes. Maybe there is an Imagemap used so the nodes may be linking to some Pages.

Twitter is a social web site with microblogging.

Retweet is a tweet that is copied from another tweet." Retweets may be copied from another messages and prepended with "RT @user", "RT" and "RT:" or postpended with "via @user". "RT please" is used by users wanted to be retweeted. Retweets may also be transmitted by pressing the retweet-button. Button-retweets are not available from the user time line (e.g.,

Mentions are tweets with "@user" that does not indicate retweets.

Links (i.e. web links) are often shorten with URL shortening services such as or Twitter automatically construct links from the "http : / /" pattern.

Some analyses characterizes users with respect to "authority", "audience" and "realness" (PeerIndex)[1]. This measurement has been applied on British Twitters.[2]

They have 26K/sec search queries.


[edit] Twitter data

Tweet might be obtained, e.g., from the streaming API, from the search API ( and from other parts for the API, e.g., individual tweets retrieved by ID (

[edit] Data sets

Due to Twitter's new Terms of service several previously public data sets are no longer available.[3]

Edinburgh Twitter Corpus 
MPI-SWS 54,981,152 user accounts, 1,963,263,821 follow links, 1,755,925,520 tweets. No longer publicly available.
Haewoon Kwak's social graph Seem to have been unavailable at times. (twitter_rv.tar.gz).[5] The uncompressed fil is approximately 26 GB. There are 41'652'230 profiles and 1'468'365'182 giving a density on 8.46-07
Observatory on Social Media
[1] [2] Data set for sentiment analysis.
Data set collected 2009 June to December in the lab of Jure Leskovec.[6] No longer publicly available.
TREC 2011 dataset [3]]
Twitter Sentiment Corpus  
Collection by Niek Sanders consisting of "5513 hand-classified tweets"

[edit] Fields

Each tweet has several fields [].

Field Description Example
user User structure
favorited Always empty in the streaming data
retweet_count Always empty in the streaming data. This is presently not reflecting the number of retweets [].
contributors Always empty in the streaming data
truncated Whether the message was truncated after retweeting False
text Actual Twitter text makan malem KFC tapi gw yg ketiban belinya ... capek tau k mall palem -_-
created_at Date of sending the tweet Thu Sep 09 10:13:12 +0000 2010
retweeted "represents whether the user you are authenticating as has retweeted this status or not. The field is a boolean and can be true or false." False
coordinates Usually empty, seems to contain the same as 'geo'
entities User mentions, hashtags
in_reply_to_status_id Usually empty
place Usually empty, if set contains a struture with country code, bounding box, city
source Program used to send the tweet, HTML-formatted a href="" rel="nofollow" Dabr
in_reply_to_screen_name Usually empty
geo Usually empty, can contain geographical coordinates [14.45101058, 120.98492687]
id Identifier for the status 23996832400
in_reply_to_user_id Usually empty

In the streaming data the usual fields may not be available. This indicated wiht the "delete" field, as well as the user-id and the status-id

The search interface only have the following fields: "profile_image_url", "created_at", "from_user", "metadata", "to_user_id", "text", "id", "from_user_id", "geo", "iso_language_code", "source". The "id" is not the same as the standard id.

[edit] User

Field Description Example
profile_use_background_image True
id User id 56706996
verified False
followers_count Integer for the number of followers 71
location Jakarta, Indonesia
statuses_count Integer for the number of messages written 1650
description Text description (autobiography) I'm not perfect
friends_count Integer for the number of frinds 65
notifications None
screen_name Twitter user name sangguinirachel
lang Should indicate language, but is often left at 'en' (English) en
name Real name Sangguini Rachel PLS
url Homepage
created_at Tue Jul 14 14:26:56 +0000 2009
contributors_enabled False
time_zone Jakarta
protected False

[edit] Third party web services

PeerIndex Analyzes a users profile with respect to "authority", "activity" and "audience" as well as "realness".
Pulse of the Tweeters Ranking of users with respect to influence on selected topics. The web service has also sentiment analysis for topics. The service is setup by researcher from Center for Ultra-scale Computing and Information Security at Northwestern University.
Socialmention A real-time search Internet search engine for social media with text sentiment analysis, keywords, users hashtag statistics across a number of services: Twitter, YouTube, Facebook, etc.
The Tweeted Times 
Construction of a personality news cite from tweets.
TweetPsych creates a "psychological profile" and display how users score on dimensions such as "social", "constructive", "sex", "work", etc. It is made by Dan Zarrella.
Trendistric plots curves of twitter message volume as a function of time and based on a query term.
Trusty [4]
TweetFeel Sentiment analysis
Twinfluence social network analysis
TwitGraph sentiment analysis based on a query. The code is available from
twitrratr sentiment analysis based on a query.
Twitter Sentiment

[edit] Twitter data providers

[edit] Researchers

  1. Aleksander Kołcz

[edit] Papers

  1. A new ANEW: evaluation of a word list for sentiment analysis in microblogs
  2. A tweet consumers' look at Twitter trends
  3. Altmetrics in the wild: Using social media to explore scholarly impact
  4. Analizying factors to increase the influence of a Twitter user
  5. Beyond microblogging: conversation and collboration in Twitter
  6. Bieber no more: first story detection using Twitter and Wikipedia
  7. Catching fish in the stream: real time analysis of audience behavior in social media
  8. Characterizing microblogs with topic models
  9. Crowd sentiment detection during disasters and crises
  10. Detecting and tracking the spread of astroturf memes in microblog streams
  11. Sarita Yardi, Daniel Romero, Grant Schoenebeck, Danah Boyd (2010). "Detecting spam in a Twitter network". First Monday 15(1): missing pages. [6].
  12. Emerging topic detection on Twitter based on temporal and social terms evaluation
  13. Everyone's an influencer: quantifying influence on Twitter
  14. Extracting strong sentiment trends from Twitter
  15. Good friends, bad news - affect and virality in Twitter
  16. I tweet honestly, I tweet passionately: Twitter users, context collapse, and the imagined audience
  17. Michael van Meeteren, Ate Poorthuis, Elenna Dugundji. "Mapping communities in large virtual social networks: using Twitter data to find the Indie Mac community". [7]
  18. Meeyoung Cha, Hamed Haddadi, Fabrício Benevenuto, Krishna P. Gummadi(2010). "Measuring user influence in Twitter: the million follower fallacy". [8]
  19. Modeling events with cascades of Poisson processes
  20. Modeling public mood and emotion: Twitter sentiment and socio-economic phenomena
  21. Networks and language in the 2010 election
  22. Networked gatekeeping and networked framing on egypt
  23. Predicting discussions on the social semantic web
  24. Sitaram Asur, Bernardo A. Huberman. "Predicting the future with social media". [9]
  25. Mike Thelwall, Kevan Buckley, Georgios Paltoglou (2010). "Sentiment in Twitter events". Journal of the American Society for Information Science and Technology missing volume: missing pages. [10].
  26. Bernardo A. Huberman, Daniel M. Romero, Fang Wu (2009). "Social networks that matter: Twitter under the microscope". First Monday 14(1): missing pages. [11].
  27. Structural predictors of tie formation in Twitter: transitivity and mutuality
  28. Temporal patterns of happiness and information in a global social network: hedonometrics and Twitter
  29. The role of multimedia content in determining the virality of social media information
  30. Tweet, tweet, retweet: conversational aspects of retweeting on Twitter
  31. Tweetin' in the rain: exploring societal-scale effects of weather on mood
  32. Tweeting about TV: sharing television viewing experiences via social media message streams
  33. Tweeting the meeting: an in-depth analysis of Twitter activity at Kidney Week 2011
  34. Tweets are forever: a large-scale quantitative analysis of deleted tweets
  35. Stephen Dann (2010). "Twitter content classification". First Monday 15: missing pages. [12].
  36. Twitter mood predicts the stock market
  37. Bernard J. Jansen, Mimi Zhang, Kate Sobel, Abdur Chowdury (2009). "Twitter power: tweets as electronic word of mouth". Journal of the American Society for Information Science and Technology 60(11): 2169-2188. doi: 10.1002/asi.21149. [13].
  38. Brian P. Blake, Nitin Agarwal, Rolf T. Wigand, Jerry D. Wood(2010). "Twitter quo vadis: is Twitter bitter or are tweets sweet?".
  39. Twitter rank: finding topic-sensitive influential Twitterers
  40. Understanding the demographics of Twitter users
  41. Bongwon Suh, Lichan Hong, Peter Pirolli, Ed H. Chi(2010). "Want to be retweeted? large scale analytics on factors impacting retweet in Twitter network". Second IEEE International Conference on Social Computing (SocialCom). [14]
  42. Haewoon Kwak, Changhyun Lee, Hosung Park, Sue Moon(2010). "What is Twitter, a social network or a news media?".
  43. Whisper: tracing the spatiotemporal process of information diffusion in real time
  44. Who says what to whom on Twitter
  45. Akshay Java, Tim Finin, Xiaodan Song, Belle Tseng(2007). "Why we twitter: understanding microblogging usage and communities". Joint 9th WEBKDD and 1st SNA-KDD Workshop. [15]
  46. Wikipedia on Twitter: analyzing tweets about Wikipedia

[edit] Events

  1. Making Sense of Microposts, workshop at the Extended Semantic Web Conference 2011

[edit] References

  4. The Edinburgh Twitter Corpus
  5. What is Twitter, a social network or a news media?
  6. J. Yang, Jure Leskovec(2011). "Temporal variation in online media". ACM International Conference on Web Search and Data Mining (WSDM '11).
Personal tools