Pattern (software)

Software (help)
Description: Python package for text mining with web documents
Developer: Tom De Smedt
Language: Python
License: BSD
Feature(s): Sentiment analysis

Pattern is a Python-based text mining software package with machine learning, etc.

The package is available with pip and the Python Package Index:

The software is briefly described in the paper Pattern for Python.

[edit] Example from paper

from pattern.web import Twitter
from pattern.en import Sentence, parse
from import search
from pattern.vector import Document, Corpus, KNN
corpus = Corpus()
for i in range(1,15):
    for tweet in Twitter().search('#win OR #fail', start=i, count=100):
        p = '#win' in tweet.description.lower() and 'WIN' or 'FAIL'
        s = tweet.description.lower()
        s = Sentence(parse(s))
        s = search('JJ', s) # JJ = adjective
        s = [match[0].string for match in s]
        s = ' '.join(s)
        if len(s) > 0:
            corpus.append(Document(s, type=p))
classifier = KNN()
for document in corpus:
print classifier.classify('sweet') # yields 'WIN'
print classifier.classify('stupid') # yields 'FAIL'd

[edit] Papers

  1. Creative Web Services with Pattern
  2. Pattern for Python

[edit] Related software

