Spark

From Brede Wiki
Jump to: navigation, search
Software (help)
Spark
Description: missing description
Developer: Apache Software Foundation
Language: Missing programming_language
License: Apache License
Database(s): Wikidata
Feature(s): Missing feature1

Spark


[edit] Examples

[edit] Installation

From the command-line run

# https://spark.apache.org/downloads.html
wget http://d3kbcqa49mib13.cloudfront.net/spark-2.0.0-bin-hadoop2.7.tgz 
 
sudo mv spark-2.0.0-bin-hadoop2.7.tgz /opt/
sudo tar vfxz spark-2.0.0-bin-hadoop2.7.tgz
sudo ln -s spark-2.0.0-bin-hadoop2.7 spark

[edit] pyspark

/opt/spark/bin/pyspark

Following https://spark.apache.org/docs/0.9.1/python-programming-guide.html

>>> words = sc.textFile("/usr/share/dict/words")
>>> words
/usr/share/dict/words MapPartitionsRDD[1] at textFile at NativeMethodAccessorImpl.java:-2
>>> words.filter(lambda w: w.startswith("spar")).take(5)
[u'spar', u"spar's", u'spare', u"spare's", u'spared']
>>> words.count()
99171
Personal tools