Building
Run the following to download and build PredictionIO from its source code.
1 2 3 4 | $ git clone https://github.com/PredictionIO/PredictionIO.git $ cd PredictionIO $ git checkout master $ ./make-distribution.sh |
You should see something like the following when it finishes building successfully.
1 2 3 4 5 | ... PredictionIO-0.9.6/sbt/sbt PredictionIO-0.9.6/conf/ PredictionIO-0.9.6/conf/pio-env.sh PredictionIO binary distribution created at PredictionIO-0.9.6.tar.gz |
Extract the binary distribution you have just built.
1 | $ tar zxvf PredictionIO-0.9.6.tar.gz
|
Installing Dependencies
Let us install dependencies inside a subdirectory of the PredictionIO installation. By following this convention, you can use PredictionIO's default configuration as is.
1 | $ mkdir PredictionIO-0.9.6/vendors
|
Spark Setup
Apache Spark is the default processing engine for PredictionIO. Download and extract it.
1 2 | $ wget http://d3kbcqa49mib13.cloudfront.net/spark-1.5.1-bin-hadoop2.6.tgz $ tar zxvfC spark-1.5.1-bin-hadoop2.6.tgz PredictionIO-0.9.6/vendors |
Storage Setup
PostgreSQL Setup
Setting up PostgreSQL to work with PredictionIO.
Make sure you have PostgreSQL installed. For Mac Users, Homebrew is recommended and can be used as
1 | $ brew install postgresql
|
or on Ubuntu: apt-get install postgresql-9.4
Now that PostgreSQL is installed use the following comands
$ createdb pio
If you get an error of the form could not connect to server: No such file or directory
, then you must first start the server manually,:
$ pg_ctl -D /usr/local/var/postgres -l /usr/local/var/postgres/server.log start
Finally use the command:
`$ psql -c "create user pio with password 'pio'"
Your configuration in pio-env.sh
is now compatible to run with PostgreSQL.
HBase and Elasticsearch Setup
Elasticsearch Setup
Elasticsearch is the default metadata store for PredictionIO. Download and extract it.
1 2 | $ wget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-1.4.4.tar.gz $ tar zxvfC elasticsearch-1.4.4.tar.gz PredictionIO-0.9.6/vendors |
If you are not using the default setting at localhost
, you may change the following in PredictionIO-0.9.6/conf/pio-env.sh
to fit your setup.
1 2 3 | PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=localhost PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9300 |
HBase Setup
HBase is the default event data store for PredictionIO. Download and extract it.
1 2 | $ wget http://archive.apache.org/dist/hbase/hbase-1.0.0/hbase-1.0.0-bin.tar.gz $ tar zxvfC hbase-1.0.0-bin.tar.gz PredictionIO-0.9.6/vendors |
You will need to at least add a minimal configuration to HBase to start it in standalone mode. Details can be found here. Here, we are showing a sample minimal configuration.
Edit PredictionIO-0.9.6/vendors/hbase-1.0.0/conf/hbase-site.xml
.
1 2 3 4 5 6 7 8 9 10 | <configuration> <property> <name>hbase.rootdir</name> <value>file:///home/abc/PredictionIO-0.9.6/vendors/hbase-1.0.0/data</value> </property> <property> <name>hbase.zookeeper.property.dataDir</name> <value>/home/abc/PredictionIO-0.9.6/vendors/hbase-1.0.0/zookeeper</value> </property> </configuration> |
Edit PredictionIO-0.9.6/vendors/hbase-1.0.0/conf/hbase-env.sh
to set JAVA_HOME
for the cluster. For example:
1 | export JAVA_HOME=/usr/lib/jvm/java-8-oracle/jre |
For Mac users, use this instead (change 1.8
to 1.7
if you have Java 7 installed):
1 | export JAVA_HOME=`/usr/libexec/java_home -v 1.8` |
In addition, you must set your environment variable JAVA_HOME
. For example, in /home/abc/.bashrc
add the following line:
1 | export JAVA_HOME=/usr/lib/jvm/java-8-oracle |
Start PredictionIO and Dependent Services
Simply do PredictionIO-0.9.6/bin/pio-start-all
and you should see something similar to the following:
1 2 3 4 5 6 7 | $ PredictionIO-0.9.6/bin/pio-start-all Starting Elasticsearch... Starting HBase... starting master, logging to /home/abc/PredictionIO-0.9.6/vendors/hbase-1.0.0/bin/../logs/hbase-abc-master-yourhost.local.out Waiting 10 seconds for HBase to fully initialize... Starting PredictionIO Event Server... $ |
You may use jps
to verify that you have everything started:
1 2 3 4 5 6 | $ jps -l 15344 org.apache.hadoop.hbase.master.HMaster 15409 io.prediction.tools.console.Console 15256 org.elasticsearch.bootstrap.Elasticsearch 15469 sun.tools.jps.Jps $ |
A running setup will have these up and running:
- io.prediction.tools.console.Console
- org.apache.hadoop.hbase.master.HMaster
- org.elasticsearch.bootstrap.Elasticsearch
At any time, you can run PredictionIO-0.9.6/bin/pio status
to check the status of the dependencies.
Now you have installed everything you need!
You can proceed to Choosing an Engine Template, or continue the QuickStart guide of the Engine template if you have already chosen one.