Some trials, some errors, some success…
Downloaded latest release which at this time is 0.5.6 from here and unzipped into incubator-zeppelin folder. Built using this command:
mvn clean package -DskipTests -Pspark-1.5 -Dspark.version=1.6.1 -Dhadoop.version-2.6.0-cdh5.5.1 -Phadoop-2.6 -Pyarn
caveat coder:
- -Pspark-1.5 corresponds to the profile in the pom, not my actual spark version
- I specifically used a spark build that used scala 2.10: I ran into ClassNotFoundExceptions when trying to run zeppelin with a spark built on scala 2.11.
- zeppelin seems to require java7 at present
Configured zeppelin_env.sh:
export JAVA_HOME=/Users/lhurley/software/jdk1.7.0_79 export HADOOP_CONF_DIR=/Users/lhurley/software/hadoop/etc/hadoop export ZEPPELIN_JAVA_OPTS="-Dhdp.version=2.6.0-cdh5.5.1" export SPARK_HOME=/Users/lhurley/software/spark export PATH=$PATH:$SPARK_HOME/bin
Ran the zeppelin daemon:
./bin/zeppelin-daemon.sh start
caveat coder:
- trying to run the daemon in a tmux shell failed:
"Zeppelin process died [FAILED]")
and you see this in the log file:
"nohup: can't detach from console: No such file or directory"
so: don’t run it in tmux…
2. and running over vpn using remote cluster I got this:
java.net.UnknownHostException: lhurley-mac: nodename nor servname provided, or not known.
I had to add my hostname /etc/hosts:
127.0.0.1 localhost lhurley-mac
Here is my sample notebook based on a music recommender from Advanced Analytics with Spark.