Apache Spark and Apache Zeppelin Visualization Tool

Apache Spark and Apache Zeppelin

Step-1: Installation and configuration of Apache Zeppelin

        https://zeppelin.apache.org/download.html

Step-2: Extract Apache Zeppelin and move it to /usr/lib directory.

                sudo tar xvf  zeppelin-*-bin-all.tgz

        move  zepline  to  /usr/lib/directory

Step-3: Install Java development kit in ubuntu and set JAVA_HOME variable.

        echo $JAVA_HOME

                create     zepplin-env.sh  and zeppline-site.xml  from template files.

                open   zepplin-env.sh

                set          JAVA_HOME=       /path/

                set          SPARK_HOME=    /path/

Step-4: Update port number for apache zeppelin server

                zepplin-site.xml

               check port Number for zepplin server 8082.

                Start zepplin servre at localhost:8082 port.

Step-5: start Zeppeline Server

[root@quickstart zeppelin-0.7.3-bin-all]# bin/zeppelin-daemon.sh  start

Log dir doesn't exist, create /usr/lib/zeppelin-0.7.3-bin-all/logs

Pid dir doesn't exist, create /usr/lib/zeppelin-0.7.3-bin-all/run

Zeppelin start                                             [  OK  ]

Step-6: Set default Interpreter spark.

Step-7: Run Spark Sql Code and Enjoy with all type of graph.


 val file=sc.textFile("file:///home/cloudera/emp.txt")

 case class Employee(eno:Int,ename:String,location:String,sal:Int)
 val sal=file.map(_.split(",")).map(e=>Employee(e(0).trim.toInt,e(1),e(2),e(3).trim.toInt)).toDF()
 sal.printSchema()
 sal.registerTempTable("emp")
 sqlContext.sql("select location,sum(sal) from emp group by location")

Thank You!

Big Data & Hadoop

Search This Blog

Apache Spark and Apache Zeppelin Visualization Tool

Apache Spark and Apache Zeppelin

Step-5: start Zeppeline Server

Comments

Post a Comment

Popular posts from this blog

KAFKA CLUSTER SETUP GUIDE

Apache Hadoop cluster setup guide