Skip to main content

Apache Spark and Apache Zeppelin Visualization Tool

Apache Spark and Apache Zeppelin


Step-1: Installation and configuration of Apache Zeppelin
        https://zeppelin.apache.org/download.html

Step-2: Extract Apache Zeppelin and move it to /usr/lib directory.
                sudo tar xvf  zeppelin-*-bin-all.tgz 
        move  zepline  to  /usr/lib/directory
 









Step-3: Install Java development kit in ubuntu and set JAVA_HOME variable.
        echo $JAVA_HOME
                create     zepplin-env.sh  and zeppline-site.xml  from template files.
                open   zepplin-env.sh
                set          JAVA_HOME=       /path/
                set          SPARK_HOME=    /path/
  
Step-4: Update port number for apache zeppelin server           
                zepplin-site.xml
               check port Number for zepplin server 8082.
                Start zepplin servre at localhost:8082 port.
 

Step-5: start Zeppeline Server

[root@quickstart zeppelin-0.7.3-bin-all]# bin/zeppelin-daemon.sh  start
Log dir doesn't exist, create /usr/lib/zeppelin-0.7.3-bin-all/logs
Pid dir doesn't exist, create /usr/lib/zeppelin-0.7.3-bin-all/run
Zeppelin start                                             [  OK  ]
 

Step-6: Set default Interpreter spark.

Step-7: Run Spark Sql Code and Enjoy with all type of graph.
val file=sc.textFile("file:///home/cloudera/emp.txt")
case class Employee(eno:Int,ename:String,location:String,sal:Int) val sal=file.map(_.split(",")).map(e=>Employee(e(0).trim.toInt,e(1),e(2),e(3).trim.toInt)).toDF() sal.printSchema() sal.registerTempTable("emp") sqlContext.sql("select location,sum(sal) from emp group by location")
Thank You!

Comments

Popular posts from this blog

KAFKA CLUSTER SETUP GUIDE

KAFKA CLUSTER SETUP Step-1: Set password less communication in master and slave machine 1-check communication in both machine. [ping master.node.com  /   ping slave.node.com] 2- set fully qualified domanin name [/etc/host] 3- su root [master/slave machine] 4- change hostname  /etc/hostname file.... hostname -f master.node.com 3-update password less ssh in master and slave. check previous blog. http://hadoop-edu.blogspot.com/2018/09/installation-of-apache-hadoop.html Step-2: Extract Kafka and Zookeeper and Update bashrc file. 1-   /usr/lib/kafka/...   [tar kafka here]         [in both machine] 2-   /usr/lib/zoo/...       [tar zookeeper here] [in both machine]       tar -xvfz  zookeeper-3.4.10.tar.gz   [master & slave] nano ~/.bashrc export ZOOKEEPER_HOME=/usr/lib/zoo/zookeeper-3.4.10 export KAFKA_HOME=/usr/lib/kafka/kafka_2.11-1.1.0 P...

Apache Hadoop cluster setup guide

APACHE HADOOP CLUSTER SETUP UBUNTU 16 64 bit Step 1: Install ubuntu os system for master and slave nodes.             1-install vmware workstation14.             https://www.vmware.com/in/products/workstation-pro/workstation-pro-evaluation.html             2- install Ubuntu 16os-64bit  for masternode using vmware             3- install Ubuntu 16os-64bit  for  slavenode  using vmware Step-2 : Update root password so that you can perform all admin level operations.                          sudo passwd root command to set new root password Step-3-Creating a User  from root user for Hadoop Eco System.             It is recommended to create a separate user for Hadoop to isolate Hadoop file system from          ...