TItle : Build Big Data Execution Environment (Spark and zeppelin).
I need to physical setup of bigdata platform for my data analysis work.
- spark(SparkSQL and MLlib,R) and Zeppelin.
I'm looking for expert(s) to configure a spark and Zeppelin on my Linux Server.
I'm not a actual jave programmer and don't have a experience of bigdata area.
So You need to provide followings.
- install and configure a bigdata platform
- Spark standalone mode.
- hadoop / spark(sparkSQl and ML ,SparkR) and Zeppelin
- Use a latest version of related SW.
- You need to install related programs or packages also on my Server .
- and I need to use Zeppelin Web U for exploring a data on SPARK.
- The goal for this project is to perform data analysis on SPARK over Zeppelin.
You need to give me followings :
- for sample program , use a scala program.
- All sample programs need to be running on Zeppelin UI or scala cli.
- pre-load a few sample data(for exampe , [url removed, login to view] )
- sample programs for data loading into spark. - Basically data loading from text file.
- sample program for Spark query and ML.
- Very basic scala program for "Collaborative filtering" method.
- a few SparkSQLs
- sample R program for utilizing of SparkR.
- and simple documentation.
I will let you know a linux server information if contracted.
this is a initial project
and if this is completed well , I'd like to extend a project to make a real business application.
I am a data scientist and have experience working with big data technologies like spark, hadoop, etc. I understand how to install spark and other engines at the backend of zeppelin. I would like to do this project.
16 freelancers are bidding on average $534 for this job
I am Senior Java/Scala Developer with 15 years of experience in design and development with strong problem solving skills. Code Samples https://github.com/tumakha CV http://bit.ly/YuriyTumakha-CV
Can you also provide more details about your environment it is a Linux Fedora or ubuntu box, ram details etc . Also do you need a single node cluster or multi node . Spark and hadoop versions required