Spark Scla

Closed Posted 2 years ago Paid on delivery
Closed

Expertise in designing and deployment of Hadoop Cluster and different analytical tools including Pig, Hive, HBase, Sqoop, Kafka Spark with Cloudera distribution.

Working on a live 20 nodes Hadoop cluster running on CDH4.4.

Working with highly unstructured and semi structured data of 40 TB in size (120 TB with replication factor of 3)

Managing external tables in Hive for optimized performance.

Very good understanding of Partitions and Bucketing in Hive

Developed Spark scripts using Scala as per the requirement using Spark 1.5 framework.

Using Spark API’s over Cloudera Hadoop Yarn to perform analytics on data used for Hive stored at HDFS.

Developed Scala Scripts, UDFs using both Data frames/SQL and RDD in Spark for data aggregation, queries and writing data back onto HDFS.

Exploring Spark to improve the performance and optimization of the existing algorithms in Hadoop using Spark context, Spark data frames, pair RDDs, double RDDs and Yarn.

Developed Spark code and Spark-SQL/Streaming for faster testing and processing of data.

Experience in deploying data from various sources into HDFS and facilitating report building on top of it as per the business requirement.

Performed transformations, cleaning, standardization and filtering of data using Spark Scala/Python and loaded the final required data to HDFS.

Load the data into Spark immutable RDDs and perform in-memory computation to generate quick and better response.

Analyzing how the data been processed by Informatica can be effectively processed using Spark and its API’s.

Spark Scala Hadoop Hive SQL

Project ID: #30636255

About the project

6 proposals Remote project Active 2 years ago

6 freelancers are bidding on average $13/hour for this job

ahmadndiayee

Hi, I am an experienced Data Engineer with a solid background in Spark. I have worked on many Big Data projects with Spark, Scala, Python, Cassandra, Snowflake, AWS ,... Let's have a call for more details about the pr More

$8 USD / hour
(2 Reviews)
2.9
roshanr1993

Hi, I have 6 years of experience and my entire work experience is on these technologies. I think we can discuss more about this

$15 USD / hour
(1 Review)
0.8
ganeshtheking

Having totally 10 years of IT experience which includes 6+ years in java and 3 years in big data. I have great experience in data engineering projects, hadoop, Hbase, Spark, Scala, Hive, Impala, Pig, and Big data proje More

$5 USD / hour
(0 Reviews)
0.0
ramireddyvijay55

3 years of experience in Big Data Domain. Having experience in HDFS, Hive, Spark, Scala, Pyspark, Sqoop. Working in Capgemini.

$5 USD / hour
(0 Reviews)
0.0
anuvindmichael

As I am A meticulous and goal-driven Hadoop& Spark Developer with 2+ years of experience as an Hadoop Developer in Bigdata Solutions team. Adept at leveraging Hadoop and Spark for processing Large insights. Proven trac More

$5 USD / hour
(0 Reviews)
0.0