Shubham V.

Associate Consultant

Mumbai , India

Experience: 3 Years

Shubham

Mumbai , India

Associate Consultant

USD / Year

Start Date / Notice Period end date:

3 Years

Now you can Instantly Chat with Shubham!

Chat Now

About Me

3 years of overall IT experience in Big Data Hadoop and Spark with Scala. Exclusive experience in Hadoop and its components like HDFS, Hive, Sqoop and Spark Executed Jobs in Spark local mode, Pseudo distributed mode, Hadoop Cluster mode for productio...

Skills

Positions

Data Analysts

Consultants

Data Scientist

Technical Consultants

Software Engineer

Data Engineer

Automation Engineer

Portfolio Projects

Description

Analyzing the source systems data before loading into HDFS. Involved in writing the FTP scripts to bring the data from NFSMount point to hadoop local environment. Involved in data loading and validating the data fields as part of data loading. Worked on Map reduce job development to process the data with respect to different customers data on their personal data in different financial services. Involved in developing the Custom Input Format classes to work with XML Feed, PDF Feeds coming from some source systems Worked on Hive Script development which includes Partitioning. Involved in MySQL Metastore configuration for hive as an external metastore.

Show More Show Less

Description

Involved in development of spark/spark sql query to run in prod

Show More Show Less

Description

Roles and Responsibilities:

· Implemented newer concepts use like Apache Spark and Scala programming

· Managing data coming from 200+ different sources

· Loaded unstructured data into Hadoop File System( HDFS)

· Written validation and data quality scripts

· Implementation of Cloudera cluster with High availability and standby solutions.

· Design and support of Data ingestion, Data Migration and Data processing for BI and Data Analytics.

· Responsible for developing data pipeline Sqoop and pig to extract the data from weblogs and store in HDFS.

· Involved in developing Pig Scripts for change data capture and delta record processing between newly arrived data and already existing data in HDFS.

· Worked on HIVE- Integration-Spark SQL scripts for performance enhancement

· Worked on DataFrame Development as part of SparkSQL.

Show More Show Less