ROBIN S.

ROBIN S.

Senior Data Engineer

Jammu , India

Experience: 2 Years

ROBIN

Jammu , India

Senior Data Engineer

14701.9 USD / Year

  • Notice Period: Days

2 Years

Now you can Instantly Chat with ROBIN!

About Me

A disciplined, self-motivated,and result-oriented graduate with Data Handling, processing, Analytic Skills. Seeking a position where I can use my natural skills of analyzing, understanding a problem and hard work to bring an effective solution to eve...

Show More

Portfolio Projects

Description

Platform: AWS Tools: Spark ,Scala ,Python ,Hadoop,Hive ,MySQL Description: This project is based on Network Data Streaming which involves Data Ingestion ,Data Validation. It is a telecommunication project which involves ingestion of various network data of users using 4G and 5g network.Data is streamed from the Newrelic source to the destination. My role in the project is to implement the logic for counts of different data segregated into 4G and 5G routed from the source to the destination . Worked on Spark-Scala to parse JSONs retrieved from NewRelic App and validating the data. Assisted in implementing modules for total unique count of different streams of messages coming from at the source as well at the destination.Involved inmanual testing of network data . Also assisted in onboarding apps to Hadoop Cluster for Data Ingestion. Daily monitoring of apps to maintain good health and manage failures. I also involved in providing support to the production support team regarding any failure orissues.

Show More Show Less

Description

Platform: AWS Tools: Spark ,Scala ,Python ,Hadoop,Hive Description: This project is based on Data Streaming which involves Data ingestion from the source. I worked in this project for 3 months. My role in this project is to do the validation of incoming data at the source and to report any issues in it .I also involved in onboarding different apps to platform ,creating and managing data in directories. Effciently deployed and integrated application modules on cloud and updated integration scripts. I also involved in monitoring the health status of the job and to prepare a report based on those observations.

Show More Show Less

Description

This project is based on creation of automated data pipeline from client source to the datawarehouse from where BI team will fetch data for BI dashboard creation. I worked as a Data Engineer in this project and my role is to do the design a data pipeline which will extract the data from the multiple sources ,processes and transform it and load it into the datawarehouse for BI Team. Used pyspark for processing of data according to the client requirements. Apart from designing data pipeline I was also involved in onsite requirements gathering from the client from the scratch, understanding their needs and their business usecases,preparing and explaining data pipeline architecture to the client. Daily standup with the client and internal team regarding progress and Implementation. In this project ,Business model of the client is based on retail which included different modules like Sales ,Purchase,Finance,Inventory ,Production.Every users have their own set of requirements. Involved in analyzing data using SQL Server and derived mapping between multiple tables. Implemented mapping and the logic based on the client requirements between different tables in pyspark Writing logic for implementing Store Procedures using pyspark. Done validation of data with client for all modules. Involved in logic implementation of some usecases in BI tool also and Testing of pyspark scripts,automation scripts. Implemented airflow scripts for automating all the scripts on daily,weekly,monthly basis. Implemented server sftp scripts for daily ,monthly ,yearly ,weekly extraction of data from the server for the excel data source.

Show More Show Less

Description

This project is based on Network Data Streaming which involves Data Ingestion ,Data Validation.It is a telecommunication project which involves ingestion of various network data of users using 4G and 5g network.Data is streamed from the Newrelic source to the destination. My role in the project is to implement the logic for counts of different data segregated into 4G and 5G routed from the source to the destination . Worked on Spark-Scala to parse s retrieved from NewRelic App and validating the data. Assisted in implementing modules for total unique count of different streams of messages coming from at the source as well at the destination.Involved in manual testing of network data . Also assisted in onboarding apps to Hadoop Cluster for Data Ingestion. Daily monitoring of apps to maintain good health and manage failures. Write alert scripts in Python to send email alert to the production team in case any error occurs in the pipeline. Writing validation scripts for streaming data in python.

Show More Show Less