About Me
- Hands on Experience in Hadoop Technologies like HDFS, Hive, Sqoop, Spark, Spark SQL, Spark Streaming, Kafka, Apache Nifi and Oozie.
- Involved in capability building for developing real time streaming applications u...
- Involved in capability building for developing real time streaming applications using Apache Spark.
- Performance tuning of Spark.
- Hands on Experience in Data Modelling in Hive.
- Experience in CDH and Hortonworks environment.
- Experience in Build the Nifi job
- Sound knowledge of Hadoop commands.
- Experience in S3, EMR and EC2 in Amazon services.
- Strong Knowledge of Agile (Scrum) methodology.
- Domain Experience in Pharmaceutical, Retails, Banking.
- Involved in release, Go Live activities.
Skills
Software Engineering
Web Development
Data & Analytics
Programming Language
Database
Operating System
Others
Development Tools
Networking & Security
Positions
Portfolio Projects
Company
EDH
Description
- Develop automation for Sqoop job for multiple table with provided mapper.
- Develop spark code for moving data from staging layer to schema, raw and archive layer with add additional audit column
- Develop spark code for collecting data from Kafka topic to staging layer(HDFS).
- Execute all script using batch scripting.
Skills
Apache Spark Apache-Kafka ScalaCompany
Security Monitoring
Description
Description: the project which will deliver a pilot model for a centralised data storage and
data management platform. This will be used to store and process the data from the log files
that come from various IT and Network applications into ArcSight.
Role &Responsibilities:
- Build the nifi job for Kafka topic to hdfs (raw zone)
- Build the stream flow for the serving layer(stream data).
- Develop DDL for store the data into hive table
Implemented performance tuning
Show More Show LessSkills
Hadoop Apache-Kafka HiveTools
hadoopCompany
ADtech ingestion
Description
Description: Adtech ingestion is a ingestion the data from multiple source location to data lake and the date lake is on S3 premises for storage. The data manipulation and cleansing perform using spark and python. Validate the data in data warehouse using Hive and redshift.
Role &Responsibilities:
- Involved in manage data coming from different DB to S3 using spark python.
- Writing CLI commands using HDFS and S3.
- Involved in creating Hive tables, loading with data which will run internally in Map Reduce way.
- Implemented complex Hive and red shift queries for validate data.
Tools
Python