About Me
Software professional with over 12 years of IT experience in diversified fields Hadoop, Spark and Mainframe Development. 2+ years of extensive experience in Bigdata and Big Data Analytics, Hadoop, HDFS, data ingestion pipeline design and advance data...
Show MoreSkills
Portfolio Projects
Description
Worked as Hadoop Developer on a project which migrates the data from Mainframe to Big Data platform. Worked on creating/maintaining a Pyspark program for both batch and real time updates/retrieval of the data from online portal to Hbase and Hive. Create Hive queries to retrieve the other application data from central repository and in corporate them into Batch process and performing application logic to come up with derived values based on rules. Used Oozie to schedule the batch workflows and worked on enabling multiple batch executions of same workflow on same day by making sure there is no conflict between the execution. This help in sending near to real-time data to the downstream applications. Worked on solving the data quality issues identified by the business owners. Reviewed the production execution of the process and optimizing the workflows by enabling parallel process for independent section and tuning the hive queries. Created daily and weekly email alert processes with the statistics of certain data from application. Created pyspark program to archive/delete the inactive data form Hbase and Hive Tables. Collaborated with developers and performance engineers to enhance supportability and identify performance bottlenecks. Followed Agile process for the project as a team member by working on the user stores/tasks assigned and updating the efforts and providing status regularly.
Show More Show Less
+1 646 305 2118
+91 9875 492266
