About Me
- 7+ years of experience in data analytics with renowned MNCs
- Experienced Data Engineer with a demonstrated history of working in Telecom, Supply Chain, HCM, and Advertising domain
- Excellent understanding of Big data stack...
- Excellent understanding of Big data stack. Expertise on Spark, Hadoop and its ecosystem components
- Experience on AWS cloud platform and its services including EC2, ECS, DynamoDB, SNS, RDS, Secret Manager, S3 etc.
- Implemented Airflow for various types of DAGs for Telecom data.
- Worked on preprocessing of structured and unstructured data & implemented various Database designs
- Worked on identification of valuable data sources and automation of the collection processes
- Experience on various migration projects to migrate the legacy systems to big data systems (Hadoop, Spark, Azure Data Bricks, S3)
- Proficient in Python & Scala programming and worked on NoSQL Databases (Dynamo DB, HBase, MongoDB)
Skills
Web Development
Data & Analytics
Database
Software Engineering
Programming Language
Operating System
Others
Development Tools
Positions
Portfolio Projects
Company
Big Data
Description
- Working in Data Engineer group, extracting, ingesting, transforming Ad Marketing, Telecom, Logistics data to be consumed by downstream teams
- Developed flow to migrate the data from 50+ sources to AWS S3 using python scripts
- Responsible for building and supporting a Big Data-based ecosystem designed for enterprise-wide analysis of structured, semi-structured, and unstructured data.
- Managing a team of 4 people and assigning day to day tasks to them.
- Validating migrated data, performing data quality checks, implementing CI/CD and orchestration using Jenkins, Terraform, Docker etc.
- Reviewing code/providing feedback relative to best practices, performance improvements etc. and work in the pair-programming environment also
- Developed NLP model for sentimental Analysis on twitter data (Logistic Regression).
Tools
PyCharmCompany
Airflow
Description
- Automated the process of batch data transfer to be loaded into RedShift using Hive & Shell scripts
- Establish a strong working relationship with business, teammates, and others within the organization
Skills
Apache Airflow PythonTools
PyCharmCompany
ETL
Description
- Developed a framework to migrate the data from different legacy sources to HDFS and using SSIS and T-SQL to process further
- Lead a project with 5 people for a PAN India implementation of an invoicing tool for Reliance Communication which improved efficiency in invoicing by 70% and saved 2430 man-hours monthly
- Introduced big data analytics to automate the generation of 250 reports per month catering from Engineer to CEO
- Assist application development teams during application design and development for highly complex and critical data projects
- Involved in POCs to adopt new technologies to improve data platform management in large scale with high throughput
Skills
SQLTools
MS SQL Server