Now you can Instantly Chat with AGAM!
About Me
Ingested data from disparate sources to create a data lake on S3. Setup Access control on AWS using SAML identity providers. Used Sqoop to capture data changes in Netezza. Optimized Netezza ingestion process to reduce overall time by 4 hours. Used AW...
Show MoreSkills
Positions
Portfolio Projects
Description
· Ingested data from disparate sources to create a data lake on S3.
· Setup Access control on AWS using SAML identity providers.
· Used Sqoop to capture data changes in Netezza.
· Optimized Netezza ingestion process to reduce overall time by 4 hours.
· Used AWS EMR task nodes to run spark tasks saving the cost by 10% of on-demand machines.
· Integrated Datadog with AWS services like ECS and EMR.
· Setup AWS EMR cluster to deploy Spark cluster.
· Optimized CI/CD pipeline to run the test in parallel in CircleCI.
· Anonymized PII data to handle CCPA requests.
· Implemented Airflow dependency management using Poetry.
· Setup Lambda process that gets triggered via AWS-SES ruleset.
· Automated data governance capability for the ETL jobs.
· Implemented Airflow root DAG to track the status of all the DAG and send the report over mail.
· Setup process that calculates domain recency metrics and sends over the mail on a daily cadence.
Implemented data validation functionality that checks the schema of incoming JSON events before the transformation.
Show More Show Less