Now you can Instantly Chat with Sagar!
· Fusion of experience in multitude of data related technologies: Big Data Analytics, AWS cloud, Business Intelligence and Reporting with Extensive Technical Experience of 4+ years in India.
· Passionate about dea...
· Passionate about dealing with Data (Structured or Unstructured), being a consistent performer in role of Big Data Developer.
· Good understanding of AWS cloud services like S3, AWS Glue, Amazon Athena, Qubole, EMR, ELK stack, Snowflake & Matillion.
· Good Experience and excellent understanding of HDFS, Python, Scala, Spark, Hive, Pig, Sqoop, Oozie, Logstash, elasticsearch, Filebeat and MYSQL.Show More
Data & Analytics
Project – TUI Future Markets: Content Matching & Price Comparison(GMP)
Working as a part of development Team. Development & Testing and Delivery of different modules. Used multiple algorithms for string matching to compare data written in natural languages.
It is a Big Data Project on Cloud platform. TUI addresses their future markets with this module. We have designed a new job for some comparisons with Competitors data based on String comparison and matching ratio with different algorithms. We are designing a dynamic pricing module based on above comparison.Show More Show Less
SkillsAWS Cloud PySpark Apache Spark
AAA Hadoop Data Foundation
Worked with 6 members Enhancement Team. Enhancement, Testing, Code Review and Delivery of different modules like Subrogation, RF Home. Created mapping document for various modules by analyzing
It is a Big Data Project on Hadoop platform. Consumer Insurance USA data pertaining to Claims, Policies, Agency and Billings & Payments will be loaded from source database into Hadoop HIVE tables. This involves combining data from multiple sources, capturing the business requirements and making sure that correct data is in place. This data is further getting normalized and aggregated for the purpose of Business Intelligence Analysis through TABLEAU Visualizations/Dashboards. The efforts include creation of Landing, Staging, Aggregation Layer and Reports using iMAP Architecture.
Data Flow: Source (SQL Server) àHadoop (Landing Layer à Staging Layer à Aggregation Layer) à Reports (Tableau)Show More Show Less
ToolsPuttty SQL Developer
Source system analysis. Analyzed business requirements and functional specifications. ETL development using Python. Replaced HBase with Amazon Redshift to reduce usage of EMR and hence reduced
It is a Big Data Project on Hadoop platform. Employees Payroll processing data pertaining to different production houses will be loaded from source i.e. Amazon (S3) into Hbase tables. This involves capturing the business requirements and making sure validation is in place. Later on to reduce the cost of using EMR cluster we replaced the Hbase with Amazon Redshift and after ETL processing moved the data to Amazon Redshift instead of Hbase .This data is further getting normalized and aggregated for the purpose of Business Intelligence Analysis through FAE(Fluid Analytics Engine) Visualizations/Dashboards.
Data Flow: Source (Amazon S3) àHadoop (Hbase Tables) / Amazon Redshift à Reports (FAE)
Show More Show Less