Sagar D.

Sagar D.

Big Data Engineer

Pune , India

Experience: 4 Years

Sagar

Pune , India

Big Data Engineer

72000 USD / Year

  • Immediate: Available

4 Years

Now you can Instantly Chat with Sagar!

About Me

Fusion of experience in multitude of data related technologies: Big Data Analytics, AWS cloud, Business Intelligence and Reporting with Extensive Technical Experience of almost 3.5 years in India. Currently working with Systems Plus Transformations L...

Show More

Portfolio Projects

Content Matching & Price Comparison(GMP)

Contribute

 Working as a part of development Team.  Development & Testing and Delivery of different modules.  Used multiple algorithms for string matching to compare data written in natural languages.

Description

It is a Big Data Project on Cloud platform. TUI addresses their future markets with this module. We have designed a new job for some comparisons with Competitors data based on String comparison and matching ratio with different algorithms. We are designing a dynamic pricing module based on above comparison.

Show More Show Less

Description

It is a Big Data Project on Cloud platform. TUI wants their system to be upgraded, enhanced and moved to cloud platform upgrades. We have designed a new pipeline for some new platforms like Global Marketing Platform(GMP) to process their data using Cloud based tools. Data Flow 1: Source (HDFS) S3 Matillion Snowflake IBM UNICA Data Flow 2: Source (HDFS) S3 Spark (Qubole) Hive (Qubole) IBM UNICA

Show More Show Less

Contribute

 Source system analysis.  Analyzed business requirements and functional specifications.  ETL development using Python.  Replaced HBase with Amazon Redshift to reduce usage of EMR and hence reduced

Description

It is a Big Data Project on Hadoop platform. Employees Payroll processing data pertaining to different production houses will be loaded from source i.e. Amazon (S3) into Hbase tables. This involves capturing the business requirements and making sure validation is in place. Later on to reduce the cost of using EMR cluster we replaced the Hbase with Amazon Redshift and after ETL processing moved the data to Amazon Redshift instead of Hbase .This data is further getting normalized and aggregated for the purpose of Business Intelligence Analysis through FAE(Fluid Analytics Engine) Visualizations/Dashboards.

Data Flow: Source (Amazon S3) àHadoop (Hbase Tables) / Amazon Redshift à Reports (FAE)

Show More Show Less

Description

It is a Big Data Project on Hadoop platform. American Automobile Association (AAA) wants their system to be upgraded and enhanced to Hadoop 2.7 with corresponding platform upgrades with some claim fraud analytics upgrade. We have designed a new pipeline for some new modules like portfolio from scratch. Data Flow: Source (RDBMS/File) Sqoop/Shell Pig/Hive/Spark Elastic Search Web UI Portals

Show More Show Less

Contribute

 Worked with 6 members Enhancement Team.  Enhancement, Testing, Code Review and Delivery of different modules like Subrogation, RF Home.  Created mapping document for various modules by analyzing

Description

It is a Big Data Project on Hadoop platform. Consumer Insurance USA data pertaining to Claims, Policies, Agency and Billings & Payments will be loaded from source database into Hadoop HIVE tables. This involves combining data from multiple sources, capturing the business requirements and making sure that correct data is in place. This data is further getting normalized and aggregated for the purpose of Business Intelligence Analysis through TABLEAU Visualizations/Dashboards. The efforts include creation of Landing, Staging, Aggregation Layer and Reports using iMAP Architecture.

Data Flow: Source (SQL Server) àHadoop (Landing Layer à Staging Layer à Aggregation Layer) à Reports (Tableau)

Show More Show Less

Description

It is a Big Data Project on Hadoop platform. Consumer Insurance USA data pertaining to Claims, Policies, Agency and Billings & Payments will be loaded from source database into Hadoop HIVE tables. This involves combining data from multiple sources, capturing the business requirements and making sure that correct data is in place. This data is further getting normalized and aggregated for the purpose of Business Intelligence Analysis through TABLEAU Visualizations/Dashboards. The efforts include creation of Landing, Staging, Aggregation Layer and Reports using iMAP Architecture. Data Flow: Source (SQL Server) Hadoop (Landing Layer Staging Layer Aggregation Layer) Reports (Tableau)

Show More Show Less