Sudheer R.

Sudheer R.

Senior Hadoop/Spark Developer

Mumbai , India

Experience: 14 Years

Sudheer

Mumbai , India

Senior Hadoop/Spark Developer

41304 USD / Year

  • Start Date / Notice Period end date: 2019-11-11

14 Years

Now you can Instantly Chat with Sudheer!

About Me

14+ years of overall experience in IT in PL/SQL Developer and Big Data Hadoop. 5 years of exclusive experience in Hadoop and its components like HDFS, Map Reduce, Pig, Hive, Sqoop, HBase and Oozie, SPARK & SCALA Good working knowledge with Map Reduce...

Show More

Portfolio Projects

www.unilever.com

Unilever Business Analytics Project

Contribute

Team Leader

Description

Unilever is getting the source data from different source systems. As part of the same business each customer would be offered with different types of products based on their needs. Customer might have different retail type of products etc. To maintain this much of huge volumes of, different varieties of data in traditional databases is a very tedious process. To meet the scaling needs of data ofUnilever, re-plat forming of current data warehouse system to hadoop solution in a cost effective solution.

Show More Show Less

T-Mobile Business Analytics Project

Contribute

• Expertise in designing and deployment of Hadoop cluster and different Big Data analytic tools including Pig, Hive, HBase, Sqoop, flume, Apache Spark, with Cloudera Distribution.

Description

T-Mobile is getting the source data from different source systems. As part of the same business each customer would be offered with different types of products based on their needs. Customer might have retail type of products etc. To maintain this much of huge volumes of, different varieties of data in traditional databases is a very tedious process. To meet the scaling needs of data of T-Mobile, re-plat forming of current data warehouse system to hadoop solution in a cost effective solution

Show More Show Less

www.bestbuy.com

This Project is all about the rehousting of their (Best Buy) current existing p

Contribute

• Moved all crawl data flat files generated from various retailers to HDFS for further processing. • Written the PIG scripts to process the HDFS data. • Created Hive tables to store the processed resu

Description

This Project is all about the rehousting of their (Best Buy) current existing project into Hadoop platform. Previously Best Buy was using MySql DB for storing their competitor’s retailer’s information. [The Crawled web data]. Early Target use to have only 4 competitor retailers namely Amazon.com, walmart.com etc….

But as and when the competitor retailers are increasing the data generated out of their web crawling is also increased massively and which cannot be accomodable in a mysql kind of data box with the same reason Best Buy wants to move it Hadoop, where exactly we can handle massive amount of data by means of its cluster nodes and also to satisfy the scaling needs of the Best Buy business operation.

Show More Show Less

www.anz.com.au

ANZ Customer Insight & Retail Analytics Project

Contribute

Imported all the ANZ Customer specific personal data to Hadoop using SQOOP component of Hadoop •Created two different users (hd user for performing hdfs operations and mapred user

Description

The Australian division caters for the bank's retail, commercial and wealth management customers in Australia. The Retail businesses are responsible for delivering a range of banking products and services to retail customers, while Commercial services small to medium enterprises through to smaller corporate. The division also has a dedicated Merchant analytics management business designed to meet the needs of high net worth individuals. This Solution is concerned with the development of a cost-effective Data Warehouse using Hadoop and Hive for storage of large amount of historical data and log data. The raw data will be coming from various sources and dumped directly into Hadoop file system through Sqoop (data extracting tool used to extract data from RDBMS (Oracle, Db2, Teradata, etc.)). Then, data is processed (like un-normalization, partitioning, bucketing, etc.) using hive queries. After that, the data is updated (using customized and optimized queries) into hive and ad-hoc queries can be run to get any form of data.

Show More Show Less