Rajendraprasad C.

Hadoop senior developer and architect with Talend

Hyderabad , India

Experience: 11 Years

Rajendraprasad

Hyderabad , India

Hadoop senior developer and architect with Talend

62019.2 USD / Year

Notice Period: Days

11 Years

Now you can Instantly Chat with Rajendraprasad!

Chat Now

About Me

11+ years of IT experience out of which 6+ years of experience is in Hadoop and remaining experience is in Application Development, Maintenance & Application Support using different technologies on different domains.
Good knowle...
Good knowledge on Talend Studio, Hive, Pig, Sqoop, HBase, Oozie, Flume and MapReduce.
Strong technical and architectural knowledge in providing Hadoop solutions. Designed and implemented 2 meta data driven frameworks using Talend.
Worked with 2 clients in USA for a total duration of 3 years.
Extensive experience in writing the Apache Pig scripts and Hive queries.
Extensive experience in Pig UDF and Hive UDF/UDTF creation.
Customized Hive queries to minimize job execution time and improve optimization.
Involved in migrating data from RDBMS to HDFS and Hive.
Designed and optimized Sqoop jobs to migrate data from different RDBMSs to Hadoop environment.
Created and scheduled job work flows in Oozie.
Worked on Flume tool to ingest data in files to Hadoop environment.
Effective in working independently and collaboratively in teams.
Flexible and ready to take on new challenges.
Worked as team lead and tech lead in different projects. Also helped Architect in proposing solutions to the client.
Worked on few proposals and POCs to help Architects to take right path in designing solutions.
Participated in client requirement analysis, business and technical process analysis, translate into technical solutions for business requirements and problems.
Team player with good Interpersonal, communication and presentation skills.

Skills

Positions

Full-Stack Developers

Portfolio Projects

Description

Polaris C360 is a project where MyUHC will get data from Mark Logic to get a complete 360 degrees view of a consumer. Whereas Mark Logic will get data with the help of ETL team which generates data canonical wise.

Cirrus is the actual source of truth for this application. Data Lake team will ingest historical (one time) and incremental data (3 times a day) into Data Lake. Now ETL team will get this data from Data Lake and generate XMLs and load them into Mark Logic. This is called canonical generation.

This Canonical generation will happen in 2 separate approaches called Json and Non-Json (Hive & Spark). In Json approach pre-processor map-reduce program will identify the incremental ids using json grouping and these XMLs for these ids need to be generated again today as there is a change in it’s data. Now processor will take these ids as input and generate XMLs and load them into Mark Logic.

In Non-Json approach XMLs generation will happen using Hive queries. We can execute these ids either in normal Hive or Spark SQL. We may have to execute queries written in scripts for incremental ids identification and XML data generation. End of the day we will generate XMLs and load them into Mark Logic.

Show More Show Less

Description

Creating extracts from data lake. Meta data driven framework created to prepare 96 (110) external extracts. There are 3 parts in this project named Oxford Ingestion, Cirrus Extraction and Merge Extract where 2nd and 3rd parts are designed and developed by our team.

Oxford ingestion is nothing but loading data from Oxford application which comes in the form of files. Data Fabric team has developed a framework to ingest file data into Hive tables. We will just configure that framework jobs and ingest our data into our tenant Hive.

Cirrus extraction is nothing but extracting data from Data Lake to our tenant hive based on query related to that extract. Since we have huge number of extracts we developed a meta data driven framework jobs in Talend to perform this activity.

Now Merge extract is nothing but merging the outputs from part-1 and part-2, again, using another meta data driven framework based on number of files, header, footer and crosswalk conditions.

Show More Show Less

Description

The OneClaim (“OneClaim”) Program is being implemented for the Customer’s US domestic & International business. The AIGPC Claims organization seeks to leverage insights from the semi-structured XML data from One Claim ODS. This allows for better understanding of the claims process efficiency and for better analytics models. The current process to track and maintain One claim ODS claim's full history data is not available and providing such capability is cumbersome, time-consuming, and not easily scalable. A better process to maintain the full historical changes in the claim during its life cycle and to have analytics on top of XML semi-structured data.

Customer focuses on strengthening data governance across data enabled projects through implementation of EDM policies and standards. Implementing EDM standards will through establishment of metadata, data quality management; establishing integrated target data architecture through implementation of enterprise information services, and enhancing the data and analytics enabled capabilities by establish a controlled shared data and analytics environment

The Project consists of following scope items:

Operational Data Store (“ODS”) data visualization

·Capture ODS and claims data warehouse (“CDW”) data into big data platform to support management information initiatives.

·Enhance big data platform to include all subjects areas available in ODS.

·Enhance big data platform to search through claims data.

·Provide web service access to data in Customer’s OneClaim application

Show More Show Less

Description

Designed the solution approach for data import, cleansing and storing in warehouse.
Implemented solution for all steps of problem statement.
For importing data from DB2 to HBase, created shell and sqoop scripts which will take care of one time bulk and daily delta load.
Created Hive scripts to manipulate and load tables.
Converted complex DB2 views into Hive which are used to run daily as batch to prepare different reports.

Scheduled jobs in Autosys for complete process/scripts to run everyday automatically.

Show More Show Less

Description

Worked as Team/Development Lead for a team of 10 members, including me, in offshore.
Responsible for moving data to HDFS from DB2 databases using Sqoop.
Data cleansing, where data came from different sources, using Pig and UDFs and save all processed data into Hbase.
Performing analytics using Hive by integrating Hive and HBase tables.
Developing Java web services (CRUD) to interact with data in HBase for different Java web UI which will read, write, insert and update relationship of clients.

Show More Show Less

Description

Responsible for importing data from different RDBMS and loading into HDFS.
Written the Apache PIG scripts to process the HDFS data.
Created Hive tables to store the processed results in a tabular format.
Completely involved in the requirements analysis phase.

Show More Show Less

Description

Team Lead

Moved all crawl data flat files generated from various retailers to HDFS for further processing.
Written the Apache PIG scripts to process the HDFS data.
Created Hive tables to store the processed results in a tabular format.
Developed the Sqoop scripts in order to make the interaction between Pig and MySQL database.
Involved in resolving the JIRAs based on Hadoop.
Developed the Unix shell scripts for creating the reports from Hive data.
Completely involved in the requirement analysis phase.

Show More Show Less

Description

Participating in all required client meetings to gather requirements and to provide status on working project.
Helping development team if they have any technical issues and make sure project will continue smoothly without any gaps.
Coding/development to meet deadlines of enhancements/projects/defects.
Translating Business Requirements to Design documents, Preparing test cases and review check lists.
Created re-usable class library which can be used for future enhancements
Development of Graphical User Interface using C# windows forms, business layer and data layer using C# and SQL Server and SQL CE
Providing the estimates/revised estimates for the tasks/modules.
Delegating the assigned project work code/development work to the offshore/onshore development team and cooradinate with them to get quality work on priority basis.
Identifying any issues/concerns of the client and will coordinate with offshore/onshore team to resolve as quickly as possible.