Now you can Instantly Chat with Ankit!
About Me
4 years experience in IT Industry as Hadoop Developer with 3 years of experience leveraging the Hadoop ecosystem to glean meaningful insights from semi-structured and unstructured data working on various Big Data tools such as - HIVE, SQOOP, SQL, ...semi-structured and unstructured data working on various Big Data tools such as - HIVE, SQOOP, SQL, SPARK, SPARK SQL, SPARK DATAFRAMES, SPARK RDD, YARN with a good understanding of HDFS.
Show MorePortfolio Projects
Description
The project has three important Big Data Components:
• Moving Data from Traditional RDMS to HDFS (SQOOP) • Storing Live data from resellers to HDFS (Kafka with Spark streaming)
• Querying the data from HDFS using HIVE to provide useful analytical information for the decision makers. (HIVE finally connected to QlikView)
Roles and Responsibilities:
• Used SQOOP to import vast data from traditional RDMS to HDFS. - Involved in writing import query for the incremental data on a scheduled basis. - Used Oozie to automate the Sqoop jobs. • Worked on Hive queries for creating and querying HIVE tables to retrieve useful analytical information. • Knows about the full structure of capturing live data from resellers. • Monitoring of Data Pipelines to ensure the thorough transfer of data. • Developing Scripts and Batch Job to schedule various Hadoop Program. • Written documentation to describe program development, logic, coding, testing, changes and corrections.
Description
Roles and Responsibilities :
• Analyzed the business requirements and functional specifications. • Extracted data from oracle database and spreadsheets and staged into a single place and applied business logic to load them in the central Oracle database. • Used Informatica Power Center 10.1.1 for extraction, transformation and load (ETL) of data in the data warehouse. • Extensively used Transformations like Router, Aggregator, Joiner, Expression and Lookup, Update strategy and Sequence generator and Stored Procedure.
• Developed complex mappings in Informatica to load the data from various sources.Implemented performance tuning logic on targets, sources, mappings, sessions to provide maximum efficiency and performance. • Parameterized the mappings and increased the re-usability. • Used Informatica Power Center Workflow manager to create sessions, workflows and batches to run with the logic embedded in the mappings. • Used the PL/SQL procedures for Informatica mappings for process control in incremental load. • Created the ETL exception reports and validation reports after the data are loaded into the warehouse database. • Written documentation to describe program development, logic, coding, testing, changes and corrections. • Followed Informatica recommendations, methodologies and best practices.
25028.5 USD / Year (Expecting)
-
Start Date / Notice Period end date: 2020-02-19