Now you can Instantly Chat with GUPTA!
About Me
A hard-working motivated professional having total 11+ years of experience seeking for a position as Data Warehouse consultant. Posses a wide variety of experience including managing, developing and implementing successful and challenging projects. W...
Show MoreSkills
Portfolio Projects
Description
Project : DATALAKE
Software : Informatica 10, Informatica Cloud, AWS, Shell script, PythonScript, AWS Lamdba
Database : Oracle, Redshift, DB2,SQL SERVER,SAP,Hive
Role : Team Lead/Senior Software
Project Description: -
Schneider is maintaining data in several systems like SAP, Oracle, DB2, FlatFile ect. They want to bring all the data into AWS could system. From the cloud system the user’s analysis the data according to the source system. The existing Data ware house system is slowly moving into the AWS system (Redshift Database). From the Redshift database the reporting team will build reports according to the user requirements.
Contribution/Role:
Role: Team Lead and Developer
- Working on agile project and distributing the work based on the priority
- Acting as a scrum master and discussing with the product manager
- Allocating the Jira task to the developers based on the priority
- Design & implement work flows using ETL and Unix scripting to perform data ingestion on AWS platform
- Extracting the data from SAP source system in the form of flat file and load the data into Redshift database system.
- Some of the source system data is migrating using DMS
- Used Informatica cloud to migrate the data from the source system into directly S3 file system of AWS.
- Data loaded into dimensions and facts.
- Interacting with the source system to get the information about the tasks what they required.
- Created Unix scripts to push the data to S3 bucket.
- Interacting with source system to bring the data into AWS environment
- Created unix script to fetch the files from other SFTP server to our unix environment.
Aws Activities:
- Created tables in Redshift Database and copied the data into the tables using Copy command
- Created SNS to get the notification of the failures while loading the data using Lambda python
- Created Redshift partition for the data based on the user requirement.
- Unload the data from Redshift table or from Redshift spectrum based on the user requirement.
- Created python script to load the data into Redshift tables.
- Using AWS DMS (Data migration services) moving the data from source systems to Redshift database.
- Created triggers or events using AWS lambda python to copy the files from S3 bucket to Redshift tables.
- Performance tuning done to load the data into fact table for 8 years data in redshift.
- Done POC on AWS Glue.
- Having knowledge on PYSpark.
- Configure the DMS jobs to load the data from source system to Redshift tables.
- Created partitions in the redshift spectrum on the table based on the requirement.
Python:
- Created python scripts to load the data into Redshift tables.
- Created script to unload the data from Redshift spectrum to get the changed data in the files between current run and previous run
- Created python scripts to vacuum the tables based on the business requirement.
- Created script to add additional field to the file using pandas and load into redshift spectrum.
IICS: (Informatica Cloud)
- Reading the files from BOX application and move it to AWS S3 bucket using Informatica Cloud mappings, tasks and Unix script.
- Extract the data from the SQL server database to Redshift database using Informatica cloud.
- Created simple jobs in Informatica Cloud (IICS) read the data from files and load into s3 bucket.
- Created linear and parallel tasks as per the requirement
- Create depended jobs between informatica cloud and informatica power center.
- Used parameter in the mappings as per the requirement.
- Created Replication task to move the data from one source system to another system and schedule the job.
Description
Project : GD Metrics AND MSI Metrics
Software : Informatica 10
Database : Oracle
Role : Team Lead/Senior Software
Project Description: -
Schneider is maintaining data in several systems like Remedy, Planview, Trust IPO and Flat files as source databases. They want to bring all the data into Oracle database system. The reports/Dashboards implemented on ITSM Domain, especially in the areas of Incident, Service Request, Problem and Change Management, Asset Management etc. These reports/dashboards will be used to know the performance of the support teams at global level, which will be used to implement charge back mechanism to the vendors based on SLA calculations.
Contribution/Role:
Role: Team Lead and Developer
- Involved in preparing the mapping specification based on the business rules
- Update the daily status report
- Created Dynamic parameter file generation
- Involved in preparing the Unix script
- Created mapping and workflow to generation 882 files for various table which are required
- Worked with multiple databases, flat files and Created Mapplets to reuse transformation Logic
- Created stored procedure based on the requirement
- Create reusable components where the same business logic applied.
- Data loaded into dimensions and facts.
Description
Client : DWP
Project : UCMI
Software : Informatica 9.1, Informatica Data Replication, Informatica DataExplore
Database : Oracle
Role : Team Lead/Senior Software
Project Description: -
The Universal Credit (UC) Programme has been established to reform the system of benefits and tax credits for people of working age. It aims to improve the incentives to enter work, reduce benefit dependency and simplify administration while continuing to provide appropriate levels of support, especially for people with additional needs.
Management Information (MI) is longer term tactical and strategic decision making based on a more holis-tic view of the business, and for this it is necessary to summarise, combine and integrate data, often of quite disparate nature and drawn from different parts of the business, into a coherent whole. This allows the interaction of the different business areas to quantified, and for decisions to take account of wider impact.
Contribution/Role:
Role: Team Lead and Developer
- Involved in preparing and review of ETL Specification
- Involved in preparing and review of Mapping sheets
- Involved in estimation (WBE)
- Interacting with client for the clarifications
- Creating the reference data hub
- Involved to prove the POC for the requirement of IDR and IDE
IDR Activities:
- Setup the Network mode environment
- Setup the connectivity for Source and Target database
- Map the tables and columns to get the data form source to target database
- Assign the path where Archive log files exist of the Source Oracle database
- Create, import, edit, or export configuration files.
- Perform the Initial Sync, Extract and Apply by using SCN (System Change Number) concept
- Using the SQL Mode for the Data Replication
- Schedule Extractor, Applier, InitialSync, in IDR Schedule
- Creating the Recovery tables.
- Assigning the required permission at source database level to read the data like Enabling Archive log mode, Enabling Minimal Global Supplemental Logging, Assign the permission to the oracle user like select, alter, view, resource, alter session.
IDE Activities:
- Created the folders for the project
- Imported the Physical Data Objects (PDO) like Table and Flat files
- Create the profile for the PDO objects
- Create column based profile
- Prepare scorecards for the profile data
- Assign the thresholds for the scorecards
- Drill down the data for the profiled data
- Create rule based profile (Business rule), filters
- Create the reference data from the profiled data
PowerCenter:
- Created Dynamic parameter file generation
- Involved in preparing the Unix script
- Created mapping and workflow to generation 882 files for various table which are required
- Worked with multiple databases, flat files and Created Mapplets to reuse transformation Logic
- Used Transformations such as Source Qualifier, Filter, Aggregator, Expression, Connected and Unconnected Lookup, Sequence Generator, Router, Update Strategy, etc to develop mappings
Description
Client : GE
Project : GECC
Software : Informatica 9.1
Graphic Tool : Graphviz 2.28 ver (To design the dependency of the workflows)
Database : Teradata 13.10 ver, Teradata SQL Assistant 13
Role : Senior Software Engineer
Project Description: -
GE Fleet Services is one of the largest fleet management companies in the world. GE Fleet provides fleet leasing and management services that meets the needs of
companies with all sizes of car and truck fleets. To understand the EDW and implement a data model that integrates data from multiple existing data sources, including: the Fleet mainframe, the Fleet customer data base (“CDB”), Telematics (PUnit), Oracle, Siebel, and Collision Experts International (3rd party accident data “CEI”) to the existing GECA WH. CUSTOMER data model standards will be followed.
Created data source to data target (i.e., “source to target”) mappings and related business logic to support the population of the target data instantiated physical model (i.e., the updated version of the GECA WH).
Designed, developed, test and deploy extract, transfer and load (“ETL“) routines and components that schedule, extract, cleanse, and transform data as per the source to target specifications.
To define and document the business, lineage and technical metadata in all pertinent aspect of the Project, including, but not limited to, data modelling, ETL, data quality, access layer, history, and audit trail/controls
Created Mapping sheet and related business rules/logic to support the population of the target data instantiated physical model (i.e., the updated version of the GECA WH).
Contribution/Role:
Role: Team Lead and Developer
- Handled the Mapper team
- Involved in preparing the mapping specification based on the business rules
- Review the mapping specification and update to the PL/PM
- Interacting with client
- Update the daily status report
- Worked on development and review activities
- Involved in preparing ETL design using Power Center Mapping Architect for Visio.
- Worked on different SORs systems like Cobal copybook, Oracle, Flat files ect.
- Involved in Data profiling for the mainframe file using Teradata Profiler.
- Involved in preparing the control Frame work for scheduling the jobs.
Description
Schneider is maintaining data in several systems like SAP, Oracle, DB2, FlatFile. They want to bring all the data into AWS could system. From the cloud system the users analysis the data according to the source system. The existing Data ware house system is slowly moving into the AWS system (Redshift Database). From the Redshift database the reporting team will build reports according to the user requirements.
Show More Show LessDescription
Schneider is maintaining data in several systems like Remedy, Planview, Trust IPO and Flat files as source databases. They want to bring all the data into Oracle database system. The reports/Dashboards implemented on ITSM Domain, especially in the areas of Incident, Service Request, Problem and Change Management, Asset Management etc. These reports/dashboards will be used to know the performance of the support teams at global level, which will be used to implement charge back mechanism to the vendors based on SLA calculations.
Show More Show LessDescription
The Universal Credit (UC) Programme has been established to reform the system of benefits and tax credits for people of working age. It aims to improve the incentives to enter work, reduce benefit dependency and simplify administration while continuing to provide appropriate levels of support, especially for people with additional needs. Management Information (MI) is longer term tactical and strategic decision making based on a more holis-tic view of the business, and for this it is necessary to summarise, combine and integrate data, often of quite disparate nature and drawn from different parts of the business, into a coherent whole. This allows the interaction of the different business areas to quantified, and for decisions to take account of wider impact.
Show More Show LessDescription
GE Fleet Services is one of the largest fleet management companies in the world. GE Fleet provides fleet leasing and management services that meets the needs of companies with all sizes of car and truck fleets. To understand the EDW and implement a data model that integrates data from multiple existing data sources, including: the Fleet mainframe, the Fleet customer data base (CDB), Telematics (PUnit), Oracle, Siebel, and Collision Experts International (3rd party accident data CEI) to the existing GECA WH. CUSTOMER data model standards will be followed. Created data source to data target (i.e., source to target) mappings and related business logic to support the population of the target data instantiated physical model (i.e., the updated version of the GECA WH). Designed, developed, test and deploy extract, transfer and load (ETL) routines and components that schedule, extract, cleanse, and transform data as per the source to target specifications. To define and document the business, lineage and technical metadata in all pertinent aspect of the Project, including, but not limited to, data modelling, ETL, data quality, access layer, history, and audit trail/controls Created Mapping sheet and related business rules/logic to support the population of the target data instantiated physical model (i.e., the updated version of the GECA WH).
Show More Show LessDescription
Sony has implemented PeopleSoft HRMS globally through Hewitt. Every day the batch jobs for this client need to be run and monitored on the production servers. These jobs are categorized as Security jobs and the Application Program Engines, Mappings. Offshore Batch Monitoring team will monitor the jobs that are scheduled through an automated process on the Sony Production servers and run them to success without any failures. In case of failures immediate required actions are taken and escalated to the onsite support teams and take the remedy action to correct the errors.
Show More Show LessDescription
Wachovia Corporation is a diversified financial services company that provides a broad range of banking, asset management, wealth management, and corporate and investment banking products and services. It is one of the largest providers of financial services in the United States, operating as Wachovia Bank in 15 states from Connecticut to Florida and west to Texas. It provide global services through more than 40 offices around the world.
Show More Show Less