GUPTA G.

Data Warehouse consultant

Wadgaon , India

Experience: 10 Years

GUPTA

Wadgaon , India

Data Warehouse consultant

USD / Year

Start Date / Notice Period end date:

10 Years

Now you can Instantly Chat with GUPTA!

Chat Now

About Me

A hard-working motivated professional having total 11+ years of experience seeking for a position as Data Warehouse consultant. Posses a wide variety of experience including managing, developing and implementing successful and challenging projects. W...

Skills

Positions

Data Analysts

Database Administrator

Data Scientist

Systems Analysts

Business Analysts

Software Engineer

Data Engineer

Business Intelligence Manager

Portfolio Projects

Description

Project : DATALAKE

Software : Informatica 10, Informatica Cloud, AWS, Shell script, PythonScript, AWS Lamdba

Database : Oracle, Redshift, DB2,SQL SERVER,SAP,Hive

Role : Team Lead/Senior Software

Project Description: -

Schneider is maintaining data in several systems like SAP, Oracle, DB2, FlatFile ect. They want to bring all the data into AWS could system. From the cloud system the user’s analysis the data according to the source system. The existing Data ware house system is slowly moving into the AWS system (Redshift Database). From the Redshift database the reporting team will build reports according to the user requirements.

Contribution/Role:

Role: Team Lead and Developer

Working on agile project and distributing the work based on the priority
Acting as a scrum master and discussing with the product manager
Allocating the Jira task to the developers based on the priority
Design & implement work flows using ETL and Unix scripting to perform data ingestion on AWS platform
Extracting the data from SAP source system in the form of flat file and load the data into Redshift database system.
Some of the source system data is migrating using DMS
Used Informatica cloud to migrate the data from the source system into directly S3 file system of AWS.
Data loaded into dimensions and facts.
Interacting with the source system to get the information about the tasks what they required.
Created Unix scripts to push the data to S3 bucket.
Interacting with source system to bring the data into AWS environment
Created unix script to fetch the files from other SFTP server to our unix environment.

Aws Activities:

Created tables in Redshift Database and copied the data into the tables using Copy command
Created SNS to get the notification of the failures while loading the data using Lambda python
Created Redshift partition for the data based on the user requirement.
Unload the data from Redshift table or from Redshift spectrum based on the user requirement.
Created python script to load the data into Redshift tables.
Using AWS DMS (Data migration services) moving the data from source systems to Redshift database.
Created triggers or events using AWS lambda python to copy the files from S3 bucket to Redshift tables.
Performance tuning done to load the data into fact table for 8 years data in redshift.
Done POC on AWS Glue.
Having knowledge on PYSpark.
Configure the DMS jobs to load the data from source system to Redshift tables.
Created partitions in the redshift spectrum on the table based on the requirement.

Python:

Created python scripts to load the data into Redshift tables.
Created script to unload the data from Redshift spectrum to get the changed data in the files between current run and previous run
Created python scripts to vacuum the tables based on the business requirement.
Created script to add additional field to the file using pandas and load into redshift spectrum.

IICS: (Informatica Cloud)

Reading the files from BOX application and move it to AWS S3 bucket using Informatica Cloud mappings, tasks and Unix script.
Extract the data from the SQL server database to Redshift database using Informatica cloud.
Created simple jobs in Informatica Cloud (IICS) read the data from files and load into s3 bucket.
Created linear and parallel tasks as per the requirement
Create depended jobs between informatica cloud and informatica power center.
Used parameter in the mappings as per the requirement.
Created Replication task to move the data from one source system to another system and schedule the job.

Show More Show Less

Description

Project : GD Metrics AND MSI Metrics

Software : Informatica 10

Database : Oracle

Role : Team Lead/Senior Software

Project Description: -

Schneider is maintaining data in several systems like Remedy, Planview, Trust IPO and Flat files as source databases. They want to bring all the data into Oracle database system. The reports/Dashboards implemented on ITSM Domain, especially in the areas of Incident, Service Request, Problem and Change Management, Asset Management etc. These reports/dashboards will be used to know the performance of the support teams at global level, which will be used to implement charge back mechanism to the vendors based on SLA calculations.

Contribution/Role:

Role: Team Lead and Developer

Involved in preparing the mapping specification based on the business rules
Update the daily status report
Created Dynamic parameter file generation
Involved in preparing the Unix script
Created mapping and workflow to generation 882 files for various table which are required
Worked with multiple databases, flat files and Created Mapplets to reuse transformation Logic
Created stored procedure based on the requirement
Create reusable components where the same business logic applied.
Data loaded into dimensions and facts.

Show More Show Less

Description

Client : DWP

Project : UCMI

Software : Informatica 9.1, Informatica Data Replication, Informatica DataExplore

Database : Oracle

Role : Team Lead/Senior Software

Project Description: -

Management Information (MI) is longer term tactical and strategic decision making based on a more holis-tic view of the business, and for this it is necessary to summarise, combine and integrate data, often of quite disparate nature and drawn from different parts of the business, into a coherent whole. This allows the interaction of the different business areas to quantified, and for decisions to take account of wider impact.

Contribution/Role:

Role: Team Lead and Developer

Involved in preparing and review of ETL Specification
Involved in preparing and review of Mapping sheets
Involved in estimation (WBE)
Interacting with client for the clarifications
Creating the reference data hub
Involved to prove the POC for the requirement of IDR and IDE

IDR Activities:

Setup the Network mode environment
Setup the connectivity for Source and Target database
Map the tables and columns to get the data form source to target database
Assign the path where Archive log files exist of the Source Oracle database
Create, import, edit, or export configuration files.
Perform the Initial Sync, Extract and Apply by using SCN (System Change Number) concept
Using the SQL Mode for the Data Replication
Schedule Extractor, Applier, InitialSync, in IDR Schedule
Creating the Recovery tables.
Assigning the required permission at source database level to read the data like Enabling Archive log mode, Enabling Minimal Global Supplemental Logging, Assign the permission to the oracle user like select, alter, view, resource, alter session.

IDE Activities:

Created the folders for the project
Imported the Physical Data Objects (PDO) like Table and Flat files
Create the profile for the PDO objects
Create column based profile
Prepare scorecards for the profile data
Assign the thresholds for the scorecards
Drill down the data for the profiled data
Create rule based profile (Business rule), filters
Create the reference data from the profiled data

PowerCenter:

Created Dynamic parameter file generation
Involved in preparing the Unix script
Created mapping and workflow to generation 882 files for various table which are required
Worked with multiple databases, flat files and Created Mapplets to reuse transformation Logic
Used Transformations such as Source Qualifier, Filter, Aggregator, Expression, Connected and Unconnected Lookup, Sequence Generator, Router, Update Strategy, etc to develop mappings

Show More Show Less

Description

Client : GE

Project : GECC

Software : Informatica 9.1

Graphic Tool : Graphviz 2.28 ver (To design the dependency of the workflows)

Database : Teradata 13.10 ver, Teradata SQL Assistant 13

Role : Senior Software Engineer

Project Description: -

GE Fleet Services is one of the largest fleet management companies in the world. GE Fleet provides fleet leasing and management services that meets the needs of

companies with all sizes of car and truck fleets. To understand the EDW and implement a data model that integrates data from multiple existing data sources, including: the Fleet mainframe, the Fleet customer data base (“CDB”), Telematics (PUnit), Oracle, Siebel, and Collision Experts International (3rd party accident data “CEI”) to the existing GECA WH. CUSTOMER data model standards will be followed.
Created data source to data target (i.e., “source to target”) mappings and related business logic to support the population of the target data instantiated physical model (i.e., the updated version of the GECA WH).
Designed, developed, test and deploy extract, transfer and load (“ETL“) routines and components that schedule, extract, cleanse, and transform data as per the source to target specifications.
To define and document the business, lineage and technical metadata in all pertinent aspect of the Project, including, but not limited to, data modelling, ETL, data quality, access layer, history, and audit trail/controls

Created Mapping sheet and related business rules/logic to support the population of the target data instantiated physical model (i.e., the updated version of the GECA WH).

Contribution/Role:

Role: Team Lead and Developer

Handled the Mapper team
Involved in preparing the mapping specification based on the business rules
Review the mapping specification and update to the PL/PM
Interacting with client
Update the daily status report
Worked on development and review activities
Involved in preparing ETL design using Power Center Mapping Architect for Visio.
Worked on different SORs systems like Cobal copybook, Oracle, Flat files ect.
Involved in Data profiling for the mainframe file using Teradata Profiler.
Involved in preparing the control Frame work for scheduling the jobs.

Show More Show Less

Description

Schneider is maintaining data in several systems like SAP, Oracle, DB2, FlatFile. They want to bring all the data into AWS could system. From the cloud system the users analysis the data according to the source system. The existing Data ware house system is slowly moving into the AWS system (Redshift Database). From the Redshift database the reporting team will build reports according to the user requirements.

Show More Show Less

Description

Show More Show Less

Description

The Universal Credit (UC) Programme has been established to reform the system of benefits and tax credits for people of working age. It aims to improve the incentives to enter work, reduce benefit dependency and simplify administration while continuing to provide appropriate levels of support, especially for people with additional needs. Management Information (MI) is longer term tactical and strategic decision making based on a more holis-tic view of the business, and for this it is necessary to summarise, combine and integrate data, often of quite disparate nature and drawn from different parts of the business, into a coherent whole. This allows the interaction of the different business areas to quantified, and for decisions to take account of wider impact.

Show More Show Less

Description

GE Fleet Services is one of the largest fleet management companies in the world. GE Fleet provides fleet leasing and management services that meets the needs of companies with all sizes of car and truck fleets. To understand the EDW and implement a data model that integrates data from multiple existing data sources, including: the Fleet mainframe, the Fleet customer data base (CDB), Telematics (PUnit), Oracle, Siebel, and Collision Experts International (3rd party accident data CEI) to the existing GECA WH. CUSTOMER data model standards will be followed. Created data source to data target (i.e., source to target) mappings and related business logic to support the population of the target data instantiated physical model (i.e., the updated version of the GECA WH). Designed, developed, test and deploy extract, transfer and load (ETL) routines and components that schedule, extract, cleanse, and transform data as per the source to target specifications. To define and document the business, lineage and technical metadata in all pertinent aspect of the Project, including, but not limited to, data modelling, ETL, data quality, access layer, history, and audit trail/controls Created Mapping sheet and related business rules/logic to support the population of the target data instantiated physical model (i.e., the updated version of the GECA WH).

Show More Show Less

Description

Sony has implemented PeopleSoft HRMS globally through Hewitt. Every day the batch jobs for this client need to be run and monitored on the production servers. These jobs are categorized as Security jobs and the Application Program Engines, Mappings. Offshore Batch Monitoring team will monitor the jobs that are scheduled through an automated process on the Sony Production servers and run them to success without any failures. In case of failures immediate required actions are taken and escalated to the onsite support teams and take the remedy action to correct the errors.

Show More Show Less

Description

Wachovia Corporation is a diversified financial services company that provides a broad range of banking, asset management, wealth management, and corporate and investment banking products and services. It is one of the largest providers of financial services in the United States, operating as Wachovia Bank in 15 states from Connecticut to Florida and west to Texas. It provide global services through more than 40 offices around the world.

Show More Show Less