Now you can Instantly Chat with Arpit!
About Me
Analyst having over 5.9+ years of experience in as Data Science with almost 2 years of experience in ML and statistical learning using R and Python. Have got good experience in handling massive data and its processing. Working on Retail domain for a ...ave got good experience in handling massive data and its processing. Working on Retail domain for a voice commanded product to provide visual and voice driven highly valuable metrics to End Customers. Working on finding association using Clustering Algorithms between products to provide a better Basket Analyzer for retailers and providing a competitive insight for product management. Implementing Hierarchical clustering for the CPG users to prepare their natural order helping CPGs to take decision based on their sales. Have implemented linear regression with Ridge and Lasso for one of the products sales. Currently exploring text analysis and classification algorithm with image processing to help one of the insurance clients to figure out a way to automate accidental claims using Logistic regression. Helped on deciding data architecture and data flow for the project and provided inputs to consider use of MPP to resolve major data and query performance issues with implementation of Elastic Search. Worked on various Microsoft ADF pipelines and SSIS packages to automate ETL process. Created various automated scripts in SQL and PowerShell to help automate the data flow on production. Worked on MPP and SMP architecture based DBs and data warehouses on various previous project including Microsoft APS (commonly known as Azure PDW). Have extensive experience on performance tuning of SQL queries. Worked on end to end MSBI projects delivering product within tight timelines. Have also experience on tools like TFS, JIRA and SVN for source control and code maintenance. Trained multiple resources with expertise on 1010Data and SQL. Learning various algorithms of Deep Learning with R, Python. Very eager to learn new technologies and contributing to the product at highest level. Microsoft certified data warehouse and SQL data modelling Expert. Have cleared paper 70-767 and 70-768 to get the certificate.
Show MoreSkills
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- 1 Years
Beginner
-
-
- 3 Years
Advanced
-
-
-
- 2 Years
Intermediate
-
-
- 5 Years
Expert
-
-
-
-
- 2 Years
Intermediate
-
-
-
-
-
- 1 Years
Beginner
-
-
-
- 2 Years
Intermediate
-
- 2 Years
Intermediate
-
-
-
- 1 Years
Intermediate
-
-
-
-
-
-
-
-
- 2 Years
Intermediate
-
-
-
-
-
-
-
-
-
-
- 1 Years
Advanced
-
- 1 Years
Intermediate
-
- 1 Years
Beginner
-
- 1 Years
Beginner
-
- 2 Years
Intermediate
-
- 2 Years
Intermediate
-
- 2 Years
Intermediate
-
-
- 1 Years
Beginner
-
-
-
-
- 2 Years
Intermediate
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
Portfolio Projects
Description
- Creating various Data model for various projects for product
- Creating various Warehouse designs and data processes to avoid future issues
- Working on classification of data marts and using it for Advanced Analytics. Operational teams during the UAT and rollout phases.
- Using most advanced technologies available in market to support our Client
- Mixture of Azure SQL and Hadoop on azure helped the process to be implemented smoothly with minimum down time
- Working on creating various analysis on different descriptive derivatives.
- Tuning Various Stored procedure and queries of SQL and Spark SQL to get a strong user experience in terms of data rendering and data loading
- Creation of various ADF pipelines and shell scripts to automate data flow though out the system.
- Working on various statistical and ad-hoc reports to help out any Client reporting and requirement.
- Developed the project using Agile methodologies and Test-Driven Development
- Maintaining builds in different environment
- Involved in performance optimization of automation code
Participating in scrum ceremonies (grooming, sprint planning, retrospectives, daily stand-ups, etc.)
Show More Show LessDescription
- Creating various model for finding association between products sold in R.
- Creating hierarchical clustering for the products for Basket Analyzer.
- Working on classification of data marts and using it for Advanced Analytics. Operational teams during the UAT and rollout phases.
- Few of the heaviest calculation were managed using Spark SQL in data bricks.
- Mixture of Azure SQL and Hadoop on azure helped the process to be implemented smoothly with minimum down time
- Working on creating various analysis on different descriptive derivatives.
- Analyzing and Using Intelligence Module to find the closest match for Keywords asked by End User.
- Worked on comparison of products in terms of inventory and stock management and finding different patterns using python as base language.
- Responsible to analyze existing application and develop automation code using Python scripts.
- Developing various Spark SQL which bring lowest level data in less than 1 minute to solve the queries.
- Tuning Various Stored procedure and queries of SQL and Spark SQL to get a strong user experience in terms of data rendering and data loading
- Creation of various ADF pipelines and shell scripts to automate data flow though out the system.
- Creation of SSIS and SQL packages to automate data loading and one-time activities.
- Developed the project using Agile methodologies and Test-Driven Development
- Maintaining builds in different environment
- Involved in performance optimization of automation code
Participating in scrum ceremonies (grooming, sprint planning, retrospectives, daily stand-ups, etc.)
Show More Show LessDescription
- Extraction of Data (structured and unstructured) from the Linux servers.
- Data Cleaning – Converting unstructured into structured data from R and Data validation.
- Sampling based on the product users, frequency usage, age, gender.
- Exploratory analysis on sampling data on R
- Statistical Analysis – a) Descriptive statistics on sampling b) Running of Correlation and PCA Analysis to know the key strength, Area of opportunity, weakness and functional relationship between the products and attributes. c) Fitting a model to estimate the maximum selling SKU’s (Stock Keeping Units) from the sampling. d) Conducting Test of Hypothesis (T-Tests) on samples to identify the behavioral information on before and after product launching.
- Preparing story boards and documentation for the case studies.
Description
- Analyze the feasibility of existing process suitable for implementation of predictive analysis.
- ETL was performed through multiple iterations to achieve a usable XML data set and converted into table data and then provide input to our models in a csv file format.
- Creating models and R & D to find the most suitable model for an acceptable result. These models were cross checked through various methods to provide enough prove to use it.
- Using a lot of data to train, test and predict based on models created and run some calculations as well.
- Gathering requirement for a new product IAN (Item Analyzer) for prescriptive as well as predictive analysis.
- Item Analyzer requires a sales representation of all products with vision of changing price depending on geography, Customer segmentation (created using association rules), Time and product groups (created using various classification algorithms).
- It was initially developed in R and now moved to data bricks for python and support for huge data processing through spark.
- Creating SSIS packages and ADF pipelines to move data from source to staging and then reutilize it for development purposes.
Working on its data flow and integration with 1010Data.
Show More Show LessDescription
- Analyzed user specifications for workability, completeness and business flow
- Participated in defining System Design, Architecture and Specifications and performed project and task estimation
- Developed, deployed and monitored SSIS Packages for new ETL Processes and upgraded the existing DTS packages to SSIS for the on-going ETL Processes
- Monitored Full/Incremental/Daily Loads and support all scheduled ETL jobs for batch processing
- Involved in Technical decisions for Business requirement, Interaction with Business Analysts, Client team, Development team, Capacity planning and Up gradation of System Configuration
- Involved in Installation, Configuration and Deployment of Reports using SSRS
- Scripted OLAP database backup and scheduled a daily backup using SQL Server agent job.
- Used SSIS tool to Extract, Clean, Transform, and Integrate and Load data in to target staging, DWH databases.
- Load data up to DWH level from staging environment.
BI testing for the reports/ETL implemented by team members
Show More Show LessDescription
MMCS required an extensive research on various tools of AI and third-party utilities to make it a better product as whole. We are working on identifying gaps in processes ,Data Modelling ,Data Integration and Data Analytics to support various Client activities with automation
Show More Show Less