Now you can Instantly Chat with Shubham!
A sagacious and dynamic Big Data Professional with 5+ years of professional experience in dealing with structured, semi-structured and unstructured data through many big data components like Hadoop, Hive, Pig,Power BI, Impala and Sqoop.... Show More
Data & Analytics
Networking & Security
• Onboarding the Chat Boats and IVR NLP Boats to the Adobe CEX system.
• Script to Identify the users those are interacting with boat with different user id’s and different conversation ids.
• Customised the tables that support Incremental loads.
• Fixed the issue as table is unable to capture NULL in the Prod.
• Sqoop jobs in order to get incremental data.
• Optimised the Execution engine by 80%.
• Achieved Adobe GreenBelt Certification.
• Certified Program Manager Software Security Practitioner Suite.Show More Show Less
Bata, VijaySales, Carters, FabIndia, SRL.
• Make script in PySpark in order to change the Schema of Hive orc files.
• Fixed count problemin PySpark inTagManager.
• Added Reporting functionality in the project and will put the xls file on ftp server.
• Handled 5 projects for different vendors.
• Fixed Production issue, Database is not connecting to the application.
• Migrate Projetcs from Java5 to Java8.Show More Show Less
SkillsJava (All Versions) PySpark Hive
• Analyse the Data of Vodafone Customers by providing the benefits as per their product plans with the help of Hadoop, hive and MySql.
• Increased the efficiency of the product by decreasing the response time of Execution engine i.e. Delta upto to 70% by implementing Impala and doing tuning of Hive in Hadoop.
• Deployment of Ziggo products onto Ziggo Server in Netherland using Sql.
• Making and Performing various WA’s in order to fix the issues of the project in Hive.
• Solving the tickets that come from the client in order to resolve the issues.
• Analysed the Hive and Hadoop commands which are using in the project to make the retrieving of data fast and Done Unix Shell Scripting.
• Solving their production based testing issues on client environment like the MAP’s are not working and improved Data Modelling in Hive.
• Worked on the connector queries which isresponsible to fetch the data from the server
• Making Unix Shell Script on Linux to execute the hive commands and scheduling it.
• Made the hive system smart enough that it will automatically gather the empty spaces and will utilise the memory in the near future.Show More Show Less
we migrate the existing product
In this project, we migrate the existing product based on oracle database to big data environment. We remove the duplicate and miscalculated data send by operators hardware before storing it to the data warehouse, So that data consistency can be maintained and other operations (queries) can be performed on data.Show More Show Less
Real Time Analysis of Twitter Data
• The goal of the project was ‘Graphical Analysis’ of large amount of twitter data using Hadoop.
• The project collected data from Twitter by using Twitter API, sink it into Hadoop File System, search information using elastic search and extract metadata using Logstash.
• Kibana was used as a dashboard for graphical representation of data.Show More Show Less
SkillsTwitter API Hadoop Kibana Logstash
Setup of Clustered and Single Node Hadoop
• The objective of the project was to setup and configure Hadoop on Ubuntu and Centos
• Tested the Hadoop setup by comparing the time taken by it to count the number of words in a book versus sequentialsearch.Show More Show Less
Setup and Configuration of Mail Server (Red Hat Enterprise Linux based)
• The objective of the project was to setup and configure the Mail Server.
• The project involved configuring FTP, SMTP, Dovecot, Postfix, HTTP, Squirrel and setup of advanced User Management and Quota Implementation.Show More Show Less