About Me
A sagacious and dynamic Big Data Professional with 5+ years of professional experience in dealing with structured, semi-structured and unstructured data through many big data components like Hadoop, Hive, Pig,Power BI, Impala and Sqoop.
... Show MoreSkills
Web Development
Software Testing
Data & Analytics
Programming Language
Development Tools
Database
Software Engineering
Others
Networking & Security
Operating System
Graphic Design
Portfolio Projects
Company
CEX
Description
• Onboarding the Chat Boats and IVR NLP Boats to the Adobe CEX system.
• Script to Identify the users those are interacting with boat with different user id’s and different conversation ids.
• Customised the tables that support Incremental loads.
• Fixed the issue as table is unable to capture NULL in the Prod.
• Sqoop jobs in order to get incremental data.
• Optimised the Execution engine by 80%.
• Achieved Adobe GreenBelt Certification.
• Certified Program Manager Software Security Practitioner Suite.
Show More Show LessSkills
Apache SqoopCompany
Bata, VijaySales, Carters, FabIndia, SRL.
Description
• Make script in PySpark in order to change the Schema of Hive orc files.
• Fixed count problemin PySpark inTagManager.
• Added Reporting functionality in the project and will put the xls file on ftp server.
• Handled 5 projects for different vendors.
• Fixed Production issue, Database is not connecting to the application.
• Migrate Projetcs from Java5 to Java8.
Show More Show LessSkills
Java (All Versions) PySpark HiveCompany
Hadoop (VZNL:CCR)
Description
• Analyse the Data of Vodafone Customers by providing the benefits as per their product plans with the help of Hadoop, hive and MySql.
• Increased the efficiency of the product by decreasing the response time of Execution engine i.e. Delta upto to 70% by implementing Impala and doing tuning of Hive in Hadoop.
• Deployment of Ziggo products onto Ziggo Server in Netherland using Sql.
• Making and Performing various WA’s in order to fix the issues of the project in Hive.
• Solving the tickets that come from the client in order to resolve the issues.
• Analysed the Hive and Hadoop commands which are using in the project to make the retrieving of data fast and Done Unix Shell Scripting.
• Solving their production based testing issues on client environment like the MAP’s are not working and improved Data Modelling in Hive.
• Worked on the connector queries which isresponsible to fetch the data from the server
• Making Unix Shell Script on Linux to execute the hive commands and scheduling it.
• Made the hive system smart enough that it will automatically gather the empty spaces and will utilise the memory in the near future.
Show More Show LessCompany
we migrate the existing product
Description
In this project, we migrate the existing product based on oracle database to big data environment. We remove the duplicate and miscalculated data send by operators hardware before storing it to the data warehouse, So that data consistency can be maintained and other operations (queries) can be performed on data.
Show More Show LessCompany
Real Time Analysis of Twitter Data
Description
• The goal of the project was ‘Graphical Analysis’ of large amount of twitter data using Hadoop.
• The project collected data from Twitter by using Twitter API, sink it into Hadoop File System, search information using elastic search and extract metadata using Logstash.
• Kibana was used as a dashboard for graphical representation of data.
Show More Show LessSkills
Twitter API Hadoop Kibana LogstashCompany
Setup and Configuration of Mail Server (Red Hat Enterprise Linux based)