Now you can Instantly Chat with AGGARWAL!
About Me
A sagacious and dynamic Big Data Professional with 6+ years of professional experience in dealing with structured, semi-structured and unstructured data through many big data components like Hadoop, Hive, Pig, Datadog, Splunk, Incorta, PySpark and Sq...
Show MoreSkills
Positions
Portfolio Projects
Description
• Onboarding the Chat Boats and IVR NLP Boats to the Adobe CEX system.
• Script to Identify the users those are interacting with boat with different user id’s and different conversation ids.
• Customised the tables that support Incremental loads.
• Fixed the issue as table is unable to capture NULL in the Prod.
• Sqoop jobs in order to get incremental data.
• Optimised the Execution engine by 80%.
• Achieved Adobe GreenBelt Certification.
• Certified Program Manager Software Security Practitioner Suite.
Show More Show LessDescription
• Make script in PySpark in order to change the Schema of Hive orc files.
• Fixed count problemin PySpark inTagManager.
• Added Reporting functionality in the project and will put the xls file on ftp server.
• Handled 5 projects for different vendors.
• Fixed Production issue, Database is not connecting to the application.
• Migrate Projetcs from Java5 to Java8.
Show More Show LessDescription
• Analyse the Data of Vodafone Customers by providing the benefits as per their product plans with the help of Hadoop, hive and MySql.
• Increased the efficiency of the product by decreasing the response time of Execution engine i.e. Delta upto to 70% by implementing Impala and doing tuning of Hive in Hadoop.
• Deployment of Ziggo products onto Ziggo Server in Netherland using Sql.
• Making and Performing various WA’s in order to fix the issues of the project in Hive.
• Solving the tickets that come from the client in order to resolve the issues.
• Analysed the Hive and Hadoop commands which are using in the project to make the retrieving of data fast and Done Unix Shell Scripting.
• Solving their production based testing issues on client environment like the MAP’s are not working and improved Data Modelling in Hive.
• Worked on the connector queries which isresponsible to fetch the data from the server
• Making Unix Shell Script on Linux to execute the hive commands and scheduling it.
• Made the hive system smart enough that it will automatically gather the empty spaces and will utilise the memory in the near future.
Show More Show LessDescription
In this project, we migrate the existing product based on oracle database to big data environment. We remove the duplicate and miscalculated data send by operators hardware before storing it to the data warehouse, So that data consistency can be maintained and other operations (queries) can be performed on data.
Show More Show LessDescription
• The goal of the project was ‘Graphical Analysis’ of large amount of twitter data using Hadoop.
• The project collected data from Twitter by using Twitter API, sink it into Hadoop File System, search information using elastic search and extract metadata using Logstash.
• Kibana was used as a dashboard for graphical representation of data.
Show More Show LessDescription
Make script in PySpark in order to change the Schema of Hive orc files.Fixed count problem in PySpark in Tag Manager.Added Reporting functionality in the project and will put the xls file on ftp server.Handled 5 projects for different vendors.Fixed Production issue, Database is not connecting to the application.Migrate Projetcs from Java5 to Java8.
Show More Show LessDescription
In this project, we migrated the existing product based on oracle database to big dataenvironment. We remove the duplicate and miscalculated data send by operators hardwarebefore storing it to the data warehouse, So that data consistency can be maintained and otheroperations (queries) can be performed on data.
Show More Show LessDescription
Collected the requirements from the Input dataset provided.Analyse the Data and get the information of Loans regarding name ,loan date etc.within the few seconds by implementing some hive commandsCutting down response time of reports up to 90%.Support scaling up of data up to 100 times.Involved in Design and Development of technical specifications using Hadooptechnologies.Responsible to manage data coming from different sources and involved in HDFSmaintenance and loading of structured data.Developed Map Reduce programs, Hive scripts to clean and filter data on the cluster andstore them on HDFS.Exported the business required information to RDBMS using Sqoop to make the dataavailable for BI team to generate reports based on data.
Show More Show Less