About Me
Data engineer having 5+ years of experience in Big Data skills majorly on Spark with Scala. I had worked on variety of log analytics tools like ELK stack, Flume, Sqoop, Hive. Also have expereince working on cloud with AWS Lambda and EC2.
... Show MoreSkills
Web Development
Data & Analytics
Programming Language
Database
Operating System
Others
Software Engineering
Positions
Portfolio Projects
Company
Log analytics and near real time visualization and alerting using Hadoop & ELK stack
Role
Backend Developer
Description
Contributed in enhancing the business processes within TCS organization by providing Log Analytics
using Hadoop and other big data tools. This saved lot of efforts and gained huge profits for organization.
- Configured multiple log collectors like Flume, Logstash, Filebeat, Nxlog to collect logs from various sources.
- Developed Hive scripts to process the data stored in HDFS. Also adopted optimization techniques to lessen the data processing time and achieve results faster.
- For providing real time visualization of data used Elasticsearch NOSQL database and Kibana as visualization layer on top of it.
- To notify user about 3 consecutive failure login attempts on a server, configured ElastAlert tool on top of Elasticsearch and send alert to user on occurrence of that event.
Tools
putty sqoop Linux Centos KibanaCompany
Predictive Modelling of Server Utilization using R
Role
Backend Developer
Description
Deployed R for providing prediction values of several server parameters like CPU,Mem, Disk,etc. This was useful for server administrators to know utilization of their servers in upcoming month.
- Retrieved server utilization data from MySQL by integrating R with MySQL database.
- After some fine tuning data in R, fed it to the ARIMA model in R to obtain predictions for same.
Skills
R Language MySQLTools
rstudio MySQLWorkbenchCompany
IS Data Warehouse
Role
Backend Developer
Description
Centralization of employee specific data in order to get all relevant information for a employee at single place. Performed data ingestion in hadoop from multiple data sources. Facilitated insightful daily analysis by comparing multiple datasets for various use cases like asset tracking, associates allocation,etc.
- Developed Sqoop script to import data from various databases like MySQL, Postgres, Oracle, MS-SQL,etc in HDFS. Performed merging of daily incremental data with existing data using sqoop merge.
- Created Hive queries to fine tune imported data and join it with other datasets and executed it using Spark-SQL.
- Provided design recommendations and thought leadership to other stakeholders that improved review processes and resolved technical problems.
- Shared responsibility for administration of Hadoop, Hive, Spark.
Skills
Apache Spark Apache Sqoop Hive HadoopCompany
Data Analytics
Role
Backend Developer
Description
This project was in BFSI domain. Client was Canadian multinational financial services organization. This project was in migration phase from Oracle based architecture to Big Data Platform.
- Developed Scala code for processing data stored in hive tables using Spark. Processed data was used for report generation in Tableau for client.
- For processing Capital Market data used Spark-SQL functions in scala code.
Skills
Apache SparkTools
IntelliJ IDEA putty HiveCompany
Data Lake
Role
Backend Developer
Description
This project was part of Customer Engagement Platform(CEP) initiative for healthcare client (American multinational biopharmaceutical company). They were creating data lake for centralizing all product, customer, professional and sales data.
- For processing data stored in Hive, developed code using Snaplogic tool.
- Implemented complex logic using Snaplogic’s limited functionality tools.
Skills
Snaplogic SQL WorkbenchTools
SnaplogicCompany
Enterprise Analytics
Role
Backend Developer
Description
This Project was for technology client.
- Developed Scala code to process data stored in Hive.
- Reduced BI layer downtime by performing hive optimization.
- Created Talend jobs to launch Spark code in cluster for data processing.
Show More Show Less
Skills
Hadoop Apache SparkCompany
Workflow Automation
Role
Backend Developer
Company
Global Piracy Analytics
Role
Backend Developer
Description
This Project is for American architecture software client to detect piracy of their product and convert them into legitimate users.
- Developing scala code to detect pirate users and run the code on Spark cluster..
- Automate the workflow using Oozie-Jenkins.
Skills
Apache Spark Oozie Jenkins AWS Glue