Rajat J.

Rajat J.

Data Engineer with Spark, Azure

Bengaluru , India

Experience: 6 Years

Rajat

Bengaluru , India

Data Engineer with Spark, Azure

54179.6 USD / Year

  • Immediate: Available

6 Years

Now you can Instantly Chat with Rajat!

About Me

===I have solutions to all of your Data Related problems here====

My core expertise are writing highly scalable ELT Jobs, with Python, Spark, Hadoop

 

Below are my detailed skillset - so that we can quickly get started and...

 

Below are my detailed skillset - so that we can quickly get started and solve your problem.

 

 

===Data Engineering===

• Creating a Data Pipeline on Cloud Platforms like Amazon Web Services (AWS).

• Writing Extract-Load-Transform (ELT) jobs for data processing using technologies like (Hive,Pyspark)

• Building real-time data pipelines for Streaming data using Apache Kafka

 

========= Data Engineering Skills =========

Expertise:

Amazon Web Services:

S3

Tools & Libraries:

PySpark, Spark, Scala, Python, Hadoop, Hive, SparkML, AirFlow.

 

Database:

Postgres, MySQL, Oracle, DynamoDB, MongoDB, MSSQL

 

===Image Processing===

• Tesseract

• AbbyyFinereader

 

===Tools & Libraries===

Jupyter Notebook,PyCharm,SQL

Show More

Portfolio Projects

Description

  • Extracting the data from scanned images from pdf using Python Tesseract OCR and Abbyy OCR.
  • Training the OCR to read the character in proper format and saving the file to apply to the other complex files.
  • Developed a Self Service Portal based on Python Flask Framework which displays the extracted data based on the dynamic query.
  • Supervised Job scheduling via Oozie and managed data ingestion

Show More Show Less

Description

1) For Address data of the United States, I have provided data engineering solutions including#Tranformation#Validation#Formattingusing#Spark3#Python3#PySpark#AWSand other Python library of validating the addresses.

2)For election data of the United States, I have provided data engineering solutions including#Tranformation#Validation#Formattingusing#Spark3#Python3#PySpark#AWSand have validated the US voters data against 240 Million records from the above solution.

3) Completed the#LogicAppfunctionality course and implemented an automated workflow for loading data from different sources to#Microsoft#Azure#SQL#Database.

4) Completed the Azure data factory fundamental course and implemented the automated triggered workflow for loading data from different sources like#salesforce#Sqldatabase#Sharepoint#List.

Show More Show Less

Description

  • Worked on exposing the Restful Services using the bottle framework and python.
  • Written scripts in python for storing and fetching the metadata received from clients into Mongo DB
  • Used PY-UNIT testing for testing the python scripts internally
  • Deploying to Production and Development Environment Workflow.

Show More Show Less