Now you can Instantly Chat with Rajat!
===I have solutions to all of your Data Related problems here====
My core expertise are writing highly scalable ELT Jobs, with Python, Spark, Hadoop
Below are my detailed skillset - so that we can quickly get started and...
Below are my detailed skillset - so that we can quickly get started and solve your problem.
• Creating a Data Pipeline on Cloud Platforms like Amazon Web Services (AWS).
• Writing Extract-Load-Transform (ELT) jobs for data processing using technologies like (Hive,Pyspark)
• Building real-time data pipelines for Streaming data using Apache Kafka
========= Data Engineering Skills =========
Amazon Web Services:
Tools & Libraries:
PySpark, Spark, Scala, Python, Hadoop, Hive, SparkML, AirFlow.
Postgres, MySQL, Oracle, DynamoDB, MongoDB, MSSQL
===Tools & Libraries===
Jupyter Notebook,PyCharm,SQLShow More
Data & Analytics
Global IT Data Lake
- Extracting the data from scanned images from pdf using Python Tesseract OCR and Abbyy OCR.
- Training the OCR to read the character in proper format and saving the file to apply to the other complex files.
- Developed a Self Service Portal based on Python Flask Framework which displays the extracted data based on the dynamic query.
- Supervised Job scheduling via Oozie and managed data ingestion
SkillsPySpark Hive Python Apache Sqoop Shell Scripting
Federal Election Commission
1) For Address data of the United States, I have provided data engineering solutions including #Tranformation #Validation #Formatting using #Spark3 #Python3 #PySpark #AWS and other Python library of validating the addresses.
2)For election data of the United States, I have provided data engineering solutions including #Tranformation #Validation #Formatting using #Spark3 #Python3 #PySpark #AWS and have validated the US voters data against 240 Million records from the above solution.
3) Completed the #LogicApp functionality course and implemented an automated workflow for loading data from different sources to #Microsoft #Azure #SQL #Database.
4) Completed the Azure data factory fundamental course and implemented the automated triggered workflow for loading data from different sources like #salesforce #Sqldatabase #Sharepoint #List.
Boeing Aircraft Stimulation
- Worked on exposing the Restful Services using the bottle framework and python.
- Written scripts in python for storing and fetching the metadata received from clients into Mongo DB
- Used PY-UNIT testing for testing the python scripts internally
- Deploying to Production and Development Environment Workflow.