Now you can Instantly Chat with Jubin!
Designed and Developed End to End ETL Pipeline in Cloud/On Prem.
Knowledge in -
* Data Collection and DBMS ( Oracle / MongoDB )
* Python Prog and Advanced Analytics ( VS Code / Spyder / Jupyter )
* Statistical Analysis with R (...
Passionate about exploring and learning new skills in Big Data Domain.
Technology lover, a team player.
Familiar with Big Data Frameworks
Data & Analytics
Networking & Security
Bitcoin Blockchain Analyser
The project includes preparing and preprocessing of blockchain json block data. The json data is ingested in Hadoop Distributed File System with Object Relationship Mapping through Plain Old Java Objects with Jackson library and Encoders. The HDFS block data is fetched by the algorithm to find the block elements that contain specific buyers that are into the fradulent money trail. Later the gathered data by Spark or SparkSQL can be dumped into MongoDB through which it can be easily used by Tableau to do some data visualization.Show More Show Less
SkillsApache Spark Java (All Versions) MongoDB Tableau Python Shell Scripting Apache Hadoop Big Data Spark SQL BlockChain
Data Transformation Module
Implemented Type2 Slowly Changing Dimensions for the Data Transformation Module in ETL Pipeline.
Language - Scala
Framework - Spark
Metadata - Oracle
The Engine was config driven, earlier it was designed to be called by spark submit but than we added an API call to the module from UI.Show More Show Less