MADHUSUDAN C.

MADHUSUDAN C.

Associate Analyst

Bengaluru , India

Experience: 2 Years

MADHUSUDAN

Bengaluru , India

Associate Analyst

16457.1 USD / Year

  • Immediate: Available

2 Years

Now you can Instantly Chat with MADHUSUDAN!

About Me

I want to work in the field of machine learning & artificial intelligence and wish to perform some creative work that will make a positive impact on the society. I possess good problem-solving skills and always open to take new challenges....

Show More

Portfolio Projects

Description

Responsibilities:

• GIS Data Load Utility: I developed a utility that loads/transform large Geojson files that contains geospatial data into Hive tables • Data transformation: Under this thread I developed mappings to transform data as per data scientists’ requirements and load the data into Exadata database. • Kafka Development: I developed Kafka pipelines to load data from source to HDFS in AVRO format. • NIFI Development: I developed NIFI pipelines to extract data from SFTP and process it to write into Kafka topics. • Code deployment management: I took the ownership to deploy different code bases to multiple environments. • Data Loads: I loaded the history data in Integration environment using various pipelines that used utilities, Oracle Golden Gate, Kafka, ODI etc. • Defect Fix: Under this thread I worked on resolving various data defects from loads and data scientists’ perspective. • Data Load Approach Design: I was involved in designing the approach for loading the historical data in production. • Scripting: Created various shell scripts to automate things such as searching files in various environment, reading various logs etc. • Python Automation: Developed utilities to automate lot of redundant development tasks.

Show More Show Less

Description

Proof of technology on Oracle Golden Gate and Kafka Integration.Developed end to end pipeline for ingesting trail files in hive using Oracle Golden Gate and Confluent Kafka Framework.Worked on debugging various issues and came up with new solutions.Created scripts in python and shell to automate the testing of all the Kafka topics.

Show More Show Less

Description

Worked on requirement gathering from client.Worked on data preprocessing such as tokenization, stemming, bucketing etc.Analysis of audio dataset.

Show More Show Less

Description

Worked on exploratory data analysis.Worked on data preprocessing such as removing duplicates, filling null values, NLP etc.Data visualization.

Show More Show Less

Description

This is a classical classification problem. The dataset contains comments along with toxicity level. The objective is to predict the toxicity of unseen comments. The main task in this problem was to featurize data and for that I performed various data analysis and created features such as count of toxic words in particular document, presence of smileys, exclamation etc. After data pre-processing, I applied various ML models such as Logistic Regression, SVM etc.

Show More Show Less

Description

Amazon Fine Food Reviews is a classical classification problem. The dataset contains reviews about various food items from Amazon. The objective of this problem is to find the polarity of the review. For this problem I performed data cleaning, preprocessing and exploratory data analysis. I used different Machine Learning models to get the polarity of the reviews.

Show More Show Less

Description

Cancer diagnosis is a multiclass classification problem. The data set has ID, Gene, Mutation, class and research text related to that. The business objective is to depict the final class out of 9 classes to predict cancer. To achieve this I performed data analysis, cleaning, featurization and trained on various ML models.

Show More Show Less

Description

This is a multi-class classification problem. The dataset contains Title, Description and respective tags. The objective is to predictive the tags for unseen data. For this problem I performed data cleaning, preprocessing and exploratory data analysis. The main objective of this problem is to predict as accurate tags as possible. For improving the model performance, I weighted the Title feature more.

Show More Show Less

Description

This problem is graph based problem. In this problem the given dataset is pair of connected nodes at a point of time and the objective is to predict the nodes which are more likely to connect. The data set provided follower – following relationship graph.

Show More Show Less