SRIKANTH Y.

Data Scientist

Hyderabad , India

Experience: 6 Years

SRIKANTH

Hyderabad , India

Data Scientist

USD / Year

Start Date / Notice Period end date:

6 Years

Now you can Instantly Chat with SRIKANTH!

Chat Now

About Me

8.5 years of experience in IT with experience in analyzing complex problems and translating them into scalable and efficient Data Science problems. Experience in Machine Learning, Deep Learning, Natural Language Processing and Big Data Engineering. E...

Skills

Positions

ML/AI Engineers

Data Analysts

Data Scientist

Data Engineer

Portfolio Projects

Description

Have implemented OCR solution using OpenCV, Tesseract libraries to extract text from printed Documents with 99?curacy and handwritten documents with 85?curacy.

Show More Show Less

Description

Built Deep Learning Model (CNN, LSTM, CTC Loss) which reads handwritten text written in words and digits in Bank Checks. This Model is being used for automating Manual Check verification and saves 67% manual effort.

Show More Show Less

Description

Classify collection call transcripts into relevant or non-relevant calls for evaluating agents performance, reducing regulatory breaches. Model built using Bert transformer with 95% accuracy and F1-Score of 94%.

Show More Show Less

Description

Provide PR servicing by giving offers to credit cards customers who are inactive by routing them to PR agents when they call Citi. 2 Models were built for Responders and Control groups. Both models together will be used to determine whether the inactive callers should be redirected to PR agents. R Squared values for responder model is 40% and for Control Model is 35%.

Show More Show Less

Description

Built a sentiment classification engine for clinical data. This will give the trend of the disease based on the clinical notes of the patient over a period of time. Example of trends are disease recurring, disease becoming more severe, less severe etc.

Show More Show Less

Description

Created a bigdata pipeline to extract data from RDBMS into HDFS. This data will be processed in Spark and will be used for Qlikview dashboards. Handled end to end implementation except Qlikview.

Show More Show Less

Description

Developed PySpark jobs which reads data from AWS S3, transforms the data and writes to S3 as Parquet files, creation of user defined functions in PySpark. Performance Optimization for existing scripts. Built a Streaming application using Spark and Kafka. Rest APIs implementation using Python. Data pipelines creation using Airflow

Show More Show Less

Description

Built a Credit Engine which calculates Probability for loan defaulting for SME Customers. This Credit Engine receives data from external/internal APIs and processes the data. This data is provided as input to the classification model where probability for defaulting is calculated.

Show More Show Less

Description

Extraction of data from RDBMS into Hadoop. ETL using PySpark and Hive for Risk Data. Storing transformed data in Hive tables for usage in Qlikview Dashboards. Generation of files with contact details and due amounts for defaulted customers.

Show More Show Less

Description

Built a machine learning model for an insurance client which predicts if a customer lapses his policy. Logistic regression and Tree type algorithms were used for prediction.

Show More Show Less

Description

Built a classification model using NLP techniques for predicting document categories for an insurance client. Developed Preprocessing pipeline to clean documents.

Show More Show Less

Description

Developed a Credit Score Engine to calculate credit score variables for SME Customers. This CreditEngine receives data from external/internal APIs and processes the data and decides whetherloan would be approved/rejected for the Customers.Created parsers for and XML input files.Processing of data done using Pandas and Numpy libraries.Developed REST API using flask.Improved performance by implementing asynchronous programming to process data fromdifferent APIs.Implementation of Data Load in MariaDB Columnar store.

Show More Show Less

Description

Development of a supply chain finance model which predicts whether finance should be provided tosellers. This model takes the inventory details, account receivables and account payables data asinput.Creation of Data Model from OFBiz erp model.Data ingestion into Hadoop using Sqoop.Logistic Regression is used to evaluate the credit risk of SMEs.

Show More Show Less

Description

Design and development of web scrapers to extract text from Singapore and Hong Kong customswebsites.Implemented Web Crawlers using Scrapy framework in Python.Deployment of Web Crawlers in AWS to have rotating proxies which prevents blockingof web crawlers by websites.

Show More Show Less

Description

Identify the profile of customers who have propensity to lapse the insurance policies.Initial model was developed for product categories - Term Life, Whole Life andUniversal Life Policies.Level Premium policies out of three major product categories -Term Life, Whole Life andUniversal Life Policies were identified. Alternate approach is model development with reclassified fourproduct categories as Level Premium Period, Term Life, Whole Life and Universal Life.This has brought significant improvement in accuracy of model.Algorithms and Language:Decision trees were used and derived the rules in R.Logistic regression was used to predict the churn of customers with probability in R.

Show More Show Less

Description

Provide a solution to analyze agent performance based on several attributes like demography,products sold, new business, etc. The goal is to improve the existing knowledge used for agentsegmentation in a supervised predictive framework and to predict the Policy inforce Quantity.Approach:Univariate and Bivariate analysis of different variablesHandling of outliersfeature engineeringSummary stats by agencyModel BuildingAlgorithms implemented in Python:Decision treesNeural Networks.

Show More Show Less

Description

A Health Care provider follows a ticketing system for all the telephonic calls received across all thedepartments where the Calls can be for New Appointment, Cancellation, Lab Queries, Medical Refills,Insurance Related and General Doctor Advice etc. The challenge is, based on the Text in the Summaryand Description of the call; the ticket is to be classified to Appropriate Category.Approach:Cleaning the data which involves converting to required formatCorpus creationPre-ProcessingDocument Term Matrix creationSplitting the data into Train, Validation and Test Datasets and applying the models on Traindataset and validating on Validation dataset.Algorithms used in R:SVMRandom ForestNaive Bayes

Show More Show Less

Description

This project for a telecom client involves processing of Monthly Bills for Fixed line customers andgenerating PDF files of Mobile Bills to the end users.Developed Informatica mappings, enabling the extract, transport and loading of the data intotarget tables.Analyzed, designed, developed, implemented and maintained moderate to complex initial loadand incremental load mappings to provide data for enterprise data warehouse.Worked with Memory cache for the better throughput of sessions containing Rank, Lookup,Joiner, Sorter and Aggregator transformations.Responsible for migrating project between environments (Dev, QA, UAT, Prod)

Show More Show Less