About Me
Experienced Lead Data Scientist in the BFSI lending sector with overall 13 years of experience. Proficient in Databricks, Python, and Excel, specializing in machine learning and advanced statistical techniques. Proven track record of delivering impac...
Show MoreSkills
Portfolio Projects
Description
- Predicting the Fair Market Value of Foreign Exchange deals booked at SGTS based on historical data.
- Dataset of approx. 50 fields and 5000 records, Done Data Visualization (Used seaborn) and EDA (dataset.head, info, describe etc.) , identify missing values (Imputer), removed unwanted fields, Encoding Categorical fields, Scaling the data , Training & Testing.
Model selection based on k fold cross validation and grid search to find the best parameters and accuracy
Show More Show LessDescription
Predicting if the Sony Sister companies would be defaulters based on Historical data.
· Data Visualization (Used seaborn) and EDA (dataset.head, info, describe etc.) , identify missing values (Imputer), removed unwanted fields, Encoding Categorical fields, Scaling the data,Training & Testing .
· Model selection based on k fold cross validation and grid search to find the best parameters and accuracy
· Measured accuracy using confusion matrix and classification report.
· Increased accuracy by 7 %( achieved accuracy of 88%) by tuning the hyper parameters.
Show More Show LessDescription
POC : Sentiment analysis of amazon reviews on Sony products
· Data Cleaning using python library re to remove numbers, punctuations etc i.e. only keeping letters a-z
· Converted reviews into lowercase.
· Removed irrelevant in reviews stopwords(using nltk library) to prepare them for machine learning.
· Stemming the words using PorterStemmer to keep the root of the words .
· Created bag of words model using CountVectorizer.
· Applied classification model on this bag of words model .
· Measured accuracy using confusion matrix and classification report.
· Increased accuracy by 7 % by selecting model based on k fold cross validation and grid search to find the best parameters and accuracy.
Show More Show Less
Description
Key Insights Generation And Suggesting Action Items to increase Sales in the Sales Regions .
Fetching Data from Multiple Sources and doing Adjustemnets to reflect Accurate sales And profitability .
Generating Insights out of the data using Excel Pivot Table .
Comparing Revenue and Profitability with Last Year and suggesting Action Items for Next Year to Increase Sales And Profiatbility .
Show More Show LessDescription
Identify the right set of customers who are most probable to take Personal loan. Data considered: On-us and Offus fetures like Salaried, CC Utilization, Recency, Vintage, Enquiry etc. XG Boost model used to split the dataset into 10 deciles, top 3 deciles probable customers are shared for campaign every month Business Impact: Top 3 deciles (30% offer) capture 65% of the business, reducing Campaign cost by 30% (50 Cr Yearly)
Show More Show LessDescription
Build a risk scorecard model to identify customers most probable to default. Data considered: On-us and Offus fetures like age, Occupation, Enquiry, active pl, CIBIL Score, bounces, etc. XGBoost Model build to swap in good performing customers and swap out bad performing customers. Business Impact: Good segment Offer Swap-in of 12 lakhs revenue growth of 80 Cr monthly. Risky segments offer swap out of 8 lakhs leading to eliminate risk worth 10 Cr business loss monthly.
Show More Show LessDescription
Identify probable customers likely to Attrite and provide top up/reduced IRR for next PL to retain the book. Data considered: On-us, Off us & Account Aggregator features like Salaried, CC Utilization, Recency, Vintage, Enquiry, ABB>25K, Retention Ratio etc. Random Forest model used to split the dataset into 10 deciles, top 3 deciles probable customers to attrite are offered Top-up/lower IRR to retain the book. Business Impact: Foreclosure rate @10% accounts for 100 Cr Book run-off monthly, ML Model Implementation reduced foreclosure by 1% (10 Cr Book growth) monthly.
Show More Show LessDescription
New lead Generation using text mining to identify new opportunities in Corporate Solutions based on certain keywords like (to build, to construct, acquire, merge etc.). Using Python Elasticsearch to mine the news from External Data Source (Dowjones) on weekly basis to find out new leads and forwarding those leads to Sales Team to find out New Business. Benefit Case: Model to generate New Business which increased New Business Revenue by 4 %.
Show More Show LessDescription
Customer 360 tool has the feature of generating automated Reports/Dashboards and Analysis, as of now few of the regular analysis like Sales Forecasting, Pipeline creation, Cross selling feature, Loss cause Analysis, Likelihood to Renew policies and Profitability Analysis can be done with this tool. Benefit Case: Removed Manual intervention for analysis & Regular BAU Reports.
Show More Show LessDescription
Predicting the Fair Market Value of Foreign Exchange deals booked at SGTS based on historical data. Dataset of approx. 50 fields and 5000 records, Done Data Visualization (Used seaborn) and EDA (dataset. head (), info(), describe() etc.) , identify missing values (Imputer), removed unwanted fields, Encoding Categorical fields, Scaling the data , Training & Testing. Model selection based on k fold cross validation and grid search to find the best parameters and accuracy.
Show More Show LessDescription
Predicting if the Sony Sister companies would be defaulters based on Historical data. Data Visualization (Used seaborn) and EDA (dataset. head (), info (), describe () etc.), identify missing values (Imputer), removed unwanted fields, Encoding Categorical fields, Scaling the data, Training & Testing.
Show More Show Less