About Me
Data Scientist with strong Programming background and expertise in using predictive modeling, data processing, and data mining algorithms to solve challenging business problems.
Involved in Python open source community and passionate about d...
Skills
Data & Analytics
Programming Language
Database
Web Development
Others
Positions
Portfolio Projects
Company
Sentiment analysis
Description
Used product reviews to check the sentiment analysis of the customer to the product
Random Train - Test Split (70 - 30) is done on dataset, TF and TFIDF (Unigram and Bigram) on the dataset is done to create sparse matrix
Evaluated the data with cross validation on several model building techniques such as Decision Tree, Logistic regression and also in deep learning techniques such as LSTM, ANN (Artificial Neural Network) to find the best model out of it and finally evaluated using test data.
Calculated accuracy, precision, recall using confusion matrix and also ROC and AUC curve were calculated for True positive rate vs False positive rate trade off
Skills
SpyderTools
spyderCompany
Patient Records
Description
Used the patient record data to create a model which predicts when a patient will get admitted again in hospital.
Developed a classification model for predicting the probability of hospital readmission.
This output was used by insurance providers and hospital accounts team.
Re-admission risk was captured by categorizing patient attributes into numerical, categorical, and text attributes
Data from more 150,000 hospital admissions was used. Logistic regression model was used as the machine learning technique
Unsupervised learning method K-means clustering is used for clustering the data
Dimensionality reduction using PCA (Principle component Analysis) is used to reduce the dimension of the dataset
ROC curve and AUC measures were calculated for True Positive Rate vs False positive rate trade-offs
Skills
SpyderTools
spyder