About Me
Data scientist with Six years of experience in predictive modeling, natural language processing, data processing, data mining, solve challenging business problems. Strong background in computer programming language, and knowledge of various types of ...
Show MoreSkills
Portfolio Projects
Description
I have Create Dashboard Using Sales Data Where I have Multiple Table So First Of All I have Connect SQl With Tableau/PowerBI and Extract Sales Data and Create Schema and Joint Them Using Modeling After That I have Do Some Data Cleaning Process And Data Format Process , At Last I have Create Visual Using Tableau/PowerBIas Per Client Requirements
ETL process
Data Modeling
Data Visualization
Scheduling
Data Security
Data Management
Show More Show LessDescription
Take a sentence, convert it into a vector. Take many other sentences, and convert them into vectors. Find sentences that have the smallest distance (Euclidean) or smallest angle (cosine similarity) between them. Convert words and sentences into high-dimensional vectors, each vector's geometric position can attribute meaning. Measure of semantic similarity between sentences.
Show More Show LessDescription
Researched and implemented NLP methods to extract relevant information from 50k+ SEC Filings using Stanford CoreNLP and Spacy in Python. Developed models using ML and NLP to analyze business model and board leadership structure of companies with Text Classifiers using NLTK and SciKit-Learn with 90% validation accuracy.
Show More Show LessDescription
Identification of financial documents based on text given in it. OCR for scanned documents. Extraction of text from PDF document using pypdf and ocrmypdf libraries. Pre-processing on extracted text using re and nltk libraries. Conversion of textual data in vector format using TFIDF and Word2Vec. Classification of textual data with ML algorithms like SVM, Decision Tree, Random Forest, and XGBoost. Used deep learning techniques like BERT, DIstilBERT, and encoder-decoder for better accuracy with sequential data. Post classification integration of the model using flask.
Show More Show LessDescription
For tabular data extraction used tabula library. Also used object detection techniques to identify tables dynamically. Used labelme to annotate the tabular data. Used Faster RCNN, YOLOV3, and YOLOV5 for tabular data detection. Post detection of tabular data used clustering techniques to generate table in CSV format. Extraction and preparation of tabular data which are extracted from PDF. To extract tabular data used object detection methods like FasterRCNN, YOLOV3, and YOLOV5. Prepare summary from a large document and used Extractive and Abstractive methods. Used gensim and sumy libraries to generate the summary of text. Used methods like LSA and LexRank to generate summary of documents. For Topic modeling worked on LDA also, with this I have identified the topics from a large corpus.
Show More Show LessDescription
Built an analytics engine to determine the critical reception of an artist's work based on social media using Sentiment Analysis and Opinion Mining in Python. Developed a Data Preprocessing module using NLTK involving multiple NLU steps. Sentiment Scoring for feature engineering. Built a Tweet Sentiment Classifier using an ensemble of Nave Bayes and Logistic Regression with 87.89% accuracy.
Show More Show LessDescription
To Pre-process the data and indexing with time series data. Visualize data using time-series decomposition method to decompose our time series into three distinct components trend, seasonality, and noise. To build and Train a Machine Learning Model using LSTM, Prophet Time Series Model. To find the accuracy of the Model Prediction using Visualize Actual and Predicted value graph. To predict the Sales Amount using LSTM, Prophet. To plot the prediction graph using Matplotlib Library to show the Sales Amount Prediction.
Show More Show LessDescription
This project refers to the prediction of whether a particular customer ceases his or her relationship with a company. To Pre-process the data of more than 1 Million Records and hundreds of Features. To build and Train a Machine Learning Model using Logistics Regression, Random Forest Classification, Xgboost Classification, Voting Classifier. To find the accuracy of the Model Prediction using Classification Reports, Confusion Matrix, AUC Score.
Show More Show Less
+1 646 305 2118
+91 9875 492266
