Marcin W.

Marcin W.

Data Scientist & Machine Learning Engineer

, United Kingdom

Experience: 17 Years

Marcin

Data Scientist & Machine Learning Engineer

115200 USD / Year

  • Immediate: Available

17 Years

Now you can Instantly Chat with Marcin!

About Me

He has almost 20 years of experience in Computer Science both in industry and academia. He is highly oriented on providing solutions to increase incomes and to optimize costs with a simplest possible set of means for a problem (KISS paradigm), oft...

He has worked with a high variety of platforms and solutions involved in the whole data processing pipeline. From gathering, modeling, storing, data, trough analyzing, finding insights in the data, up to building and deploying in production machine learning models.

He also has experience in presenting results in public gained at top scientific conferences, as well as, giving academic lectures and mentoring younger colleagues. As an independent academic researcher, he had a chance to developed great time management and organizational skills.

Moreover, he had an opportunity to coordinate work of data processing teams in different startups, to supervise students towards their master and PhD diplomas, and to coordinate efforts of experienced scientists working on research projects.
 

Show More

Portfolio Projects

Ischemic Stroke Risk Assessment

https://mwylot.net/portfolio/stroke-risk-assessment

Company

Ischemic Stroke Risk Assessment

Description

The goal of this project was to develop an AI backend engine for an intelligent decision support system which asses an ischemic stroke risk. The system was developed to allow preventive interventions for patients with high risk of a stroke. In collaboration with a health insurer we collected historical electronic health record, social-demographic data, and quality of life related data to train and evaluate machine learning models.

The data was analyzed to find which aspects present in the collected datasets had the highest impact on the ischemic stroke risk. This was an iterative process in collaboration with multiple medical doctors and researchers, leading to the election of more that two hundreds prospective features which were then engineered and gave a base for the feature selection process.  

During a model selection phase several models were evaluated, e.g., XGBoost, SVM, Random Forest, Neural Network. Area under the ROC Curve was used as a performance metric, because the data was unbalanced. The best result of 0.86 was achieved by a Neural Network consisting of three hidden layers 32 neurons each. The model was deployed and served as a REST API within a docker container, what enabled horizontal scalability of the solution. 

Show More Show Less

Tools

Keras

Company

Predicting Blood Transfusion Needs

Description

The goal of this project was to develop an AI-backed intelligent decision support system assisting medical doctors to make decision about blood transfusion. In collaboration with multiple hospitals we collected historical data about blood transfusions, patients, and medical tests relevant to blood transfusion.

The data was analyzed to find which aspects present in the collected datasets had the highest impact on a decision if blood transfusion had been executed or not. This was an iterative process in collaboration with multiple medical doctors and researchers, leading to the election of multiple factors which gave a base for the feature selection process.

During a model selection phase several models were evaluated, e.g., XGBoost, SVM, Random Forest,  k-nearest neighbors, Neural Network. Area under the ROC Curve was used as a performance metric, because the data was unbalanced. After the model selection and hyperparameters tuning the performance metric value was 0.896. The model was deployed and served as a REST API within a docker container, what enabled horizontal scalability of the solution. 

Show More Show Less

Tools

Keras

SolarData - Simulation the Performance of Photovoltaic Energy Systems

https://mwylot.net/portfolio/solardata-simulation-the-performance-of-photovoltaic-energy-systems

Company

SolarData - Simulation the Performance of Photovoltaic Energy Systems

Description

This platform allows to simulate the amount of energy  produced by a photostatic energy system. It extracts weather data from Grib files (binary format of weather data). Based on this data and geographic coordinates the system computes how much energy a PV installation is able to produce for particular weather conditions and location. Finally, it generates a graphical report to present this data on charts.

SolarData is used to evaluate current performance and determine the future value of PV generation projects (expressed as the predicted energy yield) and, by extension, influence how PV projects and technologies are perceived in terms of investment risk. SolarData leverages PV Performance Modeling Collaborative (PVPMC) framework in order to accurately simulate the performance of a defined photovoltaic energy system for a specific location.

Show More Show Less

Skills

Data Engineer

Tools

Python

patient blood management

Company

patient blood management

Description

We build patient blood management system to optimize usage of blood during the process from donation to transfusion. I design and implement machine learning algorithms and expert systems to analyze medical data. I work with Python and its machine learning and data science packages (numpy, pandas, sklearn, keras, notebooks), decision rules (durable_rules), docker and docker-compose

Show More Show Less

Tools

Numpy