Now you can Instantly Chat with Armand!
About Me
Data Engineer with a strong background in data analysis, machine learning, and software development. Proven experience in developing and deploying data-driven solutions to solve complex business problems. Skilled in Python, SQL, ETL, and cloud comput...
Show MoreSkills
Positions
Portfolio Projects
Description
I develop a clustering algorithm with PySpark based on either the location or characteristics of bike stations using the dataset of Static Geographical Information of CityBike's stations in Brisbane available in Json format ''Brisbane_CityBike.json''. Then I made a code to industrialize the clustering algorithm so as it is able to be launched daily on 10 Go of data in a Spark cluster.
Show More Show LessDescription
I used Python to retrieve data and metadata from my Spotify stories as json file. Then I cleaned up the data by using pandas dataframe to retrieve the relevant data and checking for null values. After the data is cleaned up it is loaded in a SQL database using SQLAlchemy. So I can make SQL queries to analyse my stories data. I alsocreated a workflow with Apache Airflowto visualize the process using a DAG represented like an ETL process.
Show More Show LessDescription
The project consist of connecting Alteryx to various data sources and use Alteryx features to prepare the data. The preparation of the data consists of cleaning up the data and blending different tables according to a certain columns. Then I created different charts to visualize the data in oder to get insights.
Show More Show Less