Now you can Instantly Chat with Roberto!
About Me
Young professional working towards a diploma in Data Science with experience in Python. Focused in Pyspark, Database Administration, SQL, No-SQL, Airflow, OOP, AWS Cloud, Docker, ETL, Django, Data Modeling with Dash Plotly and Command Line....
Show MoreSkills
Positions
Portfolio Projects
Description
In this project,
-
I usedspeedtest-CLI(Linux software) to collect data about the internet velocity of my residence and used a scheduler with the Cron(Linux Software) to repeat the task periodically.
-
Then I organized the output data to be ingested into a dataset in a CSV format using thePandas(Python module).
-
From the dataset, I produce an interactive graph about my upload and download rate using thePlotly(Python module).
-
And finally, I share the graph on the internet through the S3(AWS Simple Cloud Storage ) using theBoto3(Python module).
link to see the output on AWS SERVER:GRAPH
To use the code for the first time:
-
First install the requirements, with "install_requirements.sh" usingbash.
-
Secondoly, create the dataset with "creatingDataset.py" using python.
Use the "mainFile.py" to run:
os.system("bash speedTest.sh")
To collect data about your internet speed.
os.system("python3 populatingSpeedtest.py")
To populate the database with the new data.
os.system("python3 dash_plotly.py")
To vizualize the models in the dashboard.
To see the dashboard access this address"http://127.0.0.1:8050/"in your web-browser.
You also can vizualize the database in terminal with the"tabulateSpeedtest.py".
Show More Show LessDescription
General description
- Project made by me in my internship on Minstry of Communications of Brazil, to integrate spreadsheets from different sectors linked to GESAC Financial Control (WIFI-BRAZIL that aims to bring the connection to public schools).
Final gol:
- Pass the data to the PowerBI. To create a model that can integrate different sectors linked to GESAC Financial Control.
PowerBI
Spreadsheets:
- Control of Parliamentary Amendments
Emendas2021_resumo.xlsx
- Control of Credit Note and Commitment
Controle de Empenhos e NC 2022.xlsm
Challenges:
-
Business Understanding: preparation of formal documents for understanding the data based on the initial stages of CRISP-DM management. Data dictionary creation.
-
Make recommendations, so that future data could be inserted atomically following database normalizations.
-
Advanced data ingestion for reading excel files with different formats.
financial_control_wifi-BR.py
-
Create a primary key for the worksheets with the names of Deputies, and Senators to circumvent the inconsistency of the data entered by different professionals from the Ministry of Communications. To then do the modeling of Financial Control within PowerBI.
-
Create an algorithm that would make a comparison of entities to fill in the civil name of all Parliamentarians. This filling was done based on a control of all the different parliamentary names given in a dataset with the civil name of each one (Spreadsheet:Proponentes.xlsx), a bank created by me from an extraction of the Open Government Data.
Public Data
- CAMERA
- SENADO
Description
myScripts
Here I put together my main scripts made by me. DESCRIPTION OF THE FILES and FOLDERS:
FOLDERS:
calcalus_2: this project that I created to help solve problems on Calculus_2 using "Sympy", a Python module for advanced math operations such as integrals and deliveries.
gpx:
In this project, I tested an open-source app for running, and use its data in the gpx to explore information characteristics and produce graphs using "mplleaflet" and "folium", Python modules.
speedtest-CLI_dataEngineering:
This project has its separate repository outside of here. You can find it in the link below:
https://github.com/s33ding/speedtest-CLI_dataEngineering.
teaching_english:
In this project I made a program to help my girlfriend learn English, listening, conjugating verbs, and spelling. While practicing typing on the keyboard. For this I used:
-gtts and Playsound (Python Modules) - to reproduce the sound of the words.
-os (Python Modules) - to control the file system.
-pynput (Python Modules) - to control the input on the keyboard
-coloram (Python Modules) -to produce colors on the terminal, indicating in which part of the text the person is while it typing its spelling.
-random (Python Modules) - to pick a random word in the data frame for the code execute.
Show More Show Less