Now you can Instantly Chat with Satyaranjan!
About Me
Having around 3 years of experience in IT profession in python, pyspark, hive and different hadoop components. Having good understanding of hadoop components to handle structured and semi-structured data for data quality check in big data environment...ponents to handle structured and semi-structured data for data quality check in big data environment. Worked for the development of a data quality product for banking client as per the business requirement using python and pyspark in the distributed environment like hadoop ecosystem. Good knowledge of python packages for data collection from different sources and data cleansing by applying different data quality and data profiling rules. Experience to handle large dataset and apply transformation and data quality checks on large dataframe using pyspark. Having experience in banking domain and worked for Lloyds Banking Group for 2 years with different roles and responsibilities. Hands on experience with different Hadoop components, python, pyspark, hive and HDFS using different python and pyspark packages. Good understanding and analysis of business rules for data quality and data profiling and implement those using pyspark and python. Having descent knowledge on software methodologies like Agile and JIRA board for sprint activity and task management. Flexible and versatile to adapt to any new technology and work on challenging environment.
Show MorePortfolio Projects
Description
The project was to generate a report for the bank for all the customer who has account in different country by taking the data from different source systems. The data was so huge so we had implemented hadoop and different hadoop components to perform ETL on the data and generate a xml report out of it.
Show More Show Less