Nitish S.

Nitish S.

Java/ Scala/ AWS/ Big Data Developer

Lucknow , India

Experience: 5 Years

Nitish

Lucknow , India

Java/ Scala/ AWS/ Big Data Developer

50516.2 USD / Year

  • Immediate: Available

5 Years

Now you can Instantly Chat with Nitish!

About Me

Technologies experienced in: JAVA, Scala

Big Data components experienced in: Deep knowledge of Big Data Ecosystem, Spark, Spark Streaming, Kafka, Kafka Streaming, HBase, Hive, Zookeeper, YARN, MapReduce, Kafka, Docker, SQOOP, MongoDB, JDB...

Show More

Skills

Portfolio Projects

Description

The Citi Data – Big Data & Analytics Engineering Organization is actively recruiting for a Big Data Engineering Analyst. Candidates with prior hands-on experience of the Hadoop ecosystem will be preferred. The candidate must have Java experience and will contribute to the architecture, engineering and custom development required of Hadoop offering within Citi Big Data Platform

Responsibilities: ● Involved in requirement analysis, design, coding and implementations. ● Processed data into HDFS by developing solutions, analyzed the data using Spark, Spark Streaming and produced summary results from Hadoop. ● Used Sqoop to import the data from RDBMS into the Hadoop Ecosystem. ● Involved in loading and transforming sets of structured, semi structured and unstructured data and analyzed them by running hive queries and spark sql. ● Worked on various File Formats - AVRO, ORC, Parquet, Seq Files, text files, csv, xml etc. ● Managed and reviewed log files.

Show More Show Less

Description

NRT is a project which allows the flow of data E2E, in near to real time. Worked on development modules like Hbase Collector, Oracle Collector, Kafka collector and Implementation part as well. Worked for PayPal Payment Data Engineering Team. The purpose of this project is to capture all data streams from different sources and stored into our secure Cloud stack based on technologies including Hadoop, Spark and Kafka. Also we build new processing pipelines over transaction records, user profiles, files and communication data raging from emails, instant messages. Moreover using Spark to enrich and transform data to internal data models powering search, data visualization and analytics.

Responsibilities: ● Designed and implemented scalable infrastructure and platform for large amounts of data ingestion, aggregation, integration and analytics in Hadoop, including MapReduce, Spark, Spark Streaming, Kafka, HDFS, Hive. ● Written Sqoop scripts to import, export and update the data between HDFS/hive and relational databases. ● Developed Utils, for importing data from various sources like HDFS/HBase into SparkRDD. ● Processed the BA’s requirement through Spark DataFrame functions E2E. ● Designed and created the data models for customer data using HBase query API’s. ● Created Hive tables, then loaded and analyzed data using hive queries. ● Utilized Kafka to capture and process realtime and near real time time streaming data. ● Using Spark SQL and Spark Streaming for data Streaming analysis. ● Developed Spark Code in Java and Scala to perform data transformation, creating DataFrames and running Spark SQL and Spark Streaming application in Scala. ● Developed Custom Partitioner in Kafka. ● To Avoid, Add salting mechanism in HBase and Spark programs. ● For Authorization, implemented Kerberos.

Jun2019 - Feb2020)

Show More Show Less

Description

Amdocs Data Hub is an end to end platform enabling Communication Service providers to develop big data solutions including data integration, data storage, and reporting. It processes and stores that data into a unified data store based on Amdocs Logical Data Model. It can consolidate and compact such data, and then analyze and report business insights based on it. Worked on development modules like Golden Gate Collector, Kafka collector and Implementation of entities.

Designation: Sr. Software Engineer.

Responsibilities: Product Experience: o Development of different features for product ADH in -

Languages: JAVA, Scala.

Big Data Components: In depth of Big Data Ecosystem, Spark, Spark Streaming, Kafka, Kafka Streams, HBase, Hive, Zookeeper, YARN, MapReduce, HUE.

Hadoop Distribution: Cloudera, Hortonworks.

Cloud Experience: Amazon Web Services (AWS) - EC2, Kinesis, EMR, Amazon RedShift, S3. o Defects fixing with high debugging skills

On-Site Delivery Experience (Interaction with Customers): o Assess and understand the customer requirement and then provide the required Estimations of effort and resources. o Contribute in architecture & detailed design and development of varied Big Data Solution. o Incorporate continuous integration in the delivery line. o Responsible for designing, coding and testing solutions deliverable to clients. o Conduct unit testing and troubleshooting. o Apply application of appropriate development tools. o Set priorities for projects including equipment and resources to ensure timely delivery of agreed projects. o Assess and communicate risk in relation to solution delivery. o Monitor and challenge KPIs for vendor performance and identify gaps and areas of service improvement. o Ensure simplification and repeatability of dev code. o Foster an innovative culture and approach across the ETL dev team. o Apply the relevant security and risk management protocols as required. o Maintain solution documentation as appropriate . o Collaborate with teams to integrate systems. o Provide any third level support in post-production as required.

Show More Show Less