Nitish S.

Java/ Scala/ AWS/ Big Data Developer

Commitment
0/ 5
Competency
0/ 5
Reliability
0/ 5
  • Overall Experience: 7 Years  
  • Amazon Relational Database Service:
  • Apache Ant:
  • Application Lifecycle Management:
  • Business Intelligence:
  • Component Object Model (Microsoft COM):

Nitish S. 

Java/ Scala/ AWS/ Big Data Developer

Commitment
0/5
Competency
0/5
Reliability
0/5

Time zones ready to work

  • New Delhi [UTC +5]

Willing to travel to client location: Yes  

About Me 

Technologies experienced in: JAVA, Scala

Big Data components experienced in: Deep knowledge of Big Data Ecosystem, Spark, Spark Streaming, Kafka, Kafka Stream

Technologies experienced in: JAVA, Scala

Big Data components experienced in: Deep knowledge of Big Data Ecosystem, Spark, Spark Streaming, Kafka, Kafka Streaming, HBase, Hive, Zookeeper, YARN, MapReduce, Kafka, Docker, SQOOP, MongoDB, JDBC, JSON, XML, Google Protocol Buffers etc.

Hadoop distributions experienced in: Cloudera, Hortonworks.

AWS Cloud experience in: ● Fit AWS solutions inside a Big Data ecosystem ● Leverage Apache Hadoop in the context of Amazon EMR ● Identify the components of an Amazon EMR cluster, then launch and configure an Amazon EMR cluster ● Use common programming frameworks available for Amazon EMR. ● Improve the ease of use of Amazon EMR by using Hadoop User Experience (Hue) ● Use in-memory analytics with Apache Spark on Amazon EMR ● Use S3 for storage. ● Identify the benefits of using Amazon Kinesis for near real-time Big Data processing ● Leverage Amazon Redshift to efficiently store and analyze data.

Technical Expertise: Languages – Java SE. Tools – GIT, Maven, Putty, Perforce. Servers –Apache Tomcat Operating System – Windows, UNIX IDE – IntelliJ


Professional Experience:

Clairvoyant Experience – 06/19

Project: ODM (Batch Processing) & Near Real Time (NRT)
Developed a data lake for a client (of financial domain) using Spark with JAVA and Scala, cloud we used is AWS, and for streaming we do spark streaming and kafka as well. To keep data secure and for server authentication, we use Kerberos.

Stacks used: Java 7&8, Data Structure, AWS, Spark, Spark Streaming, Kafka, Kafka Stream, HBase, Hive.

Amdocs Experience: Software Developer - 11/16 - 06/19

Project: Amdocs Data Hub (ADH).

Project was to support and enhance the existing amdocs product (ADH), which takes data from source (Oracle, CSV files etc.) and loads that data to the Hadoop environment. Role was to do enhancement in ADH, analyze the code and fix the bugs. Built some new pipelines from scratch like Kafka Collector, File Collector, CSV collector etc.


Stacks Used: Java 7&8, Data Structure, AWS, Spark, Spark Streaming, Kafka, Kafka Stream, HBase, Hive, Zookeeper, YARN, HQL, SQL.


Intra-Amdocs Inter Unit project: Updation Tool

Description: Development of tool is in Micro Services.

Technologies & Tools: Java, Spring Boot, Rest Services, Couch Base, Core Java, Putty,

Detailed Achievements: Code in spring boot with Rest API’s. Integrate business logic with Couch Base DB.

Show More

Interview Videos

Signup to see videos

Risk-Free Trial, Pay Only If Satisfied.

Portfolios

FDR

Role:

The Citi Data – Big Data & Analytics Engineering Organization is actively recruiting for a Big Data Engineering Analyst. Candidates with prior hands-on experience of the Hadoop ecosystem will be preferred. The candidate must have Java experience and will contribute to the architecture, engineer

The Citi Data – Big Data & Analytics Engineering Organization is actively recruiting for a Big Data Engineering Analyst. Candidates with prior hands-on experience of the Hadoop ecosystem will be preferred. The candidate must have Java experience and will contribute to the architecture, engineering and custom development required of Hadoop offering within Citi Big Data Platform

Responsibilities: ● Involved in requirement analysis, design, coding and implementations. ● Processed data into HDFS by developing solutions, analyzed the data using Spark, Spark Streaming and produced summary results from Hadoop. ● Used Sqoop to import the data from RDBMS into the Hadoop Ecosystem. ● Involved in loading and transforming sets of structured, semi structured and unstructured data and analyzed them by running hive queries and spark sql. ● Worked on various File Formats - AVRO, ORC, Parquet, Seq Files, text files, csv, xml etc. ● Managed and reviewed log files.

Show More

Skills: Big DataRelational Database Management System - RDBMSApache AvroComma Separated Values - (CSV)XML

Tools:

Near Real Time (NRT)

Role:

NRT is a project which allows the flow of data E2E, in near to real time. Worked on development modules like Hbase Collector, Oracle Collector, Kafka collector and Implementation part as well. Worked for PayPal Payment Data Engineering Team. The purpose of this project is to capture all data stre

NRT is a project which allows the flow of data E2E, in near to real time. Worked on development modules like Hbase Collector, Oracle Collector, Kafka collector and Implementation part as well. Worked for PayPal Payment Data Engineering Team. The purpose of this project is to capture all data streams from different sources and stored into our secure Cloud stack based on technologies including Hadoop, Spark and Kafka. Also we build new processing pipelines over transaction records, user profiles, files and communication data raging from emails, instant messages. Moreover using Spark to enrich and transform data to internal data models powering search, data visualization and analytics.

Responsibilities: ● Designed and implemented scalable infrastructure and platform for large amounts of data ingestion, aggregation, integration and analytics in Hadoop, including MapReduce, Spark, Spark Streaming, Kafka, HDFS, Hive. ● Written Sqoop scripts to import, export and update the data between HDFS/hive and relational databases. ● Developed Utils, for importing data from various sources like HDFS/HBase into SparkRDD. ● Processed the BA’s requirement through Spark DataFrame functions E2E. ● Designed and created the data models for customer data using HBase query API’s. ● Created Hive tables, then loaded and analyzed data using hive queries. ● Utilized Kafka to capture and process realtime and near real time time streaming data. ● Using Spark SQL and Spark Streaming for data Streaming analysis. ● Developed Spark Code in Java and Scala to perform data transformation, creating DataFrames and running Spark SQL and Spark Streaming application in Scala. ● Developed Custom Partitioner in Kafka. ● To Avoid, Add salting mechanism in HBase and Spark programs. ● For Authorization, implemented Kerberos.

Jun2019 - Feb2020)

Show More

Skills: OracleHadoopHadoop Distributed File System - (HDFS)Apache SqoopKerberos

Tools:

Amdocs Data Hub on Amdocs Insight Platform

Role:

Amdocs Data Hub is an end to end platform enabling Communication Service providers to develop big data solutions including data integration, data storage, and reporting. It processes and stores that data into a unified data store based on Amdocs Logical Data Model. It can consolidate and compact

Amdocs Data Hub is an end to end platform enabling Communication Service providers to develop big data solutions including data integration, data storage, and reporting. It processes and stores that data into a unified data store based on Amdocs Logical Data Model. It can consolidate and compact such data, and then analyze and report business insights based on it. Worked on development modules like Golden Gate Collector, Kafka collector and Implementation of entities.

Designation: Sr. Software Engineer.

Responsibilities: Product Experience: o Development of different features for product ADH in -

Languages: JAVA, Scala.

Big Data Components: In depth of Big Data Ecosystem, Spark, Spark Streaming, Kafka, Kafka Streams, HBase, Hive, Zookeeper, YARN, MapReduce, HUE.

Hadoop Distribution: Cloudera, Hortonworks.

Cloud Experience: Amazon Web Services (AWS) - EC2, Kinesis, EMR, Amazon RedShift, S3. o Defects fixing with high debugging skills

On-Site Delivery Experience (Interaction with Customers): o Assess and understand the customer requirement and then provide the required Estimations of effort and resources. o Contribute in architecture & detailed design and development of varied Big Data Solution. o Incorporate continuous integration in the delivery line. o Responsible for designing, coding and testing solutions deliverable to clients. o Conduct unit testing and troubleshooting. o Apply application of appropriate development tools. o Set priorities for projects including equipment and resources to ensure timely delivery of agreed projects. o Assess and communicate risk in relation to solution delivery. o Monitor and challenge KPIs for vendor performance and identify gaps and areas of service improvement. o Ensure simplification and repeatability of dev code. o Foster an innovative culture and approach across the ETL dev team. o Apply the relevant security and risk management protocols as required. o Maintain solution documentation as appropriate . o Collaborate with teams to integrate systems. o Provide any third level support in post-production as required.

Show More

Skills: Java (All Versions)Apache ScalaApache SparkSpark StreamingApache-KafkaAPACHE HBASEApache HiveZookeeperYarnMap ReduceHueClouderaHortonworksAWSAWS EC2KinesisAWS S3

Tools:

Employment

Sr. Software Developer.

2020/06 -

Skills: Big DataUNIX Shell ScriptingShell ScriptingApache ScalaGemfire Distributed Caching MechanismTeamCityUrban Code

Your Role and Responsibilities:

Responsibilities:

  • Involved in requirement Analysis, Design, Coding and Implementations.
  • Processed data into HDFS by developing solutions, analyzed the data using Spark, Spark Streaming and produced summary results from Hadoop Ecosystem.
  • Used Sqoo

Responsibilities:

  • Involved in requirement Analysis, Design, Coding and Implementations.
  • Processed data into HDFS by developing solutions, analyzed the data using Spark, Spark Streaming and produced summary results from Hadoop Ecosystem.
  • Used Sqoop to import the data from RDBMS into the Hadoop Ecosystem.
  • Involved in loading and transforming sets of structured, semi structured and unstructured data and analyzed them by running hive queries and spark sql.
  • Worked on various File Formats - AVRO, ORC, Parquet, Seq Files, text files, csv, xml etc.

Implemented secured jobs in Spark and Kafka. Secured with SSL and SASL authentication and authorized with Kerberos.

Show More

Sr. Software Engineer.

2020/02 - 2020/06

Skills: AWSHadoopYarnCluster

Your Role and Responsibilities:

Project:

● Implemented solutions including advanced AWS Components: EMR, EC2, etc integrated with Big Data/Hadoop Distribution Frameworks: Zookeeper, Yarn, Spark, Scala, NiFi etc. ● Designed and Implemented Spark Jobs to be deployed and run on existing Active clusters. ● Worked on N

Project:

● Implemented solutions including advanced AWS Components: EMR, EC2, etc integrated with Big Data/Hadoop Distribution Frameworks: Zookeeper, Yarn, Spark, Scala, NiFi etc. ● Designed and Implemented Spark Jobs to be deployed and run on existing Active clusters. ● Worked on NiFi data Pipeline to process a large set of data and configured Lookup for Data Validation and Integrity. ● Worked in Spark Scala, improving the performance and optimization of the existing applications running on the EMR cluster.

Show More

Sr. Software Engineer.

2019/06 - 2020/02

Skills: Java (All Versions)Oracle

Your Role and Responsibilities:

Project: Near Real Time (NRT) - (Jun2019 - Feb2020)

NRT is a project which allows the flow of data E2E, in near to real time. Worked on development modules like Hbase Collector, Oracle Collector, Kafka collector an

Project: Near Real Time (NRT) - (Jun2019 - Feb2020)

NRT is a project which allows the flow of data E2E, in near to real time. Worked on development modules like Hbase Collector, Oracle Collector, Kafka collector and Implementation part as well.

Worked for PayPal Payment Data Engineering Team. The purpose of this project is to capture all data streams from different sources and stored into our secure Cloud stack based on technologies including Hadoop, Spark and Kafka. Also we build new processing pipelines over transaction records, user profiles, files and communication data raging from emails, instant messages. Moreover using Spark to enrich and transform data to internal data models powering search, data visualization and analytics.

 

Responsibilities:

  • Designed and implemented scalable infrastructure and platform for large amounts of data ingestion, aggregation, integration and analytics in Hadoop, including MapReduce, Spark, Spark Streaming, Kafka, HDFS, Hive.
  • Written Sqoop scripts to import, export and update the data between HDFS/hive and relational databases.
  • Developed Utils, for importing data from various sources like HDFS/HBase into SparkRDD.
  • Processed the BA’s requirement through Spark DataFrame functions E2E.
  • Designed and created the data models for customer data using HBase query API’s.
  • Created Hive tables, then loaded and analyzed data using hive queries.
  • Utilized Kafka to capture and process realtime and near real time time streaming data.
  • Using Spark SQL and Spark Streaming for data Streaming analysis.
  • Developed Spark Code in Java and Scala to perform data transformation, creating DataFrames and running Spark SQL and Spark Streaming application in Scala.
  • Developed Custom Partitioner in Kafka.
  • To Avoid, Add salting mechanism in HBase and Spark programs.
  • For Authorization, implemented Kerberos.
Show More

Software Engineer

2016/11 - 2019/06

Skills: Apache SparkApache-KafkaApache HiveAPACHE HBASEZookeeperKerberosUnixPuttyAWS EC2Apache SparkSpark Streaming

Your Role and Responsibilities:

Responsibilities:

 

Product Experience:

o     Development of different features for product ADH in -

Languages: JAVA (7 & 8), SCALA.

Big Data Componen

Responsibilities:

 

Product Experience:

o     Development of different features for product ADH in -

Languages: JAVA (7 & 8), SCALA.

Big Data Components: In depth of Big Data Ecosystem, Spark, Spark Streaming, Kafka, Kafka Streams, HBase, Hive, Zookeeper, YARN, MapReduce, HUE.

Hadoop Distribution: Cloudera, Hortonworks.

Cloud Experience: Amazon Web Services (AWS) - EC2, Kinesis, EMR, Amazon RedShift, S3.

o Defects fixing with high debugging skills.

 

 

 

On-Site Delivery Experience (Interaction with Customers):

  • Assess and understand the custmer requirement and then prvide the required Estimatins f effrt and resurces.
  • Cntribute in architecture & detailed design and develpment f varied Big Data Slution.
  • Incrporate cntinuous integratin in the delivery line.
  • Respnsible fr designing, cding and testing slutions deliverable t clients.
  • Cnduct unit testing and trubleshooting.
  • Apply applicatin f apprpriate develpment tols.
  • Set pririties fr prjects including equipment and resurces t ensure timely delivery f agreed prjects.
  • Assess and cmmunicate risk in relatin t slution delivery.
  • Mnitor and challenge KPIs fr vendr perfrmance and identify gaps and areas f service imprvement.
  • Ensure simplificatin and repeatability f dev cde.
  • Fster an innvative culture and apprach acrss the ETL dev team.
  • Apply the relevant security and risk management prtocols as required.
  • Maintain slution dcumentation as apprpriate .
  • Cllaborate with teams t integrate systems.
  • Prvide any third level supprt in pst-production as required.
Show More

Education

2012 - 2016


2010 - 2011


2009 - 2010


2012 - 2016


Skills

Algorithm Development Amazon Relational Database Service Apache Ant Apache Spark API Development

Achievements

Got many organizational awards like "Employee of the Quarter" and many more.

Certification

I bring seven years of extensive expertise in Big Data technologies, specializing in Java and Scala programming languages, coupled with a comprehensive proficiency in AWS Cloud services. My professional background encompasses a proven track record of successfully navigating and implementing complex data solutions within the dynamic landscape of Big Data analytics and cloud computing.

Preferred Languages

English - Fluent