Now you can Instantly Chat with Divya!
About Me
Over 12 years of IT industry experience in designing and developing solutions of Analytics and Big Data applications using Java - J2EE, and Hadoop ecosystem with SQL and noSQL databases. Oracle Certified Java Programmer (SE-6) Cloudera Certified Hado...
Show MoreSkills
Positions
Portfolio Projects
Description
The platform was build with a vision of becoming a Golden source for Settlement Instructions across the organization. Since SI is a key component for all trade completion and validation, its is important for the system to be highly resilient and always available, while catering to high throughput. Only the incoming data should be processed within milliseconds, but the outgoing data (served using APIs) should also be able to withstand heavy concurrent loads as well as spikes during market opening and closing times. The system was designed to serve the reference data with additional enrichment using business rules, Gruve validations, third party data enrichment, OFAC screenings and many other mandatory data normalizations. This helps the other departments/platforms in the organization to get a cleansed data without any latency and with high availability.
Show More Show LessDescription
The platform is used to generate analytics on the usage, applicability and effectiveness of using JIVE as the platform for communication across the organization. This provides dashboards for different scenarios and usage patterns across a varied selection of time frames and semantics. Also allows customers to create their own custom selections and custom charts for generating statistics of usage.
Show More Show LessContribute
Architectural Design with Client Side architect Datamart Design and Implementation Designing DataWarehouse and ETL process for Data loading and management Designing Data processing engine
Description
Understand Client’s Business needs and translate them into Functional and Non-Functional Requirements Page 3 of 7
Work parallel with the Client side Architects to design the Architecture of the System.
Design and Develop Data Processing Pipeline in Spring Integration/ Spring Boot
Design the Datamart for the Data Processing Engine
Schema Design for SQL Server Datamart Design and SSIS Data Load Design
Work hand in hand with the team for daily basis developments and integrations
Show More Show LessContribute
Design Architecture of System Business Requirement Understanding and Designing Application Database Schema Design and Modelling Design and develop Reports and Dashboards
Description
- Design the Architecture of the System
- Understand Client’s Business needs and translate them into Functional and Non-Functional Requirements
- Design and Develop complete microservice driver architecture in Spring Boot with embedded Tomcat as the web deployment container
- Schema Design for the whole use case (storage and reporting requirements)
- Develop Reports and Dashboards in AngularJs
Contribute
Design the Architecture of the System Business Requirement translation into Technical design Hive and Spark Design and implementation Report Generation and Management in Tableau
Description
- Design the Architecture of the System
- Understand Client’s Business needs and translate them into Functional and Non-Functional Requirements
- Design and Develop Data Processing Pipeline in SPARK (Java Driver)
- Schema Design for Hive Data Storage System
- Design KPIs as per the Business needs.
- Develop Reports and Dashboards in Tableau
Contribute
Develop ETL Data Processing Engine Develop Text Search Module Performance Enhancements Design and Develop MongoDB schema Develop MircoServices and Orchestration Engine
Description
- ETL based Data Processing Engine using Alteryx, Custom JAVA code, server side MongoDB Javascript
- Full Text based Advanced Search Implementation in ElasticSearch
- Assisted with Schema Design for MongoDB and Elastic Search
- Basic understanding of implementation of BI reports.
- Performance enhancements at various points of data processing and search mechanisms.
- Development of a one click orchestration layer that collects data, cleans and processes data, applies rule sets and business logics, validates and filters data, does natural language processing and associative graph modeling and presents it to the user on the web-application.
Contribute
Business Requirement gathering and Solution Designing MongoDB schema Design and Data Model Web Services Development Performance Management and Optimization Orchestration Engine
Description
The product is responsible for collection, processing, analysing, and reporting of collected from different social networks like Facebook, Twitter, GooglePlus, Youtube, Instagram. The data collection is based on prespecified keywords or user selected keywords or any user/person on social media. Registered user can see sentiment analysis and trending for different keywords which will be populated based on data collected from different social media.
Contribute
Requirement gathering amd SOW submission Development of Workflows in Oozie Development of MapReduce Pipeline Web Services development Administration of Hadoop Cluster
Description
- Requirement gathering and SOW submission
- Understanding of requirements for business layer
- Creating an architecture for the solution
- Understanding and integrating all the systems by means of a workflow engine, oozie
- Development of Map-Reduce codes
- Development of sqoop jobs
- Administration of BigData cluster
- Database design for Persistence layer in InfiniDB
Contribute
Domain Understanding and Analysis Testing Pipeline Automation Development of MapReduce Compoments Development of Webservices Performance Optimization
Description
The Project includes creating a software product for the secondary and tertiary analysis of data collected from next generation sequencing from DNA samples using SOLiD technology. The analysis comprises mapping and alignment to the reference human genome data, SNP detection, gene counting, small RNA counts and coverage detection, CNV detection etc.
Description
- Setting up and configuring Hadoop cluster and other involved frameworks like Flume, Sqoop etc.
- Configuring flume nodes for data collection.
- Creating a custom flume decorator for the selective upload of data on HDFS
- Writing Mapper and Reducer for analysis of this data and collecting the required information
Description
The project involves building of a consolidated data warehouse which will store data captured from different sources. An ETL Tool needs to be implemented which will extract data from different sources, transform it and load it into the Data warehouse. From this Warehouse, reports need to be generated for stake holders and other concerned officials to show the performance of the portal and other activities or revenue related information in the warehouse.
Show More Show LessDescription
This project involves development of a data archival solution where the data is stored in MongoDB (metadata and small files in db and large files in GridFS) and the corresponding metadata information is stored in Solr for indexing purposes (via Netty for performance enhancement)
Show More Show LessDescription
This project involves setting up MongoDB as the core repository for storage of video and playback related data (0.5 million records per min) and retrieval & update of data per heartbeat (60 sec). This information is used to enable user to register a limited number of devices with the application and maintain a consistent view of the audios/videos being watched across those devices
Show More Show LessDescription
The POCs main purpose is to simulate the analysis of large amounts of data and storing the result linked to the source data and organized in contexts so knowledge can be kept over time and links between contexts can be discovered. The POC contained 4 basic phases: 1. Data storage from the source repository to repository. 2. The entire data was indexed in SOLR 3. Patent Data was fetched from Solr and inserted in Neo4j for graphical mapping and patent family calculation. 4. Updated patent family information was updated back to SOLR
Show More Show LessDescription
The main purpose of this project was to simulate a use case for BigData technologies where the IT logs were collected from the common server of the organization and real time analysis was performed for the same. The mail logs were collected continuously from IT server using Flume and dumped to HDFS. Hadoop map-reduce technology was used further to analyze these logs and aggregate the relevant information and create statistics for the same. This output was then pushed to MySQL for storage using Sqoop and then Pentaho was used on the top for performing BI analytics on this data.
Show More Show LessDescription
This project deals with the performance testing of the Next generation sequencing algorithm (KB Base Caller algorithm) for both diagnostic and research purposes. Molokai is data- collection software for the collection of data corresponding to sequencing, fragment analysis and HID using the specified base caller algorithm.
Show More Show Less