Now you can Instantly Chat with Vamsi ts!
- Big Data Hadoop expert offering chronicled success of over 11 years of experience in Application Development, Production Support & Customizations in variou...
- Successfully trained in Implementation of Big Data Technologies-Hadoop, HIVE, PIG, SQOOP, HBASE, FLUME, OOZIE, SPARK, SCALA; knowledge of Hadoop and Spark Architecture, Execution model, Scala and major components of Hadoop Ecosystem (HDFS, HIVE, PIG, Oozie, Sqoop, Map Reduce and YARN)
- Experienced in PIG, HIVE, Map Reduce and Hadoop Distributed File Systems (HDFS)
- Knowledge in importing and exporting data from different databases like Oracle, MySQL into HDFS and Hive using Sqoop
- Trained for collecting and storing stream data like log data in HDFS using Flume and for creating tables, partitioning, bucketing of table and creating UDF's in Hive
Data & Analytics
The purpose of the project is to develop an end to end solution from injestion to analytics. Developing an end to end OLAP cube based solution. The solution is based on the open source Big Data s/w Hadoop, Angular & REST.
- Leading offshore team using Agile methodology.
- Practiced backlog grooming, release and sprint planning, daily standups, impediment removals.
- Collaborating closely with the software development, product and business teams on developing a OLAP based hadoop product.
- Managing from concept to delivery of the OLAP based hadoop product.
- Creating and managing the estimates, project plan, project schedule, resource allocation to ensure that targets will be reached
- Involved in all aspects of OLAP platform development including collecting requirements, writing high-quality documents, doing sprint planning, and coordinating all efforts to scope, schedule, and deploy new feature.
- Working on front end development using Angular.
Environment: Angular JS, Rest, Hadoop, HDFS, Hive, MapReduce, Spark, KafkaShow More Show Less
SkillsAngular.Js REST Hadoop Hadoop Distributed File System - (HDFS) Hive Map Reduce Apache Spark Apache-Kafka
The purpose of the project is to store terabytes of log information generated by the Telecom website and extract meaning information out of it. The solution is based on the open source Big Data s/w Hadoop. The data will be stored in Hadoop file system and processed using Spark which in-turn includes getting the raw data from the Servers. Process them to obtain product and pricing information, Extract various reports out of the product pricing information and Export the information for further processing.
- Delivered project needs on time and within the agreed acceptance criteria in a hybrid methodology environment as they attempted to transition to an Agile Methodology.
- Developed, managed and tracked project plan to implement requested features
- Facilitated grooming and planning sessions with team
- Tracked and reported on project progress.
Environment : HADOOP, HDFS, Hive, UNIX,SPARK,Scala Flume, Oozie. HBASEShow More Show Less
SkillsHadoop Hadoop Distributed File System - (HDFS) Hive Unix Apache Spark Scala Flume Oozie HBase
4medica is the nation's leading provider of cloud-based clinical data exchange, which provides clinicians with a unified, real-time view of patient information across disparate care locations. The company's flagship clinical integration platform, Integrated Electronic Health Record (iEHR), builds upon organizations' existing technologies to supply the exact level of health connectivity needed to address meaningful use requirements, from basic health information exchange to integration with existing electronic health records (EHRs), practice management systems and other healthcare applications.
- Understood process requirements and provided use cases for business, functional & technical requirements
- Managed programming code independently for intermediate to complex modules following development standards; planned and conducted code reviews for changes and enhancements that ensured standards compliance and systems interoperability
- Interacted with users for requirement gathering; prepared functional specifications and low-level design documents
- Provided overall leadership to the entire project team including managing deliverables of other functional team leaders
- Communicated with internal/external clients to determine specific requirements and expectations; managed client expectations as an indicator of quality
- Created and managed the estimates, project plan, project schedule, resource allocation and expenses to ensure that targets were reached
- Worked with relevant Resource Managers for project staffing and resource releases
Environment: HDFS, Apache Pig, Hive, SQOOP, Java, UNIX, SQLShow More Show Less
Genome analysis is used to analyze human genome data using Hadoop. A single human genome contains about 3 billion base pairs. This is less than 1 gigabyte of data but the intermediate data produced by a DNA sequences, required to produce a sequenced human genome, is many hundreds of times larger. Beyond the huge storage requirement, deep genomic analysis across large populations of humans requires enormous computational capacity as well. Efforts exist for adapting existing genomics data structures to Hadoop, but these don’t support the full range of analytic requirements. Our approach is to implement an end-to-end analysis pipeline based on GATK and running on Hadoop.
- Writing pig scripts to process the Genome data.
- Writing the script files for perform Hadoop operations.
- Processing data and loading to HDFS.
- Handled importing of data from various data sources, performed transformations using Hive and loaded data into HDFS.
- Injected the data from logs and relational databases using Flume and SQOOP.
Importing and exporting data into HDFS, Pig, Hive and HBase using SQOOP
Environment: HDFS, Apache Pig, Hive, Hbase, Sqoop, FlumeShow More Show Less