loader image

Data Engineering

Platform for Self-Service Data Access with Automated Privacy Control

The Opportunity:

This is a unique opportunity to get in on the ground floor of a world-class team, helping to fundamentally re-architect how companies handle personal data.



You'll be responsible for building out our core tech infrastructure and for driving the design and build processes of our platform and company. This position includes hands on build and directing of data projects across various codebases to have a direct impact on the entire product.



You will:

Because of the data-centric aspect of the product, you will work with a variety of databases, storage technologies and ETL pipelines. You will get hands on experience working with data anonymisation techniques. You will collaborate closely with the founders to define the technical roadmap of the platform. Finally, we embrace DevOps methodologies for product development, so you will be involved in all aspects of our build systems to our test framework to containerisation.



Build Data Pipelines

- Define and build data pipelines using Apache Spark and Kafka.

- Work with SQL, NoSQL and API based data sources.



Execute data transformations

- Build and write code that performs scalable data transformations on large datasets.

- Including implementing frameworks such as Whitenoise - a differential privacy toolkit for analytics and machine learning.



Build a Unified Data Access Control Layer

- Unified policy enforcement across all data endpoints used by services, applications, and users.

- Integration of policy frameworks such as OpenPolicyAgent to provide a policy layer for data access.

- Building a solution that decouples data access policies from data sources.



A selection of the technologies that you will work with:

Docker and Kubernetes

Apache Kafka

Apache Spark

Open Policy Agent

Github Actions

Application of anonymisation techniques such as synthetic data, differential privacy and data masking

NLP and automated data type recognition at scale



Requirements:


  • Experience working with NoSQL, SQL and API based data sources..

  • Experience of building, deploying and maintaining a data pipelines

  • Live within an hour of Central European time.



Position

Full-Stack Developer

Data Scientist


Must have Skills

  • SQL

    Beginner

  • NoSQL

    Beginner

  • Apache Spark

    Beginner

  • ETL Pipeline

    Beginner

Client Payroll

Up to 200 K/Year USD (Annual salary)

Fully Remote

english - Basic

Languages
Cancel
Cancel

Active

Skip

Data Engineering

Platform for Self-Service Data Access with Automated Privacy Control

The Opportunity:

This is a unique opportunity to get in on the ground floor of a world-class team, helping to fundamentally re-architect how companies handle personal data.



You'll be responsible for building out our core tech infrastructure and for driving the design and build processes of our platform and company. This position includes hands on build and directing of data projects across various codebases to have a direct impact on the entire product.



You will:

Because of the data-centric aspect of the product, you will work with a variety of databases, storage technologies and ETL pipelines. You will get hands on experience working with data anonymisation techniques. You will collaborate closely with the founders to define the technical roadmap of the platform. Finally, we embrace DevOps methodologies for product development, so you will be involved in all aspects of our build systems to our test framework to containerisation.



Build Data Pipelines

- Define and build data pipelines using Apache Spark and Kafka.

- Work with SQL, NoSQL and API based data sources.



Execute data transformations

- Build and write code that performs scalable data transformations on large datasets.

- Including implementing frameworks such as Whitenoise - a differential privacy toolkit for analytics and machine learning.



Build a Unified Data Access Control Layer

- Unified policy enforcement across all data endpoints used by services, applications, and users.

- Integration of policy frameworks such as OpenPolicyAgent to provide a policy layer for data access.

- Building a solution that decouples data access policies from data sources.



A selection of the technologies that you will work with:

Docker and Kubernetes

Apache Kafka

Apache Spark

Open Policy Agent

Github Actions

Application of anonymisation techniques such as synthetic data, differential privacy and data masking

NLP and automated data type recognition at scale



Requirements:


  • Experience working with NoSQL, SQL and API based data sources..

  • Experience of building, deploying and maintaining a data pipelines

  • Live within an hour of Central European time.



Job Type

Client Payroll


Positions

Full-Stack Developer

Data Scientist


Must have Skills

  • SQL

    Beginner

  • NoSQL

    Beginner

  • Apache Spark

    Beginner

  • ETL Pipeline

    Beginner


Languages

english -Basic

Up to 200 K/Year USD (Annual salary)

Longterm (Duration)

Fully Remote

Skip

Padraig O

| United States