Data Engineer


  • full-time data engineering position, based in London, UK (not remote)
  • VC-backed company with technical founders (from Google, Oxford, Palantir)
  • multiple openings, looking both for junior and senior developers
  • focus on using Deep Learning and NLP for Software Development
  • plenty of work to be done on data pipelines, internal tools, and more
  • modern infrastructure built on AWS, Kubernetes, Terraform and Ansible
  • impact-driven, focusing on delivering the most value to your colleagues
  • opportunity to work on Machine Learning
  • generous salary and considerable equity

About the company

Software engineers are only human, and machines are very unforgiving. We leverage modern NLP and Machine Learning techniques to address the cognitive limitations that make software so difficult to understand, produce and maintain.

Our first product augments human developers with AI to simplify and automate the code review process, enabling engineers to fix problems before shipping code to production.

Data-driven development

At the heart of Prodo’s AI is a lot of data. We’re looking for people who love to work with gigabytes of both highly structured and fuzzy data, and extract as much relevant information as possible.

We get data from pretty much everywhere:

  • version control systems, such as Git
  • online repositories (think GitHub)
  • generated statistics around code
  • metrics based on developer behaviour
  • static analysis
  • runtime analysis
  • interactions with our own UI
  • and much more

All of this feeds into various machine learning models (primarily neural networks), where we have a whole new set of interesting problems you’ll get to work on:

  • creating data pipelines to ensure that our input is always getting better
  • manipulating code into terse, powerful graph structures
  • powering enrichment with ML
  • building internal tools to facilitate data collection and experimentation
  • managing infrastructure for data collection, ML modelling and publication
  • running pipelines in a repeatable and measurable fashion
  • extracting insights and turning them into actionable feedback
  • scaling analysis to hundreds of machines on AWS

Team culture

Engineers at Prodo quickly become experts in the field. The breadth of problems we’re solving require us to learn new things every day, make quick decisions and implement scalable solutions. We take pride in our product’s value over specific features, so we move fast and in small increments to make sure we’re always delivering.

We always try to use the best technology for the job, and we change what we use with the job. Everything is conversation-driven, with the data team sitting squarely in the middle.

We’re big fans of highly-structured graph data, containers, pair programming, type systems, falafel, squash, flexible working hours, and healthy amounts of sleep.

Current stack

Our stack will never be set in stone, and newcomers will have the opportunity (and responsibility) to question and improve any technical choices made before they joined. But just to give you a flavour of our stack today, we are currently using:

  • AWS to host GPU instances for neural network training
  • PyTorch for ML modelling
  • Kubernetes for running the product
  • Terraform for spawning VMs
  • Ansible for managing them when they’re up
  • Packer for building AMIs and Docker images
  • Python for our in-house experiment runner
  • zsh, for when you just need to get something done fast

We don’t expect you to be an expert in all of the above, but that you’re willing to learn how we currently do things and help us improve.

How to apply

Did this sound intriguing? Please email us at with "Data Engineer" in the subject line, a CV, and a little about you to start a discussion.

Let us know if you’re already in London, or your current location if you plan on relocating.

You might also want to check out our product development or research scientist roles.

Lost? Head back to the home page.