Recraft
ML Data Engineer
Salary
Competitive salary
Work type
Onsite
Level
mid
Category
Data Engineering
About the role
About Us
Founded in the US in 2022 and now based in London, UK, Recraft is an AI tool for professional designers, illustrators, and marketers, setting a new standard for excellence in image generation.
We designed a tool that lets creators quickly generate and iterate original images, vector art, illustrations, icons, and 3D graphics with AI. Over 3 million users across 200 countries have produced hundreds of millions of images using Recraft, and we’re just getting started.
Join a universe of professional opportunities, develop and support large-scale projects, and shape the future of creativity. We are committed to making Recraft an essential, daily tool for every designer and setting the industry standard. Our mission is to ensure that creators can fully control their creative process with AI, providing them with innovative tools to turn ideas into reality.
If you’re passionate about pushing the boundaries of AI, we want you on board!
Job Description
At Recraft, we’re building the next generation of generative models across images and text. We’re looking for an ML Data Engineer to scale our data pipelines for unstructured data (primarily images) and keep our training flows fast, reliable, and repeatable. You’ll design and operate high-throughput ingestion and preprocessing on Kubernetes, evolve our internal data-pipeline framework, and work hand-in-hand with ML engineers to ship datasets that move model quality forward.
Key Responsibilities
Develop and maintain data-ingestion pipelines to source and prepare large-scale image (and occasional text/HTML) datasets from open, publicly accessible, and permitted sources.
Own the end-to-end flow: raw data → quality/beauty/relevance filtering → dedup/validation → ready-to-train artifacts.
Operate and improve our Kubernetes-based data-pipeline framework (distributed jobs, retries, monitoring, automation).
Work with S3-style object storage: efficient layouts, lifecycle, throughput, and cost awareness.
Add tooling around pipelines (progress/health visualization, metrics, alerts) for observability and faster iteration.
Collaborate closely with ML engineers to align datasets with training needs and accelerate experimentation.
Requirements
Must-have
Strong Python fundamentals; you write clean, maintainable, production-ready code.
Solid hands-on Kubernetes experience (containers, jobs, batch/distributed processing).
Proven track record with unstructured data, especially images (loading, filtering, transforming at scale).
Experience developing data-ingestion or parsing tools for publicly accessible sources, including handling real-world reliability and failure cases gracefully.
Comfort with S3/object storage and moving lots of data efficiently and safely.
Pragmatic, detail-oriented, ownership mindset; you enjoy making systems reliable and fast.
Nice-to-have
Familiarity with ML workflows (PyTorch) and downstream training considerations.
Experience with image quality scoring, captioning, or image-to-text pipelines.
DAG/workflow visualizations or pipeline UX tooling.
DevOps fluency: Docker, CI/CD, infra automation.
What We Offer
Competitive salary and equity.
We’re able to offer Skilled Worker visa sponsorship in the UK for qualified candidates.
Real impact on model quality: your pipelines directly power training runs and product improvements.
Ownership with support: autonomy to design and improve systems, alongside experienced ML peers.
Modern stack: Python, Kubernetes, S3, internal pipeline framework built for scale.
Growth: a fast-moving environment where shipping well-engineered systems is the norm.
Tech stack

Genuit Group
AI & Automation Engineer
Competitive salary
Leeds · 14h ago

Amoria Group
Senior BI Developer (SQL & Power BI)
Competitive salary
Manchester · 14h ago

Gigaclear
Senior Data Engineer
Competitive salary
Abingdon · 23h ago

Betfred
Data Engineer
£45k – £60k
Manchester · 1d ago

Betfred
Insight Analyst
£29k – £30k
Warrington · 1d ago

Pricecheck
BI Analyst
Competitive salary
Sheffield · 1d ago

AIT Home Delivery
Junior Data Analyst
Up to £30k
Northampton · 1d ago

Cobalt Housing
Business Intelligence Analyst
Up to £43k
Liverpool · 1d ago

Claranet
Data Engineer
Competitive salary
Gloucester · 3d ago

Balfour Beatty
Junior Digital Analyst
Competitive salary
Derby · 3d ago

Keepmoat Homes
BI Developer
Competitive salary
Doncaster · 3d ago

Admiral
Senior Data Analysts
Competitive salary
Cardiff · 3d ago

Admiral
Data Analyst
Competitive salary
Cardiff · 3d ago

Cafcass
IT Business Analyst
£52k – £54k
Remote · 4d ago

Bupa
Senior Data Platform Engineer - Snowflake
Competitive salary
Salford · 4d ago

Bupa
Data Architect
From £65k
London · 4d ago

Bupa
Head of Engineering
Competitive salary
London · 4d ago

Enable
Junior BI Developer
Up to £30k
Motherwell · 5d ago

BP
Data Analyst
Competitive salary
London · 5d ago

Serco
Head of Data Engineering Practice
Up to £110k
Solihull · 5d ago

Elexon
Data Engineer
Up to £55k
London · 1w ago

Lifeways
Head of Data & AI
Competitive salary
Remote · 1w ago

Mitie
Data and Insight Analyst
Competitive salary
Northampton · 1w ago

Innocent
BI Business Manager
Competitive salary
London · 1w ago

WSC Sports
Product Analyst
Competitive salary
London · 1w ago

Argyll
Data & Insights Analyst
£45k – £50k
London · 2w ago

Dains
Data Analyst
Competitive salary
Birmingham · 2w ago

Clear Business
Senior Reporting and Insights Analyst
£39k – £48k
Manchester · 2w ago

NTT Data
Data Business Analyst
Competitive salary
London · 2w ago

Chorus Intelligence
Data Analyst
Competitive salary
Woodbridge · 2w ago

Pfizer
Senior Manager, Data Protection Engineering
Competitive salary
Sandwich · 3w ago

InHealth
Lead Data Analyst
£55k – £60k
Middlewich · 3w ago

Siemens Energy
BI Developer
Competitive salary
Lincoln · 3w ago

United Response
Business Intelligence Engineer
Up to £49k
Remote · 3w ago

Ravio
Insights Analyst
£50k – £65k
London · 3w ago

Ravio
Data Analytics Engineer
£60k – £75k
London · 3w ago
Aptia
Data Analyst
Competitive salary
London · 3w ago

Chevron Traffic Management
Data Analyst
Competitive salary
Salford · 3w ago

Kortext
Data Analyst
£40k – £48k
Remote · 3w ago

Maersk
Data Analyst
Competitive salary
Doncaster · 3w ago

Opus
Business Analyst/Process Improvement Lead
£65k – £70k
Reigate · 3w ago

7IM
Junior Data Analyst
Competitive salary
London · 3w ago

Honda
Business Analyst
Competitive salary
Bracknell · 3w ago

MONY Group
Analytics Engineer
Competitive salary
London · 3w ago

Aiviq
Technical Business Analyst
Competitive salary
London · 3w ago

Validis
Business Analys
£55k – £65k
London · 3w ago

Plum Fintech
Lead/Staff Data Analyst
Competitive salary
London · 3w ago

Birketts
AI Technical Lead
Competitive salary
Ipswich · 0mo ago

Qubitra
Business Analyst / Project Manager
Competitive salary
UK · 0mo ago

Qubitra
Head of Engineering / Lead Engineer
Competitive salary
UK · 0mo ago
Mountain Peak Technology
Data Analyst
Competitive salary
London · 0mo ago

HSL Compliance
HR Data Assistant
Competitive salary
Ross on Wye · 0mo ago

Sperry Rail
Data Analyst I
Competitive salary
Shelton · 1mo ago

CarMoney
Data Scientist
Competitive salary
Motherwell · 1mo ago

Cardo Group
Data Engineer
£49k – £50k
Cardiff · 1mo ago

CPM
Senior Data Analyst
£42k – £45k
Thame · 1mo ago

Everflow
Performance Analyst (Growth)
£32k – £35k
Peterlee · 1mo ago

Everflow
Performance Analyst (Operations)
£32k – £35k
Peterlee · 1mo ago

Gallagher Bassett
Junior Database Administrator
Competitive salary
Ipswich · 1mo ago

Equans
Data Engineer
Competitive salary
UK · 1mo ago

Corrigenda
Power BI Developer
Competitive salary
Whiteley · 1mo ago
Recraft
ML Data Engineer
Salary
Competitive salary
Work type
Onsite
Level
mid
Category
Data Engineering
About the role
About Us
Founded in the US in 2022 and now based in London, UK, Recraft is an AI tool for professional designers, illustrators, and marketers, setting a new standard for excellence in image generation.
We designed a tool that lets creators quickly generate and iterate original images, vector art, illustrations, icons, and 3D graphics with AI. Over 3 million users across 200 countries have produced hundreds of millions of images using Recraft, and we’re just getting started.
Join a universe of professional opportunities, develop and support large-scale projects, and shape the future of creativity. We are committed to making Recraft an essential, daily tool for every designer and setting the industry standard. Our mission is to ensure that creators can fully control their creative process with AI, providing them with innovative tools to turn ideas into reality.
If you’re passionate about pushing the boundaries of AI, we want you on board!
Job Description
At Recraft, we’re building the next generation of generative models across images and text. We’re looking for an ML Data Engineer to scale our data pipelines for unstructured data (primarily images) and keep our training flows fast, reliable, and repeatable. You’ll design and operate high-throughput ingestion and preprocessing on Kubernetes, evolve our internal data-pipeline framework, and work hand-in-hand with ML engineers to ship datasets that move model quality forward.
Key Responsibilities
Develop and maintain data-ingestion pipelines to source and prepare large-scale image (and occasional text/HTML) datasets from open, publicly accessible, and permitted sources.
Own the end-to-end flow: raw data → quality/beauty/relevance filtering → dedup/validation → ready-to-train artifacts.
Operate and improve our Kubernetes-based data-pipeline framework (distributed jobs, retries, monitoring, automation).
Work with S3-style object storage: efficient layouts, lifecycle, throughput, and cost awareness.
Add tooling around pipelines (progress/health visualization, metrics, alerts) for observability and faster iteration.
Collaborate closely with ML engineers to align datasets with training needs and accelerate experimentation.
Requirements
Must-have
Strong Python fundamentals; you write clean, maintainable, production-ready code.
Solid hands-on Kubernetes experience (containers, jobs, batch/distributed processing).
Proven track record with unstructured data, especially images (loading, filtering, transforming at scale).
Experience developing data-ingestion or parsing tools for publicly accessible sources, including handling real-world reliability and failure cases gracefully.
Comfort with S3/object storage and moving lots of data efficiently and safely.
Pragmatic, detail-oriented, ownership mindset; you enjoy making systems reliable and fast.
Nice-to-have
Familiarity with ML workflows (PyTorch) and downstream training considerations.
Experience with image quality scoring, captioning, or image-to-text pipelines.
DAG/workflow visualizations or pipeline UX tooling.
DevOps fluency: Docker, CI/CD, infra automation.
What We Offer
Competitive salary and equity.
We’re able to offer Skilled Worker visa sponsorship in the UK for qualified candidates.
Real impact on model quality: your pipelines directly power training runs and product improvements.
Ownership with support: autonomy to design and improve systems, alongside experienced ML peers.
Modern stack: Python, Kubernetes, S3, internal pipeline framework built for scale.
Growth: a fast-moving environment where shipping well-engineered systems is the norm.