Pangaea Data
Data Engineer (Healthcare Data)
Salary
Competitive salary
Work type
Hybrid
Level
mid
Category
Data Engineering
About the role
As Data Engineer you will join Pangaea’s team to design and develop integrated applications for its PALLUX platform
About Pangaea Data
Pangaea Data (Pangaea) is a South San Francisco and London based business founded by Dr Vibhor Gupta and Prof Yike Guo (Director Data Science Institute at Imperial College London; Provost, Hong Kong University of Science and Technology). They have worked in medicine and computing for over 20 years and have raised over $300 million through their academic research, including a $110 million grant focused on development work on large language models in medicine. Pangaea’s AI platform, PALLUX, is configured on clinical guidelines to find more untreated (undiagnosed, miscoded, at-risk) and under-treated patients with hard-to-diagnose conditions for screening and treatment at the point of care. Pangaea’s advisors include industry veterans from healthcare and the life sciences, including Lord David Prior (former chairman, NHS England) and Mr. Andy Palmer (former CIO, Novartis).
The Role
As Data Engineer (Healthcare Data), you will join Pangaea’s team to lead and support the development of reliable, scalable, and secure data solutions. The ideal candidate will be experienced with healthcare data standards (e.g. FHIR, OMOP), possess a strong understanding of data privacy regulations (e.g., HIPAA, GDPR), and have technical expertise to design and implement data pipelines, storage systems, and integrations.
This role will continue to evolve as the business grows, but in the short term it will also involve development of the software product and collaboration with the clinical and scientific team. A strong software engineering background and knowledge in AI, especially Machine Learning and Natural Language Processing, is essential. For the right candidate, this is a senior technical position with scope to grow into a leadership role.
Key technical responsibilities will include:
- Design, implement, and maintain ETL pipelines to collect, clean, and transform healthcare data from various sources such as EHR systems, APIs, and databases
- Ensure data quality and integrity through robust testing and validation processes
- Optimize storage solutions for structured and unstructured healthcare data using databases (e.g., MongoDB) and cloud-based data warehouses (e.g., Azure Cosmos, Azure Fabric)
- Collect and maintain gold standard datasets for evaluation and benchmarking with clear instructions, version control, and API documentations.
- Maintain strict compliance with data privacy regulations such as HIPAA, GDPR, and other local healthcare policies
- Work closely with the clinical team to understand data requirements and translate them into technical solutions
- Collaborate with the AI team to provide clean, well-structured datasets for research, and AI/ML models
- Stay up-to-date with the latest data engineering technologies and best practices
Mandatory Requirements
Technical skills:
- Experience working with Electronic Health Records (EHR) systems (e.g. Epic, Cerner)
- A university qualification (Bachelors, Masters, Doctorate) with at least two years of university study in Computer Science, Informatics, Data Science, Engineering, or related
- Experience in data engineering, with a focus on healthcare data preferred
- Familiarity with NoSQL databases (e.g., MongoDB) and relational databases (e.g., PostgreSQL, MySQL)
- 5+ years in Python and SQL work
- Knowledge of ETL tools (e.g., Apache Airflow) and cloud platforms (e.g., AWS, Azure, GCP).
- Understand data modelling concepts and best practices. Experience with healthcare data standards (e.g., HL7, FHIR, ICD, SNOMED, DICOM) preferred
- Excellent problem-solving and communication skills
Personal traits:
- Ability to communicate complex ideas effectively, both verbally and written
- Ability to engage all levels of the company and the customers’ organizations
- Ability to work collaboratively in a team environment
Nice to Have
- 3-5 years experience of managing teams
- Experience working on large-scale, commercial software development projects is a plus
- Experience with research communities and/or efforts, including having published papers (being listed as author) at AI/ML/NLP/CV conferences (e.g. Bio-IT, NeuraIPS, ICML, ICLR, ACL, CVPR and KDD) and journals
- Experience and knowledge of deploying AI and Data solutions for healthcare and pharmaceuticals at scale is desirable
Perks and Benefits
- Flexible working hours
- Salary dependent on experience
- Package of attractive benefits including private medical insurance and monthly travel card
- You will join a dedicated highly renowned team offering you the opportunity to grow and develop your professional skills and profile
- You will have the opportunity to learn about building a startup business from experienced professionals and serial entrepreneurs
Application Contact Information
Your application should include a CV and cover letter highlighting your relevant experiences and motivations. Please send this to careers@pangaeadata.ai
Tech stack

Genuit Group
AI & Automation Engineer
Competitive salary
Leeds · 15h ago

Amoria Group
Senior BI Developer (SQL & Power BI)
Competitive salary
Manchester · 15h ago

Gigaclear
Senior Data Engineer
Competitive salary
Abingdon · 1d ago

Betfred
Data Engineer
£45k – £60k
Manchester · 1d ago

Betfred
Insight Analyst
£29k – £30k
Warrington · 1d ago

Pricecheck
BI Analyst
Competitive salary
Sheffield · 1d ago

AIT Home Delivery
Junior Data Analyst
Up to £30k
Northampton · 2d ago

Cobalt Housing
Business Intelligence Analyst
Up to £43k
Liverpool · 2d ago

Claranet
Data Engineer
Competitive salary
Gloucester · 3d ago

Balfour Beatty
Junior Digital Analyst
Competitive salary
Derby · 3d ago

Keepmoat Homes
BI Developer
Competitive salary
Doncaster · 3d ago

Admiral
Senior Data Analysts
Competitive salary
Cardiff · 3d ago

Admiral
Data Analyst
Competitive salary
Cardiff · 3d ago

Cafcass
IT Business Analyst
£52k – £54k
Remote · 4d ago

Bupa
Senior Data Platform Engineer - Snowflake
Competitive salary
Salford · 4d ago

Bupa
Data Architect
From £65k
London · 4d ago

Bupa
Head of Engineering
Competitive salary
London · 4d ago

Enable
Junior BI Developer
Up to £30k
Motherwell · 5d ago

BP
Data Analyst
Competitive salary
London · 6d ago

Serco
Head of Data Engineering Practice
Up to £110k
Solihull · 6d ago

Elexon
Data Engineer
Up to £55k
London · 1w ago

Lifeways
Head of Data & AI
Competitive salary
Remote · 1w ago

Mitie
Data and Insight Analyst
Competitive salary
Northampton · 1w ago

Innocent
BI Business Manager
Competitive salary
London · 1w ago

WSC Sports
Product Analyst
Competitive salary
London · 2w ago

Argyll
Data & Insights Analyst
£45k – £50k
London · 2w ago

Dains
Data Analyst
Competitive salary
Birmingham · 2w ago

Clear Business
Senior Reporting and Insights Analyst
£39k – £48k
Manchester · 2w ago

NTT Data
Data Business Analyst
Competitive salary
London · 2w ago

Chorus Intelligence
Data Analyst
Competitive salary
Woodbridge · 2w ago

Pfizer
Senior Manager, Data Protection Engineering
Competitive salary
Sandwich · 3w ago

InHealth
Lead Data Analyst
£55k – £60k
Middlewich · 3w ago

Siemens Energy
BI Developer
Competitive salary
Lincoln · 3w ago

United Response
Business Intelligence Engineer
Up to £49k
Remote · 3w ago

Ravio
Insights Analyst
£50k – £65k
London · 3w ago

Ravio
Data Analytics Engineer
£60k – £75k
London · 3w ago
Aptia
Data Analyst
Competitive salary
London · 3w ago

Chevron Traffic Management
Data Analyst
Competitive salary
Salford · 3w ago

Kortext
Data Analyst
£40k – £48k
Remote · 3w ago

Maersk
Data Analyst
Competitive salary
Doncaster · 3w ago

Opus
Business Analyst/Process Improvement Lead
£65k – £70k
Reigate · 3w ago

7IM
Junior Data Analyst
Competitive salary
London · 3w ago

Honda
Business Analyst
Competitive salary
Bracknell · 3w ago

MONY Group
Analytics Engineer
Competitive salary
London · 3w ago

Aiviq
Technical Business Analyst
Competitive salary
London · 3w ago

Validis
Business Analys
£55k – £65k
London · 3w ago

Plum Fintech
Lead/Staff Data Analyst
Competitive salary
London · 3w ago

Birketts
AI Technical Lead
Competitive salary
Ipswich · 0mo ago

Qubitra
Business Analyst / Project Manager
Competitive salary
UK · 0mo ago

Qubitra
Head of Engineering / Lead Engineer
Competitive salary
UK · 0mo ago
Mountain Peak Technology
Data Analyst
Competitive salary
London · 0mo ago

HSL Compliance
HR Data Assistant
Competitive salary
Ross on Wye · 0mo ago

Sperry Rail
Data Analyst I
Competitive salary
Shelton · 1mo ago

CarMoney
Data Scientist
Competitive salary
Motherwell · 1mo ago

Cardo Group
Data Engineer
£49k – £50k
Cardiff · 1mo ago

CPM
Senior Data Analyst
£42k – £45k
Thame · 1mo ago

Everflow
Performance Analyst (Growth)
£32k – £35k
Peterlee · 1mo ago

Everflow
Performance Analyst (Operations)
£32k – £35k
Peterlee · 1mo ago

Gallagher Bassett
Junior Database Administrator
Competitive salary
Ipswich · 1mo ago

Equans
Data Engineer
Competitive salary
UK · 1mo ago

Corrigenda
Power BI Developer
Competitive salary
Whiteley · 1mo ago
Pangaea Data
Data Engineer (Healthcare Data)
Salary
Competitive salary
Work type
Hybrid
Level
mid
Category
Data Engineering
About the role
As Data Engineer you will join Pangaea’s team to design and develop integrated applications for its PALLUX platform
About Pangaea Data
Pangaea Data (Pangaea) is a South San Francisco and London based business founded by Dr Vibhor Gupta and Prof Yike Guo (Director Data Science Institute at Imperial College London; Provost, Hong Kong University of Science and Technology). They have worked in medicine and computing for over 20 years and have raised over $300 million through their academic research, including a $110 million grant focused on development work on large language models in medicine. Pangaea’s AI platform, PALLUX, is configured on clinical guidelines to find more untreated (undiagnosed, miscoded, at-risk) and under-treated patients with hard-to-diagnose conditions for screening and treatment at the point of care. Pangaea’s advisors include industry veterans from healthcare and the life sciences, including Lord David Prior (former chairman, NHS England) and Mr. Andy Palmer (former CIO, Novartis).
The Role
As Data Engineer (Healthcare Data), you will join Pangaea’s team to lead and support the development of reliable, scalable, and secure data solutions. The ideal candidate will be experienced with healthcare data standards (e.g. FHIR, OMOP), possess a strong understanding of data privacy regulations (e.g., HIPAA, GDPR), and have technical expertise to design and implement data pipelines, storage systems, and integrations.
This role will continue to evolve as the business grows, but in the short term it will also involve development of the software product and collaboration with the clinical and scientific team. A strong software engineering background and knowledge in AI, especially Machine Learning and Natural Language Processing, is essential. For the right candidate, this is a senior technical position with scope to grow into a leadership role.
Key technical responsibilities will include:
- Design, implement, and maintain ETL pipelines to collect, clean, and transform healthcare data from various sources such as EHR systems, APIs, and databases
- Ensure data quality and integrity through robust testing and validation processes
- Optimize storage solutions for structured and unstructured healthcare data using databases (e.g., MongoDB) and cloud-based data warehouses (e.g., Azure Cosmos, Azure Fabric)
- Collect and maintain gold standard datasets for evaluation and benchmarking with clear instructions, version control, and API documentations.
- Maintain strict compliance with data privacy regulations such as HIPAA, GDPR, and other local healthcare policies
- Work closely with the clinical team to understand data requirements and translate them into technical solutions
- Collaborate with the AI team to provide clean, well-structured datasets for research, and AI/ML models
- Stay up-to-date with the latest data engineering technologies and best practices
Mandatory Requirements
Technical skills:
- Experience working with Electronic Health Records (EHR) systems (e.g. Epic, Cerner)
- A university qualification (Bachelors, Masters, Doctorate) with at least two years of university study in Computer Science, Informatics, Data Science, Engineering, or related
- Experience in data engineering, with a focus on healthcare data preferred
- Familiarity with NoSQL databases (e.g., MongoDB) and relational databases (e.g., PostgreSQL, MySQL)
- 5+ years in Python and SQL work
- Knowledge of ETL tools (e.g., Apache Airflow) and cloud platforms (e.g., AWS, Azure, GCP).
- Understand data modelling concepts and best practices. Experience with healthcare data standards (e.g., HL7, FHIR, ICD, SNOMED, DICOM) preferred
- Excellent problem-solving and communication skills
Personal traits:
- Ability to communicate complex ideas effectively, both verbally and written
- Ability to engage all levels of the company and the customers’ organizations
- Ability to work collaboratively in a team environment
Nice to Have
- 3-5 years experience of managing teams
- Experience working on large-scale, commercial software development projects is a plus
- Experience with research communities and/or efforts, including having published papers (being listed as author) at AI/ML/NLP/CV conferences (e.g. Bio-IT, NeuraIPS, ICML, ICLR, ACL, CVPR and KDD) and journals
- Experience and knowledge of deploying AI and Data solutions for healthcare and pharmaceuticals at scale is desirable
Perks and Benefits
- Flexible working hours
- Salary dependent on experience
- Package of attractive benefits including private medical insurance and monthly travel card
- You will join a dedicated highly renowned team offering you the opportunity to grow and develop your professional skills and profile
- You will have the opportunity to learn about building a startup business from experienced professionals and serial entrepreneurs
Application Contact Information
Your application should include a CV and cover letter highlighting your relevant experiences and motivations. Please send this to careers@pangaeadata.ai