Senior Data Architect
Omilia · Poland · Remote-friendly
You will continue to the employer’s original posting.
- Company
- Omilia
- Location
- Poland
- Employment type
- Full-time
- Posted
- April 15, 2026
About this job
Accountabilities Own the Training Environment data architecture end-to-end: dataset design and schema for all ML training pipelines, including dialog corpora for LLM training, conversational steps for NLU models, annotated evaluation sets, and whole-call recordings for speech-to-speech model development. Define and govern data selection and sampling strategy: establish criteria that determine which production conversations have the highest training value, including diversity-optimized sampling, confidence-based filtering, edge-case prioritization, and deduplication strategies. Build and maintain the data catalog and dataset discovery infrastructure: enable ML engineers across LLM, NLU, Speech, and Agentic teams to find, understand, and use training data without friction. Define annotation …
This is a short summary. The full description is on the employer’s page.
Get matched to jobs like this
Create a free profile and receive vacancies from Poland and across the EU that match your skills.
Get my matches