Senior MLOps Engineer
NN Digital Hub is a subsidiary company of NN Group located in Madrid, Spain. We deliver IT services and solutions for the different international Business Units.
SeniorMLOpsEngineer
As we scale our agentic AI strategy, we need to transform Databricks Mosaic AI into a self\-service platform that empowers AI engineers across business units. Your role is to build and productize AI capabilities — from RAG pipelines to model serving — and deliver them as scalable, governed, and developer\-friendly services.
The Data Intelligence Platform is the runtime for models and agentic AI at NN. It runs on Databricks MLflow and Mosaic, deploying models and agents on Databricks serverless or NN’s Kubernetes (Seldon) where needed. Mosaic Vector Search provides RAG for GenAI; the platform spans Azure and AWS to meet latency and proximity needs. As Senior MLOps Engineer, you will standardize build deploy serve monitor across these topologies and make production operations audit\-ready by design.
Your Impact as SeniorMLOpsEngineer: What are you going to do
Define the reference MLOps stack for the Agentic Platform: CI/CD patterns and environment promotion with approvals; serving topologies (Databricks serverless vs Kubernetes/Seldon) with decision records; evaluation \& drift monitoring with MLflow/Mosaic; and SLO\-based run operations. Integrate Unity Catalog lineage/policies into pipelines so teams meet governance requirements without friction and support the Run expansion.
Key responsibilities include:
- Own CI/CD standards and environment promotion for models/agents: pipelines, approvals, artifact provenance, immutable releases, and rollback/canary patterns.
- Standardize serving topologies: Databricks serverless vs Kubernetes/Seldon, with clear decision records, SLOs (latency/reliability), and cost/performance guardrails.
- Implement evaluation and monitoring: MLflow\-based offline/online evaluation, drift/quality checks, and end\-to\-end telemetry dashboards for model/agent behavior and costs with all security \& compliance basic in orders in mind
- Integrate governance\-by\-design: enforce Unity Catalog lineage/policies and capture approvals and evidence in pipelines to support audits and marketplace readiness.
- Operate the Agentic Platform to SLOs: incident response/on\-call, capacity planning, cost optimization, post\-mortems, and continuous improvement of golden paths.
As part of our Employee Experience, we offer you a range of competitive benefits that are available to you in order to improve the physical, mental, and professional wellbeing of our workers. Among the benefits are:
- We work with a hybrid model
- Starting financial aid so you can equip your workspace at home
- Allowance and telework subventions
- Flexible working hours and 2 months of intensive working hours in summer so you can fully take advantage of your time
- We hire a life insurance and a pension plan for all our employees
- We establish objectives bonus as a performance reward
- You can get to the office as you please. We have a free parking lot for cars, motorbikes, electric cars with chargers, and bikes.
- Get to know our flexible retribution facilities such as transport card, nursery checks, health insurance with Sanitas, trainings…
- We care about people. That is why we are involved in society by facilitating volunteer actions and time to all our employees.
- We care about your wellbeing. We have a Wellness Program available.
- We are digital and we love technology. Also, every team works under agile methodology.
- 5\+ years in MLOps/SRE or platform engineering with production ML/AI workloads; deep experience with Databricks MLflow is highly preferred and enterprise CI/CD.
- Design/operation of multi\-env releases (dev test prod) with approvals, secrets, and identity; IaC (Terraform/CrossPlane) and cloud networking on Azure and/or AWS.
- Hands\-on with evaluation/monitoring for models/agents; comfort with SLOs, incident response, and capacity/cost management.
- Familiarity with tools like ArgoCD, Crossplane, Istio, Knative, Opensearch, Prometheus, and Grafana.
- Experience operating Kubernetes/Seldon for model serving and migrating toward Databricks Mosaic when appropriate.
- Familiarity with agentic AI evaluation patterns (task success, tool reliability) and RAG observability (Vector Search health).
- Experience in multi\-cloud operations and cross\-region latency management.
- Working knowledge of Unity Catalog lineage/policies and how they integrate with delivery pipelines and catalogs; ability to produce audit\-ready evidence.
Please send your CVs in English language.
*In* *Nationale\-Nederlanden* *we are committed with diversity. We are proud to be an inclusive organization and we offer equal opportunities, regardless of race, cultural background, gender, gender identity, religion, national origin, age, disability, marital status, and sexual orientation. One of our core values is taking care of our employees so they can give their best within a respectful environment.*
Este anuncio proviene de indeed. Ver anuncio original ↗