(While navigating through the site, please be sure to disable your pop-up blocker.)
Machine Learning Engineer - Healthcare
The mission of The University of Texas M. D. Anderson Cancer Center is to eliminate cancer in Texas, the nation, and the world through outstanding programs that integrate patient care, research, prevention, and education. Core to the success of our mission is the ability to orchestrate multidimensional data, data analytics, and machine learning to create sustainable impact within a framework of responsible AI. We are building a dynamic team of AI experts that can help us consistently and responsibly accelerate the impact of AI across the enterprise, driving long-lasting improvements in cancer care.
Summary:
We are seeking an ML Engineer to help build and scale the AI/ML platform that underpins data science and enterprise machine learning operations. This role is central to enabling a robust AI lifecycle management framework, with responsibilities spanning the development, deployment, and monitoring of production-quality machine learning models that support both clinical and business operations. The ML Engineer will also contribute to platform reliability, automation, and integration, ensuring seamless workflows for data scientists and model developers. In addition, the role will support the evaluation and validation of external AI/ML models and products. Beyond technical delivery, this position will help foster team collaboration, drive a culture of innovation, and strengthen the processes and technological foundations that accelerate enterprise-wide adoption of data science and MLOps best practices
Technical Expertise
Support development, administration, and maintenance the AI/ML platform (built on Dataiku, Kubernetes, and Azure), ensuring reliability, scalability, and seamless integration with enterprise systems.
Orchestrate training and deployment pipelines in Dataiku, targeting both Azure and on-premises Kubernetes clusters.
Develop and maintain MLOps workflows for versioning, monitoring, reproducibility, and governance.
Manage containerized environments using Docker and Kubernetes to support data science workloads.
Provide platform support and troubleshooting for data scientists and ML engineers, enabling efficient model development and deployment.
Monitor performance, security, and compliance of the AI/ML platform, ensuring adherence to enterprise and regulatory standards.
Analytical Skills
Support the design and operation of scalable pipelines in Dataiku, Kubernetes, and Azure, ensuring reliable feature management, ML artifact tracking, and data quality monitoring.
Troubleshoot, test, and resolve issues across the platform with strong debugging and problem-solving skills.
Assist with data integration and interoperability, applying healthcare data standards and ontologies (e.g., DICOM, HL7, FHIR) where required.
Professionalism: Oral and Written
Provide knowledge transfer and technical assistance by proactively sharing platform expertise, best practices, and methodologies with peers and end users.
Support data analytics and automation workflows by reviewing requests, enabling access to data, and assisting with analysis and interpretation when needed.
Clearly communicate results, updates, and platform status in project meetings and, when appropriate, external workshops, conferences, or collaborations.
Collaborate effectively with leaders, peers, end users, and support teams, ensuring responsive and professional communication across technical and non-technical stakeholders.
Other duties as assigned
EDUCATION
- Required: Bachelor's Degree Computer Science, Software Engineering, Data Science, Physics, Math & Statistics, or another related engineering discipline.
- Preferred: Master's Degree Computer Science, Software Engineering, Data Science, Physics, Math & Statistics, or another related engineering discipline.
WORK EXPERIENCE
- Required: 3 years in machine learning engineering, data science, data engineering, and/or software engineering experience.
- Required: 1 year experience with Master's degree.
- No experience required with PhD.
Preferred Experience/Skills: Experience with MLOps platforms and/or cloud AI certifications, strong proficiency in CI/CD and automation of the AI lifecycle, experience working on healthcare focused machine learning projects. Experience with Azure and/or Kubernetes. Proficiency in services such as Azure Kubernetes Services and Azure ML (or similar).
This position may be responsible for maintaining the security and integrity of critical infrastructure, as defined in Section 113.001(2) of the Texas Business and Commerce Code and therefore may require routine reviews and screening. The ability to satisfy and maintain all requirements necessary to ensure the continued security and integrity of such infrastructure is a condition of hire and continued employment.