Senior HPC Engineer – IFM

Home Working at MBZUAI Vacancies Senior HPC Engineer – IFM

Vacancy Overview

Application Open:

Full-Time

MBZUAI’s Institute of Foundation Models is seeking a Senior HPC Engineer to provide technical leadership in designing, operating, and evolving large-scale GPU infrastructure supporting frontier AI research. The Institute for Foundation Models (IFM) operates one of the world’s largest AI-focused supercomputing environments and is looking for an experienced HPC Engineer to contribute to groundbreaking research and development.

Key Responsibilities
• Lead operation and optimization of large-scale GPU clusters.
• Drive reliability, scalability, and performance improvements.
• Lead troubleshooting and root cause analysis of complex issues.
• Design and validate new cluster deployments and upgrades.
• Collaborate with researchers to optimize distributed AI training.
• Lead vendor engagement and technical reviews.
• Mentor junior engineers.
• Define monitoring, operational standards, and capacity planning processes.
• Participate in major incident management and escalations.

Academic Qualification

Bachelor’s degree in computer science, Computer Engineering, Electrical Engineering, Software Engineering, Information Technology, Applied Mathematics, Physics, or related disciplines.
Master’s Degree preferred.

Professional Experience Required
Essential:

• 5+ years in HPC, Linux infrastructure, cloud infrastructure, distributed systems, or large-scale production environments.
• Experience with Slurm and Linux administration.
• Experience troubleshooting compute, storage, and networking systems.

Preferred:
• GPU cluster operations.
• NVIDIA technologies including CUDA, NCCL, NVLink, and GPUDirect.
• InfiniBand networking.
• Weka, Lustre, BeeGFS, or similar storage platforms.
• Azure, AWS, or GCP.
• Terraform, Ansible, or Infrastructure-as-Code.
• PyTorch Distributed, Megatron-LM, DeepSpeed, FSDP, or large-scale AI training environments.

Apply Now:

First Name

Last Name

Phone

Highest Qualification

Number of Years of Experience in Related Position

Nationality

Cover Related Last

Upload CV

Drag & Drop Files, Choose Files to Upload

Upload Cover Letter

Drag & Drop Files, Choose Files to Upload