HPC Application Engineer

Home Working at MBZUAI Vacancies HPC Application Engineer

Vacancy Overview

Application Open:

Full-Time

Job Purpose:

MBZUAI is seeking a High-Performance Computing (HPC) Application Engineer to support research teams in optimizing, managing, and troubleshooting HPC software environments and applications. This role includes providing scientific computing support, enhancing parallel computing performance, managing Slurm job scheduling, and delivering user training. The ideal candidate will have deep expertise in scientific programming, high-performance libraries, and application profiling to ensure efficient and effective utilization of HPC resources.

Key Responsibilities:

HPC Software & Environment Management:

  • Install, configure, and maintain HPC system software, libraries, and scientific applications.
  • Manage training jobs and research applications using Slurm.
  • Ensure software compatibility with HPC hardware and workflows.
  • Develop custom software environments for research teams.

User Support & Research Collaboration:

  • Work directly with researchers to understand and optimize computational workflows.
  • Provide debugging, profiling, and performance tuning for HPC applications.
  • Assist users in best practices for parallel programming, job scheduling, and data management.
  • Conduct workshops and training on optimizing applications for HPC systems.

Performance Optimization & Troubleshooting:

  • Profile and optimize scientific computing applications to maximize efficiency.
  • Use performance analysis tools (NVIDIA Nsight, Intel VTune, or others) to enhance application speedup.
  • Debug and troubleshoot failed jobs, inefficient resource usage, and software bottlenecks.

Emerging Technologies & Future Development:

  • Explore and integrate new HPC frameworks, AI/ML acceleration libraries, and parallel computing techniques.
  • Contribute to HPC documentation, automation, and user experience improvements.
  • Collaborate with IT, system engineers, and research teams to improve software deployment and system utilization.

Other Duties:

  • Perform all other duties as reasonably directed by the line manager that are commensurate with these functional objectives.

Academic Qualifications:

  • Bachelor’s degree in Computer Science, Engineering, Computational Science, or a related field.
  • A postgraduate degree will be preferred.

Professional Experience:

Essential

  • 3 years or more of experience working with HPC software, scientific computing, and parallel applications.
  • Strong programming skills in Python, Go, C++, or CUDA.
  • Experience with job scheduling (Slurm) or other HPC platform.
  • Strong problem-solving skills in debugging, profiling, and optimizing computational applications.
  • Excellent English communication skills, a collaborative attitude, and the ability to work effectively with engineers at all levels.
  • Experience with source control systems, build tools, and continuous integration pipelines.
  • Hardworking, self-motivated, detail-oriented, and proven ability to meet tight deadlines.

Preferred

  • A PhD degree, with 2+ years of equivalent practice or research experience, will be preferred.
  • Experience with AI/ML frameworks (TensorFlow, PyTorch) and HPC acceleration techniques.
  • Background in scientific research or domain-specific computing.
  • Contributions to open-source scientific computing projects.
  • Experience in higher education or research institutions, with an understanding of core research facility operations.
  • Proficiency in data analytics for process optimization and continuous improvement.
  • Working proficiency in additional languages as a plus.

Apply Now:

Please enable JavaScript in your browser to complete this form.
Click or drag a file to this area to upload.
Click or drag a file to this area to upload.