Research Scientist – World Modelling

Home Working at MBZUAI Vacancies Research Scientist – World Modelling

Vacancy Overview

Application Open:

Full-Time

Job Purpose:

MBZUAI is looking to recruit a Research Scientist for the Institutional Function Models (IFM) team to develop foundational world models for accurate physical simulations, collaborating closely with engineering and data teams on large-scale training challenges. The role will include designing scalable data annotation pipelines, developing rigorous performance benchmarks, optimizing inference for real-time interaction, and advancing multimodal training systems. Expertise in visual tokenization, quantitative evaluation methods, and scaling laws for video pretraining is highly desired. Candidates should hold a postgraduate degree in the related field, with a proven research track record.

 

Key Responsibilities:

  • Develop the foundational world model to accurately simulate the physical world.
  • Collaborate with engineering and data teams to tackle key challenges in training the world model on large-scale clusters.
  • Develop metrics and evaluation benchmarks to better assess model performance.
  • Design and implement a scalable and efficient data annotation pipeline to ensure high-quality labeled data for training and evaluation.
  • Optimize inference efficiency to enable real-time interaction.

Areas of Focus

  • Scalable Training Systems: Develop and optimize infrastructure for training multimodal LLMs and video diffusion models at massive scale.
  • Efficient Data Pipelines: Build scalable video data pipelines and annotation frameworks to support high-quality training data.
  • Inference Optimization: Enhance inference efficiency through optimization and distillation techniques to enable real-time interaction.
  • Visual Tokenization: Develop methods for discretizing visual features into tokens for improved model representation.
  • Quantitative Evaluation: Establish rigorous benchmarks to assess physical accuracy, controllability, and intelligence.
  • Scaling Laws for Video Pretraining: Investigate scaling law principles to guide efficient video pre-training strategies.

Academic Qualifications:

  • MSc or PhD in Machine Learning or Computer Science, or equivalent industry experience.

Professional Experience:

  • Experience in large-scale model training (LLMs or Diffusion Models) on large clusters.
  • Hands-on experience with state-of-the-art video generative models (e.g., Sora, Veo2, MovieGen, CogVideoX, etc.).
  • Experiences in building and optimizing large-scale video data pipelines.
  • Experience in accelerating diffusion model inference for improved efficiency.
  • Exceptional problem-solving and troubleshooting skills to tackle complex technical challenges.
  • Strong systems and engineering expertise in deep learning frameworks such as PyTorch.
  • Strong communication and collaboration skills for effective cross-functional teamwork.
  • Ability to navigate ambiguity and drive projects in rapidly evolving research areas.
  • Research contributions to top-tier conferences or journals (e.g., ICML, ICLR, NeurIPS, ACL, CVPR, COLM, etc.), with published work in relevant domains.

Apply Now:

Click or drag a file to this area to upload.
Click or drag a file to this area to upload.