**This position is posted by Jobgether on behalf of a partner company. We are currently looking for a AI Research Engineer - Pre training in Mexico.**
This role sits at the core of next-generation AI model development, focusing on advancing large-scale pre-training systems that power state-of-the-art intelligence. You will work on cutting-edge architectures spanning small, large, and multimodal models, directly influencing model capability, efficiency, and scalability. Operating in a highly research-driven and distributed engineering environment, you will help push the boundaries of what modern AI systems can achieve. The position combines deep scientific exploration with hands-on engineering on massive GPU clusters. You will design and optimize training pipelines that run across thousands of NVIDIA GPUs, ensuring performance at scale. This is an opportunity to contribute to foundational AI breakthroughs while collaborating with world-class researchers and engineers in a fast-paced, innovation-focused setting.
Accountabilities
In this role, you will lead and contribute to the development of large-scale pre-training systems and model architectures that enhance intelligence and efficiency. You will design experiments, build scalable training frameworks, and improve model performance through iterative research and engineering work.
* Conduct large-scale pre-training of AI models on distributed GPU clusters, ensuring scalability, stability, and performance
* Design, prototype, and optimize novel model architectures, including transformer and non-transformer approaches
* Run experiments, analyze results, and refine methodologies to improve training efficiency and model quality
* Identify and resolve bottlenecks in training systems, data pipelines, and model performance
* Improve distributed training infrastructure to support next-generation AI workloads
* Collaborate with researchers and engineers to translate experimental ideas into production-ready training systems
* Contribute to the evolution of high-performance AI training systems and frameworks
Requirements
------------
The ideal candidate has deep expertise in AI research and large-scale model training, with strong technical foundations in machine learning, distributed systems, and deep learning frameworks. You should be comfortable working in highly complex, GPU-intensive environments and driving research from concept to implementation.
* PhD or strong academic/research background in Computer Science, Machine Learning, NLP, or related fields (preferred)
* Hands-on experience with large-scale LLM pre-training on distributed GPU infrastructure (thousands of GPUs)
* Strong understanding of transformer architectures and advanced model design techniques
* Experience with distributed training frameworks and large-scale AI systems
* Proficiency in PyTorch and Hugging Face ecosystem for model development and training
* Strong skills in debugging, optimizing, and improving model and system performance
* Ability to design experiments, interpret results, and iterate on research hypotheses
* Strong collaboration and communication skills in research-driven environments
Benefits
--------
* Competitive compensation package aligned with AI research market standards
* Remote-friendly and globally distributed work environment
* Opportunity to work on frontier AI research at massive scale
* Access to high-performance computing infrastructure and large GPU clusters
* Collaborative environment with top-tier AI researchers and engineers
* High autonomy in research direction and experimentation
* Exposure to state-of-the-art AI systems and multimodal model development
* Professional growth in a fast-evolving, innovation-driven AI ecosystem