Senior AI Software Engineer – Model Training (f/m/d) - Aleph Alpha Careers (Heidelberg)

Our Mission

Aleph Alpha is one of the few companies in Europe doing serious foundation model pre- and post-training. We're building models that have general-purpose capabilities, and specifically excel at addressing the needs of our customers.

We're looking for exceptional Software Engineers to join our model training team. Most of the team is based in Heidelberg .

Team Culture

At Aleph Alpha, we foster a culture built on ownership, autonomy, and empowerment. Teams and individual contributors are trusted to take responsibility for their work and drive meaningful impact. We maintain a flat organizational structure with efficient, supportive management that enables quick decision-making, open communication, and a strong sense of shared purpose.

We believe a strong engineering culture is the key to model training success. We like Extreme Programming and favor trunk-based development. We often mob-program, which keeps us aligned and means we always learn from each other.

About the Role

As a Software Engineer in Model Training, you'll work across our full stack. Some weeks you might be optimizing how training loads are scheduled on our cluster and making the pipeline more robust and performant so we can iterate faster. Other weeks, you'll be enabling large-scale code execution for reinforcement learning. And at other times, you might dig deep into our evaluation codebase to lift inference throughput on evals.

No two days are the same. Things move fast, and your ability to focus and prioritize is what lets you unblock the team day-to-day while designing the high-quality tooling and infrastructure that speeds us up long-term.

We're still building out our training pipeline and infrastructure. Some pieces exist, some don't, and you'll have real influence on what gets built and how. Your work directly shapes how quickly we can experiment and improve our models.

Your responsibilities

Co-own the training pipeline end-to-end. Design, build, and maintain the infrastructure and components that let us iterate fast on experiments.
Build high-quality tooling. Model training is a continuous effort, and we deliberately invest in our tooling and infrastructure to stay successful long term.
Collaborate across disciplines. We believe in cross-functional teams. Engineers and researchers work closely so we can learn from each other and iterate faster together.
Champion good engineering practices. Working incrementally, maintaining fast feedback loops, and refactoring continuously keep a team successful long-term, especially when moving fast.
Shape the direction of the team. Our culture empowers individuals to take ownership. If you see that we'll need more GPUs, a different storage system, or a change to how the team is set up, you should drive this change.

Your profile

In the model training team, we hire slowly and deliberately. We recognise that we need top talent to deliver the best models, and we value ability over experience: if you think you would be a good fit for this role and training an LLM in Europe excites you, we want to hear from you.

Requirements

A track record of taking initiative to deliver high-impact work.
Experience contributing in high-performing teams.
Degree in computer science, engineering, or a related field.
Willingness to relocate to Germany. Our primary working locations are Heidelberg (preferred) and Berlin, although there is some flexibility to work from other locations in Germany with regular travel to Heidelberg (potentially weekly).
Ability to write software that other strong engineers want to read and build on.
Desire to take ownership of problems and collaborate with other teams to solve them.
Deep interest in how state-of-the-art foundation models work.
Strong communication skills, with the ability to convey technical solutions to diverse audiences.

Nice-to-haves

Experience working with distributed systems.
Experience working with Kubernetes.

We do not require prior experience in machine learning for this role, but we do value your eagerness to learn. If you have prior experience in ML, we will be particularly excited about:

Experience bringing AI research innovations into production.
Experience in areas such as large-scale data processing or distributed computation for foundation model training or inference.
Experience with performance engineering: profiling, benchmarking, and optimizing code for throughput, latency, or memory.

Compensation and benefits

Become part of an AI revolution!
30 days of paid vacation
Access to a variety of fitness & wellness offerings via Wellhub
Mental health support through nilo.health
Substantially subsidized company pension plan for your future security
Subsidized Germany-wide transportation ticket
Budget for additional technical equipment
Flexible working hours for better work-life balance and hybrid working model
Virtual Stock Option Plan
JobRad® Bike Lease

Senior AI Software Engineer – Model Training (f/m/d)