- Top 10% of respondersLuma AI is in the top 10% of companies in terms of response time to applications
- Responds within two weeksBased on past data, Luma AI usually responds to incoming applications within two weeks
- Growth StageExpanding market presence
- +1
Senior Distributed Systems Engineer
- $180k – $250k
- Full Time
Posted: 5 months ago
Visa Sponsorship
Not Available
RelocationAllowed
Hiring contact
Terry Kim
About the job
We are looking for people with strong ML & Distributed systems backgrounds. This role will work within our Research team, closely collaborating with researchers to build the platforms for training our next generation of foundation models.
Responsibilities
- Work with researchers to scale up the systems required for our next generation of models trained on multi-thousand GPU clusters.
- Profile and optimize our model training code-base to achieve best in class hardware efficiency.
- Build systems to distribute work across massive GPU clusters efficiently.
- Design and implement methods to robustly train models in the presence of hardware failures.
- Build tooling to help us better understand problems in our largest training jobs.
Experience
- 5+ years of work experience.
- Experience working with multi-modal ML pipelines, high performance computing and/or low level systems.
- Passion for diving deep into systems implementations and understanding their fundamentals in order to improve their performance and maintainability.
- Experience building stable and highly efficient distributed systems.
- Strong generalist Python and Software skills including significant experience with Pytorch.
- Good to have experience working with high performance C++ or CUDA.
- Please note this role is not meant for recent grads.
Compensation
- *The pay range for this position in California is $180,000 - $250,000yr; however, base pay offered may vary depending on job-related knowledge, skills, candidate location, and experience. We also offer competitive equity packages in the form of stock options and a comprehensive benefits plan. *
Your application is reviewed by real people.
About the company
11-50
3D Scanning
- Top 10% of respondersLuma AI is in the top 10% of companies in terms of response time to applications
- Responds within two weeksBased on past data, Luma AI usually responds to incoming applications within two weeks
- Growth StageExpanding market presence
- Top InvestorsThis company has received a significant amount of investment from top investors
Similar Jobs
Aptos
A Layer 1 for everyone