Cerebras’ systems are designed with a singular focus on machine learning. Our processor is the Wafer Scale Engine (WSE), a single chip with performance equivalent to a cluster of GPUs, giving the user cluster-scale capability with the simplicity of programming a single device. Because of this programming simplicity, large model training can be scaled out using simple data parallelism to the performance of thousands of GPUs. ML practitioners can focus on their machine learning, rather than parallelizing and distributing their applications across many devices. The Cerebras hardware architecture is designed with unique capabilities including orders of magnitude higher memory bandwidth and unstructured sparsity acceleration, not accessible on traditional GPUs. With a rare combination of cutting-edge hardware and deep expertise in machine learning, we stand among the select few global organizations capable of conducting large-scale innovative deep learning research and developing novel ML algorithms not possible on traditional hardware.
About the role
Cerebras has senior and junior research scientist roles open with focus on co-design and demonstration of novel state-of-the-art ML algorithms with this unique specialized architecture. We are working on research areas including advancing and scaling foundation models for natural language processing and multi-modal applications, new weight and activation sparsity algorithms, and novel efficient training techniques. A key responsibility of our group is to ensure that state-of-the-art techniques can be applied systematically across many important applications.
As part of the Core ML team, you will have the unique opportunity to research state-of-the-art models as part of a collaborative and close-knit team. We deliver important demos of Cerebras capability as well as publish our findings as ways to support and engage with the community. A key aspect of the senior role will also be to provide active guidance and mentorship to other talented and passionate scientists and engineers.
Research Directions
Our research focuses on improving state-of-the-art foundation models in NLP, computer vision, and multi-modal settings by studying many dimensions unique to the Cerebras architecture:
- Scaling laws to predict and analyze large-scale training improvements: accuracy/loss, architecture scaling, and hyperparameter transfer
- Sparse and low-precision training algorithms for reduced training time and increased accuracy. For instance, weight and activation sparsity, mixture-of-experts, and low-rank adaptation.
- Optimizers, initializers, normalizers to improve training dynamics and efficiency
Here is a sampling of our recent publications and releases:
- Cerebras-GPT: The first open-source compute-optimal LLM model suite up to 13B [paper] [huggingface]
- BTLM: State-of-the-art 3B LLM with 7B performance – most popular 3B model on Hugging Face [paper] [huggingface]
- SlimPajama: A 627B token, cleaned and deduplicated version of RedPajama [blog] [huggingface]
- Sparse Pretrain, Dense Fine-tune: Sparse GPT models up to 6.7B matching dense downstream accuracy [paper]
- Sparse-IsoFLOP Transformations: Creating larger sparse models with higher accuracy at the same compute [paper]
Responsibilities
- Develop novel training algorithms that advance state-of-the-art in model quality and compute efficiency
- Develop novel network architectures that address foundational challenges in language and multi-modal domains
- Co-design ML algorithms that take advantage of existing unique Cerebras hardware advantages and collaborate with engineers to co-design next generation architectures.
- Design and run research experiments that show novel algorithms are efficient and robust
- Analyze results to gain research insights, including training dynamics, gradient quality, and dataset preprocessing techniques
- Publish and present research at leading machine learning conferences
- Collaborate with engineers in co-design of the product to bring the research to customers
Requirements
- Strong grasp of machine learning theory, fundamentals, linear algebra, and statistics
- Experience with state-of-the-art models, such as GPT, LLaMA, DaLL-E, PaLI, or Stable Diffusion
- Experience with machine learning frameworks, such as TensorFlow and PyTorch.
- Strong track record of relevant research success through relevant publications at top conferences or journals (e.g. ICLR, ICML, NeurIPS), or patents and patent applications
Bonus:
- Experience with distributed training concepts and frameworks such as Megatron, Deepspeed, and/or FairSeq FSDP
- Experience with training speed optimizations, such as model architecture transformations to target hardware, or low-level kernel development
Why Join Cerebras
People who are serious about software make their own hardware. At Cerebras we have built a breakthrough architecture that is unlocking new opportunities for the AI industry. With dozens of model releases and rapid growth, we’ve reached an inflection point in our business. Members of our team tell us there are five main reasons they joined Cerebras:
- Build a breakthrough AI platform beyond the constraints of the GPU
- Publish and open source their cutting-edge AI research
- Work on one of the fastest AI supercomputers in the world
- Enjoy job stability with startup vitality
- Our simple, non-corporate work culture that respects individual beliefs
Read our blog: Five Reasons to Join Cerebras in 2024.
Apply today and become part of the forefront of groundbreaking advancements in AI.
Cerebras Systems is committed to creating an equal and diverse environment and is proud to be an equal opportunity employer. We celebrate different backgrounds, perspectives, and skills. We believe inclusive teams build better products and companies. We try every day to build a work environment that empowers people to do their best work through continuous learning, growth and support of those around them.
This website or its third-party tools process personal data. For more details, click here to review our CCPA disclosure notice.