About the role
Data is the most important ingredient in training larger and better models. Getting the best data is critical to making our models more intelligent and empathetic. How good is our current data? What new data do we need to ensure our success? What data mixture should we use in the training runs?
As a passionate LLM data researcher, you love uncovering interesting subsets of data, creating clear dashboards to communicate your findings, and proposing ideas for unexplored opportunities. You are also extremely interested in learning how to support large language model development with unparalleled data quality and performance insights. You believe in data-centric AI, and know how to improve LLMs from the data perspective. You want to be involved in training ever larger foundation models.
Responsibilities
Curate, mine, and analyze datasets for LLMs
Work with our pre-training/ post-training team to identify datasets needed for specific user experiences
Run hundreds of careful ablation experiments to improve our models’ performance on benchmarks we care about
Help maintain and improve core tables in our data lake used for research across the company
Alongside the data platform team, you will be responsible for making sure this vital resource is available, understood, and of the highest quality.
Who we’re looking for
Required Experience:
A bachelor’s degree in a quantitative field (Physics, Mathematics, Computer Science); PhD preferred.
2+ years experience in industry, training, and evaluating LLMs.
Experience mining text and graphical data
Spark Experience
Additional Desired Experience:
Experience with hill climbing on LLM benchmarks
Experience with cloud platforms like GCP
Experience with Kubernetes
Data visualization skills
SQL Wizardry
You will be a good fit if you are proactive and have a “get things done” mindset. Given our current pace of growth and load on our systems, most people have had a significant impact during their first week at the company.
About Character.AI
Founded in 2021, Character is a leading AI company offering personalized experiences through customizable AI ’Characters.’ As one of the most widely used AI platforms worldwide, Character enables users to interact with AI tailored to their unique needs and preferences.
In just two years, we achieved unicorn status and were named Google Play’s AI App of the Year – a testament to our groundbreaking technology and vision.
Ready to shape the future of Consumer AI? 🚀
At Character, we value diversity and welcome applicants from all backgrounds. As an equal opportunity employer, we firmly uphold a non-discrimination policy based on race, religion, national origin, gender, sexual orientation, age, veteran status, or disability. Your unique perspectives are vital to our success.