- Top 10% of respondersVals AI is in the top 10% of companies in terms of response time to applications
- Responds within a weekBased on past data, Vals AI usually responds to incoming applications within a week
- B2B
- Early StageStartup in initial stages
- Recently fundedRaised funding in the past six months
- Website
- Location
- Company size
- 1-10 people
- Total raised
- $5M
- Company type
- Seed
- Markets
Vals AI careers
Measuring model ability is the most challenging part of creating applications that are capable of automating any given part of the economy. There are no good techniques or benchmarks for evaluating LLM performance on business-relevant tasks, so adoption for enterprise production settings has been limited (see Wittgenstein’s ruler).
This problem materializes in each place where LLMs have potential: in understanding whether the AI tool companies are building a product will satisfy a customer demand, determining how feasible models and vendors are for a given enterprise in making purchasing decisions, for researchers who need a north star to which to expand model ability.
Today, answering these questions amounts to hiring a human review team to manually evaluate model outputs. This is prohibitively expensive and slow.
Vals AI is building the enterprise benchmark of LLM and LLM apps on real-world business tasks. In doing so we are creating the infrastructure + certification to automatically audit LLM applications, verifying they are ready for consumption.
We've raised mid-seven figures from some of the top institutional investors and strategic angels in Silicon Valley. See our benchmarks and launch announcement in Bloomberg: bloomberg.com/news/newsletters/2024-04-11/this-startup-is-trying-to-test-how-…
This problem materializes in each place where LLMs have potential: in understanding whether the AI tool companies are building a product will satisfy a customer demand, determining how feasible models and vendors are for a given enterprise in making purchasing decisions, for researchers who need a north star to which to expand model ability.
Today, answering these questions amounts to hiring a human review team to manually evaluate model outputs. This is prohibitively expensive and slow.
Vals AI is building the enterprise benchmark of LLM and LLM apps on real-world business tasks. In doing so we are creating the infrastructure + certification to automatically audit LLM applications, verifying they are ready for consumption.
We've raised mid-seven figures from some of the top institutional investors and strategic angels in Silicon Valley. See our benchmarks and launch announcement in Bloomberg: bloomberg.com/news/newsletters/2024-04-11/this-startup-is-trying-to-test-how-…
Member of Technical Staff
In office • Palo Alto
$125k – $175k • 0.2% – 0.5%3 months ago
Benchmarking Engineer (part-time)
In office • Austin
$40k – $120k • No equity3 months ago
Head of AI
In office • Palo Alto
$175k – $250k • 0.5% – 1.5%3 months ago
Founding Product Engineer
In office • Austin
$150k – $230k • 0.5% – 1.5%3 months ago
Rayan Krishnan
Total raised
$5M
Funded over
1 round
Latest round