HIX AI
Collapse
Simple
Home > Discover > Anthropic Challenges Current AI Benchmarking Practices with New Funding Initiative

Anthropic Challenges Current AI Benchmarking Practices with New Funding Initiative

Written by
ArticleGPT

Reviewed and fact-checked by the HIX.AI Team

2 min readJul 02, 2024
Anthropic Challenges Current AI Benchmarking Practices with New Funding Initiative

In a Nutshell

Anthropic’s new program, unveiled on Monday, will distribute funds to external organizations that can develop benchmarks to effectively assess AI models' performance and impact.

Anthropic is introducing a new funding initiative to tackle the issues confronting current AI benchmarking practices, where the existing benchmark limits the capability to assess the performance and influence of AI models.

Existing benchmarks often fall short of accurately representing how the average person uses AI systems. They fail to capture the nuances and complexities of real-world usage, leading to limited ability to offer significant insights into AI model performance.

Additionally, many of these benchmarks were developed before the advent of modern generative AI, raising questions about their relevance and applicability.

Anthropic's Funding Initiative

The program aims to identify and fund third-party organizations capable of creating benchmarks that can effectively measure advanced capabilities in AI models.

“Our investment in these evaluations is intended to elevate the entire field of AI safety, providing valuable tools that benefit the whole ecosystem,” Anthropic published on its official blog.

The necessity for new benchmarks capable of effectively evaluating AI models more accurately is urgent, “Developing high-quality, safety-relevant evaluations remains challenging, and the demand is outpacing the supply.” added in the blog.

Focus Areas for New Benchmarks

Anthropic's new benchmarks will focus on evaluating AI models' advanced capabilities, particularly in relation to AI security and societal implications.

These benchmarks will assess a model's ability to carry out tasks that have significant implications, such as cyberattacks, weapon enhancement, and manipulation or deception of individuals through deepfakes or misinformation.

Furthermore, Anthropic aims to develop an "early warning system" to identify and assess AI risks related to national security and defense. While details about this system are not disclosed in the blog post, Anthropic emphasizes its commitment to addressing these risks.

The funding program will also support research into benchmarks for "end-to-end" tasks, exploring AI's potential in various domains.

These tasks include facilitating scientific research, speaking in numerous languages, reducing prejudices, and filtering out toxicity.

Anthropic intends to develop new platforms that empower subject-matter experts to generate their own assessments and carry out extensive trials involving thousands of users.

The company has employed a dedicated coordinator for this initiative and is exploring opportunities to acquire or expand projects with scalability potential.

CEO Dario Amodei has emphasized the wider impact of AI and the necessity for thorough solutions to tackle possible inequality issues.

In an interview with Time Magazine, Amodei highlighted the importance of finding solutions beyond Universal Basic Income to ensure that advancements in AI technology benefit the wider public.

Based on 2 search sources

2 sources

Anthropic looks to fund a new, more comprehensive generation of AI benchmarks

Anthropic is launching a program to fund the development of new types of benchmarks capable of evaluating the performance and impact of AI models, including generative models like its own Claude.

Google's Alphabet And Amazon-Backed Anthropic Lead Effort To Redefine AI Evaluation Standards

Anthropic’s new program, revealed on Monday, will allocate funds to third-party organizations capable of creating benchmarks that can effectively evaluate the performance and impact of AI models

On This Page

  • Anthropic's Funding Initiative