My page - topic 1, topic 2, topic 3 Postbox Live

“The Final Exam for Humanity” to Assess Robust AI

The Final Exam For Humanity To Assess Robust Ai

Scientists Getting Ready for

The Final Exam for Humanity” to Assess Robust AI

 

 

 


However, there’s one thing AI won’t be tested on by the organizers.

The “hardest and broadest set of questions ever” are being requested by AI experts in an effort to challenge the most sophisticated AI systems in use now as well as those that will be developed in the future.


The test, dubbed “Humanity’s Last Exam” in the industry, is being crowdsourced, according to Reuters, by the Center for AI Safety (CAIS) and the training data labeling company Scale AI. Scale AI funded a cool billion dollars during the summer, for a $14 billion total valuation.

As noted by Reuters, this “exam“‘s submission period commenced one day after the release of OpenAI’s updated o1 model preview findings. According to Dan Hendryks, executive director of CAIS, o1 appears to have “destroyed the most popular reasoning benchmarks.”
Hendrycks coauthored two papers in 2021 that included suggestions for AI testing to see if models might outperform undergraduates. The AI systems being evaluated at the time were answering questions almost at random, but as Hendrycks points out, modern models have “crushed” the 2021 tests.

Abstract Thinking


While the 2021 testing criteria primarily grilled the AI systems on math and social studies, “Humanity’s Last Exam” will, as the CAIS executive director said, incorporate abstract reasoning to make it harder.
The two institutions organizing the test are also planning to keep the test criteria confidential and not opening it up to the public, to make sure the answers don’t end up in any AI training data.

Due November 1, experts in fields as far-flung as rocketry and philosophy are being encouraged to submit questions that would be difficult for those outside their areas of expertise to answer. After undergoing peer review, winners will be offered co-authorship of a paper associated with the test and prizes up to $5,000 sponsored by Scale AI.


While the organizers are casting a very wide net for the types of questions they’re seeking, they told Reuters that there’s one thing that will not be on the exam: anything about weapons, because it’s too dangerous for AI to know about.

 

 


Discover more from Postbox Live

Subscribe to get the latest posts sent to your email.

Leave a Reply

error: Content is protected !!

Discover more from Postbox Live

Subscribe now to keep reading and get access to the full archive.

Continue reading