Monthly Shaarli

All links of one month in a single page.

February, 2025

Humanity's Last Exam

We introduce Humanity's Last Exam, a multi-modal benchmark at the frontier of human knowledge, designed to be the final closed-ended academic benchmark of its kind with broad subject coverage. The dataset consists of 2,700 challenging questions across over a hundred subjects. We publicly release these questions, while maintaining a private test set of held out questions to assess model overfitting.

Open Euro LLM
thumbnail

A series of foundation models for transparent AI in Europe