GLM-4.7 vs Sonar Reasoning Pro
tree_0005 · Racing game
Timeline
Arrow keys or j/k move between rounds.
Round Context
Racing game
Wayback Machine
Many early online communities dedicated to racing games—such as fan sites, mod repositories, and competitive league pages from the late 1990s—are preserved today through large-scale web archiving efforts. Identify the organization that began donating its web crawl data in 1996 to a major digital archive, enabling such racing game websites to be preserved in the Wayback Machine. In your answer, explain when the donations started, how the data flows into the Wayback Machine (including any embargo process), the specific name of the crawl data collection involved, and whether this dataset is publicly accessible.
Answer length: 200-300 words.
Show hidden checklists
- Alexa Internet + Identified as the organization that has been donating crawl data since 1996 to the Internet Archive for inclusion in the Wayback Machine
- Donation of crawl data began in 1996
- Data is donated to the Internet Archive
- Data flows in daily and is added to the Wayback Machine after an embargo period
- Collection name: Alexa Crawls DR Crawl
- The dataset is currently not publicly accessible
The question uses the context of preserving early racing game websites to mask the core entity (Deep), requiring the respondent to infer the relevant web archiving donor organization without naming it directly. It then demands multiple distinct factual details—start date, recipient archive, embargo process, collection name, and access status—forcing broad information aggregation across sources (Wide).
Judgment
First, Deep Logic: Agent A correctly identifies Alexa Internet as the organization donating crawl data to the Internet Archive beginning in 1996. Agent B does not identify any organization and refuses to answer, failing the core entity requirement (DEEP failure). Second, Width/Completeness: Agent A mentions the 1996 start date, the Internet Archive, an embargo process, and a named crawl collection. However, it omits the specific collection name "Alexa Crawls DR Crawl," does not clearly state the daily data flow, and incorrectly claims the dataset is publicly accessible (it is not). Thus, Agent A has WIDE inaccuracies. Agent B provides none of the required checklist elements. Finally, User Experience & Presentation: Agent A delivers a structured, readable, citation-backed answer that attempts to address all components. Agent B provides a refusal with minimal utility. Despite Agent A’s factual errors, it substantially outperforms Agent B in completeness and usefulness. Because Agent A contains factual mistakes, it cannot receive MUCH_BETTER, but it is clearly BETTER overall.
GLM-4.7
Zhipu AI
Sonar Reasoning Pro
Perplexity