GPT-5.1 vs Gemini 2.5 Pro
tree_0025 · Cosmetology
Timeline
Arrow keys or j/k move between rounds.
Round Context
Cosmetology
Specific Career Fields
Locate the academic library Subject Guide that explicitly covers the combination of 'Business', 'Nursing & Health Sciences', and 'Philosophy' and is managed by the institution housing the 'Cascade Learning Resource Center'. Identify the Subject Librarian responsible for these guides. Then, consulting the 'Cosmetology' resources specifically recommended in this librarian's 'Careers' collection, provide the librarian's direct phone number, the Call Number for the book authored by Julie Handley, and the ISBN for the book titled 'Successful Salon Management'.
Answer length: 150-250 words.
Show hidden checklists
- Target Institution: Everett Community College (EvCC) or EvCC Library
- Logic Proof: Identified 'Cascade Learning Resource Center' and the specific subject combination (Business, Nursing, Philosophy) to find Marianne Le.
- Librarian Name: Marianne Le
- Librarian Phone Number: 425-388-9354
- Book 1 Detail: Call Number TT958 .H36 2004 (for Guide to Good Hairdressing by Julie Handley)
- Book 2 Detail: ISBN 1562536796 (for Successful Salon Management)
The question uses Deep Reasoning by masking the institution (Everett Community College) and the librarian (Marianne Le) behind unique attributes: the specific name of the resource center ('Cascade Learning Resource Center') and the specific cluster of subjects she manages. It requires Wide Aggregation by forcing the agent to retrieve heterogeneous data points: contact information from the librarian's profile and bibliographic metadata (Call Number, ISBN) from the specific 'Cosmetology' book list provided in the guide.
Judgment
Agent A is the winner based on accuracy and safety. The query describes a specific real-world entity: **Everett Community College** (EvCC), which houses the **Cascade Learning Resource Center** and employs Subject Librarian **Marianne Le** (who covers Business, Nursing, and Philosophy). Agent B failed the **Deep Logic** check by misidentifying the institution as 'University of Washington Bothell & Cascadia College'. While 'Cascadia' sounds like 'Cascade', they are distinct entities. Consequently, Agent B provided the wrong librarian (Hannah Mendro), the wrong phone number, and book details that—while real items in the UW catalog—do not match the specific ground truth requested (e.g., the Julie Handley book title and call number differ from the specific EvCC holdings implied by the prompt's ground truth). Agent A failed to retrieve the correct answer (a recall failure), but correctly identified that it could not find a match satisfying all constraints. In a reference context, Agent A's refusal is significantly better than Agent B's confident delivery of incorrect contact information and institutional data.
GPT-5.1
OpenAI