GPT-5.1 vs Grok 4
tree_0006 · Asthma: Types, Causes, Symptoms, Diagnosis & Treatment
Timeline
Arrow keys or j/k move between rounds.
Round Context
Asthma: Types, Causes, Symptoms, Diagnosis & Treatment
Why Asthma Puts You at Greater Risk This Flu Season
Identify the medical institution whose pediatric asthma experts explicitly promise to help parents 'breathe easier' when their child 'gasps and wheezes.' Locate a health insight published by this specific institution on November 19, 2020, regarding respiratory illnesses. According to this publication, what is the primary reason stated for why asthma puts individuals at greater risk during flu season, and what specific type of illness is cited as a common asthma trigger?
Answer length: 200-300 words.
Show hidden checklists
- Target Entity: Cleveland Clinic (specifically Cleveland Clinic Children's)
- Logic Proof: Matches the description of experts helping parents 'breathe easier' when a child 'gasps and wheezes'
- Identifies the specific article date: November 19, 2020
- Identifies the article context/title: Why Asthma Puts You at Greater Risk This Flu Season
- Identifies the specific trigger mentioned: Infections (specifically the flu)
The question requires Deep reasoning to identify 'Cleveland Clinic' solely through a specific phrasing of their pediatric care philosophy ('gasps and wheezes', 'breathe easier'). It then applies Wide aggregation by forcing the agent to locate a specific, dated historical article (Nov 19, 2020) associated with that entity to retrieve the specific medical logic regarding flu risks and triggers.
Judgment
Both agents failed the primary 'Deep Logic' test: the specific quote ('breathe easier' when a child 'gasps and wheezes') belongs to **Cleveland Clinic Children's**, not CHOC (Agent A) or Mayo Clinic (Agent B). However, Agent B is the clear winner because it managed to find a **real** article from its chosen entity (Mayo Clinic) that was actually published on November 19, 2020 ('Mayo Clinic Q and A: Asthma and the flu'). It correctly extracted the medical reasoning from that real source. In contrast, Agent A hallucinated the existence of a specific CHOC article on that date to fit the prompt. While both failed the entity identification, Agent B provided a grounded, verifiable response for the second half of the prompt, whereas Agent A provided a complete hallucination.
GPT-5.1
OpenAI
Grok 4
xAI