GLM-4.7 vs Kimi K2
tree_0006 · Asthma: Types, Causes, Symptoms, Diagnosis & Treatment
Timeline
Arrow keys or j/k move between rounds.
Round Context
Asthma: Types, Causes, Symptoms, Diagnosis & Treatment
Why Asthma Puts You at Greater Risk This Flu Season
Identify the medical institution whose pediatric asthma experts are marketed with the promise to help children 'breathe easier' when they 'gasp and wheeze.' Based on this institution's published health insights, explain the specific relationship between asthma and the flu highlighted in their November 2020 guidance. Additionally, identify which two specific conditions—one related to the sinuses and one to blood pressure—are explicitly cited as examples of issues treated by their primary care providers.
Answer length: 200-300 words.
Show hidden checklists
- Target Entity: Cleveland Clinic (Identified via 'breathe easier' and 'gasp and wheeze' marketing text)
- Asthma/Flu Relationship: Infections (like the flu) are a common asthma trigger
- Primary Care Condition 1: Sinus infections
- Primary Care Condition 2: High blood pressure
The question uses deep logic by masking the source (Cleveland Clinic) behind a specific quote from their pediatric asthma marketing material ('breathe easier', 'gasp and wheeze'). It requires wide information aggregation by forcing the agent to retrieve facts from two distinct areas of the site: a specific dated article about asthma triggers (Target 0) and a general primary care service description listing specific treated conditions (Target 1).
Judgment
Both agents failed the fundamental 'Deep Logic' check by misidentifying the target entity. The prompt contains specific marketing language ('breathe easier' when they 'gasp and wheeze') which is the tagline for the Cleveland Clinic's pediatric asthma program (as noted in the Ground Truth). Agent A incorrectly identified the Mayo Clinic, and Agent B incorrectly identified UT Physicians. Because both agents identified the wrong institution, their subsequent answers regarding the 'November 2020 guidance' and 'primary care conditions' were either hallucinations or misattributions of generic medical advice to the wrong entity. According to the evaluation criteria, when both agents fail the core entity check, it is a Low Quality Tie.
GLM-4.7
Zhipu AI
Kimi K2
Moonshot AI