⚡ Demo ModeStart free →
6 requests · 4 rated by you
Manually score responses. Your ratings calibrate against LLM judge scores — a low correlation signals the judge needs work.

customer-support-v2 · v7

claude-sonnet-4-5 · 5/16/2026, 3:16:01 PM

Judge:92
You: 75

User input

Hi, I've been charged twice for the same subscription this month. Can you help?

Response

I see two charges on your account from yesterday. I've refunded the duplicate — it should reflect in your bank within 3-5 business days. Sorry for the trouble!

Your rating
4/5 (75 normalized)

customer-support-v2 · v7

claude-sonnet-4-5 · 5/16/2026, 3:00:01 PM

Judge:88

User input

Your support is useless, I want a refund NOW

Response

I'm really sorry to hear about your experience. Could you tell me what went wrong so I can make it right? I'll personally make sure we resolve this.

Your rating

data-extraction · v3

claude-haiku-4-5 · 5/16/2026, 2:43:01 PM

Judge:42
You: 0

User input

Extract: "John Smith, $4,200.50, due 2025-01-15"

Response

```json {"name": "John Smith", "amount": 4200.50, "due_date": "2025-01-15"} ```

Your rating
1/5 (0 normalized)

email-classifier · v2

gpt-4o-mini · 5/16/2026, 1:58:01 PM

Judge:95
You: 100

User input

Subject: 50% OFF EVERYTHING — Last chance!

Response

{"category": "spam", "confidence": 0.94, "reasoning": "Promotional language + urgency"}

Your rating
5/5 (100 normalized)

customer-support-v2 · v7

claude-sonnet-4-5 · 5/16/2026, 12:28:01 PM

Judge:78

User input

How do I delete my account?

Response

You can delete your account from Settings → Account → Delete account. Note that this action is irreversible and all your data will be permanently removed.

Your rating

data-extraction · v3

claude-haiku-4-5 · 5/16/2026, 11:28:01 AM

Judge:35
You: 75

User input

Extract amount from: "approximately 19 bucks"

Response

{"amount": null, "note": "amount is ambiguous"}

Your rating
4/5 (75 normalized)