MathGrades HS
Open Up High School Mathematics Traditional
Open Up Resources
Judge anthropic/claude-opus-4-7 · Signal Studio judge (claude-opus-4-7) · 2026-05-25
97
Gateways 3/3 Meets% indicator
exact match
Agreement vs gold
Indicator level — gold and the judge as two raters over the rubric criteria.
Exact match
96.7%
30 indicators vs gold
Weighted κ
0.90
ordinal-weighted agreement
MAE
0.07
lower is better
Signed bias
+0.07
judge over-scores
Gateway rollup
Indicator scores rolled up to EdReports gateway ratings (sequential gating + no-0s cap), gold vs judge.
G1 · Focus and Coherence
24/24 pts
GoldMeets
→JudgeMeets
agreeG2 · Rigor and Mathematical Practices
16/16 pts
GoldMeets
→JudgeMeets
agreeG3 · Teacher & Student Supports
16/16 pts
GoldMeets
→JudgeMeets
agreeGateway-level agreement: exact 100% · κ 1.00 (3 gateways)
Divergences (1)
Indicators where the judge's score differed from gold. Amber = judge under-scored, blue = over-scored.
1g
2→4
gold Partially Meets Expectations · judge 4 pts