ELAGrades 6-8

EL Education K-8 Language Arts

EL Education

Judge anthropic/claude-opus-4-7 · Signal Studio judge (claude-opus-4-7) · 2026-05-25

86

% indicator exact match

Gateways 3/3 Meets

Agreement vs gold

Indicator level — gold and the judge as two raters over the rubric criteria.

Exact match

85.7%

28 indicators vs gold

Weighted κ

0.84

ordinal-weighted agreement

MAE

0.21

lower is better

Signed bias

+0.14

judge over-scores

Gateway rollup

Indicator scores rolled up to EdReports gateway ratings (sequential gating + no-0s cap), gold vs judge.

G1 · Text Quality and Complexity

28/28 pts

GoldMeets
JudgeMeets
agree

G2 · Building Knowledge

32/32 pts

GoldMeets
JudgeMeets
agree

G3 · Usability

24/25 pts

GoldMeets
JudgeMeets
agree

Gateway-level agreement: exact 100% · κ 1.00 (3 gateways)

Divergences (4)

Indicators where the judge's score differed from gold. Amber = judge under-scored, blue = over-scored.

1l
12
gold Partially Meets Expectations · judge 2 pts
2h
24
gold Partially Meets Expectations · judge 4 pts
3j
24
gold Partially Meets Expectations · judge 4 pts
3b
21
gold Meets Expectations · judge 1 pts
Live read from the Signal Studio spine (core-data).