Thursday, April 30, 2009

NYSED's Secret Shame

Skitch is my new favorite toy.

Uploaded with plasq's Skitch!

Sue VanHattum said...

Also of note: You only need 20 right out of 87 to score a 50!

Perhaps the scaled score is percentile? If so, then half the text-takers are getting less than a quarter of the questions right.

If it's multiple choice, that might be a score that's truly close to zero. (Since you'd get a quarter or a fifth of them right by random chance.)

Kate Nowak said...

This test has 30 multiple choice questions, with 4 choices, worth 2 points each, so 60 available points come from multiple choice. So guessing on the MC gets you halfway to passing.

Chris Wellons said...

Wow! That is extremely misleading. A student only needs to get 1/3 of the exam correct when it seems they need 2/3. Even more, students that are just barely failing (60-64) get graded a second time by a different grader and keep the higher score if it passes.

I made a plot that I think helps illustrate the shame better than the table: nys-regents-score-plot.png

Sue VanHattum said...

I don't know how to find the info Chris mentioned about 2nd grading. I'm also curious how they scale it. It doesn't look exactly linear...

Chris Wellons said...

Sue: Found it here: Integrated Algebra Regents Exams Homepage.

Kate Nowak said...

Chris - that plot is great. I bet if we had data for the numbers of students that scored each raw score, we could see why it was scaled the way it was. (Namely, it didn't let an embarrassing number of kids fail.)

Kevin said...

The scaled score is probably percentiles. This is a common way of scaling tests that are intended for comparing individuals within a group. It tells you nothing about how the group as a whole is doing, of course, but it does allow you to quickly see where an individual places, independent of the overall difficulty of the test.

Some tests (such as SAT) try to use a scaling that means roughly the same thing from one version of the test to the next (though even they redo the scaling every couple of decades). A constant-difficulty scaling like that is better than either a raw score (which varies with question difficulty) or percentile (which can't detect movement of the group as a whole).

Sue VanHattum said...

Good blog post today on New York's standardized tests, at Bridging Differences, here.