Hello, reader! If you intend to post a link to this blog on Twitter, be aware that for utterly mysterious reasons, Twitter thinks this blog is spam, and will prevent you from linking to it. Here's a workaround: change the .com in the address to .ca. I call it the "Maple Leaf Loophole." And thanks for sharing!

Thursday, April 30, 2009

NYSED's Secret Shame

Skitch is my new favorite toy.

Uploaded with plasq's Skitch!


  1. Also of note: You only need 20 right out of 87 to score a 50!

    Perhaps the scaled score is percentile? If so, then half the text-takers are getting less than a quarter of the questions right.

    If it's multiple choice, that might be a score that's truly close to zero. (Since you'd get a quarter or a fifth of them right by random chance.)

  2. This test has 30 multiple choice questions, with 4 choices, worth 2 points each, so 60 available points come from multiple choice. So guessing on the MC gets you halfway to passing.

  3. Wow! That is extremely misleading. A student only needs to get 1/3 of the exam correct when it seems they need 2/3. Even more, students that are just barely failing (60-64) get graded a second time by a different grader and keep the higher score if it passes.

    I made a plot that I think helps illustrate the shame better than the table: nys-regents-score-plot.png

  4. I don't know how to find the info Chris mentioned about 2nd grading. I'm also curious how they scale it. It doesn't look exactly linear...

  5. Chris - that plot is great. I bet if we had data for the numbers of students that scored each raw score, we could see why it was scaled the way it was. (Namely, it didn't let an embarrassing number of kids fail.)

  6. The scaled score is probably percentiles. This is a common way of scaling tests that are intended for comparing individuals within a group. It tells you nothing about how the group as a whole is doing, of course, but it does allow you to quickly see where an individual places, independent of the overall difficulty of the test.

    Some tests (such as SAT) try to use a scaling that means roughly the same thing from one version of the test to the next (though even they redo the scaling every couple of decades). A constant-difficulty scaling like that is better than either a raw score (which varies with question difficulty) or percentile (which can't detect movement of the group as a whole).

  7. Good blog post today on New York's standardized tests, at Bridging Differences, here.


Hi! I will have to approve this before it shows up. Cuz yo those spammers are crafty like ice is cold.