Hacker News new | past | comments | ask | show | jobs | submit login

The fallacy with using tests of the kids as a marker for the quality of the teachers is that you just can't do that and get reliable results.

A bunch of black kids from the ghettos, who have their parents in jail/dead, worrying about siblings etc. and no cash to spend on basic school equipment, much less a decent meal every day, will have vastly lower scores than a bunch of white kids with helicopter parents.

If schools were adequately financed e.g. to provide free, healthy meals, proper study rooms and free school supplies, that could at least reduce the gap.

Unfortunately, kids can't vote and a large number of poor black kids end up in jail or dead anyway so they can't vote even when they're old. And so, schools remain the first place to go when politicians need to cut expenses.




> The fallacy with using tests of the kids as a marker for the quality of the teachers is that you just can't do that and get reliable results.

Which is not what is being proposed by people arguing for teacher evaluations drawing on standardized testing (https://en.wikipedia.org/wiki/Value-added_modeling), as the very name 'value-added' implies.


VAM is a good idea but it's really hard to get right and the trend has been make it very high stakes for teachers. You didn't really address the examples which mschuster91 provided and that's important to understanding the problem:

1. Limited parental support (note: this does not imply bad parents – working 3 jobs to pay the bills leaves little time to help with homework)

2. Unstable living environment

3. Strong financial restrictions

4. Need to care for siblings[1]

5. Food insecurity

How do you construct a VAM model which recognizes that a teacher who got a class full of students suffering from one or more of those problems and managed to improve them by one grade level had a LOT more work, and more complicated work, than the teacher in a wealthy suburb who got a bunch of students with affluent, involved parents who are both pushing their kids hard to excel and providing tons of extra support outside of school?

This isn't just a philosophical debate, either, since school districts are tying large parts of compensation to test scores. Starting with a hard job which doesn't pay particularly well, how many years are you going to spend not getting bonuses for your hard work or even being arbitrarily punished before you give up and find an easier job?

One estimate has ~12% of NYC public school teachers being punished by the flawed VAM in use there:

http://mathbabe.org/2015/04/03/how-many-nyc-are-arbitrarily-...

That's a high number to begin with and downright shameful when you consider that those schools are already facing a hard time getting qualified teachers. If hiring is hard, you really need to make an effort to retain the people you do manage to find.

1. My wife has had students who felt pressure to skip after-school extra-curricular activities or even go to an inferior college so they could care for younger siblings while their parents worked. That's not wrong in the sense of everyone involved having a sympathetic motive but it's a huge burden which more affluent kids never even have to think about, which is why simple-sounding ideas like making college admission or scholarships merit-based ends up reinforcing the existing socioeconomic status quo rather than changing it.


> You didn't really address the examples which mschuster91 provided and that's important to understanding the problem:

On the contrary, I addressed it entirely. mschuster91 seems to be under the impression that the teacher evaluation schemes boil down to nothing but the simplest possible before-after comparison of grades of students, ignoring all issues of demographics, differing student quality, differing school circumstances, etc. Such a scheme is indeed absurd, as his counterexample proves, but it is not what has been proposed by pretty much everyone! The actual proposals are well aware of what he thinks is the fatal problem, and go to often elaborate lengths to model and adjust for these sorts of heterogeneities in order to quantify the value-added of a particular teacher. The problem is recognized, included, and mostly dealt with. Whether the solution works entirely or is worthwhile is unclear, but he's arguing against a strawman.

> One estimate has ~12% of NYC public school teachers being punished by the flawed VAM in use there:

So I've looked at http://mathbabe.org/2015/04/02/the-arbitrary-punishment-of-n... and I have zero idea what she is trying to show. She assumes independence and treats it as a coin flip. Ummm.... what? With that sort of logic, you could show no one could expect to score a 1600 on the SAT. When criticized she links to a real analysis†, which shows considerable non-independence which means her numbers are wrong and will overstate how many will be denied tenure based on the VAMs. By the way, why are you phrasing it as 'punished'? That sounds like you're assuming your conclusion. If VAM doesn't affect hiring decisions, there's no point to bothering with it in the first place is there, but if it does affect hiring decisions, that means teachers are being 'punished'...?

† not that I think too much of it either, since it relies mostly on an argument from incredulity and pointing angrily at some scatterplots, and tries to ignore the r=.35 correlation of ratings from two subjects; to put an r=.35 in perspective, the correlation between years of education and intelligence is only ~r=.55! Even the best IQ tests won't correlate with Gf more than r=.7 or so. r=.35 is pretty good for a single pair. I don't know why he thinks a .24 is 'minuscule' when that means you're predicting half of variance... (I wonder if this is a graphing problem? He doesn't seem to jitter the datapoints, which for a large amount of discrete data will hide a lot of the density; a plot of r=.35 of n=6k should look much more striking, like this: http://imgur.com/KcwmJJH ) For implications, look at the first graph and think about classification rates. Look at the datapoints at 100 along one axis, then look across to see how many correspond to <10 on the other; hardly any do, and the 100s are almost all mapped onto 80+ on the other axis. Or look at the 0s. In terms of identifying the bottom decile, it's doing a good job.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: