A VAM Supporter Struggles with Value-Added Data Pt. 1

This is part 1 of a series on how we use value-added data in Tennessee and across the nation. The entire series can be found below:

Part 2: how value added data impacts teachers in the classroom
Part 3: how to use value added data constructive
Part 4: some final thoughts and questions

Before teaching, I attended public policy school at UW-Madison. While studying economics, statistics and policy analysis I became deeply convicted of the importance of using numbers to objectively evaluate public programs. Numbers, I reasoned, were fair. They couldn’t lie and weren’t subjected to human whims, thus were perfect as a basis for sound policy making.

Fast forward to this June. Over the past year I’d been hearing a lot about the inappropriateness of using value-added models in teacher-based policy decisions such as salary, licensure and hiring and firing. Value-added modeling uses past student test scores, coupled with other variables, to “predict” future student performance. The assumption is that as long as all other variation is accounted for, any variation from the “predicted” score can be attributed to the teacher and the school and is thus considered above average growth. Given my background, I set out to write a piece in support of value-added data systems like Tennessee’s TVAAS system.

However, my research into these systems has created more questions than answers. The result is that, as of this September, I find myself more conflicted about the appropriateness of value-added data systems than before starting my research. As a supporter of value-added modeling, I want to share the three facts that I have uncovered that have made me more skeptical about the appropriate use of value-added modeling in education policy decisions.

First, these systems are not be as accurate at isolating the teacher portion of student growth as I thought they were. I’d heard this contention from bloggers such as Diane Ravitch for some time, but what got my attention was a statement by the American Statistical Association (ASA) on value-added measures in education. A highly respected professional organization, their statement highlights the fact that according to most studies, only 1-14 percent of variation in test scores can be attributed to individual teachers. The rest, the assert, belongs to factors outside the classroom such as “family background, poverty, curriculum and unmeasured influences.”

This suggests that while some of the growth generally attributed to teachers can in fact be truly attributed to us, a large portion of the variation in test scores rests on variables outside of our control. This assertion by the ASA carries with it some troubling policy consequences which I’ll explore later.

Second, these models are subject to human opinion and therefore human error. As most value-added researchers will tell you (including the TVAAS system), not all value-added systems are the same. Different entities use different equations or methods for calculating value added growth (as TVAAS will tell you, “not all value-added systems are created equal”), and these equations produce different results for the exact same group of teachers. Notably, one study that I read from 2009 found considerable year-to-year changes in Florida teacher’s evaluation scores, where some teachers were marked as ineffective one year but effective the next. This same study found that up to 80 percent of the variation in test scores could be explained by “unstable or random components (i.e. not teachers).” Another study at Stanford found a high rate of variability in scores from year to year when using different models.

This is disturbing to me because it suggests that the results from these systems are not as objective as they first appear. Different models produce different results and we can’t all agree on one model that is “the best.” Because the model selection is susceptible to human opinion, it is therefore also susceptible to human error, which can then lead to erroneous policy decisions in turn.

Third, I’m concerned that value-added models are increasingly being used as the basis for an growing group of policy decisions. Currently, value-added results are being used in some positive ways here in Tennessee. They are incorporated into teacher evaluations which are then used to identify teachers in need of support.  If this was all that we did with this data I’d be satisfied, even given the information highlighted above.

However, policy makers haven’t stopped there. Value-added results are being used in high stakes decisions such as teacher surplussing and hiring and firing. We’ve also seen efforts to tie teacher pay and licensure directly to value-added data by the legislature and state board of education. This is concerning given that these systems may be less predictive and less objective than the first appear. If the model’s own creators admit that not all models are created equal (read: some are worse than others) and a large percent of the variance in scores is attributable to factors other than teachers, is it really appropriate to base these high stakes decisions on their outcomes?

That said, I’m not ready to give up on value-added data systems just yet. I still think that there is a place for the data they generate. These systems do have some predictive power, even if the strength is debatable, and therefore can also be used to provide teachers with valuable data to inform their instructional practices. I think that there might even be a place for value-added data in teacher evaluations for tested subjects, albeit at a much lower percentage than current levels and as long as they are not used in high stakes decision making.

For those looking for some additional reading in support of value-added systems, some of the questions I’ve raised are also directly challenged by value-added researchers, notably the SAS system here in Tennessee. They do provide some very thorough answers, though not enough to put me fully at ease. I’m still struggling with this concept and how we should best use this valuable but imperfect tool. As of now, I think we need to be much more cautious about how we use this valuable but imperfect tool in education policy both here in Tennessee and nationally.

This piece approaches the topic from a policy perspective grounded in research. I’ll be publishing a second piece in the coming days sharing some of my concerns with the use of value-added data from the perspective of a practicing teacher. 

1 comment for “A VAM Supporter Struggles with Value-Added Data Pt. 1

  1. ezra
    September 23, 2014 at 11:06 pm

    I don’t normally comment on fellow contributors pieces, but I will make an exception because we’ve talked about this before. I’m very happy you decided to take on this subject. I realize that there seems to be a part 2 to come, but I wanted to point out some things that add to the criticism.

    My biggest problem with TVAAS stems from it being proprietary. SAS, whose research you link to, owns this system. http://www.comptroller.tn.gov/Repository/RE/Tennessee%20VAAS%202013.pdf. But making money off a product is not the biggest problem, the larger issue is that it is not open to scrutiny. I, you, nor any other teacher or educator has seen the TVAAS algorithm. We only see the results. You can’t really evaluate, test and measure what you can’t see. TVAAS isn’t dark matter. There is no peer review of TVAAS. Furthermore, there is no way to tweak the system should we find an issue (which we can’t because, well, you know…). This raises red flags for the teacher and researcher in me.

    Ultimately, if it’s a good system, you should be paid. I understand that. Not everyone is will to put their work out there for free. However, that shouldn’t hinder an open assessment of that work. If you trust your system, it should be open to scrutiny and evaluation.

Leave a Reply

Your email address will not be published. Required fields are marked *


7 − = five