There is no better display of America’s obsession with rankings than the popularity of the high school ranking system invented by Washington Post columnist Jay Mathews. With the claim that it ranks “America’s Most Challenging Schools,” the Challenge Index, as it is popularly known, has bewitched and beguiled the nation.
Part of the charm, if you can call it that, is that the methodology is deceptively simple. Take the total number of Advanced Placement (AP), International Baccalaureate (IB), and Advanced International Certificate of Education (AICE) tests given at a school each year and divide by the number of seniors who graduated in May or June. In other words, it is a simple fraction that is proportional to the number of tests of a given ilk given at a school each year, and inversely proportional to the number of graduating seniors.
Recently, Jay Mathews and this columnist have engaged in an exchange of views on this purported single-metric measure of the quality of education in our nation’s high schools. Jay argues that “AP, IB and Cambridge test participation is the best index we have of actual learning in high school.” Is the mere participation in a test “the best index we have of actual learning,” or is it a gross mismeasure of education?
Jay argues that “the reason why you CAN be assured that an AP, IB or Cambridge course is challenging is because the tests are written and graded by outside experts, unlike all the other high school courses. They can’t be dumbed down without the test scores revealing that.” It goes without saying that the Index makes no effort to determine what the test scores reveal. Therein lies the first unsubstantiated assumption inherent to the Challenge Index: that the standard of instruction of AP, IB, and AICE courses in schools across the nation is of identical quality and rigor.
He insists, “Go interview a few AP and IB students, and then compare their experiences to those who don’t take those courses. Some of the instruction may be disappointing, but the nature of those courses means that on average their instructors are going to be better than teachers teaching regular courses. Principals make sure of that.” It is the second unsubstantiated assumption that underlies the supposed measure of actual learning: that on average AP, IB, and AICE course instructors are going to be better than teachers than those teaching regular courses.
Maryland’s largest school district, Montgomery County Public Schools (MCPS), publishes reports that allow for a test of Mathews’ assumption. Consider, for example, the report on 2015 Advanced Placement Exam Participation and Performance for Students in Montgomery County Public Schools and Public School Students in the State of Maryland and the Nation, dated December 10, 2015. Two high schools of fairly similar student demographics, Winston Churchill and Walter Johnson, had an almost identical number of students take the Advanced Placement English Language and Composition test (281, and 280 students, respectively). However, just 19.2% scored a 5 at Churchill, while 24.6% received a similar score at Walter Johnson. Nearby Walt Whitman had 163 students taking the test with 42.9% scoring a 5. At Quince Orchard 67% scored a 5 on Advanced Placement Calculus AB, with 91 students participating. At Richard Montgomery 96 students took the same test and just 27.1% scored a 5. The assumption that the standard of instruction of AP courses across schools is consistent even within a single school district simply does not hold water.
Are AP, IB, and AICE course instructors better than teachers than those teaching regular courses? Schools lack an incentive to assign the best teachers to these courses. After all, better teaching doesn’t improve a school’s ranking: recruiting more students to take the test does.
According to the MCPS report, one demographic, Asians, tended to take 5 or more AP tests in a given year. Thus, schools with large populations of Asian students are likely to enjoy a higher ranking. On average, schools in wealthier regions tended to have larger number of students taking any particular AP class, with a consequent higher ranking.
The Index also doesn’t disclose that some schools benefit from housing magnet programs that attract the best and the brightest from a wide geographic area. For example, Richard Montgomery and Montgomery Blair attract students from other parts of the county to their highly selective magnet programs, favorably skewing their rankings. Consider a rough illustration of the impact of magnet programs on a schools ranking. According to the 2014-2015 enrollment figures, there were 619 seniors at Blair, and the total number of AP tests taken in 2014 was 2271, making for an Index of 3.669. However, Blair administered 617 AP tests to the magnet Class of 2014 alone, which consisted of 95 graduating seniors. Assuming that these numbers remain roughly the same during 2014 and 2015, one could conclude that 1657 AP tests were administered to 524 graduating non-magnet students, making for an Index of 3.162. The difference of approximately 0.507 would drop the 2015 Blair ranking among Maryland schools from 19th to around 31st. The indications are that Blair would likely rank more like schools of similar demographics if the magnet was excluded.
At this columnist’s request, Mathews asked for the disaggregated data for Blair, reporting back on January 14, 2016 that “Brian Edwards [the individual responding to public information requests at that time] tells me they don’t do AP data in that form so he can’t answer your good question.” However, the fact that the magnet program did publish the AP data for graduating magnet seniors indicates that the school has the means to generate the data. Clearly, schools can skew rankings in their favor by selectively withholding data.
In a response to a draft version of this column, Jay argues, “I don’t think my arguments are unsubstantiated. They are based on hundreds of interviews with teachers and students. If you are going to argue that anything not backed by data is unsubstantiated, then most of our greatest reporting triumphs, such as Watergate, remain unsubstantiated. What you have to do to legitimately attack my argument is reporting of similar depth, and get different answers.”
However, the Index is not a news story. It is promoted as “the best index we have of actual learning.” Such an assumption must be supported by verifiable data based research. The reporting of opinions of “hundreds” of “teachers and students,” doesn’t buttress the elevation of the Index from a mere opinion to an actual metric of educational outcome. The comparison of an unsubstantiated educational metric to the watershed Watergate story seems hardly fair to the Post’s Bob Woodward and Carl Bernstein.
While the Index might have been conceived with the best of intentions, the absurdity of assuming that a single metric, the number of tests taken in a given year, would somehow measure actual learning in high school aside, the malleability of the data, the unjustified, and unsupported assumptions all conspire to make the Index a mismeasure of education. The notable absence of substantive research linking AP courses to better college outcomes, not to mention the dearth of evidence equating every AP course to college level of instruction, makes the Index a questionable measure of anything other than the propensity of schools to implement educational interventions of questionable value, all in an effort to boost rankings.