Good time to be a data nerd

Despite the Trump administration’s worrisome embrace of “alternative facts,” education researchers have cause for optimism: They now have access to richer and more reliable data than ever before. 


Using data to understand and improve education has been at the top of the to-do list for most education reformers in recent years. Data-driven decision making and its first cousin, evidence-based policy making, were mantras of the Obama administration across policy areas and were considered cornerstones to many of the administration’s key programs.

Despite having its day in the sun during the Obama years, data collection has been thrown some shade since President Trump came to town. Only days after taking office, leaks emerged from federal agencies indicating that the administration was clamping down on research and information, especially as it pertains to issues that the administration is less than enthusiastic about, like climate change and civil rights enforcement. Those rumors, coupled with the introduction of “alternative facts” into policy discussions, have created an environment of great concern among researchers and others who care about data and evidence.

However, despite the Trump administration casting a pall on the scientific community, education researchers actually have something to get excited about.  Two data sets — the U.S. Department of Education’s Civil Rights Data Collection (CRDC) and the Stanford Education Data Archive (SEDA) — now offer researchers, policymakers, and the public a more comprehensive (albeit sometimes depressing) picture of U.S. schools. Unlike the results of assessments like NAEP and PISA, these data go far beyond the boundaries of student achievement data. The CRDC and SEDA provide us with detailed information about the many factors and contexts that affect student achievement.

Civil Rights Data Collection

The CRDC officially came into being in 1968 during the civil rights movement. At that time, the department’s Office of Civil Rights (OCR) started collecting a wide range of student data — student enrollment numbers as well as specific information about education programs and services, all of it disaggregated by race/ethnicity, sex, English learner status, and disability. (

Federal policymakers clearly understood that information was power, and the CRDC remains an important aspect of OCR’s strategy for “administering and enforcing the civil rights statutes for which it is responsible.” In its almost 50 years, the CRDC has become more robust, adding data sources and elements that cover a range of highly relevant issues for educators, policymakers, researchers, the media, and the general public.

To support its education agenda, the Obama administration revamped the CRDC survey, requiring schools and districts to provide information on preschool students and school discipline tactics, two areas that are particularly important when evaluating the conditions of learning that affect minority and at-risk communities. The administration also widened the net for collecting data: The 2011-12 school year data marked the first time since 2000 that the department collected data from all 97,000 public schools and 16,500 school districts — representing 49 million students.

By including detailed survey questions about school discipline, bullying, chronic student absenteeism, teacher and student equity, preschool education, and college- and career-readiness, the Obama administration was able to shine a bright, data-rich light onto the many inequities that still affect student achievement. (However, the researcher in me has to point out that the CRDC is by no means a perfect database, if there is such a thing. The data is self-reported, and the survey instrument is a lengthy and time-consuming 81 pages.)

The disturbing addendum to the CRDC story is that the Trump administration has made clear its disdain for the OCR’s activities under President Obama, so continued federal funding for the CRDC could very well be at risk. It would be terribly shortsighted and wasteful to put the database on the chopping block, however, especially considering the vast federal resources that have already gone into its development. Even more important, if the American people are going to continue to support the promise of public education, they need to understand the very real inequities that still exist for many students across the country. The CRDC’s longitudinal data, with its ability to show trends over time, can help build that understanding.

Stanford Education Data Archive

The Stanford Education Data Archive (SEDA) was developed with a similar mission in mind. ( Sean Reardon, professor of poverty and inequality at Stanford University, has led the effort to create a database focused on the factors that contribute to persistent racial and socioeconomic achievement gaps. Reardon’s work has been instrumental in helping educators and policymakers understand the insidious effect of income inequality on educational outcomes. Using graphs and charts that provide an unsettling visual of income inequality across the U.S., SEDA shows the extent to which socioeconomic conditions affect student achievement. In a 2016 study for example, Reardon’s research showed that children in the school districts with the highest concentrations of poverty score an average of more than four grade levels below children in the richest districts. School districts, even those in very close geographic proximity, often provide vastly different educational experiences to their students. Reardon and his team of researchers have taken clear aim at educational inequities by providing detailed, disaggregated data about the geographic and socioeconomic conditions that profoundly affect American schools.

Unlike the CRDC, SEDA does not rely on federal funding alone — the archive is supported by private funding in addition to federal grants. This only underscores, however, a sad truth about the current state of education research. The fact that many of our most motivated and able scholars have to depend on private support reflects poorly on the nation’s commitment to educational equity and school improvement.

Having said that, research is only as good as it is useful. The CRDC and SEDA are important contributors to the field precisely because these datasets are highly relevant to improving policy and practice and because they are accessible to the public. My team at the Center on Education Policy has used each of these resources numerous times, and we see endless potential to apply the data in new ways.  We even see opportunities to pursue research questions that can only be explored by cross-referencing the two datasets. For data nerds, these are heady times.

Databases for regular folks

But what about those educators and policymakers who are not data nerds? Does the complexity of these databases limit their value in the very real world of teachers and school and district leaders? The power that goes along with access to important information should not be limited to a small group of researchers or policy elites, no matter how good their intentions.

Recent survey work by the Center on Education Policy (where I am executive director) shows that a majority of both teachers and district leaders feel they are not included in decision making at the district, state, or national levels. (

What we don’t want to do is further alienate teachers and local leaders by collecting data that can only be understood and used by a select few located in state capitals, on Capitol Hill, or at elite universities. That would defeat the purpose of bringing this important information to light.

As we contemplate the incredible amount of educational data that is now available, we should think carefully about accessibility and relevance. The CRDC and SEDA will have limited value if their treasures are barely known and lightly used. The same holds true for the smaller batches of data produced by state and local assessments and other data-driven instructional tools. When education leaders and policymakers push for more data-driven decision making in education, they need to remember that at the end of the day, it is people who make the data useful and actionable, and those people need to know how to access data and effectively apply it to their work. The Data Quality Campaign, an organization that has worked for a decade to broaden the use of data in education, articulates this point perfectly: Everyone who has a stake in education needs the right data in the right format at the right time to serve our students along their unique journeys. Well-stated and spot on. Now we just have to figure out how to make that happen.


Citation: Ferguson, M. (2017). Good time to be a data nerd.  Phi Delta Kappan 98 (7), 74-75.

MARIA FERGUSON ( is executive director of the Center on Education Policy at George Washington University, Washington, D.C.

No comments yet. Add Comment

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

WP_User Object ( [data] => stdClass Object ( [ID] => 38 [user_login] => mferguson [user_pass] => $P$Bt6b8ovk5n6fvkxENe1XL0el/rMwLm0 [user_nicename] => mferguson [user_email] => [user_url] => [user_registered] => 2018-08-24 09:36:22 [user_activation_key] => [user_status] => 0 [display_name] => Maria Ferguson [type] => wpuser ) [ID] => 38 [caps] => Array ( [author] => 1 ) [cap_key] => wp_capabilities [roles] => Array ( [0] => author ) [allcaps] => Array ( [upload_files] => 1 [edit_posts] => 1 [edit_published_posts] => 1 [publish_posts] => 1 [read] => 1 [level_2] => 1 [level_1] => 1 [level_0] => 1 [delete_posts] => 1 [delete_published_posts] => 1 [author] => 1 ) [filter] => [site_id:WP_User:private] => 1 ) 38 | 38