Do students’ high scores on international assessments translate to low levels of creativity?




In a previous issue of Phi Delta Kappan, Yong Zhao (2012) urged readers to stop worrying so much about the relatively poor performance of American students on measures such as the Programme for International Student Assessment (PISA), which compares the mathematics, science, and reading achievement of 15-year-olds from dozens of countries (OECD, 2010, 2013). When the 2009 PISA scores were released, noted Zhao, “The results received extensive media coverage in the United States, all emitting a sense of shock, urgency, and anxiety” (p. 56) about American students’ ability to compete with peers from China and other economic rivals. But that anxiety is misplaced, said Zhao. Teenagers in Shanghai may score highest in the world on math tests, but what the Chinese really want is for their students to excel in ways that PISA doesn’t measure: They want them to be more creative, like the Americans.

It’s no surprise, Zhao added, that some of PISA’s low- and average-performing countries, such as the U.S., are better at producing entrepreneurs like Steve Jobs and innovative companies like Apple, Spotify, and Facebook. As Apple’s cofounder Steve Wozniak explained in an interview with the British Broadcasting Company, the individualistic culture of the American educational system allows creative people to flourish. By contrast, schools in places like China and Singapore (also a high scorer on PISA) are too homogeneous and “structured” to give rise to great artists, musicians, singers, writers, athletes, and the like (p. 57).

However, Zhao’s argument relies far too much on anecdotal evidence and self-reported data, and it doesn’t hold up to careful analysis of results from PISA and other international assessments. No doubt, some Chinese and Singaporeans would agree that their educational systems should be less rigidly structured. But it’s wrong to argue that students who perform well on PISA tend to be less creative or entrepreneurial than their counterparts in other countries.

Not creative? A hasty conclusion

PISA and other large-scale international assessments have been scrutinized from various angles and perspectives (e.g., Gorur & Wu, 2015; Grek, 2009; Kreiner & Christensen, 2014; Schoultz, Säljö & Wyndhamn, 2001), and some of this scrutiny has brought about actual change in education policy. For example, David Baker and Gerald LeTendre (2005) explained that after American students performed poorly on the 1995 Third International Mathematics and Science Study (TIMSS), policy advocates used the results to justify the educational reforms of No Child Left Behind. Further, argued Baker and LeTendre (and see Johansson, 2016; Phillips & Ochs, 2003), when policy makers give too much weight to these sorts of international benchmarks, they can easily be tempted to copy the approaches they see in the highest-scoring countries, leading to an unhealthy homogenization of their school systems.

Although it’s true that test scores have been misused, Zhao’s critique of PISA — arguing that high PISA scores in East Asian countries are related to low levels of creativity — is difficult, if not impossible, to support. Certainly, nobody other than Zhao has been able to establish any causal relationships between increasing scores on international large-scale assessments and decreasing levels of innovation across countries.

To back up his thesis, Zhao compares the mathematics scores from the 2009 PISA with results of the Global Entrepreneurship Monitor (GEM) study, which is a survey of, among other things, perceived levels of entrepreneurial capability (i.e., an individual’s confidence in his or her ability to succeed in entrepreneurship) in a wide range of countries (Bosma, Wennekers, & Amorós, 2012). On first glance, the comparison is striking — countries with high scores on PISA (such as Japan, Korea, and Singapore) all had low scores on their perceived entrepreneurial capacity. At the same time, countries that performed in the middle of the pack on PISA (such as Sweden and the U.S.) ranked fairly high on entrepreneurial capacity, and one of PISA’s lowest performers (the United Arab Emirates) reported the highest entrepreneurial capacity of all.

Zhao finds this pattern to be evidence of a statistically significant relationship, showing that countries with high PISA scores tend to be less innovative. Moreover, he asserts, many Chinese and Singaporeans themselves “blame their education for their shortage of creative and entrepreneurial talents,” and states that if they’re correct, then the relationship “could be causal” (p. 60). That is, Zhao appears to be arguing that the way these countries teach math causes their students not only to do well on tests but also to become less creative. Thus, he concludes, it’s a mistake for U.S. policy makers to pursue reforms that make their schools more like those in East Asia. If they continue to push for more emphasis on standardization, test taking, and highly rigorous academic work, then students’ creativity will be seriously harmed, and the American workforce will become less innovative.

But on closer inspection, this argument turns out to be pretty weak. It may be true that Singapore and China haven’t produced many world-famous musicians, but other than that, there isn’t much evidence that East Asians suffer from a lack of creativity or, if they do, that it has anything to do with their PISA scores.

Zhao’s reliance on self-reported levels of entrepreneurial capacity is especially problematic. For one thing, the construct is not well defined, making it difficult to interpret people’s self-assessments. For another, it is unclear why entrepreneurial abilities should be equated with creativity, or why creativity should be distinguished from mathematical proficiency. Actually, mathematical reasoning and problem solving are often described as deeply creative activities. For example, Haylock (1997) argued that there are at least two major ways in which the term creativity is used in the context of mathematics: 1) thinking that is divergent and overcomes fixation and 2) the thinking behind a product that is perceived as outstanding by a large group of people. Further, creativity is associated with long periods of work and reflection rather than rapid and unique insights.

Perhaps more important, the context is so different from one country to another that it may not be possible to compare those self-assessments at all, or to know what to make of results that seem to show that people in East Asia are less innovative than their counterparts in the West. For example, it can be very difficult to start a business or pursue other forms of entrepreneurship in a country that is tightly governed by the state, while it is relatively easy to do so in the U.S. or Sweden — one would expect such differences to affect the GEM results.

But for the sake of argument, let’s assume that people’s self-assessments are entirely accurate, meaning that people in East Asia are in fact less innovative, as the GEM data appear to show. In that case, we should also expect them to perceive themselves to be stronger in math, consistent with the PISA results.


To see if this is true, I checked the data from PISA’s 2012 student background questionnaire, which included three questions asking participants to rate (on a scale of 1-4) their own mathematical abilities (Table 1). In particular, I focused on students from countries that performed well on PISA’s math section but ranked low on the GEM survey, and vice versa. These included Korea, Japan, and Singapore (East Asian countries that, along with Finland, ranked highest on PISA), as well as Sweden, the U.S., and the United Arab Emirates (U.A.E.), which ranked below average on PISA but had the highest perceptions of entrepreneurial capability in the GEM study (Bosma et al., 2012).

Using SPSS (a common statistical program), I computed a mean scale of 1-4 for students from each country — a mean (or average) score around 3 thus indicates that students perceive themselves to be good in mathematics, while those rating about 2 indicate that they do not think they are particularly good at math.


Table 2 shows the countries’ overall performance on the 2012 PISA math section, along with their students’ self-perceptions of their mathematical ability. As we can see, the lowest-achieving countries have the highest scores on their perceived skills. And while perceived ability was similar for Singapore, Sweden, and the U.K., those countries’ PISA results are very different –– Singapore’s results suggest that its students are roughly 1-2 years ahead of their peers from Sweden and the U.K. (Strietholt et al., 2014).

Notably, while students in Japan and Korea performed near the top on PISA, they ranked their own math ability much lower than did students in any of the other countries. Likewise, while the U.S. and U.A.E. performed near the bottom of this list on PISA, students from those countries ranked their own math ability the highest — U.A.E. stands out here, having both the lowest PISA score and highest self-reported ability.

Note that the disconnect between PISA scores and self-reported mathematical ability is very similar to the disconnect that Zhao found between PISA scores and self-reported entrepreneurial capacity: People from countries that score high on the math assessment perceive themselves to be less capable at math (i.e., their perceptions are incorrect), and they also perceive themselves to be less entrepreneurial (a perception that Zhao assumes to be correct). Well, which is it? Are self-perceptions of abilities accurate or not?

In fact, there is good reason to believe that self-perceptions are wrong in both cases, and that East Asians’ PISA scores bear no relation to their entrepreneurial ability. A better explanation for the disconnect between test scores and perceived abilities has to do with national and/or cultural differences in the ways people respond to attitude surveys.

It’s wrong to argue that students who perform well on PISA tend to be less creative or entrepreneurial than their counterparts in other countries.

A number of studies have shed light on these differences. For example, when Jia He and Fons Van de Vijver (2016) investigated the association between motivation and achievement using PISA data from 64 countries, they found a negative correlation at the country level — the countries with lower self-reported levels of motivation outperformed countries that had higher levels of motivation. Within all countries, the association between student motivation and achievement was positive, suggesting that motivation does in fact matter to educational outcomes. So why was the association negative at the country level? The authors point to cultural differences in response styles: People in East Asia tend to be more modest, and people in the U.S. tend to be more outwardly confident.

Let’s be cautious with self-reported measures

As this example suggests, it can be problematic to rely on people’s self-reported measures of their abilities. Within a group, people tend to give relatively accurate estimates of where they rank, but that may not be the case when it comes to comparisons across groups with different frames of reference (Johansson Myrberg, & Rosen, 2012; Marsh, 1986).

Further, evidence strongly suggests that in a variety of social and academic contexts, people with low ability are most likely to overestimate their own knowledge or skills –– a pattern often referred to as the Dunning-Kruger effect (Dunning et al., 2003; Kruger & Dunning, 1999), and top performers are most likely to underestimate their relative strengths.

Finally, if there are relatively few world-famous East Asian writers, musicians, and other artists, perhaps that has mostly to do with the West’s dominance in exporting its own values and tastes. It would be hard to argue that a country like Japan (home to manga stories, anime movies, spectacular cuisine, and on and on) lags behind any other part of the world in terms of creativity. And when it comes to economic matters, it is worth noting the gross domestic product for East Asian countries shows marked advances in recent years, and researchers are suggesting that high scores on PISA and other international assessments can be connected to their productivity (Hanushek & Woessmann, 2016). And despite Zhao’s warnings about the supposedly stifling nature of East Asian school systems, those countries appear to be producing all the entrepreneurs and innovators they need.



Baker, D. & LeTendre, G. (2005). National differences, global similarities: World culture and the future of schooling. Palo Alto, CA: Stanford University Press.

Bosma, N., Wennekers, S., & Amorós, J. (2012). Global entrepreneurship monitor: 2011 extended report: Entrepreneurs and entrepreneurial employees across the globe. London, England: Global Entrepreneurship Research Association.

Dunning, D., Johnson, K., Ehrlinger, J., & Kruger, J. (2003). Why people fail to recognize their own incompetence. Current Directions in Psychological Science, 12 (3), 83-87.

Gorur, R. & Wu, M. (2015). Leaning too far? PISA, policy and Australia’s “top five” ambitions. Discourse: Studies in the Cultural Politics of Education, 1-18.

Grek, S. (2009). Governing by numbers: the PISA “effect” in Europe. Journal of Education Policy, 24 (1), 23-37.

Hanushek, E.A. & Woessmann, L. (2016). Knowledge capital, growth, and the East Asian miracle. Science, 351 (6271), 344.

Haylock, D. (1997). Recognising mathematical creativity in schoolchildren. Zentralblatt für Didaktik der Mathematik, 29 (3), 68–74.

He, J. & Van de Vijver, F.J.R. (2016). The Motivation-Achievement Paradox in international educational achievement tests: Toward a better understanding. In R.B. King & A.B.I. Bernardo (Eds.), The psychology of Asian learners: A Festschrift in Honor of David Watkins (pp. 253-268). Singapore: Springer Singapore.

Johansson, S. (2016). International large-scale assessments: What uses, what consequences? Educational Research, 58 (2), 139-148.

Johansson, S., Myrberg, E., & Rosén, M. (2012). Teachers and tests: Assessing pupils’ reading achievement in primary schools. Educational Research and Evaluation, 18 (8), 693-711.

Kreiner, S. & Christensen, K. (2014). Analyses of model fit and robustness. A new look at the PISA scaling model underlying ranking of countries according to reading literacy. Psychometrika, 79 (2), 210-231.

Kruger, J. & Dunning, D. (1999). Unskilled and unaware of it: How difficulties in recognizing one’s own incompetence lead to inflated self-assessments. Journal of Personality and Social Psychology, 77 (6), 1121-1134.

Marsh, H.W. (1986). Verbal and math self-concepts: An internal/external frame of reference model. American Educational Research Journal, 23 (1), 129-149.

Organisation for Economic Co-operation and Development (OECD). (2010). PISA 2009 results. Paris, France: Author.

Organisation for Economic Co-operation and Development (OECD). (2013). PISA 2012 results. Paris, France: Author.

Phillips, D. & Ochs, K. (2003). Processes of policy borrowing in education: Some explanatory and analytical devices. Comparative Education, 39 (4), 451-461.

Schoultz, J., Säljö, R., & Wyndhamn, J. (2001). Conceptual knowledge in talk and text: What does it take to understand a science question? Instructional Science, 29 (3), 213-236.

Strietholt, R., Bos, W., Gustafsson, J-E., & Rosén, M. (2014). Educational policy evaluation through international comparative assessments. New York, NY: Waxmann.

Zhao, Y. (2012). Flunking innovation and creativity. Phi Delta Kappan, 94 (1), 56-61.


Citation: Johansson, S. (2018). Do students’ high scores on international assessments translate to low levels of creativity? Phi Delta Kappan 99 (7), 57-61.


STEFAN JOHANSSON ( is senior lecturer in the Department of Education and Special Education at the University of Gothenburg, Sweden.

No comments yet. Add Comment

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

stdClass Object ( [ID] => 68233 [display_name] => Stefan Johansson [first_name] => Stefan [last_name] => Johansson [user_login] => stefan-johansson [user_email] => sjohansson@fake.fake [linked_account] => [website] => [aim] => [yahooim] => [jabber] => [description] => STEFAN JOHANSSON ( is senior lecturer in the Department of Education and Special Education at the University of Gothenburg, Sweden. [user_nicename] => stefan-johansson [type] => guest-author ) 8 |


How our word choices can empower our students 

NAEP benchmarks: Neither useful nor credible

What’s wrong with imagining you’re a 5th grader?

What should Common Core assessments measure?

Testing provides crucial information 

Columns & Blogs