Blog Image

Artur Nilsson's research blog

Is there a generalizability crisis in psychological science?

Uncategorised Posted on Tue, November 16, 2021 11:11:00

Tal Yarkoni recently published an article arguing that psychological science suffers from a generalizability crisis. Although this article has caused quite a stir in the field (and quite a bit of confusion), the issue Yarkoni discusses is by no means new. It has been known by methodologists for at least a couple of decades. It is also intimately connected to the problems that led to the demise of positivism and falsificationism in the philosophy of science. But Yarkoni has provided a new statistical formulation of this problem and brought it to the attention of mainstream researchers in psychology.

I discuss the basic methodological issue, divorced from the statistical formulation, below.

The fundamental problem

Research questions, hypotheses, and conclusions in psychological science tend to be expressed in very general terms, without contextual qualifications. The implicit assumption is often that the research addresses human nature and general principles of behavior. There are at least two reasons for this.

  • The first is the nomothetic ideal that stems from our disicpline’s positivist heritage and encourages the pursuit of context free-laws.
  • The second is that an article today comes across as more important (or “cooler”), and has a greater chance of being published in a high-impact journal, if it purpurts to unveil a basic truth about human psychology than if it only covers the behavior of say American college students in a particular artificial situation.

But are such broad formulations really justifiable? Is it defensible to hide or ignore potential contextual contingencies?

In order to draw valid conclusions from empirical research, our claims need to match the observations that were made. This means that all the theoretical concepts that are implicitly or explicitly used need to match concrete aspects of the study. This is known as the problem of construct validity in the methodological literature. Following Campbell’s classical typology, we need to obtain construct validity in terms of participants (or whatever are the units of research), instruments, manipulations, and situations.

The problem that researchers commonly rely exclusively on WEIRD (Western, Educated, Industrialized, Rich, Democratic) participant samples is well-known (see my previous blog post). The fact that we also need samples of instruments, manipulations, and situations that cover the constructs, causes, and situations we want to make claims about is less known. Research articles are full of sweeping claims about broad theoretical constructs, although what the studies have looked at are really just specific, often arbitrarily selected operationalizations of these constructs out of a broad universe of possible operationalizations. Nor is a statistical (multilevel) model that permits generalizations to populations of instruments, manipulations, and situations typically used.

Solving the problem

This problem might seem insurmountable at first glance. How would it be possible to obtain representative samples of all features of the study? How would we specify the total universe of instruments, manipulations, and situations?

We would do it in the same way we obtain representative samples from the total population of individuals. Obtaining a completely random sample is not feasible in a psychology that aspires to make broad claims about all of humanity. Contra positivist philosophy of science, making broad inductivist generalizations that are completely uncontaminated by theoretical assumptions is not possible. We inevitably rely on theoretical assumptions about relevant variations among the instances of the population when we generalize. We might, for example, try to make the sample of participants representative in terms of age, gender, education, income, nation, political identity, ethnicity, culture, or geographic region—that is, we are assuming that these specific variations are relevant to the phenomena we are studying. In the same way, we need to think carefully about theoretically relevant variations in the instruments (e.g., self-report, peer report, or observations), manipulations (e.g., techniques, frequency, or duration), or situations (e.g., anonymous or non-anonymous) and take these into consideration in our studies.

Would it perhaps be the case, as some commentators have thought, that the generalizability problem does not even emerge on a falsificationist philosophy of science?

Popper argued that (a) science is nomothetic, and (b) universal claims can be deductively falsified (through Modus Tollens) by never verified, so therefore (c) science is characterized by falsification. Would it perhaps then be enough to test one prediction that operationalizes the hypothesis out of many possible predictions? After all, logic dictates that the hypothesis cannot be true if any prediction that follows from it turns out to be false. The problem with this line of reasoning is that Popper’s deductivist account of falsification never worked out. When a result conflicts with the hypothesis, this does not necessarily mean that the hypothesis is false; it might be the case that some auxiliary assumption about, for instance, the instruments, participants, or manipulations rather than the hypothesis was false (this is also known as the Duhem-Quine thesis). What is deductively refuted is the conjunction between the hypothesis and all of the myriad auxiliary assumptions that we need to rely on to test it. Therefore, as Popper was forced to acknowledge, a falsification is in the end a practical decision about what to attribute the results to rather than a simple deductive relation between hypothesis and observation statement–and it is a complex one. To rigorously falsify or corroborate a hypothesis in a falsificationist spirit, we need to repeat the study with different populations, instruments, and situations, guided by theoretical assumptions and critical thinking about relevant variations, and analyze the extent to which the result is robust contra sensitive to different auxiliary assumptions.


The generalizability problem is a real problem in psychological science and the philosophy of science–it cannot be dissolved by simple philosophical maneuvers. It does, at the same time, not make up an insurmountable crisis that should lead to the abandonment of the entire enterprise of quantitative psychology. A more constructive move is for researchers, editors, and reviewers to get better at promoting careful, nuanced, contextually contingent generalizations over wild hyped-up overgeneralizations, and tests of the robustness of results across relevant variations in participants, instruments, manipulations, and situations. Scientific inquiry is, for all its successes, inevitably falliable, imperfect, and limited. It should always be subject to critical scrutiny and improvement.

Authoritarianism on the left and the right: A sequel

Uncategorised Posted on Fri, October 08, 2021 11:49:11

In an earlier blog post, I discussed the problems with a new scale developed by Conway and colleagues to measure left-wing authoritarianism. I wrote that a better scale for measuring this construct could benefit research in psychology. Shortly after this, another team of researchers (Costello et al., 2021) published a new left wing authoritarianism scale. Although I have not studied this scale myself, it looks more promising and lacks the obvious problems of the Conway et al. scale. It is most likely an improvement on previous measures of left-wing authoritarianism.

Nevertheless, the paper written by Costello et al. on their new scale is disappointing in some respects.

A misleading narrative

It perpetuates a sensationalist (and polemical) narrative that was later picked up by the Atlantic and Quilette. According to this narrative, there is a widespread and remarkably biased belief among psychological researchers that authoritarianism on the left does not exist, and therefore it has not been studied. In the introduction of their paper, Costello et al. write: “Many decades later, a core question remains unresolved: are some individuals on the left disposed to authoritarianism?”

Although it is possible to find a handful of extravagant quotations on the rarity of left-wing authoritarianism in the history of the field, authoritarianism scholars have rarely denied the existence of left-wing authoritarianism. In fact, a plethora of research programs have studied authoritarianism on the left for decades, particularly in formerly Communist countries in Eastern Europe.

A persistent empirical finding is that authoritarianism is, other things equal, more common among right-wing conservatives than others, even when it is measured in an ideologically neutral way. This is the conclusion I drew (together with John Jost) in a recent review (Nilsson and Jost, 2020) of research on the association between authoritarianism and political ideology. Costello et al. falsely attribute the claim that left-wing authoritarianism is “‘the Loch Ness Monster’ of political psychology” to us. Nowhere, in this paper did we deny that authoritarianism on the left exists or that it is common some contexts. We even discussed left-wing authoritarianism in Eastern Europe and South East Asia. As I wrote in my previous blogpost, left-wing authoritarianism “is common in contexts where the socially sanctioned authorities and norm systems are left leaning. ” There is no conflict between any of these assertions and the observation, made by Popper (1945) among others, that totalitarian ideologies on both the left and the right pose a threat to the open society.

Conceptual confusion

In fact, the sort of research presented by Conway et al. and Costello et al. (and many others), which operates within a standard variable-oriented paradigm, has no relevance at all to the question of whether “some individuals on the left are authoritarian”. Statistical analyses of this sort address associations between variables–they cannot make existence proofs. The notion that constructing a scale to measure something proves that it “exists” (in some unspecified sense) is based on naive ontological ideas. Psychological constructs are abstract mathematical idealizations. Personality scales measure idealized quantities of a statistical abstraction–the “average person”.

This research is also not relevant to the question of whether a “shared psychological core” underlies authoritarianism on the left and on the right. To validly address this question, it is necessary to construct an ideologically neutral measure of this core and demonstrate that it is measurement invariant across left- and right-wingers who score highly on it. Thereafter, the extent to which ideology moderates the associations between the core of authoritarianism and other phenomena can be studied to map similarities and differences between manifestations of authoritarianism on the left and the right.

Left- and right-wing authoritarianism scales may of course be useful for many other purposes. But if we want to investigate the association between authoritarianism in general and ideology, then we need neutral scales that are applicable to both left- and right-wingers (see Nilsson & Jost, 2020).

What is science anyway? On trust in science, critical thinking, and the Swedish covid response

Philosophy and meta-theory Posted on Wed, December 30, 2020 18:30:29

Trust in science is a central pillar of modern democracies. Reliance on the expertise of scientific authorities is a powerful heuristic, because it impossible for one person to be an expert on everything. This heuristic works best if we trust the scientists who are the leading researchers on the topic of interest. Nevertheless, trust in science should never be unconditional. The notion of the scientist as an unassailable authority is antithetical to the very idea of science. Science is the best tool we have for understanding our world, but individual scientists are not flawless arbiters of the truth—they are often wrong and sometimes irrational (e.g., subject to groupthink, emotional conviction, and reasoning biases). A scientific attitude must therefore incorporate a readiness to think critically when there are grounds for doing so.

Critical thinking is an epistemic virtue as long as it is grounded in rational argumentation and analysis of evidence. Evidence-based critique should never be dismissed just because the critic is not an authority on the topic of interest—attacking the epistemic authority of the critic is the pseudo-scientist’s game. The perspectives of epistemic outsiders might contain insights that could inform even the expert’s understanding of his or her subject area. Whatever grains of truth the outsider’s critique might contain should be harvested. Even if the critique turns out to be wrong or ill-founded, responding to the critique might produce a more nuanced understanding of the topic in question. Critical exchange in this sense is part of the very fabric of science.

Trivial as it may seem, this point is often not well understood. High levels of trust in science are not always coupled with an understanding of the scientific attitude to the pursuit of knowledge. The Swedish covid response provides a good illustration of this.

The Swedish covid response and the case of face masks

Over the past year, the Swedish Public Health Authority has made numerous severe misjudgments concerning the spread of the corona virus Sars Cov-2 in Sweden and the efficacy of preventive measures. Scientists make mistakes; to err is human. But several things are notable. First, the representatives of the Swedish Public Healthy Authority have several times casually dismissed critiques from a plethora of highly qualified virologists, epidemiologists, immunologists, mathematicians, and other academics both nationally and internationally. Second, they have consistently erred on the side of underestimating the dangers of Sars Cov-2 and the needs for precautionary measures, and they have failed to learn from their mistakes. Third, some (but of course far from all) of the claims they have made have been highly questionable and based on fallacious arguments.

In spite of this, few critical questions have been asked by journalists and science reporters. Many journalists were initially more concerned with dismissing or ridiculing critics (e.g., calling them “hobby-epidemiologists”) and later on with what scientific authorities should be trusted than by asking questions about evidence and exposing vacuous claims and blatantly fallacious arguments.

The claims about face masks made by the Swedish state epidemiologist Anders Tegnell among others are an interesting case in point. Even as scientific evidence for the efficacy of face masks in fighting the covid pandemic has grown, Tegnell has persisted in claiming that the evidence is in fact “weak” and the studies that provide evidence for the efficacy of face masks have “problems”. It is difficult to know what exactly these vague assertions are supposed to mean because no reporter has asked him, but I am guessing that he is referring to the fact that there is no positive evidence from a fully randomized controlled double-blinded trail with tens of thousands of participants yet. Tegnell has repeated claimed that the randomized controlled trial conducted by Bundgaard et al. (2020), which did not provide clear evidence for the efficacy of face masks in Denmark, is the best study on this topic so far. There are three reasons that the invocation of this study is grossly misleading (several other scientists have already commented this, including the statistician Olle Häggström here):

  1. This study only investigated whether face masks protect the person who wears the mask from being infected, but research suggests that face masks reduce the transmission of viruses mainly by preventing those who wear them from infecting others (although some masks also protect the bearer)—that is, the central hypothesis was not tested.
  2. This study was conducted in the late spring of 2020 when the transmission rate had already declined a great deal, presumably because of the seasonality of corona viruses. The potential for aerosol transmission over longer distances is much greater in the winter, and therefore face masks have greater potential utility this time of year.
  3. This study only had the statistical power to detect extremely large effects, which were in turn wildly unlikely in the first place given the selection of outcome measure and the timing of the study.

There are many other studies that have provided evidence that face masks reduce Sars Cov-2 transmission. For instance, a German natural experiment by Mitze et al. (2020) suggested that the introduction of face mask regulations in different German regions produced a 45% reduction of the number of new infections.

It is true that we do not have certain evidence that face masks effectively reduce transmission. A completely ideal controlled experiment in natural settings with very high statistical power is yet to be reported. But we do not have this kind of evidence for the effect of smoking on lung cancer or the effect of hand washing on Sars Cov-2 transmission either. We still have good reasons to believe that all these effects exist. For instance, nicotine causes cellular changes that are known to be associated with lung cancer, soap is known to dissolve viruses, and face masks have been known to block around 85-95% of virus-containing droplets and aerosols since the beginning of the current pandemic.

Some skepticism about new findings and epistemic conservatism is understandable and can, to some extent, be warranted. But the same standards of evidence should be applied to new ideas regardless of whether they are consistent with your own preconceptions or not. Anders Tegnell has made a plethora of claims with little or no evidence to back them up. For instance, he has claimed that the Swedish recommendations were effective in reducing Sars Cov-2 transmission in the spring of 2020. By his own standards of evidence (when discussing research on face masks) there is no evidence whatsoever for this claim—there is not even a control or reference point, and no attempt to rule out alternative explanations such as disease seasonality. He has also claimed that Sweden has done quite well in handling the pandemic based on anecdotal comparisons with other countries, again with no scientific grounds—for instance, without attempts to statistically account for differences between the countries. There is, as I mentioned, probably not even strong evidence for the efficacy of hand washing according to the evidentiary standards Tegnell applied to research on face masks, even though the Public Health Authority has recommended hand washing from the start while refusing to recommend usage of face masks.

Public opinion on face masks in Sweden has begun to shift recently, and the Public Health Authority has reluctantly begun to recommend (rather than “not forbid”) face mask usage during rush hour on public transportation and in hospitals. Yet in crowded malls most people use hand sanitizers incessantly but not a face mask, although the current research suggests that widespread face mask usage would be a lot more effective in combating Sars Cov-2, which is now known to be airborne. It is likely that the deficient scientific thinking on this issue and others among representatives of the Public Health Authority has caused great harm.

Final thoughts on trust in science

Science should have a very prominent position in a modern, secular, democratic society. Trust in science needs to be high for a society to thrive. But our trust in scientists should never be unconditional. Our allegiance should ultimately be with scientific argument, evidence, method, and genuine expertise rather than the provincially sanctioned authorities of the day. Trust in science is not deference for authority or worship of anything. New generations of students and science journalists should be taught to distinguish genuine science from its pale imitations and to distinguish genuine evidence-based critiques of scientific ideas from fake news, conspiracy theories, crackpot ideas, and ideological fanaticism.

Ideology, authoritarianism, and the asymmetry debates

Uncategorised Posted on Mon, April 20, 2020 02:42:25

According to one of the classical psychological theories of ideology, conservatism is associated with a simple, intuitive, unsophisticated, rigid, and authoritarian cognitive and psychological style. This rather unflattering portrait of conservatives has been the target of criticism lately. Critics have argued that it is a product of a “liberal bias” and hostility toward conservatism among social and political psychologists. Studies have been designed to show that the associations between the aforementioned characteristics and political ideology are symmetrical—or in other words, that extremists of any ideological persuasion are simple-minded, rigid, authoritarian, and susceptible to cognitive biases.

Some of the criticism of the classical “rigidity-of-the-right” theory is undoubtedly warranted. But the problem with this theory is not that it is all wrong. The problem is that it has proved to be too simplistic, and some of the criticism of it is also too simplistic. There are probably ideological asymmetries, symmetries, and extremism effects. It all depends on what specific aspect of cognitive or psychological style you focus on. There are also considerable differences between different kinds of left- and right-wing ideologies, which have often been lumped together, somewhat misleadingly, under the broad labels of “liberalism” and “conservatism” (see Nilsson, Erlandsson, & Västfjäll, 2019; Nilsson et al., 2020).

In this blog post, I will discuss the case of authoritarianism, which has recently generated heated debates. This is a case for which I believe there is currently pretty good evidence for the existence of ideological asymmetries. Authoritarianism tends to be higher among right-wing conservatives all other things equal (see Nilsson & Jost, 2020).

Authoritarianism on the right and on the left

In 1981, Bob Altemeyer introduced a new instrument designed to measure Ring-Wing Authoritarianism (RWA), which has become tremendously popular. This instrument has been rightly criticized because it confounds authoritarianism and right-wing or conservative ideology (some of the items refer to conservative issues, e.g., “You have to admire those who challenged the law and the majority’s view by protesting for women’s abortion rights, for animal rights, or to abolish school prayer”). This content overlap could produce a spurious or inflated correlation between authoritarianism and conservatism.

To address this problem, Conway et al. (2018) introduced a parallel scale to measure Left-Wing Authoritarianism (LWA) by rewriting RWA items so that they would refer to “liberal” authorities and norms. For instance, in one of the items, “the proper authorities in government and religion” was replaced with “the proper authorities in science with respect to issues like global warming and evolution”. The authors found that LWA was strongly associated with liberal forms of prejudice, dogmatism, and attitude strength in US convenience samples. On this basis, they concluded that authoritarianism exists among left-wingers and right-wingers in essentially equal degrees.

This conclusion may seem appealing at first glance, and the paper has been uncritically cited as evidence of blatant liberal bias in political psychology by some influential figures. But this research is methodologically flawed.

Content validity. The first critical issue in validating a scale concerns the theoretical correspondence between the content of the items that comprise the scale and the definition of the construct that the scale is purported to measure. Consider the following items:

  • “Progressive ways and liberal values show the best way of life.”
  • “There is absolutely nothing wrong with Christian Fundamentalist camps designed to create a new generation of Fundamentalists.” (reverse-scored)
  • “It’s always better to trust the judgment of the proper authorities in science with respect to issues like global warming and evolution than to listen to the noisy rabble rousers in our society who are trying to create doubts in people’s minds.”
  • “With respect to environmental issues, everyone should have their own personality, even if it makes them different from everyone else.” (reverse-scored)

The content of these items is a combination of liberal political views and a rejection of epistemological and moral relativism. For a liberal individual to score low on LWA s/he would have to think that all values are equal, scientists are no more trustworthy than non-scientists with respect to scientific issues, and fundamentalist indoctrination is totally unproblematic. If anything, such a person sounds to me to be more (rather than less) susceptible to authoritarian tendencies.

Even if we would accept that the Conway-LWA scale conceptually parallels Altemeyer’s RWA scale (which is debatable), relying on either these scales to assess the empirical association between authoritarianism and political ideology is pointless, because of the problem with content overlap. Fortunately, criticism of the RWA scale has stimulated the development of more psychometrically rigorous measures of authoritarianism that disentangle different aspects of authoritarianism and remove ideological content and language (here is a sample item: “We should believe what our leaders tell us”). So far, these developments have suggested that an association between authoritarianism and right-wing conservatism remains when content overlap is removed (see Nilsson & Jost, 2020).

Structural validity. A second critical issue in validating a scale concerns the correspondence between the theorized structure of the constructs that are purportedly measured and the empirically derived structure in the data. Conway et al. have not reported any tests of structural validity in any of the papers or online supplements that I am aware of. They only report Cronbach’s alpha reliabilities, which say nothing about dimensionality, and their data are not openly accessible, which makes it is difficult to evaluate the scale. This is particularly troubling given the aforementioned problems with the content validity. Although some of the other items may capture elements of left-wing authoritarianism, it is currently impossible to tell which dimensions the instrument does or does not measure.

Nomological network. A third issue in validating a scale (at least according to some methodologists) is whether the scale correlates in a predictable or theoretically meaningful manner with conceptually adjacent and non-adjacent constructs. The main results Conway et al. (2018) invoke to support the validity of their LWA scale are a set of correlations between LWA and measures of dogmatism, intolerance, and attitude strength designed for left-wing liberals. However, there is so much overlap between these scales that finding anything other than a strong correlation between them would be almost impossible (i.e., the hypothesis that is purportedly tested is not falsifiable). For instance, these are items used to measure left-wing dogmatism (LWD) and left-wing authoritarianism (LWA):

  • LWD: “I don’t trust the modern-day scientific experts on global warming very much” (reverse-scored)
  • LWA: “It’s always better to trust the judgment of the proper authorities in science with respect to issues like global warming and evolution than to listen to the noisy rabble rousers in our society who are trying to create doubts in people’s minds.”
  • LWD: “With respect to environmental issues, there is no “ONE right way” to live life; everybody has to create their own way.” (reverse-scored)
  • LWA: “With respect to environmental issues, everyone should have their own personality, even if it makes them different from everyone else.” (reverse-scored)

I can’t help thinking that you could produce whatever correlation you want by adjusting formulations and content overlap.

General points. The argumentation by Conway and colleagues is a bit insidious in implying that either you embrace the conclusions of their paper or you think that LWA is a myth or an “invalid” construct.

  • Is LWA a “valid” construct?

Validity is not a property of constructs. A construct can be more or less useful, and it can have referents or lack referents.

  • Is LWA a “viable” or useful construct?

There is no evidence that the scale introduced by Conway and colleagues is useful. But that does not mean that the theoretical concept of authoritarianism on the left is not useful. A more psychometrically rigorous and unbiased LWA scale could benefit research in psychology.

  • Does authoritarianism on the left exist?

Yes, it does. It is common in contexts where the socially sanctioned authorities and norm systems are left leaning.

Final thoughts

Research on left-wing and right-wing authoritarianism should be judged according to the same standards of evidence. This should lead us to the conclusion that neither Altemeyer’s RWA scale nor Conway’s LWA scale can be used for assessing the association between authoritarianism and ideology. Altemeyer’s scale serves other purposes (e.g., measuring ideology or predicting prejudice). An LWA scale could too. But the very idea of basing such a scale on an RWA scale that was designed almost four decades ago is misguided. Although we still have a long way to go with respect to improving measurement practices in psychology, there have undoubtedly been major advances in psychometrics since Altemeyer’s days, and better measures of authoritarianism are available today (see Nilsson & Jost, 2020).

None of this is to suggest that the “rigidity-of-the-right” theory got everything right or that liberal bias is a myth—the picture is, as I wrote earlier, probably a lot more complicated, and liberal bias has probably shaped research in some areas. But there is a risk that researchers could go too far in the other direction—from “it’s all asymmetrical”, to “it’s all symmetrical”, or from liberal bias to anti-liberal bias—as the case of authoritarianism appears to suggest. The history of ideas is full of examples of dramatic over-reactions. For instance, the failures of positivist philosophy led many intellectuals to adopt radical forms of relativism, and the scientific failures of “soft” psychodynamic and humanistic personality psychologies led many researchers to adopt a remarkably naïve trait-theoretical reductionism. The truth is rarely found in either of the extremes.

New research on bullshit receptivity

Comments on new research Posted on Mon, July 08, 2019 01:40:23

The notions of ”alternative facts” and fake news have rapidly become viral. Although research on receptivity to falsehoods is useful, there is also a problem here. These notions are often used for ideological rather than scientific purposes—the real facts of the ingroup tribe are pitted against the lies of the other tribes. We need more research that focuses not on what facts people subscribe to but on how they engage with evidence and arguments, and how to promote a more scientific (as opposed to ideological or tribalist) attitude among the public.

One interesting new line of research focuses on the notion of receptivity to bullshit, which the philosopher Harry Frankfurt famously defined (in his book ”On bullshit”) as a statement produced for a purpose other than conveying truth (e.g., persuading or impressing others).

One type of bullshit is that which emerges when someone does not really know the answer to a question but tries to say something that sounds convincing anyway in order to come off as competent. An example is the student who is bullshitting to try to pass an exam by trying to write something that sounds good to fool the teacher. This is a type of bullshit focused on self-promotion. It has been addressed a recent paper by Petrocelli (2018).

Another type of bullshit is the political bullshit. This is the type of bullshit that results when a person says whatever s/he can to place his or her own party or ideology in the best light possible and persuade others or convince them to join him or her. This is the type of bullshit that often makes political debates and opinion journalism so predictable and boring—facts are tortured and twisted to fit into an ideological ”box”, and the whole thing is more a game of trying to ”score” a goal on the opposite team and getting cheered on by your own team than a serious engagement in a rational debate in which you are open to pursuing the truth and learning something new. This type of bullshit is focused on promoting an ingroup cause or ideology rather than the self.

It is, however, the pseudo-profound bullshit that has been the main focus on recent research.

Receptivity to pseudo-profound bullshit

Pseudo-profound bullshit is composed of sentences designed to sound intellectually profound, through the use of buzzwords and jargon, that are actually vacuous. This type of bullshit has a long history in intellectual (or pseudo-intellectual) circles. There has even been a culture of bullshitting in some academic circles, particularly in some quarters of continental and postmodern philosophy. For instance, see this funny Youtube-clip, in which the philosopher John Searle recounts a conversation in which the famous postmodernist Michel Foucault says that in Paris you need to have at least 10% that is incomprehensible in your writings to be considered a serious and profound thinker. The postmodern movement was also the target of the infamous hoax perpetrated by the physicist Alain Sokal, who was able to publish an article crammed with bullshit in a leading postmodern journal. This is how Sokal described the article when he made the hoax public:

I intentionally wrote the article so that any competent physicist or mathematician (or undergraduate physics or math major) would realize that it is a spoof … I assemble a pastiche — Derrida and general relativity, Lacan and topology, Irigaray and quantum gravity — held together by vague rhetoric about “nonlinearity”, “flux” and “interconnectedness.” Finally, I jump (again without argument) to the assertion that “postmodern science” has abolished the concept of objective reality. Nowhere in all of this is there anything resembling a logical sequence of thought; one finds only citations of authority, plays on words, strained analogies, and bald assertions.“

Another prominent source of pseudo-profound bullshit is New Age literature, particularly in the alliance between pseudo-science and spirituality that has come to be symbolized by the well-known New Age guru Deepak Chopra. A Swedish book called ”Life through the eyes of quantum physics” that recently hit the best-seller lists provides an almost parodic illustration of this sort of pseudo-profound bullshit. This book is full of vague Chopraesque claims about quantum consciousness and its ”scientifically proven” power to shape reality, including preventing serious illnesses such as cancer, promoting success in life, altering the magnetic field of the earth, and causing miracles. The authors not only did lacked knowledge of the basics of quantum physics, they had no interest in it either (as interviews have made apparent)—their interest was in selling New Age spirituality with the help of pop-bullshitting about quantum physics and superficial narratives about Eastern spiritual wisdom.

The reason that pseudo-profound bullshit is so pernicious is in part, I suspect, that it plays on the human yearning for a deep sense of mystery and understanding of the cosmos. Our existential predicament is mind-boggling and anxiety-provoking, and it is comforting to believe that there are gurus or other authorities out there with a deeper sense of the truth, and to therefore attribute your own lack of ability to understand what they say to our own ignorance.

Recent findings

How do you study bullshit receptivity scientifically? First, you need a sample of bullshit sentences. Fortunately, there is a very simple, algorithmic way of constructing such sentences. You let a computer randomly string together impressive-sounding buzzwords into a syntactically correct sequence. There are a number of such bullshit generators available online, including the Postmodernism generator and the Wisdom of Chopra. These sentences are by definintion bullshit, since they are constructed absent concern for the truth.

In a pioneering paper that won them the Ig-Nobel Prize, Pennycook, Cheyne, Barr, Koehler, and Fugelsang (2015) constructed a set of bullshit sentences (e.g., “Wholeness quiets infinite phenomena”) through this method, with a focus on New Age-jargon, and then let people rate how profound they thought these sentences were. The found that receptivity to the bullshit sentences was associated with an intuitive cognitive style, a lack of reflectiveness, supernatural beliefs, and other related constructs. Pennycook and Rand (2019) have later also found that this sort of receptivity to pseudo-profound bullshit plays a role in receptivity to fake news.

My colleagues and I constructed a Swedish measure based on the Pennycook et al. (2015) paradigm. We have used this measure to address, among other things, the debates in political psychology over whether there are ideological asymmetries in epistemic orientations (Nilsson, Erlandsson, & Västfjäll, 2019). We found in essence that social conservatism (and particularly moral intuitions about ingroup loyalty, respect for authority, and purity) is robustly associated with receptivity to pseudo-profound bullshit, consistent with the classical notion of a “rigidity of the right”. Interestingly, we also found a particularly high bullshit receptivity among persons who vote for the green party in Sweden, and a very low bullshit receptivity among right-of-center social liberals.

What are the mechanisms driving these differences? A part of it appears to be a failure to critically engage with information. Like Pennycook and colleagues, we have found that bullshit receptivity is robustly associated with low cognitive reflection, and we have also found it to be negatively associated with numeracy and positively associated with confirmation bias.

But this cannot be the whole story. For example, the greens were close to the average in terms of cognitive reflectiveness in our study. We speculated that their high bullshit receptivity is instead due to a strong openness to ideas that is not always tempered by critical thinking. Interestingly, two papers suggesting that this is indeed a mechanism underlying bullshit receptivity appeared right after our paper was accepted for publication. Bainbridge, Quilan, Mar, and Smillie (2019) found that receptivity to pseudo-profound bullshit is associated with the personality construct “apophenia”—the tendency to see patterns where none exist—which is a form of trait openness. Walker, Turpin, Stolz, Fugelsang, and Koehler (2019) measured illusory pattern perception through a series cognitive tests rather than personality questions but came to a similar conclusion—bullshit-receptive persons tend to endorse patterns where none exist.

There may of course also be other mechanisms that contribute to receptivity to pseudo-profound bullshit. For example, Pennycook and colleagues have suggested that perceptual fluency contributes to receptivity to fake news. It is possible that persons who are commonly exposed to a specific type of pseudo-profound jargon are more likely to be receptive to this kind of bullshit.

Another great addition to this growing body of research is a paper by Čavojová, Secară, Jurkovič, and Šrol (2019), which presents conceptual replications of many of the key findings on receptivity to pseudo-profound bullshit in Slovakia and Romania. I often lament that psychology fails to take the problem of WEIRD samples and studies seriously, but these studies certiainly do. By demonstrating that the research paradigm I have discussed here is meaningful and useful outside of the U.S. and Western Europe, they put this new, fascinating field on firmer ground.

Key papers


Bainbridge, T. F., Quinlan, J. A., Mar, R. A., & Smillie, L. D. (2019). Opennes/Intellect and susceptibility to pseudo-profound bullshit: A replication and extension. European Journal of Personality, 33(1), 72-88.

Čavojová, V., Secară, E-C., Jurkovič, M., & Šrol, J. (2019). Reception and willigness to share pseudo-profound bullshit and their relation to other epistemically suspect beliefs and cognitive ability in Slovakia and Romania. Applied Cognitive Psychology, 33(2), 299-311.

Nilsson, A., Erlandsson, A., & Västfjäll, D. (2019). The complex relation between receptivity to pseudo-profound bullshit and political ideology. Personality and Social Psychology Bulletin.

Pennycook, G., Cheyne, J. A., Barr, N., Koehler, D. J., & Fugelsang, J. A. (2015). On the reception and detection of pseudo-profound bullshit. Judgment and Decision Making, 10(6), 549-563.

Pennycook, G. & Rand, D. G. (2019). Who falls for fake news? The roles of bullshit receptivity, overclaiming, familiarity, and analytic thinking. Journal of Personality.

Petrocelli, J. V. (2018). Antecedents of bullshitting. Journal of Experimental Social Psychology, 76, 249-258.

Walker, A. C., Turpin. M. H., Stolz, J. A., Fugelsang, J. A., & Koehler, D. J. (2019). Finding meaning in the clouds: Illusory pattern perception predicts receptivity to pseudo-profound bullshit. Judgment and Decision Making, 14(2), 109-119.

Meta-theoretical myths in psychological science

Philosophy and meta-theory Posted on Wed, November 28, 2018 02:05:00

There is a lot of talk of “meta science” in psychology these days. Meta science is essentially the scientific study of science itself—or, in other words, what has more traditionally been called “science studies”. The realization that psychological science (at least as indexed by articles published in high-prestige journals) is littered with questionable research practices, false positive results, and poorly justified conclusions has undoubtedly sparked an upsurge in this area.

The meta-scientific revolution in psychology is extremely sorely needed. It is, however, really a meta-methodological revolution so far. It has done little to rectify the lack of rigorous meta-theoretical work in psychology, which dates back all the way to the behaviorist expulsion of philosophy from the field (for example, see this paper by Toulmin & Leary, 1985). Psychology is today, as philosopher of psychology André Kukla has remarked (in this book), perhaps more strongly empiricist than any scientific field has been at any point in history. Although many researchers have an extremely advanced knowledge of statistics and measurement, few have more than a superficial familiarity with contemporary philosophy of science, mind, language, and society. When psychologists discuss meta-theoretical issues, they usually do it without engaging with the relevant philosophical literature.

I will describe three meta-theoretical myths that I think are hurting theory and research in psychology. This is not a complete list. I might very well update it later.

1. Scientific explanation is equivalent to the identification of a causal mechanism

This is on all counts an extremely common assumption in psychological science. In this respect, psychological theorizing is remarkably discordant with contemporary philosophical discussions of the nature of scientific explanation. While there can be little doubt that mechanistic explanation is a legitimate form of explanation, the notion that all scientific explanations fall (or should fall) in this category has not been a mainstream view among philosophers for several decades. Even some of the once most vocal proponents of explanatory reductionism abandoned this stance long ago. One of today’s leading philosophers of science, Godfrey-Smith (2001, p. 197) goes as far as to assert (in this book) that “It is a mistake to think there is one basic relation that is the explanatory relation . . . and it is also a mistake to think that there are some definite two or three such relations. The alternative view is to recognize that the idea of explanation operates differently within different parts of science—and differently within the same part of science at different times.”

Psychology is particularly diverse in terms of levels of explanation, ranging from instincts and neurobiology to intentionality and culture-embedment. For example, functional explanations (the existence of success of something is explained in terms of its function) are very popular in cognitive psychology. In my own field, personality and social psychology, a lot of the explanations are implicitly intentional (reason-based) explanations (a mental event or behavior is explained in terms of beliefs, desires, goals, intentions, emotions, and other intentional states of a rational agent). The reasoning is often that it would be rational for people to act in a particular way (people should be inclined to do this or that because they have this or that belief, goal, value, emotion, etc.) and that this explains why they de facto tend to act in this way. Even though the researchers seldom recognize it themselves, this is not a mechanistic explanation. The cause of the action is described in intentional rather than mechanistic terms. Not all causal explanations are mechanistic explanations (a very famous essay by the philosopher Donald Davidson that first made this case can be found here).

It is of course possible to argue that these are not real scientific explanations—that the only real scientific explanations are mechanistic. The important thing to realize is that this is akin to saying that much, perhaps most, of psychological research really is not science. In fact, even the so called causal mechanisms purportedly identified in psychological research are generally quite different from those identified in the natural sciences. Psychological research is usually predicated on a probabilistic, aggregate-level notion of causality (x causes y in the population if and only if x raises the probability of y in the population on average ceteris paribus) and a notion of probabilistic, aggregate-level mediation as mechanistic explanation, while the natural sciences often employ a deterministic notion of causality.

2. Statistical techniques contain assumptions about ontology and causality

I do not know how widespread this myth really is, but I have personally encountered it many times. Certainly, statistical tests can be based on specific assumptions about the ontology (i.e., the nature of an entity or property) of the analyzed elements and the causal relations between them. But the idea that these assumptions would therefore be intrinsic to the statistical tests is fallacious. Statistical tests merely crunch numbers—that is all they do. They are predicated on statistical assumptions (e.g., regarding distributions, measurement levels, and covaration). Assumptions about ontology and causality stem wholly from the researcher who seeks to make inferences from statistical test to theoretical claims. They are, ideally, based on theoretical reasoning and appropriate empirical evidence (or, less ideally, on taken-for-granted conventions and presuppositions).

One common version of this myth is the idea that techniques such as path analysis and structural equation modeling, which fit a structural model to the data, are based on the assumption that the predictor variables cause the outcome variables. This idea is also related to the notion that tests of mediation are inextricably bound up with the pursuit of mechanistic explanation from a reductionist perspective. These ideas are false. Structural models are merely complex models of the statistical relation between variables. Mediation analyses test whether there is an indirect statistical relation between two variables through their joint statistical relation to an intermediate variable. These tests yield valuable information about the change in variable in light of the change in other variables, which is necessary but far from sufficient for making inferences about causality. The conflation of statistical techniques with “causal analysis” in the social sciences is based on historical contingencies (i.e., that it what they were initially used for), rather than rational considerations (for example, see this paper by Denis & Legerski, 2006).

Yet another related idea is that statistical tests are based on presuppositions regarding the reality of the variables that are analyzed. It is true in a trivial sense that there is little point in performing a statistical test unless you assume that the analyzed variables have at least some reference to something out there in the world—or, in other words, that something is causing variation in scores on the variable. But the critical assumption is just that something is measured (much like science in general presupposes that there is something there to be studied). Assumptions about the ontology of what is measured are up to the researcher. For example, statistical analyses of “Big Five” trait data are consistent with a wide variety of assumptions regarding the ontology of the Big Five (e.g., that they are internal causal properties, behavioral regularities, abstract statistical patterns, instrumentalist fictions, socially constructed personae). Furthermore, the finding that scores on an instrument have (or do not have) desirable statistical properties does not tell us whether the constructs it purportedly measures are in some sense real or not. A simple realistic ontology is not necessary; nor is it usually reasonable, which brings us to the third myth.

3. Psychological constructs have a simple realistic ontology

At least some versions of this myth appear to be very common in psychological science. In its extreme form, it amounts to the idea that even abstract psychological constructs correspond to real internal properties under the skin, like organs, cells, or synapses, that are cut into the joints of nature in a determinate way. There are several fundamental problems here.

First, scientific descriptions in general are replete with indeterminacy.  There are often multiple equally valid descriptions that are useful for different purposes. In biology, for example, there are several different notions of ‘species’ (morphological, genetic, phylogenetic, allopatric), with somewhat different extension, that are used in different branches of the field. In chemistry, even the period table of elements—the paradigmatic  example of a scientific taxonomy—may be less determinately “cut into the joints of nature” that popular opinion would suggest (see this paper by the philosopher of science John Dupré). In psychology, the indeterminacy is much greater still. The empirical bodies of data are often difficult to overview and assess, both the phenomena themselves and the process of measurement may be complicated, and particularly intentional descriptions have messy properties. Debates over whether, for example, personality traits, political proclivities, or emotions “really” are one-, two-, or n-dimensional are therefore, from a philosophical perspective, misguided (and, by the way, another common mistake is to confuse conceptual representations such as these ones, which can have referents but not truth values, with theories, which have truth values!) What matters is whether the models are useful. Sometimes it may be the case that multiple models have legitimate uses, for example by describing a phenomenon with different levels of granularity and bandwidth. There are practical benefits in having the scientific community unite around a common model, but this is often not motivated by the genuine superiority of one model over the competitors.

Second, psychological constructs are commonly identified in terms of individual differences between persons. They are, in this sense, statistical idealizations or convenient fictions (“the average person”) that are useful for describing between-person variation in a group. The differences exist between persons rather within any particular person (as particularly James Lamiell has argued for decades, for example in this paper). It is of course possible to study psychological attributes that we have good reasons for ascribing to individuals in terms of between-person constructs. But the opposite chain of reasoning is fallacious; it is not possible to directly infer the existence or structure of an attribute at the level of the individual from models or constructs that usefully represent between-person variation at the level of the group aggregate (see, for example, this recent paper by Fisher, Medaglia, & Jeronimus, 2018). For example, it is misleading to describe personality traits such as the “Big Five” as internal causal properties, as has often been the case (see also this interesting paper by Simon Boag). This does not (contrary to what some critics have argued) necessarily imply that suchlike between-person constructs are useless for describing the psychology of individuals, but only that a naïve realistic ontology of the phenomena that they identify is precluded.

Third, at least insofar as we employ intentional descriptions (and possibly other descriptions as well), portraying persons as basically rational agents that harbor beliefs, desires, emotions, intentions, and other intentional states, we are faced with an additional problem. On this level of description, a person’s ontology is not just causally impacted by the external world; it is in part constituted by his or her relation to the world (this is often called the ‘externalism’ of the mental). This is because intentional states derive a part of their content from those aspects of the world they represent. The world affords both the raw materials that can be represented and acted upon and frameworks for how to represent and organize these raw materials. It is, in this sense, necessary for making different kinds of intentional thought and action possible. Therefore, at least some psychological attributes exist in the person’s embedment in the world—fully understanding them requires an understanding of both the person’s international psychological properties and his or her world, including both personal circumstances of life and the collective systems of meaning that actions (both behavioral and mental) are embedded within (see, for example, this classical paper by Fay & Moon, 1977).

On top of this, we have the problem most thoroughly explicated by the philosopher of science Ian Hacking (in this book) that many psychological attributes are moving targets with an interactive ontology. This means that the labels we place on the attributes (e.g., that certain sexual orientations have been viewed as pathological, immoral, or forbidden) elicit reactions in those who have the attributes and responses from the surrounding social environment that, in turn, change the attributes.

Psychology is still WEIRD

Comments on new research Posted on Wed, November 14, 2018 15:20:26

Psychological science is fraught with problems. One of these problems that has recently attracted widespread attention is the proliferation of false positives, which is rooted in a combination of QRPs (questionable research practices), including “p-hacking” (choosing analytical options on the basis of whether they render significant results) and “HARKing” (hypothesizing after the results are known), and very low statistical power (i.e., too few participants). Overall, psychology has responded vigorously to this problem, although much remains to be done. Numerous reforms have been put in place to encourage open science practices and quality in research.

Another problem that has become widely recognized recently is that psychological research often makes inferences about human beings in general based on studies of a thin slice of humanity. As Henrich, Heine, & Norenzayan (2010) noted in a landmark paper, participants in psychological research are usually drawn from populations that are WEIRD (Western, Educated, Industrialized, Rich, Democratic), which are far from representative of mankind—in fact, they turn out to frequently be rather eccentric, even when it comes to basic cognitive, evolutionary, and social phenomena such as cooperation, reasoning styles, and visual perception (see also this interesting preprint by Schultz, Bahrami-Rad, Beauchamp, & Henrich that very thoroughly discusses the historical origins of WEIRD psychology).

The paper by Henrich and colleagues has racked up almost 5000 Google Scholar citations. Yet a recent paper by Rad, Martingano, and Ginges (2018) suggests that the impact of the Henrich et al. paper on actual research practices in psychology has been minimal, at least as indexed by research published in the high-prestige journal Psychological Science. Rad et al. find that researchers persist in relying on WEIRD samples and show little awareness of the WEIRD problem: “Perhaps the most disturbing aspect of our analysis was the lack of information given about the WEIRDness of samples, and the lack of consideration given to issues of cultural diversity in bounding the conclusions” (p. 11402).

Explaining the persistence of the WEIRD problem

How can it be that psychology has responded so vigorously to the problem with false positives, yet so inadequately to the WEIRD problem? Surely both problems are equally serious, are they not? I can think of at least three possible explanations.

1. First and foremost, the WEIRD problem is a manifestation of a much broader problem. It is a manifestation of the lasting influence of the marriage between logical positivism and behaviorism that shaped psychology for almost half a century. Psychological research was supposed to yield universal facts, just like physics, by employing “neutral”, culture-free materials and methods, a quantitative methodology, and hard-core empiricism. Given the vast historical impact of this ideal, it is no mystery that psychology remains both WEIRD and theoretically unsophisticated. This is simply the implicit paradigm under which psychology has operated for more than a century. While the problem with false positives is a problem signaling a crisis within this paradigm, the WEIRD problem is a meta-problem with the paradigm itself.

2. Second, it is possible that researchers do not realize the severity of the WEIRD problem because they are immersed in a homogeneous community of like-minded individuals with similar concerns, and their exposure to other intellectual cultures is limited. Here it is important to note that the WEIRD problem is not limited to participant selection. It is a problem of testing WEIRD theories on WEIRD samples with WEIRD methods. I personally often find psychological theories and concepts US-centric (e.g., the reification of “liberals” and “conservatives” in political psychology or the pre-occupation with the self and neglect of other aspects of the person’s worldview in personality psychology)—which is not surprising given that most of the leading researchers in psychology are from the United States—and I still live in the broader Western cultural sphere.

3. A third possible explanation for the persistence of the WEIRD problem is that there are many practical difficulties involved in conducting research in non-WEIRD contexts. A lot of things could go wrong. You need high-quality translations of research materials. You also need to obtain a reasonable degree of measurement invariance across languages and populations to be able to make meaningful comparisons between them. Even so, the results may not be at all what you expected. Perhaps the theories and instruments do not perform as they are supposed to do. Of course, on a purely scientific basis such findings would be extremely important. But perhaps researchers still find it is easier to just stick to studying well-known populations under well-known conditions in order to more easily find support for their hypotheses and publish their work.

Moving forward

The WEIRD-problem needs to attain the same status as the false positives-problem in psychology. As Rad, Martingano, and Ginges (2018) suggest, authors need to do a much better job reporting sample characteristics, explicitly tying findings to populations, justifying the sampled population, discussing the generalizability of the findings, and investigating existing diversity in their samples. Journals and funders need to start encouraging these practices. Given all the work involved in conducting non-WEIRD research and the fierce competition over research funding and space in high-impact journals, we are unlikely to see any real change unless the inclusion of non-WEIRD research will give extra points.

When it comes to the problem with WEIRD perspectives, psychology might need to become more open to scholarship born out of non-WEIRD (particularly non-US) contexts. An increased openness to philosophical, meta-theoretical, historical, and anthropological scholarship in general, which is for the most part completely ignored in psychological science today, would be particularly helpful. That would help us both to address the WEIRD-problem and to make psychology a more theoretically sophisticated science.

The evolutionary foundations of worldviews

Comments on new research Posted on Wed, November 07, 2018 15:27:16

When taking a graduate course on evolutionary psychology a few years ago, I thought a bit about the potential evolutionary bases of worldviews. I was specifically interested in the opposition between humanistic and normativistic perspectives posited by Silvan Tomkins Polarity Theory (more information here) that is encapsulated in the following quotation: “Is man the measure, an end in himself, an active, creative, thinking, desiring, loving force in nature? Or must man realize himself, attain his full stature only through struggle toward, participation in, conformity to a norm, a measure, an ideal essence basically prior to and independent of man?” (Tomkins, 1963).

Evolutionary bases of of normativism and humanism

Drawing on Tomkins’ (1987) notion that “the major dynamic of ideological differentiation and stratification arises from perceived scarcity and the reliance upon violence to reduce such scarcity”, I suggested (in my term paper) that conditions of resource scarcity should have fostered a tough-minded climate where the strong and hostile could prove their worth by contributing to resource provision, and those weak or vulnerable were met with anger, contempt, and disgust. I suggested that humanism is to a greater extent rooted in the problem of forming stable alliances with other persons and groups, which requires interpersonal trust and empathy.

Because psychological traits co-evolve as entire “packages” in response to particular adaptive contexts, it is reasonable to predict that humanism and normativism co-vary with other psychological and physiological traits that also help to solve the respective adaptive problems. Normativism may have co-evolved with other traits that helped to solve the problem of resource acquisition, such as aggressiveness, physical strength and formidability, risk-taking, conscientiousness, persistence, and diligence—this should be true at least among men, who are thought of as the primary resource providers in an evolutionary context. Humanism may instead have co-evolved with traits such as empathy, altruism, agreeableness, and concern for the welfare of individuals, which are crucial for social bonding.

Egalitarianism and upper-body strength

Interestingly, a portion of the aforementioned hypotheses have  subsequently been tested. The results of twelve studies conducted in various countries are reported in a recent paper by Michael Bang Petersen and Lasse Lauritsen titled Upper-body strength and political egalitarianism: Twelve conceptual replications. Drawing on models of animal conflict behavior, Petersen and Lauritsen suggest that attitudes related to resource conflict (i.e., egalitarianism) should be related to upper-body strength among males, which was crucial for the resolution of resource conflicts in our evolutionary past. They argue that “formidable individuals and their allies would be more likely to prevail in resource conflicts and needed to rely less on norms that enforced sharing and equality within or between groups in order to prosper”.

The measures of upper-body strength employed include both self-report measures and objective measures of formidability. The one major limitation of these studies—and this is a major limitation—is that there was, as far as I understand it, no control for significant environmental factors such as time spent in the gym, physical exercise background, occupation, or use of performance enhancing drugs (although other more indirectly relevant variables such as socioeconomic status and unemployment experiences were taken into consideration). Nevertheless, it is interesting to note that the authors find a clear relationship among men (but not women) between physical formidability and social dominance orientation (which encompasses egalitarianism) but not between formidability and right-wing authoritarianism.

Toward an evolutionary understanding of worldviews?

In order to establish that there is genetic covariation (not just covariation in general) between formidability and worldviews, future research needs to do a better job controlling for crucial environmental influences (recent studies have apparently started to do this). Behavioral genetics methods can also be employed to more directly assess genetic covariation. In addition to this, a broader range of worldview dimensions (e.g., normativism and humanism, which are correlated with authoritarianism and social dominance) and physiological predispositions could easily be taken into consideration. Let us hope that this will indeed what will happen over the next years.

Next »