Do we need to challenge thoughts in cognitive behavior therapy?

Richard J. Longmore a,, Michael Worrell a,b


Cognitive behavior therapy (CBT) emphasizes the primacy of cognition in mediating psychological disorder. It aims to alleviate distress by modifying cognitive content and process, realigning thinking with reality. Recently, various authors have questioned the need for CBT therapists to use logico‐rational strategies to directly challenge maladaptive thoughts. Hayes [Hayes, S.C. (2004). Acceptance and commitment therapy and the new behavior therapies. In S.C. Hayes, V.M. Follette, & M.M. Linehan (Eds.), Mindfulness and acceptance: Expanding the cognitive behavioral tradition. (pp. 1–29). New York: Guilford] has identified three empirical anomalies in the research literature. Firstly, treatment component analyzes have failed to show that cognitive interventions provide significant added value to the therapy. Secondly, CBT treatments have been associated with a rapid symptomatic improvement prior to the introduction of specific cognitive interventions. Thirdly, there is a paucity of data that changes in cognitive mediators instigate symptomatic change. This paper critically reviews the empirical literature that addresses these significant challenges to CBT. A comprehensive review of component studies finds little evidence that specific cognitive interventions significantly increase the effectiveness of the therapy. Although evidence for the early rapid response phenomenon is lacking, there is little empirical support for the role of cognitive change as causal in the symptomatic improvements achieved in CBT. These findings are discussed with reference to the key question: Do we need to challenge thoughts in CBT?


  1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
  2. CBT component analysis studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
    • Component analysis studies for depression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175

2.1.1.      Behavioral activation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176

2.1.2.      Automatic thoughts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176

2.1.3.      Cognitive therapy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176

  • Component analysis studies of anxiety disorders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
  • Summary of component analysis studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
  1. Behavioral experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
  2. The rapid early response debate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
  3. Cognitive mediation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
  4. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                         185


In his 1995 paper, David A. Clark set out to define the principles that distinguish cognitive therapy from other therapeutic approaches:

“The therapist and patient collaborate to identify distorted cognitions, which are derived from maladaptive beliefs or assumptions. These cognitions and beliefs are subjected to logical analysis and empirical hypothesis‐testing which leads individuals to realign their thinking with reality.” (Clark, 1995; p. 155, emphases in original).

Clark, in common with other leading cognitive therapists including Aaron T. Beck (Beck, 1970; DeRubeis, Tang, & Beck, 2001), asserts that a fundamental postulate of the cognitive model of psychopathology is that cognitive change is central to treating psychological disorder, stating that “all therapies work by altering dysfunctional cognitions, either directly or indirectly” (p. 158). Hence, modification of maladaptive cognition is both the process by which cognitive therapy is effective, as well as the mechanism of change in psychotherapy more generally.

In line with this fundamental postulate, the authors of treatment manuals for cognitive behavior therapy (CBT) invariably describe techniques for modifying the meaning of thoughts (e.g.,Beck, Rush, Shaw, & Emery 1979; Beck, 1995). Hackmann, (1997), in common with these authors, draws attention to specific techniques for challenging the meaning of dysfunctional thoughts on the basis of their internal logic: for example, evaluating the evidence for and against a thought using written thought records, eliciting more realistic thoughts and looking for evidence of distorted thinking.

As well as distinguishing CBT theory, this emphasis on working with dysfunctional cognitions defines what cognitive therapists actually do in their therapy sessions. Blagys and Hilsenroth (2002) conducted a review of the psychotherapy process literature and found that evaluating, challenging and modifying thoughts was one of the hallmarks that distinguished CBT practice from that of other therapies.

However, is the direct, explicit modification of maladaptive cognitions a necessary or sufficient intervention in CBT? Hayes (2004) identified three “empirical anomalies” in the CBT outcome literature. First, component analyzes do not show that cognitive interventions provide added value to the therapy. Second, CBT treatment is often associated with a rapid, early improvement in symptoms that most likely occurs before the implementation of any distinctive cognitive techniques. Third, measured changes in cognitive mediators (the thoughts and beliefs held by the cognitive model to underpin disorder) do not seem to precede changes in symptoms. In the same vein, Orsillo, Roemer, Lerner, and Tull (2004) note that a problem in evaluating the mechanisms of change in cognitive behavior therapy is that CBT is “a general label for a variety of techniques, any of which may actually be the active ingredient of treatment” (p. 71).

The logical, “rationalist” approach to modifying cognition has also been a subject to critical reappraisal at a theoretical level by researchers proposing that multi‐level cognitive architectures provide a more accurate description of human cognition. For example, Brewin, in his recent M.B. Shapiro Award Lecture (Lawson, 2005), questions the proposition that challenging thoughts leads to changes in feelings and behaviors. Drawing on the findings of cognitive science, he proposes that human cognition comprises multiple memory systems and knowledge stores, not all of which are open to introspection. Further, he suggests that these multiple systems give rise to multiple self‐representations. He concludes that therapy is better employed in a constructivist strengthening of more helpful representations, rather than the logico‐deductive challenging of unhelpful representations. Likewise, Teasdale (1997) contrasts “propositional” meanings (which are semantic and declarative) with “implicational” meanings (implicit, holistic meanings which reflect the “felt sense” of experience and are closely linked to emotion). For Teasdale (1997) implicational beliefs “do not have a specific truth value that can be assessed” (p. 152). Therefore, therapy should focus on changing the client’s “actual way of being” (p. 150) rather than aiming at logically challenging beliefs. For Teasdale, there is little value in exposing the logical flaws in the client’s thought processes: to do so is merely to focus on semantic, declarative meanings without engaging emotional processes.

These are provocative critiques. CBT is rightly proud of its tradition as an empirically based therapy. It seeks to modify and improve interventions on the basis of their demonstrable effectiveness. This paper investigates the “empirical anomalies” identified by Hayes (2004) in order to determine whether they call into question some of the fundamental tenets of the therapy. It will examine what the research literature tells us about the value of the cognitive change procedures that form an explicit part of the therapy, as well as the evidence pertaining to the implicit role of cognitive modification as the mediating mechanism of symptomatic improvement. To what extent does effectiveness of CBT depend upon direct cognitive interventions? Does therapeutic change rely on cognitive change? In short, do we need to challenge thoughts in CBT?

CBT component analysis studies

Studies which attempt to split the therapeutic elements comprising CBT and deliver separate components of the therapy either to different groups (between‐subjects designs) or to the same participants in sequence (within‐subjects designs) provide useful evidence about the effectiveness of specific CBT interventions. A literature search for such studies was undertaken using the PsychInfo and Medline databases, limited to publications in English since 1980. References from studies elicited by this method were also followed up. A summary of the relevant CBT component analysis studies is given in Table 1.

Component analysis studies for depression

Perhaps the most significant component analysis study to have examined the active elements of CBT is that of Jacobson et al. (1996). Indeed, this particular paper is cited by Hayes (2004) to support his conclusion that the cognitive components of CBT do not actually add to the effectiveness of the treatment.

Table 1

CBT component analysis studies

AT = Automatic Thoughts; BA = Behavioral Activation; CBT = Cognitive Behavior Therapy; CT = Cognitive Therapy; ERP = Exposure and Response Prevention; IE = Imaginal Exposure; RET = Rational Emotive Therapy; SCD = Self Control Desensitisation; SIT = Self‐Instructional Training.

The participants in the Jacobson et al. (1996) study comprised 152 people who met the Diagnostic and Statistical Manual – Third Edition (Revised) criteria for major depression (DSM‐IIIR; American Psychiatric Association, 1987). They were randomly allocated to one of three treatment conditions: Behavioral Activation (BA), Automatic Thoughts (AT), and Cognitive Therapy (CT). Treatment protocols were produced for each of the conditions. The main elements of each treatment are described as follows:

Behavioral activation

  • Monitoring daily activities and assessing the pleasure and mastery involved.
  • Assigning new daily activities to increase pleasure and mastery.
  • Imaginal rehearsal of activities before they are undertaken.
  • Problem solving any obstacles to undertaking new activities.
  • Interventions to address social skills deficits such as assertiveness and communication training.

Automatic thoughts

This treatment condition comprised the above elements of the activation condition, plus techniques designed to assess and modify negative automatic thoughts as follows:

  • Identifying automatic thoughts arising in‐
  • Use of Daily Thought Records.
  • Examining evidence for and against automatic thoughts.
  • Examining attributional biases in the way participants assess their successes and failures.
  • Homework assignments in which participants assess the validity of their negative interpretations.

Cognitive therapy

This condition comprised the elements of the above two conditions, plus techniques designed to modify dysfunctional assumptions and core beliefs as follows:

  • Using the ‘downward arrow’ technique and discussion to identify core beliefs.
  • Identifying the advantages and disadvantages of core beliefs and assumptions; identifying alternatives.
  • Homework exploring the operation of core beliefs and assumptions and applying alternative beliefs.

Treatment comprised 20 sessions. For all three conditions, treatment was provided by four trained cognitive therapists with a range of 8 to 12 years post‐qualification experience, all of whom had experience in at least one previous clinical trial for cognitive therapy. Indeed, Jacobson and Gortner (2000) note that the therapists in the trial all had an allegiance to CT. Protocol adherence in the treatment conditions (for example, ensuring that AT techniques did not ‘leak’ into the BA treatment condition) was monitored by randomly selecting tapes of 20% of the sessions and blind rating them for their use of all the techniques from the three conditions. Adherence proved to be high: for example, in the BA condition, BA interventions were rated as ten times more prevalent than AT or CT interventions combined.

Outcome, as measured by the Beck Depression Inventory (BDI; Beck et al., 1979) and the Hamilton Rating Scale for Depression (HRSD; Hamilton, 1967), showed no significant differences between the conditions, either at the conclusion of treatment or at 6‐month follow‐up. For example, in the BA condition (n = 56) mean BDI pre‐treatment was 29.3, which had reduced to 9.1 post‐treatment. In the AT condition (n = 43), mean BDI was 29.1 pre‐treatment, compared with 9.3 post‐treatment, while for the CT condition (n = 50), the comparable figures were 29.8 and 10.3, respectively. Gortner, Gollan, Jacobson and Dobson (1998) provide two‐year follow‐up data for the study’s participants, which show no significant differences between treatment conditions in terms of relapse rates. Therefore, it was concluded that behavioral activation alone was as effective in treating depression as BA combined with cognitive interventions.

Dobson and Khatri, (2000) consider that the Jacobson et al. (1996) study has potentially serious implications for both the theory and practice of CBT for depression. In practical terms, behavioral activation is simpler and more cost effective, both in the training of therapists and delivery to patients. Further, they suggest that efficacy of behavioral interventions in the trial must lead to doubt regarding the significance of cognitive factors in the etiology and maintenance of depression.

However, findings of the Jacobson et al. (1996) study need to be qualified in several respects. Despite being a component study, the treatment conditions remained fairly broad and complex. It could be argued that the BA condition contained cognitive elements in its protocol: for example, the imaginal rehearsal of activities and problem solving of obstacles may entail the modification of dysfunctional thoughts and assumptions. In addition, it should be borne in mind that the AT and CT conditions were not less effective than BA, but rather equally effective – a finding that would not be expected if cognition played no role in the maintenance of depression.

The findings of Jacobson et al. run so radically counter to the CBT paradigm, that Jacobson and Gortner (2000) report they gave rise to questions in the research community regarding the quality of the cognitive therapy provided in the study. However, they point out that the recovery rates in the study (about 60% of participants in each condition no longer meeting diagnostic criteria at termination) compare favorably with those of previous outcome studies of CBT for depression. Therefore, the case at issue is not that CT performed poorly, but rather that BA performed so well. Despite this, it was generally recognized that such potentially important results require replication. Therefore a further research program–the University of Washington Treatment for Depression Study– was instigated.

This new study has been running since 1997. The results have not yet been formally subjected to peer review and journal publication. However, some details of the design are given in Jacobson and Gortner (2000). Rather than being conceptualized as component research comparing the elements of CBT, the University of Washington study conceives of behavioral activation as a separate treatment in its own right. The aim of BA for depression is to increase access to environmental “anti‐depressant reinforcers.” Jacobson and Gortner give examples of what this might mean in practice: “People who have lost their jobs are aided in finding employment. Those who have lost their friends or lovers are aided in finding new people. Those with relationships that have gone sour are aided in fixing them.” (Jacobson & Gortner 2000, p. 113). BA is compared in the new study with CT, pharmacotherapy and pill placebo conditions. According to Jacobson and Gortner, the three active treatment conditions are being administered and supervised by advocates of each approach who are experts in their field. Jacobson and Gortner state the intention to enter 500 participants into the trial.

Members of the study’s research team have presented some of the acute phase results at conferences (Dimidjian et al., 2003). These findings are also alluded to in Martell, Addis and Dimidjian (2004). Here, it is stated that BA proved as effective as antidepressant medication, and that both produced superior outcomes to cognitive therapy, which was no more effective than the pill placebo condition (Martell et al., 2004, p. 155). Given that the Washington University study purports to be the largest outpatient therapy trial for depression yet undertaken, these would appear to be perplexing results for the proponents of cognitive therapy as a treatment for depression. However, putting aside the comparison with BA, the Washington results would seem to contradict many previous studies which have shown CT to be equally effective as pharmacotherapy as a treatment for moderate depression (e.g.,Hollon et al., 1992; Murphy, Simons, Wetzel, & Lustman, 1984) and severe depression (DeRubeis, Gelfand, Tang, & Simons, 1999; Hollon & DeRubeis, 2004). Therefore, it will be necessary to wait for the publication of the study’s data before its full implications can be assessed.

Two studies have sought to compare the components of CBT for depression by delivering them not to separate groups of participants, but instead to the same participants in separate phases of treatment. In the first such study, Zettle and Hayes (1987) compared cognitive restructuring with behavioral homework for 12 people with depression. Time series analyzes revealed no superiority for either element. Further, the order in which the components were introduced did not influence effectiveness. Unfortunately, difficulty in interpreting these results comes from Zettle and Hayes’ decision to incorporate an initial treatment component – “distancing,” which they describe as helping participants to recognize that depressogenic beliefs are hypotheses rather than facts, through the use of similes and reattribution techniques. For all participants, distancing came before the behavioral and cognitive components were introduced. Therefore, the “distancing” intervention (which is ostensibly cognitive) may have blunted the impact of the study’s overtly cognitive phase of treatment. The second study, reported in Jarrett and Nelson (1987), compared the effectiveness of logically analysing dysfunctional thoughts with that of behavioral experiments in 37 depressed subjects in group treatment across 12 sessions. Once again, there was no difference in outcome between the two classes of intervention.

Component analysis studies of anxiety disorders

In the light of Jacobson et al’s findings, Dobson and Khatri, (2000) called for component analysis research into the active elements of CBT for disorders other than depression. In particular, they noted that such research should address whether the cognitive components of CBT contributed to the efficacy of treatment for the anxiety disorders, above and beyond exposure to anxiety‐provoking stimuli. There is a scattering of studies in the research literature that allow such a comparison. These will now be considered.

Borkovec and Costello (1993) had previously found that a CBT package comprising applied relaxation, self‐control desensitisation and brief cognitive therapy proved an effective treatment for generalized anxiety disorder (GAD). Consequently, Borkovec, Newman, Pincus and Lytle (2002) set out to find the active component of this package by comparing cognitive therapy (CT) with self‐control desensitisation (SCD) and a condition in which the two elements were combined (CBT). What is illuminating about this study is that the component treatment conditions were relatively “pure.” The CT condition examined worry thoughts through logical analysis, examination of evidence and probability estimates, labelling logical errors, decatastrophizing and identifying alternative thoughts and beliefs. The SCD condition comprised progressive relaxation training and the imaginal rehearsal of anxiety‐provoking situations in order to establish cued relaxation. For Borkovec et al., the CT condition targeted the cognitive response system, while the SCD condition aimed at ameliorating physiological responsiveness.

Sixty‐nine participants meeting diagnostic criteria for GAD received 14 sessions of treatment. At termination and at a 2‐year follow‐up, all the treatment conditions had proven to be equally effective on a comprehensive battery of anxiety measures. From this result, Borkovec et al. concluded that “targeting some response processes in therapy for a sufficiently long period of time might therefore affect all of the other processes involved in the maintenance of anxiety” (p. 296). Therefore, this study provides evidence that challenging thoughts is potentially effective for GAD, but no more effective than a desensitisation procedure, and certainly not a necessary element of treatment. Corollary evidence to this effect is provided by Öst and Breitholtz (2000), who compared applied relaxation to cognitive therapy for GAD (the CT condition incorporated behavioral experiments – the study was not intended as a component analysis comparing cognitive with behavioral elements of treatment) and found no difference between outcomes on any measure for their 36 participants, either at the termination of treatment or a 1‐year follow‐up.

Two component studies of posttraumatic stress disorder (PTSD) have sought to separate exposure from cognitive interventions in order to specify the active element of treatment. Tarrier et al. (1999) compared the effectiveness of imaginal exposure and cognitive therapy in a between‐subjects design with 72 participants who had been diagnosed on the Clinician Administered PTSD Scale (CAPS; Blake et al., 1990). The cognitive therapy condition involved looking at beliefs regarding the meaning of the trauma event and the attributions made following it. Therapists in this condition avoided discussion of the trauma itself in order to prevent inadvertent exposure, although they had access to trauma narratives from the assessment phase of the study. The aim of the CTcondition was to focus on the interpretation of the event rather than the event itself. In contrast, the imaginal exposure condition involved describing the trauma event as if it was happening, while attempting to visualise it. This condition precluded interventions aimed at cognitive restructuring. Participants received 16 sessions of therapy, and there were no significant differences on outcome measures between the two conditions at the termination of treatment or 6‐month follow‐up, with one exception: although both conditions provided equal levels of clinical improvement, a significantly greater minority of participants experienced a worsening of symptoms at the termination of imaginal exposure than CT (9 participants in the IE condition, compared with 3 in the CT condition). This difference was no longer evident by the 6‐month follow‐up.

Lovell et al. (2001) also compared imaginal exposure with cognitive restructuring for PTSD, but incorporated a treatment condition that combined these elements. They hypothesised that the different treatment conditions might impact differently on the specific symptom clusters of PTSD, with exposure having greatest impact on anxiety‐based symptoms such as re‐experiencing and avoidance, while cognitive restructuring would produce greater change in appraisal‐related features such as numbing, detachment and guilt. At termination, there were no significant differences between the treatment conditions, either in terms of overall outcome or differential impact on specific symptoms.

Two studies have compared the effectiveness of purely cognitive interventions with that of exposure alone in the treatment of social phobia. Emmelkamp, Mersch, Vissia and van der Helm (1985) evaluated rational emotive therapy (RET) against self‐instructional training and exposure in vivo for 34 participants in a between‐subjects design. After six 2 1/2 h group treatment sessions, there were no significant differences between the RET and exposure groups on a range of outcome measures. Mattick, Peters and Clarke (1989) compared exposure in vivo with cognitive restructuring and a condition that combined the two treatments. Treatments were administered in a group format. Again, there were no significant differences between the treatment conditions in terms of endstate functioning or clinical improvement, either at the termination of treatment or at 3‐month follow‐up, although the group receiving cognitive restructuring alone showed significantly greater gains on two subscales of the Fear Questionnaire (Marks & Mathews, 1979).

More commonly, studies of social phobia have compared full CBT treatment protocols with exposure alone. Three have reported superior outcomes for CBT. Firstly, Butler, Cullington, Munby, Amies and Gelder (1984) compared selfdirected exposure with exposure plus anxiety management (AM). The AM package comprised relaxation and distraction techniques along with rational self‐talk. Addition of AM produced a significantly superior outcome on measures of social avoidance. However, the heterogeneity of the AM treatment, along with the extreme brevity of the cognitive component (participants received just 2 1/2 h of AM training in total), makes it impossible to identify the active element of treatment. Secondly, Mattick and Peters (1988) compared group CBT with exposure in vivo. CBT participants performed significantly better on a behavioral test involving exposure to situations derived from individually constructed fear hierarchies. Further, 48% of the exposure group reported ongoing avoidance of phobic situations at a 3‐month follow‐up, compared with 14% of the CBT group. However, using precisely the same treatment protocols, therapists and measures, the study by Mattick et al. (1989) failed to replicate these findings. Finally, Hofmann (2004) compared group CBT and exposure in vivo, finding that the CBT group scored significantly lower on the Social Phobia and Anxiety Inventory (SPAI; Turner, Beidel, Dancu, & Stanley, 1989) at a 6‐month follow‐up. However, Hofmann notes the limited scope of the exposure exercises in his study: “Although participants feared numerous social situations, this intervention focused primarily on the patients’ public speaking anxiety” (Hofmann, 2004, p. 393). Sub‐optimal exposure, delivered in a group format, may have prevented the generalization of gains to other phobic situations. For the CBT group, however, participants considered the full range of their individual fears, and it is possible that this may have allowed a more creative, self‐directed use of exposure (or lessening of avoidance) after the termination of active treatment.

Recognizing the methodological shortcomings of many studies in this area, Feske and Chambless, (1995) conducted a meta‐analysis of data from fifteen studies of CBT and exposure treatments for social phobia. They concluded that “exposure with and without cognitive modification are equally effective in the treatment of social phobia” (Feske & Chambless, 1995, p. 712).

Three treatment studies for obsessive compulsive disorder (OCD) allow for a comparison of the effectiveness of behavioral and cognitive interventions. For these studies, the behavioral intervention in each case has been exposure and response prevention (ERP), which involves exposure to the obsessional thought and resisting the performance of the accompanying compulsive ritual. van Oppen et al. (1995) followed a treatment protocol that involved participants receiving either “pure” cognitive therapy (without behavioral experimentation) or ERP for six sessions, before behavioral experiments were added to the CT condition. Measures taken at the sixth session showed no significant differences between the two conditions. Likewise, both Emmelkamp and Beens, (1991) and de Haan et al. (1997) compared purely cognitive interventions with ERP for randomly assigned OCD sufferers. For both studies, there was no significant difference in outcome between the two conditions.

Studies that compare full CBT packages with ERP alone are also illuminating with regards to whether adding cognitive interventions improves the effectiveness of ERP. In the van Oppen et al. (1995) study described above, behavioral experiments involving exposure were added to the pure CT condition after 6 weeks of treatment, and this produced a slight superiority for the full CBT package as compared with the pure ERP condition at the termination of treatment. Vogel, Stiles, and Götestam (2004) compared ERP combined with relaxation training (as an attention placebo control) with ERP combined with CT. They found no significant difference in the treatment outcome for those who completed treatment, but they did find a significantly lower dropout rate for those receiving the ERP plus CT combination. In another study, McLean et al. (2001) compared CBT to ERP in the group treatment of OCD. They found that ERP produced significantly superior outcomes: for example, at 3 month follow‐up, 13% of patients treated with CBT were recovered, compared with 45% of patients treated with ERP alone. More recently, the same research group has compared CBT to ERP delivered on an individual basis, finding no significant difference in outcome (Whittal, Thordarson, & McLean, 2005). Clark (2004) concludes “At this time there is no evidence that adding cognitive interventions to ERP […] is clinically more effective than ERP alone for a heterogeneous sample of patients with OCD.” (p. 275).

Summary of component analysis studies

Generally, it can be noted that there is only a limited number of component analysis studies that seek to specify the active elements of CBT. Further, there is a virtual absence of multiple baseline studies comparing the effectiveness of behavioral and cognitive interventions. This is somewhat surprising given the volume of studies that exists on the effectiveness of CBT as a whole, the effort expended on specifying the particular cognitive distortions that may be implicated in particular disorders, and the emphasis that the cognitive model of psychopathology places on the role of cognition in mediating distress. From the component analysis studies that have been examined here, however, it is possible to conclude that, for a range of clinical problems, specifically cognitive interventions do not produce superior outcomes to the behavioral components of CBT. In studies examining depression, behavioral activation alone is equally as effective as behavioral activation combined with cognitive interventions. In studies examining the anxiety disorders, exposure‐based interventions are of comparable efficacy to techniques that focus on examining thoughts.

What conclusions are possible in light of these empirical findings? First, it could be argued that there simply have not been a sufficient number of studies of sufficient quality undertaken to allow the true potency of cognitive interventions to be demonstrated. That, as more is learned regarding the specific cognitive distortions and cognitive processes underlying specific disorders, it will become increasingly possible to measure and target these cognitions, improving the overall efficacy of the therapy. However, the obvious counterargument is that until the added value of cognitive interventions is empirically demonstrated, it would be wise to maintain a scientific neutrality: “we have no evidence it works better,” is preferable to “we know it works better, but haven’t quite managed to show it yet.” Second, it could be argued that the relentless equality of outcome between conditions in the above studies provides evidence that common, non‐specific therapy factors underpin clinical improvement. Third, reiterating the conclusion drawn by Borkovec et al. (2002), it could be hypothesised that the different components of CBT examined in the above studies work on different systems. According to Lang’s ‘three systems’ model, (Lang, 1985; 1988) there are physiological, behavioral, and cognitive aspects to psychological problems. Producing change in one system will induct change in the other two. For example, for depression, behavioral activation works directly on the behavioral system, but produces change in the cognitive and physiological systems; likewise, cognitive interventions produce change in the cognitive system, but also lead to corresponding changes in the behavioral and physiological systems. If this were the case, then it would not be surprising that component studies show equal outcomes for behavioral and cognitive interventions.

Behavioral experiments

Looking merely at component studies, it would be possible to conclude that we do not need to challenge thoughts in CBT: that explicitly cognitive interventions provide little or no added value to behavioral ones. However, it could be argued that cognitive interventions work best precisely when they are combined with behavioral aspects of the therapy. In other words, that the CBT whole is distinctly greater than the sum of its parts. Nowhere might this synergy better be illustrated than in the use of behavioral experiments.

Behavioral experiments (BEs) are planned, experiential activities designed to test the validity of beliefs. For Bennett‐Levy et al. (2004) they are a form of collaborative empiricism aimed at achieving cognitive change: “BEs in cognitive therapy are primarily a means of checking the validity of thoughts, perceptions, and beliefs, and/or constructing new operating principles and beliefs.” (p. 11). They contrast this cognitive change mechanism with behavior therapy’s emphasis on exposure and habituation. Indeed, for Bennett‐Levy et al., behavioral experiments confirm the cognitive primacy hypothesis.

Bennett‐Levy (2003) compared the effectiveness of automatic thought records (ATRs) with BEs for three groups of student participants undertaking cognitive therapy training courses (N = 27), who practiced these techniques on themselves. Participants rated the effectiveness of each technique on a scale from 1 to 10, where 1 = no effect and 10 = very strong effect, for three outcomes: a) increased awareness of internal processes b) belief change, and c) behavior change. Bennett‐Levy also gathered qualitative data on how participants experienced the techniques from written reflections, interviews and group reflection. It was found that there was no difference between ATRs and BEs in terms of increased awareness of internal processes. However, in terms of belief change and behavior change, BEs were rated as significantly more effective than ATRs. A grounded theory analysis of the qualitative data revealed that participants felt BEs involved greater emotion, anxiety and impact on “emotional belief.” Bennett‐Levy concludes that this is in line with Teasdale’s (1997) Interacting Cognitive Subsystems (ICS) model. Namely, ATRs operate on the propositional code, with its emphasis on declarative truth value; while BEs impact more on the holistic implicational code, with its emphasis on “heart level,” emotional belief.

However, there are problems with Bennett‐Levy (2003) study. Most notably, his participants were students studying cognitive therapy. They were a self‐selecting sample with a presumed allegiance to cognitive therapy and knowledge of its theory and methods. Further, they were not a clinical sample. Extrapolating the findings to clinical groups is difficult, because such groups may struggle with a higher order of dysfunctional beliefs and patterns of avoidance.

Bennett‐Levy aligns his explanation for the effectiveness of BEs with Teasdale’s (1997) ICS model. However, Teasdale’s views differ from those of Bennett‐Levy et al. (2004) in terms of the role afforded to cognitive change:

“it may not be sufficient simply to gather data about experience and to evaluate beliefs against this evidence. Rather, it may be necessary to arrange for actual experiences in which new or modified models are created” (Teasdale, 1997, p. 150; emphasis in original).

For Teasdale, the value of BEs is that they represent “enactive procedures” that activate different schematic models. They change the patient’s “actual way of being” (p. 150) rather than providing evidence which, rationally considered, leads to belief change.

It remains a moot point whether behavioral experiments need to challenge thoughts in order to be effective. It is equally plausible that cognitive change follows behavioral change, rather than driving it. Studies testing these alternative hypotheses could potentially compare the effectiveness of full BEs (with their emphasis on identifying and challenging beliefs), with targeted behavioral exposure. In the absence of such research, it is open to question whether behavioral experiments gain anything from prioritising cognitive change.

The rapid early response debate

As noted in the introduction, one of the “empirical anomalies” identified by Hayes (2004) as casting doubt on the need for cognitive interventions in CBT is that “Clinical improvement in CBT often occurs before the presumptively key features have been adequately implemented” (Hayes 2004, p. 4). This phenomenon has been termed “rapid early response”.

Ilardi and Craighead (1994) reviewed data from eight studies of the efficacy of CBT for depression. They found that in 7 out of the 8 studies, between 60% and 70% of the total improvement patients experienced across the course of therapy happened within the first 4 weeks. They hypothesized that initial improvement was unlikely to be explained by cognitive modification techniques, and concluded instead that non‐specific factors mediated the majority of the improvement seen in CBT.

However, Tang and DeRubeis (1999) re‐examined the data reviewed by Ilardi and Craighead. They found that in 7 out of the 8 studies, the therapy actually comprised two sessions per week within the initial 4‐week period. Therefore, patients were receiving a dose of eight sessions of therapy within that time – an amount easily sufficient to allow the introduction of cognitive modification techniques and see resulting improvements. In addition, Tang and DeRubeis separated out the data for those who responded well to treatment and those who did not. This was done for two of the original eight studies, where full data were available. Responders were defined as those whose BDI score at the completion of treatment was less than 10; non‐responders were those with a BDI of 10 or more. Separating the data in this way demonstrated that, within the first 4 weeks of treatment, in one study responders experienced 56% of their total improvement after 53% of their therapy sessions, and in the other study 43% of their total improvement after 42% of their therapy sessions. On the other hand, non‐responders experienced almost 100% of their total improvement in the first 4 weeks. Thus, those for whom the therapy worked well experienced a steady improvement across the course of treatment, while those for whom it did not work so well experienced initial gains which then levelled off.

Ilardi and Craighead (1999) replied to the criticisms of Tang and DeRubeis by noting that in their original study, both responders and non‐responders still showed a rapid early response to treatment. Further, they maintain that even though there were two treatment sessions per week, 4 weeks of treatment still constitutes too short a time frame for patients to absorb and respond to cognitive modification techniques. Wilson (1999) asserts that early rapid response may be a general phenomenon, applying to conditions other than depression. For example, Wilson et al. (1999) found that by session five of treatment, people with bulimia nervosa showed a 76% improvement in their frequency of binge eating.

On balance, evidence for the early rapid response phenomenon is not compelling. Tang and DeRubeis note that eight sessions is sufficient to allow for cognitive modification techniques to be introduced. Indeed, Fennell and Teasdale (1987) state that introducing patients to the cognitive rationale and providing initial homework tasks may be sufficient to produce substantial improvement in some cases. When the number of sessions is accounted for instead of the number of weeks, then the early response to therapy is not greatly disproportionate to the dose of sessions received. Of course, the lack of evidence for an early rapid response in cognitive therapy is not in itself evidence for the effectiveness of cognitive procedures.

Cognitive mediation

The final “empirical anomaly” in the CBT research literature alluded to by Hayes (2004) is that “Changes in cognitive mediators often fail to explain the impact of CBT” (p. 4). Indeed, Ilardi and Craighead (1999) note that “the status of targeted cognitive modification as the sine qua non of patient improvement in CBT remains in doubt” (p. 298). Here, it is necessary to return to the distinction between cognitive intervention as procedure in the practise of CBT, and cognitive change as the mechanism underpinning symptom improvement in the CBT model of psychopathology. According to the CBT model, cognitive interventions work through changing underlying cognitive structures. However, unless their effects were demonstrated to be mediated by changes in such underlying structures, there would remain the possibility that they work through other means. Likewise, impact of behavioral interventions, although not directly targeting dysfunctional beliefs, could theoretically work by altering cognitions. Hence, status of cognition as a mediating mechanism is a matter for separate empirical enquiry from the effectiveness of the CBT interventions aimed at changing these cognitions.

Comparison studies show that cognitive change can be a product of other therapeutic treatments, apart from CBT. Not only can cognitive change occur as a result of other treatments, but also the degree of change can be equal to that produced by CBT. For example, McManus, Clark, and Hackmann (2000) studied outcomes for 23 participants with social phobia. They received either cognitive therapy or pharmacotherapy with instructions for self‐administered exposure. Belief change for the key cognitions thought to underlie social phobia was measured by the Social Probability and Cost Questionnaire (SPCQ; modified from Foa, Franklin, Perry, & Herbert, 1996). It was found that by the termination of treatment, there were equally significant reductions in participants’ estimations of the probability and cost of negative social events in both treatment conditions. Likewise, Simons, Garfield and Murphy (1984) studied both symptom and cognitive change for 28 depressed outpatients who received either cognitive therapy or pharmacotherapy. Cognitive measures included the Dysfunctional Attitudes Scale (DAS; Weissman & Beck, 1979) and the Automatic Thoughts Questionnaire (ATQ; Hollon & Kendall, 1980). Both CT and pharmacotherapy produced similar levels of change on the cognitive as well as the symptom measures. As a result, Simons et al. concluded that cognitive change was a part of the improvement seen in treatment, rather than the primary cause of improvement. These findings are reinforced by a meta‐analysis conducted by Oei and Free (1995) that concluded that cognitive change is no greater as a result of CBT than as a result of drug treatment or other therapies.

Hollon and DeRubeis (2004) acknowledge these findings. However, they note that “the relevant question is not whether change in cognition is specific to cognitive therapy, but whether the pattern of change over time is consistent with causal agency.” (p. 56). They cite a study by DeRubeis et al. (1990) that tracked symptomatic and cognitive changes during the treatment of depression with either cognitive therapy or pharmacotherapy. In this study, changes in cognition for those treated with CT predicted changes in symptoms, while this was not the case for those who received pharmacotherapy. Hollon and DeRubeis (2004) state that this finding supports the conclusion that cognitive change was the mechanism of change in the CTcondition, but the consequence of change for those receiving pharmacotherapy. Tang and DeRubeis (1999) have also examined the phenomenon of the “sudden gain” that some patients make during cognitive therapy for depression. By examining individual data for three patients across two studies, they found that sudden decreases in depressive symptoms (as measured by the BDI) were often preceded in the previous therapy session by patients verbalizing a change in at least one important belief. Reviewing this evidence as a whole, Hollon and DeRubeis (2004) conclude that “there are indications that theoretically specific ingredients drive the change in depression by virtue of inducing change in existing beliefs and information‐processing strategies” (p. 57).

However, is this conclusion warranted? In the original DeRubeis et al. (1990) study, four measures of cognition were used: the Automatic Thoughts Questionnaire (ATQ), the Hopelessness Scale (HS; Beck, Weissman, Lester & Trexler, 1974), the Dysfunctional Attitudes Scale (DAS) and the Attributional Style Questionnaire (ASQ; Seligman, Abramson, Semmel, & von Baeyer, 1979). DeRubeis et al. compared participants who completed a 12‐week course of CT (including data from participants who received CT plus imipramine) (n = 32) with participants who had received 12 weeks of pharmacotherapy alone (n = 32). Measures of cognition and depression were taken pretreatment, after 6 weeks (midtreatment) and at 12 weeks (post‐treatment). Both groups showed significant and comparable improvements in depression at the termination of treatment. DeRubeis et al. also found that for both the CT and pharmacotherapy groups, there was significant improvement on all four cognitive measures between pretreatment and midtreatment. Further, there were no significant differences between the groups in the degree of improvement on any of the cognitive measures. Therefore, pharmacotherapy appears as effective as cognitive therapy in producing cognitive change. The key finding of DeRubeis et al., however, is that for participants receiving CT, greater midtreatment changes on the ASQ and DAS were predictive of lower posttreatment depression scores. From this, they conclude that cognitive change is instrumental to the amelioration of depression for those receiving CT, while being epiphenomenal to the improvement of those receiving drug treatment alone.

This begs the question as to why equally significant changes in cognitive content for the pharmacotherapy group failed to predict how well participants fared. Regardless of the treatment modality, the cognitive model would hold that “Alterations in the content of the person’s underlying cognitive structures affect his or her affective state and behavioral pattern” (Beck et al., 1979, p. 8). DeRubeis et al. are aware of this anomaly. They hypothesise that measured cognitive changes in the CT group are indicative of changes at a deeper level: namely, the use of active cognitive and behavioral problem solving strategies. However, if such strategies are the true “theoretically specific ingredients” driving change in CT for depression, then they are neither specified nor measured in the study. Indeed, DeRubeis et al. state that “the hypothesis we offer is speculative and not, in its specific form, supported by the present findings” (p. 867).

Jacobson et al. (1996), as part of their component study of behavioral and cognitive interventions for depression, measured changes in putative cognitive mediators using the Automatic Thoughts Questionnaire (ATQ), and the Expanded Attributional Style Questionnaire (EASQ; Peterson & Villanova, 1988). They attempted to establish if there was a correlation between early cognitive change on these measures and late change in depression symptom scores. In contrast to the findings of DeRubeis et al., they did not find a causal, temporal link between changes in cognitive mediators and later changes in symptom scores for the AT (automatic thoughts) or CT (cognitive therapy) treatment conditions. However, they did find that change on two subscales of the EASQ (stable and global attributional style) correlated with later improvements in depression for patients treated in the BA (behavioral activation) condition. The study of Jacobson et al. (1996), using similar measures and temporal analyzes to the study of DeRubeis et al., failed to demonstrate evidence for cognitive mediation.

Burns and Spangler (2001) also examined the causal relationship between cognitive change and symptom change in depression and anxiety using structural equation modelling. Dysfunctional attitudes for 521 outpatients with depression and anxiety were measured using the Burns Perfectionism Scale (PS; Burns, 1980) and the Burns Interpersonal Attitude Scale (BIAS; Persons, Burns, Perloff, & Miranda, 1993). They found that although dysfunctional attitudes were correlated with changes in depression and anxiety, there was no causal effect linking belief change with symptom change.

Finally, Hollon and DeRubeis’s assertion that cognitive change is the cause of symptom change in cognitive therapy but the consequence of symptom change in pharmacotherapy is not borne out by the temporal relationship between data from the Simons et al. (1984) study. Data illustrating this point is given in Table 2.

Table 2

Changes in cognitive variables and BDI score (after (Simons et al., 1984)

CT = cognitive therapy; BDI = Beck Depression Inventory; ATQ = Automatic Thoughts Questionnaire; DAS = Dysfunctional Attitudes Scale.

It can be seen from Table 2 that there is a similar gradient of change in cognitive variables and symptoms of depression for both the cognitive therapy and pharmacotherapy conditions. Indeed, if anything, changes on cognitive measures occur earlier in the drug treatment condition relative to changes in BDI scores. The opposite temporal pattern might be expected if cognitive change were driving symptom change in CT, but a consequence of symptom change in pharmacotherapy.

Teasdale et al. (2001) report findings from the Cambridge and Newcastle trial of cognitive therapy for residual depression. Here, 158 patients were randomized to receive antidepressant medication and clinical management either alone or together with 18 sessions of cognitive therapy. A battery of cognitive measures was administered in order to see if cognitive change mediated relapse rates. It was found that an extreme responding style on these questionnaires predicted early relapse, rather than the specific content of cognitions. Teasdale et al. take this to be indicative of a “black and white” thinking style that both mediates relapse and responds differentially to treatment with CT. However, the clarity of these findings is somewhat compromised by the fact that a non‐depressed control group produced more extreme positive responses on the same questionnaires.

In summary, the evidence that cognitive variables mediate therapeutic change in CBT is somewhat limited. The key findings by DeRubeis et al. (1990) have not been replicated. Although Tang and DeRubeis (1999) present data on “sudden gains” in therapy following belief change, this was only for three patients. The finding of Teasdale et al. (2001) of a mediating cognitive style (extreme responding) does not fully differentiate recovered depressives from non‐clinical controls. Meanwhile, a variety of studies has shown that cognitive change is an outcome of other treatments, to the same extent as in CBT. Jacobson et al. (1996) suggest that difficulty identifying cognitive mediators of therapeutic change may reflect the poor quality of measures of beliefs and attitudes. Nevertheless, an important element of the rationale for cognitive interventions–that changes in cognition mediate therapeutic change in CBT–currently lacks empirical support.


This paper has attempted to answer an important question at the heart of CBT theory and practice: do we need to challenge thoughts in CBT? In essence, this question concerns the logical, rationalist methods used to challenge dysfunctional thoughts and beliefs. Nor is the question an arbitrary one. Proponents of a new generation of behavioral therapies propose that the rational challenging of thoughts is superfluous (Hayes, Follette, & Linehan 2004). Likewise, those who claim that human cognition comprises multiple processes, stores and codes seek new mechanisms by which to achieve change at these various levels (Segal, Williams, & Teasdale, 2002). These theoretical developments and others like them have led to an increased emphasis on behavioral change, constructivism and attentional control – the so‐called “Third Wave” of CBT. At the same time, they have led to a decreased emphasis on the rational challenge of the content of thoughts.

This paper has attempted an empirical rather than a theoretical examination of the status of cognitive interventions in CBT. A comprehensive review of component research into the active elements of CBT was undertaken. This examined studies both of depression and the anxiety disorders. The review showed that, almost without exception, component studies found no difference in effectiveness between the cognitive and behavioral elements of CBT. Nor did cognitive interventions provide “added value” to behavioral interventions. Taken together, these studies provide a substantial body of research showing that cognitive interventions are not a necessary component of the therapy. While it could be the case that non‐specific therapy factors or blunt outcome measures explain these findings, a further possibility is that interventions are effective when they help people to switch “modes”(Beck, 1996) or “schematic models” (Teasdale, 1997). More precisely, psychological states comprise interacting cognitive, affective, behavioral and physiological elements. Any treatment which effectively targets one of these systems may lead to a change in all of them (Borkovec et al., 2002).

It is possible that component studies are flawed because in seeking to dismantle the separate parts of CBT, they neutralise what makes it effective: the interaction of cognitive and behavioral techniques. Behavioral experiments were considered from this perspective. While Bennett‐Levy (2003) compared the effectiveness of behavioral experiments to that of purely cognitive interventions, the key comparison will come when their efficacy is measured against purely behavioral interventions. This will elucidate whether their emphasis on cognitive hypothesis testing is justified, and in turn whether there is an interaction effect between the behavioral and cognitive elements.

Another doubt cast upon the validity of cognitive interventions in CBT is the finding that the therapy produces a rapid early improvement in symptoms; one which putatively precedes the application of cognitive techniques. The research on this issue was reviewed and it was found that evidence for such an effect was lacking.

The issue of cognitive mediation was then examined. If, as Clark (1995) asserts, CBT works by modifying dysfunctional cognitions and realigning thinking with reality, then it should be possible to show cognitive change mediating symptomatic improvement over the course of therapy. Evidence pertaining to this hypothesis was reviewed. It was shown that there is currently little empirical support for cognitive mediation.

Taken together, these findings reveal a worrying lack of empirical support for some of the fundamental tenets of CBT. There is a paucity of evidence that cognitive interventions forming the core procedural aspects of CBT are differentially effective in reducing distress. Further, there is a lack of evidence that their effectiveness, such as it is, is mediated cognitively. Given this state of affairs, it is somewhat surprising that there is not more research on these issues. Suggestions for further research might include the following:

  1. Studies comparing cognitive measures and symptom measures throughout the course of therapy would allow a morecomplete examination of temporal changes in beliefs and cognitive mediation. Development of new and better measures of the beliefs thought to underlie particular disorders would enhance this research.
  2. Multiple baseline studies of the cognitive and behavioral elements of CBT.
  3. Further component studies comparing CBT with its behavioral and cognitive elements in order to test specifichypotheses (for example, Do behavioral experiments work by testing beliefs, or through exposure?).

CBT has always stressed its status as an empirically grounded therapy. A multitude of studies has shown it to be an effective treatment for a range of psychological disorders. Nothing should detract from this achievement. However, it is important for the development of CBT that it keeps sight of its empirical roots. Anomalous findings and gaps in knowledge represent an opportunity to clarify mechanisms of change and build better, more effective therapeutic interventions. This paper is an invitation to renewed curiosity in line with these principles.




A Component Analysis of Cognitive-Behavioral Treatment for Depression

The purpose of this study was to provide an experimental test of the theory of change put forth by A. T. Beck, A. J. Rush, B. F. Shaw, and G. Emery (1979) to explain the efficacy of cognitive-behavioral therapy (CT) for depression. The comparison involved randomly assigning 150 outpatients with major depression to a treatment focused exclusively on the behavioral activation (BA) component of CT, a treatment that included both BA and the teaching of skills to modify automatic thoughts (AT), but excluding the components of CT focused on core schema, or the full CT treatment. Four experienced cognitive therapists conducted all treatments. Despite excellent adherence to treatment protocols by the therapists, a clear bias favoring CT, and the competent performance of CT, there was no evidence that the complete treatment produced better outcomes, at either the termination of acute treatment or the 6-month follow-up, than either component treatment. Furthermore, both BA and AT treatments were just as effective as CT at altering negative thinking as well as dysfunctional attributional styles. Finally, attributional style was highly predictive of both short- and long-term outcomes in the BA condition, but not in the CT condition.


The cognitive model of depression (Beck, Rush, Shaw, & Emery, 1979) states that depressed individuals have stable cognitive schemas (also referred to as underlying assumptions or core beliefs) that develop as a consequence of early learning. These schemas predispose people toward negative interpretations of life events (i.e., cognitive distortions or automatic thoughts [ATs]), which in turn, lead the depressed person to engage in depressive behavior. Cognitive-behavioral therapy (CT) for depression includes interventions that focus on publicly observable behavior, dysfunctional ATs, and inferred underlying cognitive structures or schemas. The treatment is conducted in a progressive manner so that the therapist first focuses on overt behavior change; teaches the client to assess and, when necessary, correct situation-specific distortions in thinking; and finally moves to the identification and modification of more stable depressive schemas and presumed cognitive structures.

A number of investigators have documented the clinical usefulness of CT for depression. In a meta-analysis of this approach, Dobson (1989) suggested that CT is at least as powerful and perhaps more effective than behavior therapy, pharmacotherapy, and other psychotherapies or waiting-list control conditions. Some have questioned the state of this evidence (Hollon, Shelton, & Loosen, 1991), in part based on the results of the Treatment of Depression Collaborative Research Program (TDCRP; Elkin et al„ 1989). However, even in the TDCRP, CT showed long-term effects that were at least as durable, if not more durable, than pharmacotherapy or interpersonal psychotherapy (Shea etal., 1992).

Beck and his associates are quite specific about the hypothesized active ingredients of CT, stating throughout their treatment manual (Beck et al., 1979) that interventions aimed at cognitive structures or core schema are the active change mechanisms. Despite this conceptual clarity, the treatment is so multifaceted that a number of alternative accounts for its efficacy are possible. We label two primary competing hypotheses the “activation hypothesis” and the “coping skills” hypothesis.

According to the activation hypothesis, CT effects change through the activation of clients; that is, by instigating them to become active again and to put themselves in contact with available sources of reinforcement. Instigative interventions play a major role particularly in the early stages of CT and may be largely responsible for its effectiveness. It has been noted that much of the change during CT occurs within the first few weeks (Rush, Beck, Kovacs, & Hollon, 1977), when instigations toward activation play a prominent role in the treatment. Previous studies that found CT more effective than behavioral activation ([BA] e.g., Shaw, 1977) may not have used activation strategies that work as well as those used in CT. If an entire treatment based on activation interventions proved to be as effective as CT, the cognitive model of change in CT (stipulating the necessary interventions for the efficacy of CT) would be called into question. Moreover, in addition to these important theoretical questions, there are important practical considerations: Are the elaborate cognitive interventions directly designed to modify core schema necessary? It may be that a much more parsimonious set of treatment procedures would have comparable effects.

A second hypothesis that could explain the efficacy of CT is the coping skills hypothesis. According to this hypothesis, clients learn to cope with depressing events and depressogenic thinking during CT, and it is this new set of skills that, along with activation, accounts for the alleviation of depressive behavior. In other words, it is not that core cognitive structures are altered, but that people learn effective coping strategies for dealing with life stress and the ATs associated with these events. If structural changes in core schema are really necessary for changes in clients’ depression, then CT should be significantly more effective than a treatment that stops with the training in modifying automatic dysfunctional thinking in specific situations.

To test these competing hypotheses, we conducted an experiment comparing three treatments for major depression in adults: one that included only behavioral activation (BA treatment), another that included both BA and work on ATs (AT treatment), and a third that corresponded to the full cognitive therapy of depression (CT treatment). The full CT treatment included not only work on BA and AT, but also a direct focus on identifying and modifying core depressogenic schema. According to the cognitive theory of depression, CT should work significantly better than AT, which in turn, should work significantly better than BA.

A second purpose of the present study was to examine the correlations between changes in specific mechanisms and outcome, both within and across treatments. For example, independently of whether BA worked as well as CT, was CT more successful at modifying cognitive schema than BA? In other words, do the various treatments differentially effect the processes that they are supposed to effect? A related question concerns whether the three treatments operate by means of different mechanisms, independently of their overall efficacy. For example, BA may work as well as CT but for different reasons. Do the variables that correlate with a positive acute treatment response differ across the treatments? One may expect BA to be more highly correlated with outcome in the BA condition than in the CT condition. More generally, it would be of interest to know how treatments effect change, independently of how well they work.



The sample consisted of 15 21 participants who met criteria for major depression according to the Diagnostic and Statistical Manual of Mental Disorders (3rd edition, revised; DSM-IIl-R; American Psychiatric Association, 1987), scored at least 20 on the Beck Depression Inventory (BDl;Becket al., 1979), and scored 14 or greater on the 17-item Hamilton Rating Scale for Depression. (HRSD; Hamilton, 1967). DSM- III-R diagnoses were based on the Structured Clinical Interview for DSM-HI-R (SCID; Spitzer, Williams, & Gibbon, 1987). Originally, training was provided by Michael First from the biometrics research department at the New \brk State Psychiatric Institute, where the SCID was developed. Further training and supervision was provided by Donna Miller, an experienced psychiatrist and expert SCID interviewer who was on site. The interviews themselves were conducted by clinical psychology graduate students, carefully trained and supervised by Miller. Raters were not informed of treatment condition. Interrater reliability between Miller and SCID raters was .90, on the hasis of the percentage of times Miller and the rater agreed on the primary diagnosis. Raters were also not informed of which tapes were being rated by Miller.

Similarly, the HRSD was administered as an adjunct to the SCID. For a previous study, our research group had rewritten the HRSD so that it could be inserted into a structured diagnostic interview (Whisman et al., 1989). This version of the HRSD has excellent psychometric properties and is highly reliable. Moreover, although it is a clinical interview, it can be administered by technicians without loss of reliability. Raters were not informed of the treatment condition of the participant or of which tapes were being assessed for reliability.

Eighty percent of the participants were referred directly from Group Health Cooperative, the largest health maintenance organization (HMO) in the state of Washington. The remainder were recruited from public service announcements. Of the original 152 participants accepted into the study, 110 were women and 42 were men. One hundred thirty-seven of this group completed therapy (defined as receiving at least 12 sessions of treatment). Thus, in all there were 15 dropouts (an 8% attrition rate): Three of these dropouts refused random assignment and never had a treatment session. The rates of attrition during acute treatment were comparable in the three conditions.

Exclusion criteria included a number of concurrent psychiatric disorders (bipolar or psychotic subtypes of depression, panic disorder, current alcohol or other substance abuse, past or present schizophrenia or schizophreniform disorder, organic brain syndrome, and mental retardation). We also excluded participants who were in some concurrent form of psychotherapy, who were receiving psychotropic medication, or who needed to be hospitalized because of imminent suicide potential or psychosis.

After qualifying for the study, participants were randomly assigned to one of the three treatment conditions after matching to ensure group equivalence on the following variables: number of previous episodes of depression, presence or absence of dysthymia, severity of depression, gender and marital status (married, divorced, single, or widowed).

Table 1 shows means and standard deviations for the demographic variables. We first correlated these variables with primary measures of treatment outcome to determine whether they should be used as covariates in the primary analyses. None of them (gender, years of education, marital status, or percent Caucasian) were significantly correlated with either posttreatment BD1 and HRSD scores, or changes from pre- to posttreatment on these two measures of depression severity. There were also no significant differences between treatment conditions on any of these variables, nor did the treatments differ in their pretreatment BDI scores. However, clients receiving CT and AT treatment had significantly higher pretreatment HRSD scores than did those receiving BA treatment, F( 2. 148) = 3,52, p< .05.


Four experienced cognitive therapists provided treatment in all three conditions. Their average age was 43.5 years (range, 37 to 49 years), and they averaged 14.8 years of postdegree clinical experience (range, 7 to 20 years). They had been practicing CT treatment for an average of 9.5 years since their formal training, with a range of 8 to 12 years.

All four therapists had participated in at least one previous clinical trial in which they served as research therapists for CT treatment. Despite their having previous experience, a year was devoted to training the therapists in and piloting the component (BA and AT) treatments. Three manuals were created for this study, one for each treatment condition.[1] All were based on the original CT manual (Beck et al., 1979) but included specific guidelines for prescribed and proscribed interventions in each treatment condition.

We developed a system for monitoring and calibrating for protocol adherence. One of the coauthors (Keith S. Dobson) listened to a randomly determined 20% of all audiotaped therapy sessions. Therapists were immediately contacted if a protocol violation occurred. In addition, monthly meetings were held involving authors Neil S. Jacobson and Keith S. Dobson, the project coordinator; and all therapists to discuss any ambiguities regarding protocol or treatment integrity issues. Therapists were also encouraged to “flag” any ambiguities in past sessions or any concerns they had about adherence in upcoming sessions. Finally, as we describe later, adherence was systematically evaluated independently of these calibration procedures.


BA condition. In the Beck et al. (1979) CT manual, a common early strategy is to identify behavior problems and to invoke a series of interventions designed to activate people in their natural environment. These early strategies consist mostly of semistructured activities. Included in the list of interventions considered to have an activation focus are (a) monitoring of daily activities, (b) assessment of the pleasure and mastery that is achieved by engaging in a variety of activities, (c) the assignment of increasingly more difficult tasks that have the prospect of engendering a sense of pleasure or mastery, (d) cognitive rehearsal of scheduled activities, in which participants imagine themselves engaging in various activities with the intent of finding obstacles to the imagined pleasure or mastery expected from those events, (e) discussion of specific problems (e.g., difficulty in falling asleep) and the prescription of behavior therapy techniques for dealing with them; and (f) interventions to ameliorate deficits in social skills (e.g., assertiveness, communication skills). In the BA condition, activation is the exclusive focus for 20 sessions.

Activation and the modification of dysfunctional thoughts (AT condition). From a CT perspective, activation is only the first area that requires assessment and modification in depression. Thus, although it was possible to structure an entire protocol around this behavioral intervention, Beck et al. (1979) advocate that the therapist move relatively quickly into cognitive interventions. A number of techniques have been developed to identify and modify automatic dysfunctional thoughts. Jn particular, cognitive therapists listen for “cognitive distortions,” which are negative construals of different events that precipitate sad feelings and depressed behavior. The CT therapist typically moves past activation interventions to begin to assess these negative patterns of thinking and to teach clients to be aware of them so that they can then be modified. Within this general framework, a number of different techniques have been developed to assess and modify dysfunctional thoughts. These include the following: (a) noticing mood shifts in therapy sessions and asking for the thoughts that preceded the mood shift, (b) using a daily record of dysfunctional thoughts as a form of personal diary in which the clients note particularly problematic events and the types of thoughts that surrounded those events, (c) reexamining thoughts in specific situations and determining whether the event warranted the types of conclusions that the patient had drawn about it, (d) helping clients leam how to respond in a more functional manner to negative thinking, (e) examining the possibility of attributional biases or mistakes in the way the clients see the causes of various successes and failures in their lives, and (f) the development of homework assignments in which the clients assess the validity of their negative interpretations. In this study, the AT condition permitted the use of all interventions from the BA condition and those listed earlier. The only proscription in this condition was the opportunity to work on underlying core beliefs or schemas.

CT condition. CT, in its complete form, includes the identification and modification of more general patterns of thought that are stable and presumably the causes of cognitive distortions and negative feelings. There are a number of specific interventions that are typical of therapists when they attempt to modify schema. They include the following:

(a)   use of the “downward arrow,” a technique in which the therapist asks the client for their explanations about why certain problems have emerged, which then leads to the therapist hypothesizing various types of general concerns and eventually to the identification of core beliefs;

(b)   the explicit identification of underlying assumptions and core beliefs, either by direct report of the client or by inference on the part of the therapist; (c) the identification of alternative assumptions or core beliefs; (d) the discussion of the advantages and disadvantages of holding various assumptions or core beliefs; (e) the discussion of the shortterm versus the long-term advantages of various assumptions or beliefs; (f) the assignment of homework that allows patients to determine whether they actually use certain assumptions or core beliefs in the way they deal with their life circumstances and to explore the application of other assumptions to those circumstances; and (g) the use of the same techniques involved in modifying dysfunctional thinking, except in this case they are applied to core beliefs rather to situation-specific dysfunctional thinking.

In this study, the CT condition allowed the use of the full range of BA, AT, and CT interventions. To ensure a fair test of the core schema hypothesis, however, we required that a minimum of eight sessions have a primary focus on assumptive work.

Outcome Measures

All participants were evaluated before therapy, at the time of termination, and at 6-, 12-, 18-, and 24-month follow-ups. In this article, we focus on the immediate effects of treatment and on those at the 6-month follow-up. We include measures of depressive symptoms and the presence or absence of major depression, which were based on reports from clients and from clinical evaluators.

To assess the presence or absence of major depression at posttest, clinical evaluators gave participants a modified version of the Longitudinal Interval Follow-Up Evaluation II (LIFE; Keller et al., 1987), developed to assess the longitudinal course of psychiatric disorders. The LIFE includes a semistructured interview that allows one to assess psychopathology over the previous 6 months. In our modified version, criteria for the diagnosis of depression were changed from the Research Diagnostic Criteria used on the original LIFE to those used in the DSM-III-R. To determine presence or absence of major depression, we used weekly psychiatric ratings on a scale ranging from 1 (absence) to 6 (presence). We used the LIFE measure to determine whether participants continued to meet DSM-III-R criteria for major depression at posttest.

Participants were also given the 17-item version of the HRSD, administered by a clinical evaluator. This is a widely used interviewer- based measure of depression severity.

As a second self-report measure of depression severity, the BDI (Beck et al., 1979) was administered to participants before and after treatment. This is another widely used measure of depression severity that correlates highly with the HRSD, has excellent psychometric properties (Beck, Steer, & Garbin, 1988), and is sensitive to clinical change (Edwards et al., 1984; Lambert, Shapiro,&Bergin, 1986).

Data Analysis

All analyses of outcome were conducted on those participants who completed at least 12 sessions of treatment (“completers”), those who completed the maximum allotment of 20 treatment sessions (“maximum completers”), and those who had at least one session of therapy but dropped out before completing 12 treatment sessions (“dropouts”). For dropouts, the last available score on each outcome measure served as the termination score. Posttest HRSD and BDI scores served as the primary measures of depression severity. Analyses of covariance, with pretreatment scores on the dependent measures used as covariates, were applied to compare the efficacy of three treatments.

Treatment response was also analyzed categorically. To assess the percentage of participants in each treatment condition who either recovered or improved but failed to recover, we looked at the percentage who scored 8 or less on the BDI. These criteria, although arbitrary, were recommended by Frank et al. (1991) in an effort to standardize measures of recovery in depression research.[2] Participants were categorized as improved but not recovered if they no longer met DSM-III-R criteria for major depression at posttest but continued to report BDI scores greater than 8. Contingency table analyses were used to compare treatments in improvement and recovery rates.


Adherence to Treatment Protocols

The measure of treatment integrity used in the present study was a modified version of the National Institute of Mental Health Collaborative Study Psychotherapy Rating Scale (CSPRS; Hoi Ion, Evans, Elkin, Lowery, 1984). Items included both techniques designated by the treatment manual and those prohibited or proscribed by it. Ideally, the three treatment conditions should have been most different on items reflecting interventions addressing the modification of dysfunctional ATs and core schema, as all three conditions included BA. Moreover, protocol violations should have been kept to a minimum. Our scale had 7 items measuring the use of interventions focused on BA, 12 measuring work on ATs, and 7 measuring work on underlying assumptions (UA) or core schema, as well as 3 items reflecting interventions that are proscribed in all three conditions. We were also interested in “potency,” that is, the ratio of interventions that are essential to the treatment to those that are compatible with it but neither unique nor essential to CT (Waltz, Addis, Koemer, & Jacobson, 1993). Five items were added in the category ENU (essential but not unique to BA, AT, or CT; e.g., setting an agenda, assigning homework); also, 11 items were added in the category COMPAT, which reflected nonessential interventions that are compatible with all conditions (e.g., skills training, assessing general functioning) but essential to none.

Thus, the total scale had 45 items, grouped into the aforementioned six scales. Raters listened to a tape of the therapy session, taking notes as they listened, and then rated each item on a scale ranging from 0 (not at all) to 6 (extensively or thoroughly). Nine clients were randomly selected from each condition for adherence ratings, for a total of 27 clients. For each of these clients, one early, one middle, and one late session were randomly selected, with sessions 1 and 20 excluded. Thus, a total of 81 tapes were rated.

Treatment condition was kept masked to trained coders. Intraclass correlation coefficients were used to determine interrater reliability. The mean intraclass correlations were .81, ranging from .73 to .89 across the six scales.

As Table 2 indicates, therapists were successful at keeping the treatments distinct. Therapists confined themselves to BA interventions in that condition and to BA and AT interventions in the AT condition and used all three types of interventions in the CT condition. The average ratings were exactly as we had expected: BA items were common in all three conditions but most common in the BA condition; AT interventions were common in both CT and AT conditions; and work on core schema was common only in the CT condition. On an absolute basis, almost no protocol violations were detected.

Another test of adherence involved asking the following question: Did the rate of occurrence of BA, AT, and UA interventions exceed the random fluctuations expected by chance? This question addressed whether the little bit of AT and UA work that occurred in BA and AT conditions, respectively, differed significantly from the “noise level” that one would expect by chance. Simultaneously, it allows one to be assured that the mean ratings of BA, AT, and UA interventions differed significantly from zero when they were supposed to differ. There were no statistically significant deviations from treatment protocols in any condition.



In the BA condition, only BA interventions significantly differed from zero, /(26) = 4.95, p < .05. In the AT condition, both the AT intervention, 1(26) = 3.89,p < .05, and the BA intervention, 1(26) = 3.63, p < .05, occurred to a significant degree, but UA interventions did not, 1(26) = 1.03, ns. However, in the CT condition, all three types of interventions occurred to a statistically significant degree: For BA, 1(26) = 4,61, p < .05; for AT, 1(26) = 4,47, p < .05; for CT, 1(26) = 3.1, p < .05.

Finally, to compare the treatment conditions for potency, we examined the scale totals for ENU, COMPAT, and PROSCR (proscribed), interventions. No significant between-group differences emerged for any of these three scales.

Keith S. Dobson, who provided supervision for all therapists in the project, randomly selected tapes in the CT condition and rated them for competence on the Cognitive Therapy Scale (CTS), the accepted instrument for assessing competence in CT. The convention, albeit arbitrary, is to use a score of 40 as the cutoff for competence on the CTS. The overall means were above 40, as were the means for each therapist: For Therapists 1-4, Afs = 45.16,44.01, 47.91, and 46.17, respectively.

Treatment Outcome

Table 3 presents the means, standard deviations, and results of our primary outcome analyses. Results are presented first for the total sample (including dropouts) and then for each of three subsamples: maximum completers, completers, and dropouts. Pretreatment group differences were assessed through one-way analyses of variance (ANOVAs). With the exception of the HRSD on the total sample, there were no significant pretreatment differences between conditions.

The primary treatment outcome analyses consisted of 3 (Treatment Group) X 4 (Therapist) multivariate analyses of covariance (MANCOVAs), with posttest BDI and HRSD scores serving as dependent variables and pretest scores on the respective pretest measures serving as covariates. The MANCOVAs for the total sample failed to uncover statistically significant differences among treatments. F( 4, 252) = 1, ns; therapists, F(6, 252) = 1, ns\ or Therapist X Treatment interactions, F{ 12, 252) < 1, ns. Similar MANCOVAs for the completers showed group equivalence for treatments, F( 4,2 3 8) = 1, ns\therapists, F(6, 238) < 1, ns; and Therapist X Treatment interactions, F{ 12, 238) < 1, ns. Finally, for the subsample of maximum completers, there were no differences among treatments, F(4, 226) = 1, ns; therapists, F(6, 226) < 1, ns; or Treatment X Therapist interactions, F( 12, 226) < 1, ns. To protect familywise error rates, we chose .01 as our level of significance. However, none of the MANCOVAs were significant at even the .05 level. Table 3 indicates the results of ANCOVAs for each subsample and each measure. Because none of the therapist or Therapist X Treatment interaction effects were significant, only the main effects for treatment are presented in Table 3. As indicated, there were no significant differences between the treatments on either the BDI or the HRSD. When we looked at the results separately for each therapist, we similarly found no differences among treatment conditions on either measure.

We also looked at the proportion of clients in each condition who improved and recovered to assess the clinical significance of each treatment condition (Jacobson, Follette, & Revenstorf, 1984; Jacobson & Truax, 1991). Table 4 presents the improvement and recovery rates for each of the treatments in each of the four samples. Chi-square analyses revealed no significant differences between treatments on improvement or recovery in any of the four samples. The mean improvement rate was 62.3% for the complete sample, 66% for maximum completers, 58.3% for partial completers, and 16.7% for dropouts. The mean recovery rate was 51.5% for the complete sample, 54.5% for maximum completers, 58.3% for partial completers, and 5.6% for dropouts. Dropouts had significantly lower rates of improvement and recovery than maximum completers (x2[l, N = 141] = 7.8,p< ,01;andx2[l,iV = 141] = 9.5,p< .01, respectively) and completers (x2[l, A = 149] = 9.51,p< .01; and x2[I, JV = 149] = 7.8,p < .01, respectively).

Six-Month Follow-Up Results

Table 3 includes follow-up scores on the BDI and the HRSD for all participants. We were able to obtain follow-up data on all but one participant (a 99% retention rate). We conducted ANCOVAs on both measures, with follow-up scores as the dependent variables and posttest scores as covariates. This analysis provides a parametric test for changes during the follow-up period on depressive symptoms as a function of treatment condition. The analyses found that there were no significant differences between treatment conditions, F(2, 132) < 1, ns. As Table 5 indicates, the treatments were also equivalent in the ultimate impact of therapy: this conclusion is derived from ANCOVAs in which follow-up scores served as dependent variables and pretest scores served as covariates. Thus, the three treatments did not differ either in the overall impact of therapy through the 6-month follow-up or in changes in depressive symptoms over the first 6 months after posttest.

Table 5 shows the percentage of participants in each condition who had recovered during the course of therapy and relapsed by the time of the 6-month follow-up, based on the LIFE interview. Relapse was defined as meeting criteria for major depression, and we used three different definitions of recovery: 8 consecutive weeks of not meeting criteria for major depression, ending therapy with a BDI score of 8 or less, and ending therapy with an HRSD score of 7 or less. Contingency table analyses indicated that, regardless of how recovery was defined, groups did not differ significantly in relapse rates.

We also compared the recovered participants in all three treatment conditions on the number of “well weeks” during the follow-up period, again using three criteria for recovery. A well week was defined as a week when there were no or minimal symptoms, based on the LIFE interview. The maximum score was 26. As Table 6 shows, there were no significant differences between treatment conditions, regardless of how recovery was defined.

To summarize, none of our follow-up analyses uncovered differences between groups. CT did not lead to decreased relapse, or better long-term functioning in terms of depressive symptoms, than did either of the component treatments.



Mechanisms of Change

In this series of analyses, we took two approaches. First, we looked at the impact of each treatment condition on the processes that it was expected to effect, as well as those allegedly outside its domain; thus, we examined the degree to which each condition resulted in increased behavioral activation, decreased negative thinking, and alterations in depressogenic cognitive structures. Second, we tried to establish a temporal relationship between changes in particular mechanisms and outcome. We used the Pleasant Events Schedule (both frequency and pleasure ratings) as our measure of behavioral activation (MacPhillamy & Lewin- sohn, 1971), the Automatic Thoughts Questionnaire (Hollon & Kendall, 1980) as our measure of dysfunctional thinking, and the Expanded Attributional Style Questionnaire (EASQ; Peterson & Villanova, 1988) as our measure of cognitive structures.

To examine the degree to which the three treatments resulted in mechanism changes, we compared pre- and posttreatment scores on the three measures using paired l tests. When clients were considered as an aggregate, there were significant improvements on each of these mechanism measures. Clients in all conditions increased their frequency and enjoyability of pleasant events; decreased their negative thinking; and showed significantly lowered tendencies to attribute negative events to internal, stable, and global factors.

We examined differential outcomes based on treatment condition. We used ANCOVAs to compare between-groups changes in these measures, with the pretreatment score on each of the criterion variables serving as oovariate. None of these measures changed differentially as a function of treatment condition.

It was still possible that change in a mechanism could be a cause of later depression change in one treatment and a consequence in another (Hollon, DeRubeis, & Evans, 1987). One way of evaluating this possibility was to examine the temporal relationship between change in depression and cognitive and behavioral mechanisms. Again, following DeRubeis and Fteeley (1990), we calculated residual change scores from pre- to midtreatment (early change) and from mid- to posttreatment (late change) on both the BDI and each mechanism measure. Table 7 shows the correlations between early residual change in cognitive and behavioral mechanisms and late residual change in depression in each treatment. Contrary to what was expected, early change in two subscales of the EASQ were associated with later change in the BA but not in the CT treatment. Participants in the BA treatment who made less negative attributions early in treatment became less depressed later in treatment. Also contrary to expectation, early change in frequency of pleasant events was associated with later change in depression in the CT treatment but not in the BA treatment.

We also examined the correlations between early residual change in depression and late residual change in cognitive and behavioral mechanisms. Early change in depression was not significantly related to later change in the EASQ or the PES in either of the treatment conditions.




We found no evidence in this study that CT is any more effective than either of its components. When one examines the means and standard deviations on our outcome measures, the null findings are unlikely to be attributable to inadequate power. The outcomes were quite comparable across treatment conditions and across outcome measures. Given the fact that our criteria for recovery were more stringent than in many previous studies, it is hard to compare the outcomes of this and other studies. However, our recovery rates were comparable with those of the TDCRP; despite a more severely depressed sample in this treatment study than in the TDCRP (as evidenced by higher mean BDI scores), the magnitude of change for participants in this study was comparable with those of previous CT studies.



The finding that BA alone is equal in efficacy to more complete versions of CT is important for both the theory and treatment of depression. We have ruled out threats to the internal validity of this study, and to the results given earlier, suggesting that these are valid findings; Our competence ratings showed that the therapists were performing CT within the range typically viewed by experts as competent; also, the absence of superiority for CT is not accounted for by unwanted overlap between treatments. The adherence ratings suggest that the treatments were quite discriminable and that the therapists did an excellent job of sticking to the treatment protocols. Thus, despite the fact that the treatments were distinct, the outcomes were indistinguishable, at least in the short term.

Furthermore, the treatments were not significantly different at follow-up. The parametric analyses included the entire sample, thus preserving random assignment. With these analyses, there were no overall differences between groups at the time of the 6-month follow-up, and groups did not change differentially during the follow-up period. All groups maintained their treatment gains for the most part during the short follow-up period. When relapse rates were examined, either parametrically in terms of the number of well weeks or nonparametncally in terms of the proportion of participants who had relapsed, CT once again failed to outperform component treatments.

Thus, participants with depression who received BA alone did as well as those who were additionally taught coping skills to counter depressive thinking Furthermore, both component groups improved as much as those who received interventions aimed at modifying cognitive structures, specifically underlying assumptions, and core schema. These findings run contrary to hypotheses generated by the cognitive model of depression put forth by Beck and his associates (1979), who proposed that direct efforts aimed at modifying negative schema are necessary to maximize treatment outcome and prevent relapse. These results are all the more surprising, given that they run counter to the allegiance effect (Robinson, Berman, & Neimeyer, 1990), which is quite commonly related to outcome in psychotherapy research. All of the therapists expected CT to be the most effective treatment, and morale was low whenever a case was assigned to BA. Moreover,

Keith S. Dobson, one of the clinical supervisors in the TDCRP, expected CT to outperform the alternative treatments. In short, although the null hypothesis can never be accepted, especially in response to one study with negative findings, the distinctiveness of the treatments as well as the allegiance of the therapists and supervisor make the absence of a treatment effect more convincing than would otherwise be the case.

These results raise questions as to the theory of change put forth in the CT book by Beck and his associates. They also raise questions as to the necessary and sufficient conditions for change in CT. These questions are more pronounced in light of the failure to find evidence that the mechanisms addressed by the various treatments were associated with differential change in the targeted mechanisms. In fact, our analyses of moderator effects yielded the counterintuitive finding that changes in attri- butional style were most inclined to be followed by decreased depression in BA, not in CT, as one would expect given the cognitive theory of change. It seems as if clients who responded positively to activation were also those who altered their predictions regarding how they would respond to negative life events that might occur. Because this was not a predicted finding, it should be interpreted with caution. Nevertheless, if measures of attributional style are thought of as predictions regarding hypothetical future encounters rather than measures of cognitive structure, it may be that patients with depression who respond positively to activation instructions are also those who make more optimistic predictions once they are provided with interventions designed to place them in touch with potential sources of positive reinforcement. Of course, it is also possible that BA- focused treatments are more effective ways of changing the way people think than treatments that explicitly attempt to alter thinking. Perhaps the exposure to naturally reinforcing contingencies produces changes in thinking more effectively than the explicitly cognitive interventions do.

If BA and AT treatments are as effective as CT and also are as likely to modify the factors that are thought to be necessary for change to occur, then not only the theory but also the therapy may be in need of revision. Both RA and AT are more parsimonious treatments than CT and might be more accessible to less experienced or paraprofessional therapists. Because the intervention choices are fewer and more straightforward, these component treatments may also be more amenable to less costly alternatives to psychotherapy, such as self-administered or peer support treatments (cf. Christensen & Jacobson, 1993).

Many questions need to be answered before one can draw negative conclusions about the theory of change put forth by Beck et al. (1979). For one thing, it may be that CT will prove to be effective in preventing recurrence relative to the component treatments. If that proves to be the case, we have shown that the schema modification component of CT has a prophylactic effect, although it may not facilitate acute treatment response. As our 12-month, 18-month, and 2-year follow-up data come in, we will be able to compare the treatments in terms of their relapse-recurrence prevention.

Finally, we acknowledge current limitations in our ability to measure the constructs that were targeted for intervention by the three treatment conditions. It could be that the absence of an association between treatment condition and target mechanism has more to do with the inadequacy of currently available measuring instruments than with the absence of differential change mechanisms. This concern is especially acute for measures of negative schema, in which paper-and-pencil measures have been criticized. We recognize the limitations of these methods and acknowledge that if proper measures existed, the association between mechanism and treatment condition might indeed be stronger.



Cognitive Behavioral Therapy: Escape From the Binds of Tight Methodology

CBT has become rooted as proven dogma in the treatment of depression despite large problems remaining in methodology of CBT clinical trials and the logic behind how CBT works. This article will describe the major methodologic problems in the clinical trials of CBT.

Dedicated to the Amazing Houdini

American stunt performer (March 24, 1874 – October 31, 1926)

The start of Chapter 3 of the famous book Feeling Good by David D. Burns1 on cognitive behavioral therapy (CBT) hit me, “Depression is not an emotional disorder at all! Every bad feeling you have is the result of your negative thinking.” In this paper, I intend to give this conclusion some good natured trouble.

CBT has become rooted as a proven dogma in the treatment of depression in spite of large problems remaining in the methods of CBT clinical trials and the logic of how CBT works. In this paper, I would like to discuss 4 major problems. (“Depression” in this article is defined as DSM-IV major depressive disorder [MDD]2).

  1. The premise of CBT that negative cognitions are the cause of MDD is the only instance in all of medicine and psychiatry where a symptom of an illness is also construed to be the cause.

The diagnosis of MDD includes negative cognitions as a symptom (ie, feeling worthless or excessive or inappropriate guilt2), and it is known both in clinical practice and in research, that negative cognitions may resolve either with antidepressant medications or with cycling out of the depression.3

Negative cognitions as a symptom can also make depression worse, however, asthmatic coughing as a symptom of asthma may also make asthma worse. If we make a therapy that helps to decrease coughing in asthma, we might conclude the therapy was efficacious in asthma. Is that really true? It depends on whether you state that therapy helped asthmatics to feel and function better (true) versus if you state that the therapy is the fundamental treatment of asthma (false).

  1. The statement that CBT clinical trials are “randomized and controlled” obfuscates that the studies are not double-blind (ie , neither subjects nor therapists in psychotherapy studies are blind to the type of treatment).

No CBT study (no psychotherapy study) can be a double-blind study. They may be single-blinded, the rater may not know the treatment the patient received, but neither the patients, nor the therapists, can be blinded to the type of therapy given (two out of three of the persons involved in the trial, ie, all of the persons involved in the treatment, are unblinded). Moreover, the patient must be an active participant in correcting negative distorted thoughts, so they are quite aware of the treatment group they are in.

While a drug study can use a double-blinded placebo control, psychotherapy arms that are called controls are not a blind-placebo, the therapist is also likely a believer in the therapy approach and may transmit this hope to the patient in some way, and large uncontrolled bias is the result in these studies.

In addition, MDD studies are known to have large random error because subjects with a variety of mildly low mood are included, investigator and patient preference, non-perfect rating instruments, etc. Bias can then lead to a result very far from the true value, Figure 1.4 A study on bias in treatment outcome studies concluded that the results of unblinded randomized clinical trials (RCTs) tended to be biased toward beneficial effects if the RCTs’ outcomes were subjective (as they are in psychotherapy studies) as opposed to objective.5

A recent meta-analysis examined the effectiveness of CBT when placebo control and blindedness were factored in.6 Pooled data from published trials of CBT in schizophrenia, MDD, and bipolar disorder that used controls for non-specific effects of intervention were analyzed. This study concluded that (1) CBT is no better than non-specific control interventions in the treatment of schizophrenia and does not reduce relapse rates; (2) CBT is not an effective treatment strategy for prevention of relapse in bipolar disorder; (3) CBT treatment effects are small in treatment studies of MDD. For MDD, the authors note that the pooled effect size was very low at 0.28 (Hamilton Depression Scale) and 0.27 (Beck Depression Inventory). Remember, these studies are still not double blind making the findings of even small efficacy questionable.

When medication arms are compared to a psychotherapy arm, they also include a blind placebo drug arm, but no blind psychotherapy arm. This inherently makes the study prejudiced against the medication arms, making these studies fatally flawed.7

Another recent meta-analysis found no differences between directive or non-directive therapies when controlled for researcher allegiance, and that most of the effects of therapy were realized by non-specific factors.8

  1. Symptoms in MDD include primary symptoms such as low mood, and negative cognitions as secondary reactions to these symptoms such as hopelessness and despair that may be easily assuaged by a psychotherapy. The person is then deemed a responder because “responder” is defined as a 50% improvement on a rating scale.

That negative cognitions may resolve either with antidepressant medications or with cycling out of the depression supports the notion that the negative cognitions were secondary to the depressed mood.3

The concept of primary and secondary symptoms, are also not new to psychiatry. Some authors have described psychological symptoms as a consequence of physical symptoms,9 and negative symptoms in schizophrenia may be classified as primary and etiologically related to the core pathology of schizophrenia, or secondary negative symptoms, some of which are derivative of other symptoms of schizophrenia ( ie, reclusive behavior resulting from paranoia).10

In this way, negative conclusions such as, “I don’t deserve anything,” “I am a nobody,” “no one likes me,” etc, can be seen as a psychological reaction to depressed mood. Many therapists have seen that giving persons hope and support can alleviate symptoms and decrease depression scores, but the person still suffers from the disorder. Even if the psychological symptoms of negative self content are construed to not be secondary, it is easy to consider how they may be more pliable to un-blinded psychotherapeutic intervention.

In addition, because “response” in a clinical trial of MDD is defined as a 50% improvement on a rating scale, “response” can be as a result of assuaging of psychological pain thus making the patient a “responder” in a clinical trial without actually changing the underlying biologic illness of MDD.11

While CBT trials have been shown to maintain gains in depression over long study periods, we are still left with the problem of the impossibility of double blinding of a psychotherapy trial putting the validity of these long-term studies into question.

  1. Patients’ response to psychotherapy can strongly differ depending on whether they have non-melancholic, melancholic, or psychotic MDD (Figure 2), and this can critically affect the results of a clinical trial.

Pre-treatment severity of MDD symptoms portends better outcome with antidepressants12 suggesting that the worse MDD is the more biologically based the etiology. In addition, patients with non-melancholic MDD are not deemed to clearly have an illness with biologic underpinnings compared to those with more severe melancholic and psychotic features.13,14

If a patient does not have an actual biologic depression, then they more easily improve because their non-melancholic depressed mood was based on personality issues and/or psychosocial problems.15 The psychotherapy intervention being studied will then be NON-inferior to any other intervention, and if the patients and therapists are studying CBT, the element of hope and expectation on the part of the patients to get better in these non-blinded trials will bias the results in favor of the CBT arm of the study. In addition, non-melancholic MDD is thought to be the most common form of MDD,16 thus the most likely type of MDD subject to enter a CBT trial, and the informed consent procedure biases the subjects who enter to those that are favorably inclined to the psychotherapy.17

Until a reliable biologic marker for MDD is discovered, only persons with at least a melancholic MDD should be considered to have MDD for the purpose of being included in such trials, assuming they can be double-blind. Unfortunately, these studies can never be double-blinded.

The combination of points raised this article leads to the conclusions that:

  1. CBT may help depressed persons function and feel better, but that is not an intervention proven to treat the core pathology of MDD
  2. CBT is not a psychotherapeutic modality proven to be better than any other type of psychotherapeutic intervention for MDD
  3. CBT should not be given as mono-therapy for persons with melancholic or psychotic MDD
  4. Our field must not allow studies that are not double-blinded to be called “controlled,” or “evidence-based,” they need to be in a different category, ie, “uncontrolled clinical data”

I can see what they mean now on the back side of the Feeling Good book that states, “The amazing, scientifically proven techniques described by eminent psychiatrist David D. Burns, MD, will show you what you can do immediately lift your spirits and develop a positive outlook on life.”1 Reminds me of a Houdini show promotion poster circa 1920.





Cognitive behavioural therapy for major psychiatric disorder: does it really work? A meta-analytical review of well-controlled trials

Background. Although cognitive behavioural therapy (CBT) is claimed to be effective in schizophrenia, major depression and bipolar disorder, there have been negative findings in well-conducted studies and meta-analyses have not fully considered the potential influence of blindness or the use of control interventions.

Method. We pooled data from published trials of CBT in schizophrenia, major depression and bipolar disorder that used controls for non-specific effects of intervention. Trials of effectiveness against relapse were also pooled, including those that compared CBT to treatment as usual (TAU). Blinding was examined as a moderating factor.

Results. CBT was not effective in reducing symptoms in schizophrenia or in preventing relapse. CBT was effective in reducing symptoms in major depression, although the effect size was small, and in reducing relapse. CBT was ineffective in reducing relapse in bipolar disorder.

Conclusions. CBT is no better than non-specific control interventions in the treatment of schizophrenia and does not reduce relapse rates. It is effective in major depression but the size of the effect is small in treatment studies. On present evidence CBT is not an effective treatment strategy for prevention of relapse in bipolar disorder. Received 20 August 2008; Revised 11 March 2009; Accepted 18 March 2009; First published online 29 May 2009

Key words: Bipolar disorder, cognitive therapy, depression, schizophrenia.




Cognitive behavioural therapy (CBT) has been widely adopted by psychiatry in recent years, but its increase in use in the severe disorders of schizophrenia, major depression and bipolar disorder is particularly noteworthy. This is because it challenges what has, until recently, been a dominance of biological approaches to these disorders. Thus, although contemporary accounts of schizophrenia (e.g. Picchioni & Murray, 2007) emphasize biological factors in its aetiology and consider neuroleptic drugs to be the mainstay of treatment, official UK treatment guidelines from the National Institute for Clinical Excellence (NICE) also state that psychological interventions are indispensable and that CBT should be offered to all patients (NICE, 2003, 2009). Psychological factors may loom

larger in the aetiology of major affective disorder, but when it comes to treatment, the emphasis in the literature, particularly in bipolar disorder, has once again been firmly on pharmacotherapy. Attitudes may be changing here too, however. References to the effectiveness of CBT are pervasive in the UK depression treatment guideline (NICE, 2004); a government initiative is under way in the UK to provide CBT for depression and anxiety in 250 dedicated therapy centres (Layard, 2006); and CBT is being advocated for relapse prevention in bipolar disorder (e.g. Scott & Colom, 2005; Basco & Rush, 2007).

Nevertheless, a cursory look at the literature reveals well-conducted trials where CBT has had negative findings in all three disorders. For example, large-scale trials of CBT in schizophrenia have failed to find significant advantages over befriending (Sensky et al. 2000) or supportive counselling (Lewis et al. 2002). In depression, the National Institute of Mental Health (NIMH) study of brief psychotherapeutic interventions found only marginal evidence for the effectiveness of interpersonal psychotherapy and none for cognitive therapy (Elkin et al. 1989). A recent large trial of CBT for prevention of relapse in bipolar disorder found no advantage over treatment as usual (TAU) (Scott et al. 2006). In fact, the perceived efficacy of CBT in all three disorders seems to rest principally on metaanalysis, where it has been concluded, for example, that: ‘The positive results … can therefore be taken as confirming the promise of cognitive behavioural treatment in schizophrenia’ (Pilling et al. 2002); ‘cognitive therapy has been demonstrated effective in patients with mild or moderate depression and its effects exceed those of antidepressants’ (Gloaguen et al. 1998); and ‘the use of psychological therapies as an adjunct to medication [in bipolar disorder] is likely to be clinically and cost effective’ (Scott et al. 2007).

A feature of these and other meta-analyses, however, is the lack of consideration they have given to bias caused by lack of blinding and the failure to use a control intervention. For example, out of seven metaanalytical reviews of CBT for schizophrenia (Gould et al. 2001; Rector & Beck, 2001; Pilling et al. 2002; Jones et al. 2004; Tarrier & Wykes, 2004; Zimmermann et al. 2005; Wykes et al. 2008), only two (Zimmermann et al. 2005; Wykes et al. 2008) examined the influence of blindness on effect size, and neither of these attempted to establish the treatment’s effectiveness in trials that used both blinding and a control intervention. Nor was blindness addressed in either of the two benchmark meta-analyses of CBT for depression (Gloaguen et al. 1998; Churchill et al. 2001). The way in which CBT was compared against other psychological interventions in Gloaguen et al.’s (1998) meta-analysis has also been criticized (Parker et al. 2003).

Noting that there is increasing evidence that inadequate quality of trials can translate into biased findings of systematic reviews in health care, Ju¨ni et al. (2001) recommended that the influence of study quality should be examined routinely. They also argued that it is preferable to do this by examining the influence of key components of methodological quality individually rather than by means of summary scores from quality scales, which are problematic for several reasons. This meta-analysis therefore examines the effectiveness of CBT in studies that have attempted to guard against two of the most familiar and important sources of bias in treatment trials, lack of blinding and failure to use a control intervention.


We included studies that examined the effectiveness of CBT in adults (i.e. not adolescents or elderly subjects) meeting any diagnostic criteria for schizophrenia (some of which also allowed patients with schizoaffective disorder and delusional disorder), major depression or bipolar disorder. CBT was defined as an intervention whose core elements include the recipient establishing links between their thoughts, feelings and actions and target symptoms; correcting misperceptions, irrational beliefs and reasoning biases related to these target symptoms, involving monitoring of one’s own thoughts, feelings and behaviours with respect to the symptom; and/or the promotion of alternative ways of coping with target symptoms.

The studies were required to use a control intervention that the study investigators either explicitly considered not to have specific therapeutic effects or which might reasonably be regarded as lacking these (e.g. supportive therapy, psycho-education, relaxation). We also included studies comparing CBT to pill placebo (which have only been carried out in major depression). Blindness of evaluations was not specified as a requirement for inclusion, but was examined as a moderator variable. In keeping with the general approach of meta-analysing methodologically rigorous trials, we did not include studies with small sample sizes (<10 participants in either group) or studies that were identified by the authors as pilot studies. Excluded studies are given as Supplementary material (available in the online version of the paper). We also meta-analysed studies of CBT for prevention of relapse, even though many of these used TAU as the comparison condition rather than a control intervention. This was on grounds that (a) relapse is a relatively objective outcome measure that should be robust to the effects of subject and observer bias; and (b) relapse prevention has been a major focus of studies of CBT in depression and constitutes the only type of study that has been carried out in bipolar disorder. Nevertheless, we also examined the use of TAU or a control intervention as a moderator variable, where possible, in these studies. To be included, studies had to use a symptomatic definition of relapse, rather than simply equating this with rehospitalization, and had to define relapse according to predetermined criteria.

Studies were searched using existing comprehensive meta-analyses of CBT for schizophrenia (Jones et al. 2004), depression (Gloaguen et al. 1998; Churchill et al. 2001; Vittengl et al. 2007) and bipolar disorder (Scott et al. 2007), supplemented by electronic searches of the literature (Medline, EMBASE and PsycINFO). For the electronic search, we chose inception dates of 5 years before the publication of the earliest of the above meta-analyses, which would have captured earlier studies. The search was conducted up to the end of January 2009. Review articles and the reference lists of all obtained papers were checked, as were research databases for trials. Only published studies were included. There were no restrictions on year of publication or language. Tables A1 and A2 in the Appendix provide details on the included studies.

Data were synthesized using standard metaanalytical techniques. Studies comparing the effect of CBT against a control intervention were pooled from continuous measures (i.e. symptom scores) using an effect size measure, Cohen’s d (Hedges’ correction was used). The end-point was the end of the acute treatment phase as defined by the investigators. In line with common meta-analytical practice, effect sizes obtained from a range of different symptom rating scales were pooled; we did not attempt to carry out separate analyses for the different scales, unless there were fundamental conceptual differences between them (e.g. self-rated versus observer-rated). Odds ratios (ORs) were calculated for relapse rates. Fixedeffects analysis was used in both cases (random effects analyses gave similar results). Intention-to-treat analysis was used if relevant data were available (typically in relapse studies) or, if not, on the numbers remaining at the end of the study period. Two of the investigators extracted effect sizes and ORs by consensus. All results were checked twice. Heterogeneity was assessed by means of the Q-statistic.



Effectiveness on symptoms

Nine trials were found. We excluded two studies of first-episode psychosis (Jackson et al. 2008; Lecomte et al. 2008) because they both contained a high proportion of patients (>20%) with affective psychotic diagnoses. The studies were carried out on both acute and chronic patients and the period of treatment ranged from 5 weeks to 9 months. The control interventions used were supportive counselling/supportive therapy (n=5), befriending (n=1), group psycho-education (n=1), recreational therapy (n=1) and social activity therapy (n=1). Two were open studies and seven were carried out under blind conditions. Several studies did not provide overall symptom scores but instead gave separate scores for positive and negative symptoms (and sometimes disorganization or general psychopathology). To maximize the number of usable studies, therefore, a combined effect size for all symptoms for each study was first calculated by averaging the effect sizes for these symptoms (this was done using the individual effect sizes and standard errors, using a random effects model and testing for homogeneity in each case). Effects on positive and negative symptoms were then examined separately.

The findings are shown in Fig. 1. The pooled effect size was x0.08 [95% confidence interval (CI) x0.23 to +0.08, p=0.34] (the negative sign favours CBT). The studies were not significantly heterogeneous [Q(8)= 9.28, p=0.32]. As Fig. 1 suggests, the two non-blind studies had a significantly larger pooled effect size than the seven blind studies (x0.63 v. 0.00) [QB(1)= 6.38, p=0.01]. Dividing studies into those carried out on acute patients (n=1), mixed or unspecified patients (n=6) and chronic patients (n=2) did not reveal differences [effect size +0.10, x0.17 and x0.04 respectively, QB(2)=1.82, p=0.40]. The overall effect size was increased only slightly by excluding the single study that used a group therapy form of CBT (Bechdolf et al. 2004) (effect size for eight studies=x0.11, 95% CI x0.29 to +0.06, p=0.19).

Eight studies reported findings for positive symptoms and seven for negative symptoms. The pooled effect size for positive symptoms was x0.19 (95% CI x0.37 to x0.02, p=0.03), favouring CBT. Once again, however, the result was moderated by blindness: the effect size in the six blind studies was x0.08 compared to x0.87 in the two non-blind studies [QB(1)=9.28, p=0.002]. The pooled effect size for negative symptoms was x0.02 (95% CI x0.22 to +0.18); here, blindness did not moderate the effect size [effect size for five blind studies +0.04 v. x0.26 for two non-blind studies, QB(1)=1.36, p=0.24].

Effectiveness against relapse

Eight studies were found. These had follow-up periods of 6 months to 3 years. We did not include two studies (Drury et al. 2000; Turkington et al. 2008) because there was a 5-year interval between treatment and assessment during which there was no intervention or evaluation. Three of the studies compared CBT against TAU, and five included comparison groups of supportive counselling. Six rated relapse under blind conditions and two under non-blind conditions. The studies defined relapse in terms of increases in positive symptoms, usually requiring that the increase lasted a specified period and sometimes with a requirement of hospitalization or change in management (see Appendix).

The findings are shown in Fig. 2. The pooled OR for these studies was 1.17 (95% CI 0.88–1.55, p=0.29), non-significantly favouring TAU. The studies were not significantly heterogeneous [Q(7)=11.89, p=0.10]. Blindness moderated the effect size at trend level [OR for six blind studies 1.35 v. 0.72 for two non-blind studies, QB(1)=3.28, p=0.07]. However, use of control intervention was not a significant moderating factor [QB(1)=0.02, p=0.89]. Once again, there was nothing to suggest that inclusion of studies using group CBT was influencing the result [OR for six studies using individual CBT 1.12 v. 1.01 for two studies using group CBT, QB(1)=0.20, p=0.66].

In the study of Garety et al. (2008a) we analysed relapse data in patients who had made a full or partial recovery. However, Garety et al. (2008b) have argued that these rates do not reflect the true intention-to-treat effect because patients were randomized to CBT or TAU while they were ill; some failed to recover (CBT, n=9; TAU, n=18) and so did not have the opportunity to relapse. Adjusting the total numbers for CBT and TAU to include patients who were randomized but did not recover made little difference to the pooled OR (1.20, 95% CI 0.91–1.59, p=0.19).

Major depression

Effectiveness against symptoms

Ten studies were found. These all excluded patients with bipolar disorder or psychotic depression. Six of the studies compared patients against a control psychological intervention and four against pill placebo. The studies all measured symptoms using the observer-rated Hamilton Depression Rating Scale (HAMD) or the self-rated Beck Depression Inventory (BDI), or both. Because the former scale is observer rated and the latter a self-rating questionnaire, we meta-analysed data from these scales separately.

Figure 3 shows the result for the nine studies using the HAMD. The pooled effect size was x0.28 (95% CI x0.45 to x0.12, p=0.001), significantly favouring CBT. The studies were not heterogeneous [Q(8)=9.40, p=0.31]. The effect size was significantly greater in the four studies comparing CBT to pill placebo than in the five comparing it to control psychological intervention [x0.41 v. 0.00, QB(1)=4.94, p=0.03]. Blindness of evaluations did not significantly moderate the effect size in these studies [pooled effect size for five blind studies x0.39 v. x0.16 in three non-blind studies; QB(1)=1.00, p=0.32] (the study of Scott and Freeman, 1992 was excluded from this analysis because of uncertainty over whether blindness had been maintained).

The pooled effect size for the eight studies using the BDI was similar at x0.27 (95% CI x0.45 to x0.08, p=0.004). Use of psychological control intervention (five studies) or pill placebo (three studies) did not moderate the effect size in these studies (x0.27 v. x0.27). The BDI is a self-rated scale and so none of these studies could be considered blind.

Effectiveness against relapse

Nine studies were included. We excluded four studies (Evans et al. 1992; Hollon et al. 2005; Segal et al. 2006; Dobson et al. 2008) because of systematic bias: the patients in the control group, but not those in the CBT group, had been treated with antidepressant medication until immediately before withdrawal at the start of the study, so potentially increasing the risk of depressive relapse in this group. All but one of the studies compared CBT to TAU (Perlis et al. 2002 compared it to pill placebo), and in all but one cases relapse was determined by an assessor who was blind to allocation. Relapse was typically defined as development of symptoms meeting diagnostic criteria for major depression; however, three studies allowed a supplementary criterion based on development of depressive symptoms exceeding a predetermined threshold but not meeting criteria for major depression (Shea et al. 1992; Paykel et al. 1999; Perlis et al. 2002).

The studies are summarized in Fig. 4. The pooled OR was 0.53 (95% CI 0.40–0.71, p<0.001). The studies were not significantly heterogeneous [Q(7)=8.60, p= 0.38]. All, or nearly all, of the studies were blind (blindness was not commented on in the study of Shea et al. 1992), and all but one (Perlis et al. 2002) compared CBT to TAU. Therefore, these moderating variables were not examined.

In two studies patients in both groups remained on antidepressant medication throughout the follow-up period, whereas in five, both groups were withdrawn from medication either before study entry or within the first 20 weeks of a 2-year follow-up (in the other two studies some patients in both groups were treated). The pooled ORs for studies on treated and untreated patients were 0.52 and 0.45 respectively [QB(1)=0.17, p=0.67].

Bipolar disorder

Effectiveness against relapse

There were no includable trials of CBT as a treatment for acutely ill patients. Four controlled trials of CBT for prevention of relapse have been carried out and are shown in Fig. 5. They all compared CBT to TAU and the assessments were all made under blind conditions. In three of the studies relapse was defined as development of symptoms sufficient to meet diagnostic criteria for major depression, mania, hypomania, or a mixed state; the fourth required a defined period of moderate/severe or incapacitating depressive or manic symptoms. The pooled OR for the four studies was insignificant at 0.78 (95% CI 0.53–1.15, p=0.22).


Studies of psychological therapies in major psychiatric disorder have not used, and perhaps will never be able to use, precisely the same methodology as that used to establish the efficacy of drug treatments, namely the double-blind, placebo-controlled trial. However, when those studies whose design approximates to this methodology are reviewed, their findings are at variance with the conclusions expressed in review articles, meta-analyses, editorials and even government documents.

The contrast is at its starkest in schizophrenia. In a recent editorial, Kingdon (2006) stated: ‘More than 20 randomized controlled trials and five meta-analyses have shown cognitive behaviour therapy to be beneficial in schizophrenia, reducing both positive and negative symptoms during therapy and beyond.’ Yet pooling the results of nine trials comparing CBT to non-specific control interventions reveals no indication of effectiveness. Nor does meta-analysis of a similar-sized body of evidence of CBT for relapse prevention yield any evidence of an effect. CBT for schizophrenia thus finds itself in the unusual position of being recommended in the revised NICE guideline (NICE, 2009), despite having failed in all of the treatment studies that used both control interventions and blind evaluations, and after the authors of the largest trial of relapse prevention (Garety et al. 2008a) concluded that ‘generic CBT for psychosis is not indicated for routine relapse prevention in people recovering from a recent relapse of schizophrenia.’

It could be objected that our meta-analysis of positive symptom scores revealed a small but significant effect size [x0.19 (95% CI x0.37 to x0.02), p=0.03] in favour of CBT. However, this advantage seemed clearly to reflect the lack of blindness of two of the trials; CBT showed no evidence of effectiveness against positive symptoms in the pooled results from six trials that used both control interventions and blind evaluations. Another ground for appeal might be that one relatively large study of the effectiveness of CBT in schizophrenia (Sensky et al. 2000) found that, although CBT was no better than a control intervention of befriending at the end of the 9-month treatment period, it did show a significant advantage at follow-up a further 9 months later. However, delayed or enduring effects have not been observed in other studies (Tarrier et al. 1999, 2004), and the most recent meta-analysis (NICE, 2008) found effect sizes for CBT against ‘active controls’ (mainly non-specific control interventions, but in one case cognitive remediation therapy) of only x0.18 (95% CI x0.39 to +0.03, five studies) at 12-month follow-up and x0.08 (95% CI x0.40 to +0.24, three studies) at 24 months.

A final objection could be that, in the meta-analysis of relapse rates, we did not include studies that used hospitalization as an index of relapse. This decision excluded a large study which found that CBT significantly reduced the rate of subsequent hospitalization in schizophrenia (Turkington et al. 2006). The NICE (2009) meta-analysis of this and four other studies also found a significant advantage for CBT in reducing rehospitalization (relative risk 0.76, 95% CI 0.61–0.94). Nevertheless, hospitalization is not the same thing as relapse; the decision to admit a schizophrenic patient depends not only on their clinical status but also on considerations of whether there is support outside hospital, whether the patient is likely to comply with treatment at home, etc., judgements of which could be influenced by knowledge that he or she is in the active treatment arm of a trial. Indeed, the fact that Turkington et al.’s (2006) trial, where hospitalization was the outcome measure, and Garety et al.’s (2008a) similarly large trial, where relapse was the outcome measure, had such completely contradictory results attests to the reality of the difference between these two measures.

However, CBT does emerge from our metaanalytical review as an effective treatment for major depression, both as a treatment for acute symptoms and for relapse prevention. Nevertheless, there is a qualification to this conclusion: at 0.28 (HAMD) and 0.27 (BDI) the pooled effect size for the acute treatment studies was in the small range, implying only modest therapeutic benefit. These findings bear comparison with those of the most exhaustive meta-analysis of psychological treatments for depression to date, the National Health Service (NHS) R&D Health Technology Assessment systematic review of brief psychological treatments for depression (Churchill et al. 2001). This found that all of a range of psychotherapeutic interventions showed significant advantages when compared to TAU or a waiting list control. CBT was also found to be significantly superior to supportive therapy. However, here the authors went on to state: ‘The overall quality score of the trials appeared to have a considerable effect on recovery and mean differences, with lower-scoring trials demonstrating a pronounced and highly significant difference and higher-scoring trials demonstrating no significant differences.’ Perhaps, more than anything else, our review makes it clear that a large, methodologically rigorous trial comparing CBT to a nonspecific control intervention in depression, similar to the several that exist in schizophrenia, has yet to be carried out. We were able to find only five such studies, all of which were small and only one of which was carried out under blind conditions. This might be considered a somewhat slender evidence base on which to introduce 250 treatment centres providing CBT for depression and anxiety across the UK.

For understandable reasons, little work has examined the usefulness of CBT in patients who are acutely manic or hypomanic. However, pilot studies (Lam et al. 2000; Scott et al. 2001) gave grounds for optimism for its use in relapse prevention. Three out of the four formal trials then went on to find no significant advantage for CBT, including one with very large numbers (n=253). Meta-analysis of these trials supports the conclusion that this form of psychological therapy is ineffective in preventing relapse in bipolar disorder.

A certain amount of ambiguity concerning the nature of control interventions is evident in the metaanalytical literature on CBT. Sometimes the term ‘active control’ is used (e.g. NICE, 2009), with the implication, not always correct, that, similar to how the term is used in drug studies, the therapy is being compared against an intervention that also has established therapeutic benefits. In other meta-analyses, a strategy is adopted of evaluating CBT systematically against a range of different therapies, some of which, such as relaxation and supportive counselling, would be expected to have little or no therapeutic effect, whereas others, such as psychodynamic therapy, have clear therapeutic aims (e.g. Churchill et al. 2001; Cuijpers et al. 2008). However, it is important not to lose sight of the fact that we only included studies using control interventions that lacked any specific therapeutic effect. Thus, for example, Sensky et al. (2000) described befriending as a non-specific control intervention, whose benefits for people with schizophrenia do not have any underlying theoretical or empirical basis, where the sessions focused on neutral topics, such as hobbies, sports and current affairs, and in which psychotic or affective symptoms were not directly tackled in any way. Similarly, Churchill et al. (2001), in the NHS R&D Health Technology Assessment systematic review of brief psychological treatments for depression, defined supportive therapy as ‘an inclusive term, often used in treatment outcome trials to describe an attention-placebo condition to provide a comparison to active manualized psychological interventions.’ Certainly, these interventions can result in symptomatic improvement, but there is no mystery as to why this should occur. Psychological interventions are susceptible to the so-called Hawthorne effect (e.g. Gillespie, 1991), the tendency of people singled out for a study of any kind to improve their performance or behaviour simply because of the special attention they receive. (The name derives from an electricity plant in the USA where a famous series of studies established that just about any intervention significantly increased the workers’ productivity.)

Should evidence from well-controlled studies outweigh evidence from poorly controlled ones? Until recently the answer to this question would have been emphatically yes; it is a familiar story in medicine for a treatment to show promise in one or more open studies, and then perhaps be successful in a crossover trial, only to go on to fail miserably in double-blind, placebo-controlled, parallel group trials. This simple algorithm has been complicated by meta-analysis, which typically includes all studies, good and poor, published and unpublished, in an effort to arrive at the best possible estimate of the size of the treatment effect. Use of such a broad-brush approach makes subsequent examination of study qualities desirable, even mandatory. Yet there seems to have been a reluctance to do this in the meta-analytical literature on CBT in major psychiatric illness. Even the otherwise exemplary Cochrane meta-analysis of schizophrenia (Jones et al. 2004), which carried out separate analyses of CBT against TAU and supportive counselling, still failed to examine the moderating effect of blindness. The authors of meta-analyses of CBT for depression seem unperturbed by the fact that they are basing their conclusions on studies that have often been carried out against TAU or a waiting list control; that have not always been randomized; that sometimes failed to use diagnostic criteria; and that so far have ignored the moderating effect of blindness altogether. These issues are not trivial; the findings of our meta-analysis could be viewed as an object lesson on the importance of taking such sources of bias into account.



Критика «Cognitive behavioural therapy for major psychiatric disorder: does it really work? A meta-analytical review of well-controlled trials



Letter to the Editor: A comment on Lynch et al. (2009)




An abstract is not available for this content. As you have access to this content, full HTML content is provided on this page. A PDF of this content is also available in through the ‘Save PDF’ action button.


Meta-analysis (MA) is an essential tool for summarizing evidence for a specific intervention, but is prone to bias and not objective per se. Because many MAs have failed to report procedures in a transparent way that enables readers to assess strengths and weaknesses, a group of researchers developed the QUORUM guidelines (Moher et al. Reference Moher, Cook, Eastwood, Olkin, Rennie and Stroup1999; update: Moher et al. Reference Moher, Liberati, Tetzlaff and Altman2009). These list 19 major criteria which are deemed essential for transparent reporting of the method and results in a systematic review (overall there are 27 guidelines, referring to title, abstract, introduction, methods, results, discussion and funding). Lynch et al. (Reference Lynch, Laws and McKenna2009) only comply with five of these. For example, they do not present the full electronic search strategy including search terms or describe the process of study selection (e.g. screening, determining eligibility) or the process of data extraction (e.g. were different raters involved in the data extraction and how did they agree?), they do not list and define all variables for which data was sought and, although they emphasize the risk of over-interpreting results from methodologically weak studies, they do not describe methods for assessing risk of bias in the included studies, such as quality of randomization and blinding or drop-out rates. Moreover, they do not transparently describe the synthesis of results. The results section contains no flow diagram of the study selection or numbers of studies screened and there is no description of the included studies with regard to relevant study characteristics.

This lack of reporting makes it extremely difficult to understand their selection of studies. For example, one study that used an active control design (Levine et al. Reference Levine, Barak and Granek1998) and was included in other meta-analyses (Lincoln et al. Reference Lincoln, Suttner and Nestoriuc2008; Wykes et al. Reference Wykes, Steel, Everitt and Tarrier2008) were not even listed in the list of excluded studies (see supplementary online Appendix in Lynch et al. Reference Lynch, Laws and McKenna2009). Whereas a study by Hogarty et al. (Reference Hogarty, Sander, Greenwald, DiBarry, Cooley, Ulrich, Carter and Flesher1997) that used an intervention that was not considered as CBT by the author of that study or the authors of other meta-analyses, was included. Some studies were excluded because of using additional elements in the intervention, such as motivational interviewing or family inclusion whereas others were included although they also used motivational interviewing (Haddock et al. Reference Haddock, Barrowclough, Shaw, Dunn, Novaco and Tarrier2009) or involved family members (Drury et al. Reference Drury, Birchwood, Cochrane and Macmillan1996). Other exclusion criteria are listed more explicitly but lack a strong rationale. For example, why was the label ‘pilot study’ an exclusion criteria, given all other criteria were fulfilled? This resulted in the exclusion of two relevant studies. Further, why was relapse restricted to defined symptom changes whereas studies focusing on rehospitalization or follow-up symptom scores – for which beneficial effects of CBT have been demonstrated (Lincoln et al. Reference Lincoln, Suttner and Nestoriuc2008) – were excluded? Despite other disadvantages, rehospitalization rates or days would have been the least prone to observer bias, which is what the authors were aiming at. Alone, the exclusion of studies that focused on rehospitalization reduced the pool of relevant studies by another five. Finally, a number of not previously defined exclusion criteria were added in the results section or appeared in the list of excluded studies, such as co-morbid substance abuse, the use of cognitive remediation as a control intervention, exceeding a certain percentage of affective psychoses or the use of 5-year follow-up periods. These criteria reduced the number of included studies by a further five. As the authors do not, in fact, restrict their analyses to blind or active-controlled studies, it is difficult to ascertain what the 13 studies that survived this selection process have in common.

What do we learn from this meta-analysis?

Several recent MAs (Zimmermann et al. Reference Zimmermann, Favrod, Trieu and Pomini2005; Lincoln et al. Reference Lincoln, Suttner and Nestoriuc2008; Wykes et al. Reference Wykes, Steel, Everitt and Tarrier2008) that identified small to medium effects for CBT have also investigated the effect of study quality on effect size. In particular, the MA by Wykes et al. (Reference Wykes, Steel, Everitt and Tarrier2008), which included 34 RCTs, tested the moderating effect of different aspects of study quality alone and in combination. They also found overall effect sizes to be smaller, albeit still significant, in studies using blind symptom ratings. Rating bias in observer-rated scales is a problem that is not restricted to psychological interventions (Margraf et al. Reference Margraf, Ehlers, Roth, Clark, Sheikh, Agras and Taylor1991) and might be solved by focusing more on self-rating scales, which have been shown to assess positive symptoms with adequate reliability (Lincoln et al. in press). The MA by Lynch et al. is also not the first to take the study design into account. A separate integration of effect size according to whether studies included an active control intervention in addition to TAU or merely TAU has been conducted in other MAs which also demonstrate effect sizes to be smaller in the active control group designs, but not absent (Jones et al. Reference Jones, Cormac, Silveira da Mota Neto and Campbell2004; Zimmermann et al. Reference Zimmermann, Favrod, Trieu and Pomini2005; Lincoln et al. Reference Lincoln, Suttner and Nestoriuc2008). Due to the larger data basis and more transparent methodology the results from these MAs are more conclusive than those resulting from the selective methodology employed by Lynch et al.

Finally, even if the integration of all effect sizes from blind and actively controlled studies failed to find an effect for CBT, the conclusion that CBT is ineffective might be overly hasty. Although Lynch et al. set the premise that the control interventions must be unspecific, a closer look at the included control interventions reveals some of them to involve rather specific elements that are not always clearly distinguishable from CBT. For example, Durham et al. (Reference Durham, Guthrie, Morton, Reid, Treliving, Fowler and Macdonald2003) used a psycho-dynamic approach, developed to enable patients with psychosis to come to terms with past psychotic episodes and understand them in the context of their life history and feelings. The studies by Bechdolf et al. (Reference Bechdolf, Knost, Kuntermann, Schiller, Klosterkötter, Hambrecht and Pukrop2004) and Valmaggia et al. (Reference Valmaggia, Van der Gaag, Tarrier, Pijnenborg and Sloof2005) which also failed to find CBT superior to the control intervention used psycho-education in the control intervention, which is certainly specific and has even been demonstrated to be effective under certain conditions (Lincoln et al. Reference Lincoln, Wilhelm and Nestoriuc2007). Exclusion of these three studies would have increased the effect size in the MA by Lynch et al. by over 50%. Although, authors of a MA cannot be held responsible for inconsistencies in the primary outcome studies, they are responsible for selecting and integrating studies in a way that allows them to draw valid conclusions with regard to their primary hypotheses. In order to analyse the impact of CBT over and above non-specific effects resulting from therapist contact and supportive listening, an analysis of individual well-designed studies might have been more convincing.

In sum, it can be noted that Lynch et al. point the finger at some known weaknesses in the evaluation research of CBT for psychosis. However they do not add much to the existing knowledge, apart from underlining once again that effect sizes are smaller in blind studies and when there are strong control interventions. In light of the evidence from other MAs and the methodological constraints in the MA by Lynch et al. the absence of significant effects should not be over-interpreted.


Meta-analysis is, as Lincoln points out, a tool. As such, it is doubtful whether anyone can be prescriptive about how and when it should be used. Recent examples of meta-analyses which were carried out on a subset of all the available data, and which did not go into exhaustive detail in their methods, but which nevertheless had clinically useful findings include Geddes et al. (Reference Geddes, Calabrese and Goodwin2009) on lamotrigine for bipolar depression, Cuijpers et al. (Reference Cuijpers, van Straten, Bohlmeijer, Hollon and Andersson2009) on psychotherapy for depression and Leucht et al. (Reference Leucht, Corves, Arbter, Engel, Li and Davis2009) on atypical neuroleptics for schizophrenia. None of these studies featured flow charts.

Can tinkering with the studies we included and excluded in our meta-analyses make the pooled effectiveness of CBT for schizophrenia significant? The study of Levine et al. (Reference Levine, Barak and Granek1998), which Lincoln highlights, had six patients in the CBT arm and six in the control (supportive therapy) arm, and so was excluded on the (stated) grounds of being too small. Adding this study (ES −2.23) and three other small/pilot studies [Haddock et al. Reference Haddock, Tarrier, Morrison, Hopkins, Drake and Lewis1999 (n=8, 10, ES +0.57); Turkington & Kingdon, Reference Turkington and Kingdon2000 (n=10, 5, ES −1.14); Cather et al. Reference Cather, Penn, Otto, Yovel, Mueser and Goff2005 (n=15, 13, ES +0.04)] to the meta-analysis of CBT against symptoms makes little difference to the pooled effect size (−0.09, 95% CI −0.25 to 0.06, p=0.22).

Arguing for the exclusion of the study of Hogarty et al. (Reference Hogarty, Kornblith, Greenwald, DiBarry, Cooley, Ulrich, Carter and Flesher1997) from the meta-analysis of CBT against relapse in schizophrenia faces two problems. First, their definition of personal therapy emphasized identification and management of psychosis-related affect dysregulation through a process of internal coping, and so conforms to definitions of CBT. Second, this study was included in the meta-analyses of Pilling et al. (Reference Pilling, Bebbington, Kuipers, Garety, Geddes, Orbach and Morgan2002), the Cochrane review (Jones et al. Reference Jones, Cormac, Silveira and Campbell2004), and the original and revised NICE guidelines (NICE, 2003, 2009). It should also be noted that the lack of a significant pooled effect for CBT in our meta-analysis does not depend on the inclusion of this study (pooled OR for the remaining seven studies: 1.13, 95% CI 0.84–1.52, p=0.42).

We feel it was uncontroversial to exclude two studies that Lincoln alludes to, which compared CBT to befriending (Jackson et al. Reference Jackson, McGorry, Killackey, Bendall, Allott, Dudgeon, Gleeson, Johnson and Harrigan2008) and to social skills training (Lecomte et al. Reference Lecomte, Leclerc, Corbière, Wykes, Wallace and Spidel2008), because they contained significant numbers of patients with affective psychosis. On the other hand it required truly solomonic judgement to decide whether or not to include a study which compared CBT to cognitive remediation therapy (Penadés et al. Reference Penadés, Catalán, Salamero, Boget, Puig, Guarch and Gastó2006), an intervention which, while potentially therapeutic, would not be expected to have any effect on psychotic symptoms. As it happens, however, none of these studies found a significant advantage for CBT.

The study using motivational interviewing plus CBT, which Lincoln considers we should have included or at least justified excluding (Haddock et al. Reference Haddock, Barrowclough, Tarrier, Moring, O’Brien, Schofield, Quinn, Palmer, Davies, Lowens, McGovern and Lewis2003), was carried out on dual-diagnosis patients, not on patients just with schizophrenia.

Lincoln’s final point is that control interventions like befriending, supportive counselling and psychoeducation might not be completely therapeutically inert. This begs the question: if CBT can not be shown to be better than these, does it really deserve such passionate advocacy?





Over-simplification and exclusion of non-conforming studies can demonstrate absence of effect: a lynching party?

A commentary on ‘Cognitive behavioural therapy for major psychiatric disorder: does it really work? A meta-analytical review of well-controlled trials’ by Lynch et al. (2009)


An abstract is not available for this content. As you have access to this content, full HTML content is provided on this page. A PDF of this content is also available in through the ‘Save PDF’ action button.

A number of meta-analyses of cognitive behavioural therapy (CBT) in severe mental illness have been published by its advocates (Wykes et al. Reference Wykes, Steel, Everitt and Tarrier2008), independent commentators (Jones et al. Reference Jones, Cormac, Silveira da Mota Neto and Campbell2004) and now finally by its opponents (McKenna, Reference McKenna2001, Reference McKenna2006; Turkington & McKenna, Reference Turkington and McKenna2003). Lynch et al. (Reference Lynch, Laws and McKenna2009) conclude that there is a small effect of CBT in severe depression but no effect in bipolar disorder and schizophrenia.

In bipolar disorder, there have been relatively few studies and the largest reported a negative finding which has overshadowed promising earlier findings. This may be related to the treatment group selected in that study or the therapeutic intervention used. Lam et al. (Reference Lam, Watkins, Hayward, Bright, Wright, Kerr, Parr-Davis and Sham2003) have described an adaptation of CBT, whereas the intervention in the larger study seems to have been more behavioural in form, concentrating on early intervention, treatment adherence and relapse prevention (Scott et al. Reference Scott, Paykel, Morriss, Bentall, Kinderman, Johnson, Abbott and Hayhurst2006).

The findings on schizophrenia are more interesting. Meta-analyses and reviews published so far have found favourably for CBT. Lynch et al. (Reference Lynch, Laws and McKenna2009) find differently. For relapse prevention, this may be because a large new study has produced negative findings in this area and also because of the use of studies that do not fit their inclusion criteria. For example, Hogarty’s Personal Therapy has very different origins and practice from CBT and was not described as such by him (Hogarty et al. Reference Hogarty, Kornblith, Greenwald, DiBarry, Cooley, Ulrich, Carter and Flesher1997). Most importantly, the total exclusion of studies using hospitalization as a proxy for relapse led to a substantial underestimate of effect.

Lynch et al. (Reference Lynch, Laws and McKenna2009) review studies which have an active control but wrongly conclude that such controls ‘lacked any specific therapeutic effects’. Befriending, for example, appears to have positive effects on delusions including paranoia but not hallucinations (Samarasekera et al. Reference Samarasekera, Kingdon, Siddle, O’Carroll, Scott, Sensky, Barnes and Turkington2007). Active controls allow for differentiation of effects from the non-specific but very important effects of developing a relationship with patients. Studies comparing with treatment as usual are summarily dismissed but these do have relevance in assessing generalizability in effectiveness studies.

Studies in this area have used different time scales and target symptoms and previous meta-analyses have made allowance for this. Lynch et al. (Reference Lynch, Laws and McKenna2009) do not, focusing only on end-of-treatment scoring. This has allowed them to claim that ‘perhaps the best study published to date … conducted by Sensky and colleagues’ (Beck et al. Reference Beck, Rector, Stolar and Grant2008) was one which failed. It did not indeed show a difference on most measures at the treatment end-point – apart from suicidality (Bateman et al. Reference Bateman, Hansen, Turkington and Kingdon2007) – but such effects were apparent at the 9-month follow-up and maintained at 5 years (Turkington et al. Reference Turkington, Sensky, Scott, Barnes, Nur, Siddle, Hammond, Samarasekara and Kingdon2008). Lynch et al. (Reference Lynch, Laws and McKenna2009) justify excluding these beneficial effects because some other studies did not show this despite others – not quoted – which did (Drury et al. Reference Drury, Birchwood and Cochrane2000; Turkington et al. Reference Turkington, Kingdon, Rathod, Hammond, Pelton and Mehta2006).

It is also disingenuous to hold up double-blind placebo trials of medication as if they were infallible: the effects of bias and other factors are still clearly a major concern as has emerged with evaluation of the latest generation of antipsychotics (Tyrer & Kendall, Reference Tyrer and Kendall2009). Blindness is rarely assessed in these trials. Yet the side-effects of drugs such as haloperidol compared with olanzapine and quetiapine are markedly different such that it would be expected that patient and rater would frequently be aware of which drug was being provided. Generalizability is also a major concern – recruiting patients with delusional beliefs, especially paranoia, to any study is difficult but to medication trials especially so.

The approach of Lynch et al. (Reference Lynch, Laws and McKenna2009) to mental disorder using concepts and terms such as schizophrenia (van Os, Reference van Os2009) is also being increasingly recognized as too blunt for psychosocial interventions. Successful targeted studies are emerging in early psychosis (Morrison et al. Reference Morrison, French, Walford, Lewis, Kilcommons, Green, Parker and Bentall2004) and where psychosis is associated with substance abuse (Haddock et al. Reference Haddock, Barrowclough, Tarrier, Moring, O’Brien, Schofield, Quinn, Palmer, Davies, Lowens, McGovern and Lewis2003), command hallucinations (Trower et al. Reference Trower, Birchwood, Meaden, Byrne, Nelson and Ross2004), post-traumatic stress disorder (Mueser et al. Reference Mueser, Rosenberg, Xie, Jankowski, Bolton, Lu, Hamblen, Rosenberg, McHugo and Wolfe2008) and anger (Haddock et al. Reference Haddock, Barrowclough, Shaw, Dunn, Novaco and Tarrier2009). Such diversity makes meta-analysis that much more complex and none of these studies was included in this meta-analysis. Over-simplification and exclusion of non-conforming studies can readily demonstrate limited or absence of effect. However, the acceptability of CBT to patients, carers as well as practitioners suggests that positive findings in the clinical studies undertaken so far are valid and generalizable in clinical practice. These are now being used and evaluated further in countries from China to Pakistan to the USA.




Letter to the Editor: An agenda for the next decade of psychotherapy research and practice


An abstract is not available for this content. As you have access to this content, full HTML content is provided on this page. A PDF of this content is also available in through the ‘Save PDF’ action button.

Lynch et al. (Reference Lynch, Laws and McKenna2010) provide a fascinating meta-analysis showing that when cognitive behavioural therapy (CBT) is compared with a psychotherapy or pill placebo control group, CBT is no more effective in reducing symptoms of schizophrenia or bipolar disorder than other approaches, and is only slightly better at improving depression. This finding stands in stark contrast to the large literature that shows that CBT is effective when compared with waiting-list controls (e.g. Roth & Fonagy, Reference Roth and Fonagy2005). Rather, the finding is consistent with a now overwhelming body of evidence suggesting that all the main established psychotherapies are equivalently effective (e.g. Luborsky et al. Reference Luborsky, Singer and Luborsky1975; Wampold et al. Reference Wampold, Mondin, Moody, Stich, Benson and Ahn1997; Ward et al. Reference Ward, King, Lloyd, Bower, Sibbald, Farrelly, Gabbay, Tarrier and Addington-Hall2000). Whilst CBT works, it does not appear to work better than other approaches. We suggest that the research focus should now move from establishing the effectiveness of any one technique, towards studying what common mechanisms underlie all therapeutic contact. Similarly, we suggest that practitioners should now decide what therapy to practise on grounds other than simple efficacy.

Component isolation studies do not support the argument that CBT operates through the theoretically expected mechanisms. These studies involve two therapy groups that are identical in all ways, except that one group has had the theoretically ‘active’ component removed. A recent meta-analysis (Ahn & Wampold, Reference Ahn and Wampold2001) showed that removing the theoretically ‘active’ component had no effect on the effectiveness of the therapies. Further, the data trended towards suggesting that the group without the theoretically active component was actually more effective (an effect in the opposite direction).

Component isolation studies coupled with evidence of equivalence between psychotherapies suggest that all therapies operate through the same common mechanisms. Counselling psychology has long suggested this is the case, proposing mechanisms such as the quality of the therapeutic relationship (Rogers, Reference Rogers1957), whilst others have focused on the process of engaging in psychotherapy (Wampold, Reference Wampold2007), or the cognitive changes associated with all therapeutic change (Higginson et al. Reference Higginson, Mansell and Woodin press). Lynch, et al.‘s (Reference Lynch, Laws and McKenna2010) study suggests that future research would be better advised to focus on empirically establishing the mechanisms by which all therapies work.

Unfortunately, with increasing evidence that CBT is not more effective than other therapies, the CBT movement appears to be in some crisis, and to be responding not with an increased focus on mechanisms, but rather by spawning endless ‘third-wave’ CBT approaches. These have not been shown to be superior to what is currently available, largely as they are not being compared with other existing therapies. Indeed, some third-wave approaches (see Mansell, Reference Mansell, House and Loewenthal2008) appear to bear little resemblance to CBT as originally conceived. Perhaps in diversifying the practice of CBT in the search of superior efficacy, we are witnessing the dissolution of the ‘brand image’, and the shift to an approach that cannot be distinguished from counselling psychology. We certainly hope that attitudes within the CBT community will become so inclusive, but in this case there needs to be a greater focus on common mechanisms, and less on increasingly arbitrary divides between the ‘in-group’ (CBT) and the ‘out-group’ (counselling psychologists).

With the argument for the overwhelming superiority of the effectiveness of CBT being empirically disproven, practitioners are now left with the question of which therapy to choose, to which there are at least four possible answers. First, a defeatist approach would be that the decision is irrelevant, as all therapy is equal. Second, a more pragmatic approach would be to focus therapeutic activity around the currently best-supported mechanisms. This would be aided greatly by more empirical work into what these mechanisms are. Third, therapists could focus on outcomes other than simple effectiveness. An attractive approach would be to choose the cheapest therapy (Bower et al. Reference Bower, Byford, Sibbald, Ward, King, Lloyd and Gabbay2000). Alternatively, other outcomes could be studied such as authenticity (Wood et al. Reference Wood, Linley, Maltby, Baliousis and Joseph2008); there is suggestion that whilst therapies are equivalent on the outcome of the presenting problem, CBT may outpace humanistic therapy on cognitive exploration, but underperform on affective exploration and insight (Shechtman & Pastor, Reference Shechtman and Pastor2005) – thus superiority, equivalence, or inferiority may be outcome specific (Joseph & Wood, Reference Joseph and Woodin press).

Fourth, the decision could be based on moral grounds. All therapeutic contact is based on the therapist’s assumptions and beliefs (Wood & Joseph, Reference Wood and Joseph2007). For example, there is a basic philosophical distinction to be made between directive and non-directive therapies. An unintended consequence of directive psychotherapies may be to disempower the client, and direct them to a life course that is wrong for them (Joseph & Linley, Reference Joseph and Linley2006).

Debates over the role of therapy are not new, but recent years have seen the debate become overshadowed by arguments for effectiveness when in fact this is in part a political and moral decision (Proctor, Reference Proctor, Joseph and Worsley2005). We welcome Lynch et al.‘s (Reference Lynch, Laws and McKenna2010) landmark study as it provides a compelling argument that the psychological science of therapeutic practice needs to chart a new course, one that can now focus on finding common mechanisms, and seriously debate the considerations underpinning the choice of therapy. Such work is especially urgent if new attempts to improve access to psychotherapy services (Layard, Reference Layard2006; Boyce & Wood, Reference Boyce and Wood2009) end up supporting one therapy over another, on the false assumption that it is superior, to the disenfranchisement of other effective therapies.





[1]     These manuals are available from Neil S. Jacobson.

[2]      Results were virtually unchanged when alternative criteria recommended by Frank et al. (1991) were adopted: HRSD scores less than 7 or at least 8 weeks of not meeting criteria for major depression.