Feature
Article
A meta-analysis showed that control group outcomes in psilocybin trials for depression were significantly weaker than those in selective serotonin reuptake inhibitor (SSRI) and esketamine trials, suggesting that psilocybin’s large observed treatment effects may be inflated by methodological factors such as functional unblinding and expectancy bias.
The burgeoning interest in psychedelic-assisted therapies for depression has propelled psilocybin to the forefront of psychiatric research and policy debate. Although granted FDA breakthrough therapy designation for treatment-resistant depression (TRD) in 2018 and major depressive disorder (MDD) in 2019, psilocybin has faced scrutiny for the methodological challenges it presents in clinical trials, particularly regarding blinding and patient expectancy.1
In a recent meta-analysis published in JAMA Network Open, Hieronymus et al explored these concerns by examining whether control group outcomes in psilocybin trials differ meaningfully from those in trials of 2 more established treatments: selective serotonin reuptake inhibitors (SSRIs) and esketamine (Spravato; Janssen Pharmaceutical).2 The findings have broad implications for clinicians, payers, and regulators seeking to interpret antidepressant efficacy claims in a postpandemic mental health landscape that has become increasingly receptive to psychedelic interventions.
Psilocybin has demonstrated acute antidepressant efficacy in several clinical trials, with standardized effect sizes often exceeding those of conventional therapies. However, concerns have been raised about the degree to which these results may be influenced by “functional unblinding,” where the psychoactive effects of psilocybin make it obvious to participants and clinicians who is receiving the active treatment. This can compromise the integrity of control conditions and inflate treatment effects via expectancy bias or nocebo responses in control participants. To probe this issue, Hieronymus et al compared standardized mean changes (SMCs) and standardized mean differences (SMDs) across 17 randomized controlled trials (RCTs) using the Montgomery-Åsberg Depression Rating Scale (MADRS) as the primary outcome measure.
Hieronymus et al synthesized data from 4 psilocybin trials (n=373), 2 esketamine trials (n=573), and 11 SSRI trials (n=4014). Eligible trials were selected based on predefined criteria including adult patient populations aged 18 to 65 years, use of the MADRS scale, double-blind design, and trial durations of 2 weeks or longer. Control treatments included inert placebo, low-dose psilocybin, or niacin in psilocybin trials; oral antidepressant plus intranasal placebo in esketamine trials; and placebo pills in SSRI trials.
Data extraction focused on within-group changes (baseline to end point) and between-group differences. Effect sizes were calculated using raw score standardization. To test whether the type of study population (psilocybin, SSRI, or esketamine) moderated outcomes, the authors used the Omnibus Test of Moderators (QM) and calculated the proportion of variance explained (R²).
The most striking finding was that control group outcomes in psilocybin trials were significantly worse than in SSRI and esketamine trials:
Importantly, study population type was a significant moderator for both between-group effect sizes (QM = 10.7; P = .005) and control treatment SMCs (QM = 10.4; P = .005), but not for active treatment SMCs (QM = 1.21; P = .55). In other words, the differences in trial results were driven by how well control participants fared, and not by differences in how well the active treatments performed.
When looking at response rates (defined as ≥50% reduction in MADRS scores from baseline), only 19% of control participants in psilocybin trials responded, compared with 33% in SSRI trials and 42% in esketamine trials. Active treatment response rates were more consistent across drug classes as well at 48% (psilocybin), 46% (SSRIs), and 52% (esketamine).
Dropout rates also varied substantially. SSRI trials had the highest attrition, with 32% and 35% dropout in active and control arms, respectively. Psilocybin (5% active; 11% control) and esketamine (12% active; 8% control) trials reported much lower dropout rates, likely reflecting their shorter duration and more intensive participant engagement.
These findings suggest that the apparent superiority of psilocybin in clinical trials may be due in part to underperformance in control conditions. This could reflect either methodological bias (eg, inadequate blinding) or sampling bias (eg, enrolling patients with unusually low responsiveness to control treatments). The authors note that functional unblinding is more difficult to avoid with psychedelic interventions in which psychoactive effects are both immediate and conspicuous.
Importantly, while esketamine shares some administration similarities with psilocybin—including being given under supervision and causing acute perceptual effects—it did not exhibit the same level of control group underperformance. This implies that psilocybin trials may be uniquely vulnerable to expectancy-related biases.
The primary limitation of the study is its inability to determine the causes behind the observed differences in control group outcomes across the 3 drug classes—psilocybin, esketamine, and SSRIs. While the meta-analysis clearly demonstrated that participants in psilocybin trials experienced significantly less improvement in depression scores in the control arms compared with those in SSRI and esketamine trials, the reasons for this disparity remain unresolved. Contributing to this uncertainty is the limited and heterogeneous nature of the psilocybin literature available; there are only 4 psilocybin trials available for analysis, compared with 11 SSRI trials and 2 esketamine trials.
Illustration of depression with antidepressant pills. Image Credit: © Anastasia Knyazeva - stock.adobe.com
Although the comparator groups (SSRIs and esketamine) are not exhaustive of all antidepressant options, their control group outcomes—SMC of 1.00 for SSRIs and 1.12 for esketamine—were consistent with previous meta-analytic findings. One notable methodological consideration was the inclusion of study findings from Carhart-Harris et al, which compared 25 mg of psilocybin with a combined regimen of 1 mg psilocybin plus the SSRI escitalopram (Lexapro; Forest Laboratories, Inc), rather than an inert placebo.2,3 Hieronymus et al noted that this study was retained in the analysis for 2 reasons: first, because escitalopram is expected to be at least as effective as a placebo; and second, because the control group in this trial showed the largest pre- to posttreatment improvement among the 4 psilocybin studies.2 Furthermore, excluding this trial would have resulted in an even lower pooled control group effect size for psilocybin. Importantly, sensitivity analyses that omitted this trial did not materially change the overall results, reinforcing the robustness of the findings.
The meta-analysis also excluded studies involving older adults or those with acute suicidality, which limits generalizability. Notably, a post hoc analysis of 3 esketamine trials in patients with suicidal ideation showed an unusually high control group response (SMC = 1.87), supporting the idea that baseline severity and acuity may modulate placebo responsiveness.
As enthusiasm grows for psychedelic-assisted treatments in mental health care, the findings of this meta-analysis underscore the importance of critically assessing the methodological context in which psilocybin's antidepressant efficacy has been evaluated. While psilocybin continues to show promise in reducing depressive symptoms, Hieronymus et al explain that its large observed effect sizes may be partially driven by unusually poor outcomes in control groups rather than superior performance of the active intervention itself. Specifically, control participants in psilocybin trials demonstrated significantly lower response rates (19%) and smaller symptom improvements (SMC = 0.50) compared with those in SSRI (SMC = 1.00) and esketamine (SMC = 1.12) trials, suggesting a unique vulnerability to expectancy bias and functional unblinding.
These findings raise caution for stakeholders considering broad clinical or formulary adoption of psilocybin. For regulators and payers, they highlight the need for more rigorous trial designs—particularly those that minimize unblinding and address patient expectations—to ensure that efficacy signals are not overstated. Additionally, the limited number and heterogeneity of psilocybin trials to date underscore the need for further research using diverse and representative patient populations. Until such evidence emerges, claims about psilocybin’s broad effectiveness in depression should be interpreted with careful attention to trial context and control group performance.
REFERENCES
1. Heal DJ, Smith SL, Belouin SJ, Henningfield JE. Psychedelics: threshold of a therapeutic revolution. Neuropharmacology. 2023;236:109610. doi:10.1016/j.neuropharm.2023.109610
2. Hieronymus F, López E, Werin Sjögren H, Lundberg J. Control group outcomes in trials of psilocybin, SSRIs, or esketamine for depression: a meta-analysis. JAMA Netw Open. 2025;8(7):e2524119. doi:10.1001/jamanetworkopen.2025.24119
3. Carhart-Harris R, Giribaldi B, Watts R, et al. Trial of psilocybin versus escitalopram for depression. N Engl J Med. 2021;384(15):1402-1411. doi:10.1056/NEJMoa2032994
Stay ahead of policy, cost, and value—subscribe to AJMC for expert insights at the intersection of clinical care and health economics.