Discrepancies in autologous bone marrow stem cell trials and enhancement of ejection fraction (DAMASCENE): weighted regression and meta-analysis
Poor reporting can cause serious harm
- “Discrepancies in autologous bone marrow stem cell trials and enhancement of ejection fraction (DAMASCENE): weighted regression and meta-analysis” by A N Nowbar and colleagues. (BMJ 2014;348:g2688, doi:10.1136/bmj.g2688.)
Objective To investigate whether discrepancies in trials of use of bone marrow stem cells in patients with heart disease account for the variation in reported effect size in improvement of left ventricular function.
Design Identification and counting of factual discrepancies in trial reports, and sample size weighted regression against therapeutic effect size. Meta-analysis of trials that provided sufficient information.
Data sources PubMed and Embase from inception to April 2013.
Eligibility for selecting studies Randomised controlled trials evaluating the effect of autologous bone marrow stem cells for heart disease on mean left ventricular ejection fraction.
Results There were over 600 discrepancies in 133 reports from 49 trials. There was a significant association between the number of discrepancies and the reported increment in ejection fraction with bone marrow stem cell therapy (Spearman’s r=0.4, P=0.005). Trials with no discrepancies were a small minority (five trials) and showed a mean ejection fraction effect size of −0.4%. The 24 trials with 1-10 discrepancies showed a mean effect size of 2.1%. The 12 with 11-20 discrepancies showed a mean effect of size 3.0%. The three with 21-30 discrepancies showed a mean effect size of 5.7%. The high discrepancy group, comprising five trials with over 30 discrepancies each, showed a mean effect size of 7.7%.
Conclusions Avoiding discrepancies is difficult but is important because discrepancy count is related to effect size. The mechanism is unknown but should be explored in the design of future trials because in the five trials without discrepancies the effect of bone marrow stem cell therapy on ejection fraction is zero.
Why do the study?
This study is all about the reliability of the published evidence that underpins many therapeutic decisions. Randomised trials are the gold standard evaluation of any treatment, and it’s important that published reports can be trusted. This study explores unreliable reporting of trials and how that can affect overall findings. The authors look specifically at trials of a new and invasive treatment for heart disease: the infusion of stem cells harvested from a patient’s own bone marrow. Poor reporting of trials can be a serious problem if it distorts results and gives the impression that a treatment works better than it really does (inflates the effect size), or causes less harm. The authors chose to study stem cell treatment for heart disease because researchers have been unable to explain conflicting results from published trials. Poor reporting is one possible contributor to the inconsistency.
What did the authors do?
These authors searched systematically for randomised controlled trials testing autogolous bone marrow stem cells to help improve ventricular function in adults with heart disease. They used a comprehensive search strategy that included research databases, research registers, and hand searching reference lists.
Eight researchers went through each report looking for discrepancies in the design, methods, or results. They defined discrepancy as “two (or more) reported facts that cannot both be true because they are logically or mathematically incompatible.”
They also went through each report and extracted the effect size of the treatment, which indicates the extra improvement in ejection fraction in treated patients compared with controls. Ejection fraction is a measure of left ventricular function. It’s expressed as a percentage. The higher the better.
The authors then looked for a statistical correlation between their exposure (discrepancies in reporting) and outcome (the size of the treatment effect), and did other analyses adjusting for confounding factors such as sample size. Adjustments help to isolate the exposure we are interested in from the obscuring effects of other trial features that are also associated with effect size. Smaller trials, for example, tend to report bigger effect sizes.
Finally, the authors did pooled analyses (meta-analyses) of groups of trials with no discrepancies, 1-10, 11-20, and 21-30 discrepancies to see if trials with more discrepancies reported a bigger effect than trials with fewer discrepancies.
What did the study find?
The authors found 49 different trials of stem cell therapy in 133 published reports. They identified 604 discrepancies in these few dozen trials including: numbers that didn’t add up, contradictory statements about randomisation, figures that did not match the data, and women present in early reports that seemed to have become men by later reports of the same trial. Individual trials contained between 0 and 89 discrepancies each.
Initial analyses showed clear and statistically significant correlation between the number of discrepancies in trials and the size of the reported effect (Spearman’s r=0.4, P=0.005). Figure 1 shows the correlation clearly. 1 Figure 2 shows how effect size grew as mean number of discrepancies increased. 2
The five trials with no discrepancies reported no extra improvement in ejection fraction for adults given stem cells (mean extra improvement -0.4%). At the other end of the scale, the five trials with more than 30 discrepancies (each), reported a mean extra improvement of 7.7% for adults given stem cells. Poor reporting seems to make the treatment look good, although you should never infer causality from observational studies. All we can say with confidence is that poor reporting is associated with significantly bigger effect sizes.
Adjusted analyses and the formal meta-analysis of a subset of trials confirmed the link between poor reporting and bigger effect sizes. The association was independent of sample size and of trial quality as assessed by the Cochrane risk of bias tool.
Perhaps the most striking finding is that only five trials out of 49 were reported without inconsistencies or errors.
What are the study’s strengths and weaknesses?
This is a big study that used a good search strategy to identify trials in a systematic way. Eight authors went through each one and had to agree on the discrepancies they identified. This gives us confidence in the validity of the exposure. They found a new and important association, with implications beyond this particular specialist treatment.
All studies have limitations. These authors were able to look only at discrepancies in published reports of trials. They weren’t able to look at discrepancies between published reports and protocols of the same trial, or between published reports and records of trials in trial registers. These “prepublication” sources are an important resource because they record a trial as it was conceived, before the first patient was recruited. By comparing prepublication records of trials with their full published reports, we can see if the trial changed in any way during conduct or analysis. This can help identify all sorts of biases, including selective reporting of results that are statistically significant, mutating outcomes, extra unplanned analyses, or even non-publication of trials.
The authors find an association between discrepancies and effect sizes in reports of trials. The authors think it’s the discrepancies driving this association—poorly reported trials are exaggerating the benefits of treatment with stem cells. But the opposite is also possible, since trials with big effect sizes might be published more often (in multiple reports) affording a greater opportunity for discrepancies. The number of published reports of a trial is an important confounding factor that the authors weren’t able to control in their analyses.
They also had to exclude five Chinese trials, because extra discrepancies might have crept in during translation.
Finally, the authors had to make up their own way of defining and counting discrepancies. Others might have done it differently, and perhaps obtained different results. The next step is to develop a valid, accepted, and reproducible way of measuring discrepancies in reports of trials so researchers can repeat this study in a comparable way.
What do the findings mean?
First and foremost, these findings suggest that the published evidence on stem cell treatment for adults with heart disease is littered with poorly explained inconsistencies or discrepancies, and that these inconsistences are associated with exaggerated effectiveness. Meticulously reported trials (a small minority) reported no effect at all from this invasive treatment. If published reports are error prone and biased, then patients suffer. Doctors, misled by the published evidence, prescribe unproved treatments and expose patients to unknown harms. No treatment is risk free.
Unreliable reporting has implications for health policy and for health budgets. All treatments come at a cost. Whoever pays the bills needs to know how best to spend their limited budgets. Unreliable evidence risks diverting money into suboptimal treatments when it would be better spent elsewhere. Again, patients suffer, along with all taxpayers in publicly funded systems such as the NHS.
There are also implications for how we organise and appraise research in the future. If published reports can be unreliable, perhaps we should be looking for more reliable evidence in raw datasets, or prepublication sources including commercially sensitive data held by drug companies. Clinical guidelines, health policies, and, by extension, therapeutic decisions are currently made on the basis of published research, often in the form of meta-analyses (pooled results from all existing trials). There’s a growing recognition that this approach gives us only half the story.  We should try harder to complete the picture, with all unpublished data and reports, before accepting that a new treatment can work better than an existing alternative.
The main message for medical students and new doctors is perhaps to maintain a healthy scepticism about published reports of trials, particularly trials of new and high tech treatments. Clear, careful, and above all transparent reporting of trials really matters. Poor reporting can cause serious harm.Alison Tonks, associate editor, BMJ
Correspondence to: email@example.com
Competing interests: As an associate editor, AT helps select research papers for publication in The BMJ.
Provenance and peer review: Commissioned; not externally peer reviewed.
- Freemantle N, Rait G, Trials of autologous bone marrow stem cells for heart disease. BMJ 2014;348:g2750.
- Lehman R, Loder E, Missing clinical trial data BMJ 2012;344:d8158.
- Loder E, Tovey D, Godlee F, The Tamiflu trials BMJ 2014;348:g2630.
- Loder E, Godlee F, Barbour V, Winker M, VB, MW: on behalf of the PLOS Medicine editors. Restoring the integrity of the clinical trial evidence base. BMJ 2013;346:f3601.
Cite this as: Student BMJ 2014;22:g3288