A methodological review of how heterogeneity has been examined in systematic reviews of diagnostic test accuracy

Authors: Dinnes J, Deeks J, Kirby J, Roderick P

Journal: Health Technology Assessment Volume: 9 Issue: 12

Publication date: March 2005

DOI: http://dx.doi.org/10.3310/hta9120


Dinnes J, Deeks J, Kirby J, Roderick P.A methodological review of how heterogeneity has been examined in systematic reviews of diagnostic test accuracy. Health Technol Assess 2005;9(12)

Download: Citation (for this publication as a .ris file) (4.2 KB)

Journal issues* can be purchased by completing the form.

The cost of reports varies according to number of pages and postage address. The minimum cost for a copy sent to a UK address is £30.00. We will contact you on receipt of your completed form to advise you of actual cost. If you have any queries, please contact nihredit@southampton.ac.uk.

*We regret that unfortunately we are unable to supply bound print copies of Health Technology Assessment published before issue 12:31. However, PDFs are available to print from the "Downloads" tab of the issue page.


No responses have been published. If you would like to submit a response to this publication, please do so using the form below.

Comments submitted to the NIHR Journals Library are electronic letters to the editor. They enable our readers to debate issues raised in research reports published in the Journals Library. We aim to post within 2 working days all responses that contribute substantially to the topic investigated, as determined by the Editors.

Your name and affiliations will be published with your comment.

Once published, you will not have the right to remove or edit your response. The Editors may add, remove, or edit comments at their absolute discretion.

Post your response



Middle Initial

Occupation / Job title

Affiliation / Employer



Other authors

For example, if you are responding as a team or group. Please ensure you include full names and separate these using commas

Statement of competing interests

We believe that readers should be aware of any competing interests (conflicts of interest).

The International Committee of Medical Journal Editors (ICMJE) define competing interests as including: financial relationships with industry (for example through employment, consultancies, stock, ownership, honoraria, and expert testimony), either directly or through immediate family; personal relationships; academic competition; and intellectual passion.

If yes, provide details below:

Enter response title

Enter response message


Security key

Regenerate security key

By submitting your response, you are stating that you agree to the terms & conditions

The full text of this issue is available as a PDF document from the Downloads section on this page.



To review how heterogeneity has been examined in systematic reviews of diagnostic test accuracy studies.

Data sources

Centre for Reviews and Dissemination's Database of Abstracts of Reviews of Effects (DARE).

Review methods

Systematic reviews that evaluated a diagnostic or screening test by including studies that compared a test with a reference test were identified from DARE. Reviews for which structured abstracts had been written up to December 2002 were screened for inclusion. Data extraction was undertaken using standardised data extraction forms.


A total of 189 systematic reviews met the inclusion criteria. The median number of studies included was 18. Meta-analyses have a higher number with a median of 22 studies compared with 11 for narrative reviews. Graphical plots to demonstrate the spread in study results were provided in 56% of meta-analyses; in 79% these were plots of sensitivity and specificity in the receiver operating characteristic (ROC) space. Statistical tests to identify heterogeneity were used in 32% of reviews: 41% of meta-analyses and 9% of reviews using narrative syntheses. The chi-squared test and Fisher's exact test to assess heterogeneity in individual aspects of test performance were the most common. In contrast, only 16% of meta-analyses used correlation coefficients to test for a threshold effect. A narrative synthesis was used in 30% of reviews. Of the meta-analyses, 52% carried out statistical pooling alone, 18% conducted only summary receiver operator characteristic (SROC) analyses and 30% used both methods of statistical synthesis. For those undertaking SROC analyses, the main differences between the models used were the weights chosen for the regression models, although in 42% of cases the use of, or choice of, weight was not provided. The proportion of reviews using statistical pooling alone has declined from 67% in 1995 to 42% in 2001, with a corresponding increase in the use of SROC methods, from 33% to 58%. However, two-thirds of those using SROC methods also carried out statistical pooling rather than presenting only SROC models. Reviews using SROC analyses also tended to present their results as some combination of sensitivity and specificity rather than using alternative, perhaps less clinically meaningful, means of data presentation such as diagnostic odds ratios. Three-quarters of meta-analyses attempted to investigate statistically possible sources of variation, using subgroup analysis or regression analysis. The impact of clinical or socio-demographic variables was investigated in 74% of these reviews and test- or threshold-related variables in 79%. At least one quality-related variable was investigated in 63% of reviews. Within this subset, the most commonly considered variables were the use of blinding, sample size, the reference test used and the avoidance of verification bias.


The emphasis on pooling individual aspects of diagnostic test performance and the under-use of statistical tests and graphical approaches to identify heterogeneity perhaps reflect the uncertainty in the most appropriate methods to use and also greater familiarity with more traditional indices of test accuracy. This indicates the difficulty and complexity of carrying out such reviews. In these cases it is strongly suggested that meta-analyses are carried out with the involvement of a statistician familiar with the field. Further methodological work on the statistical methods available for combining diagnostic test accuracy studies is needed, as are sufficiently large, prospectively designed primary studies of diagnostic test accuracy comparing two or more tests for the same target disorder. Use of individual patient data meta-analysis in diagnostic test accuracy reviews should be explored to allow heterogeneity to be considered in more detail.

Share this page

Email this page
Publication updates

If you would like to receive information on publications and the latest news, click below to sign up.