This review systematically examines the research literature published in the period 2002-8 on structured violence risk assessment instruments designed for use in mental health services or the criminal justice system. It adopted much broader inclusion criteria than previous reviews in the same area in order to capture and summarise data on the widest possible range of available instruments.
To address two questions: (1) what study characteristics are associated with a risk assessment instrument score being significantly associated with a violent outcome? and (2) which risk assessment instruments have the highest level of predictive validity for a violent outcome?
Nineteen bibliographic databases were searched from January 2002 to April 2008, including PsycINFO, MEDLINE, Cumulative Index to Nursing and Allied Health Literature, Allied and Complementary Medicine Database, British Nursing Index, International Bibliography of the Social Sciences, Education Resources Information Centre, The Cochrane Library and Web of Knowledge.
Inclusion criteria for studies were (1) evaluation of a structured risk tool; (2) outcome measure of interpersonal violence; (3) participants aged 17 years or over; and (4) participants with a mental disorder and/or at least one offence and/or at least one indictable offence. A series of bivariate analyses using either a chi-squared test or Spearman's rank-order correlation were conducted to explore associations between study characteristics and outcomes. Data from a subset of studies reporting area under the curve (AUC) analysis were combined to provide estimates of mean validity.
For the overall set of included studies (n = 959), over three-quarters (77%) were conducted in the USA, Canada or the UK. Two-thirds of all studies were conducted with offenders who had either no formal mental health diagnosis (43%) or forensic samples with a formal diagnosis (25%). The Psychopathy Checklist-Revised was tested in the largest number of studies (n = 192). Most studies (78%) reported a statistically significant (p < 0.05) relationship between the instrument score and a violent outcome. Prospective data collection (chi-squared = 4.4, p = 0.035), number of people recruited (U = 27.8, p = 0.012) and number of participants at end point (U = 26.9, p = 0.04) were significantly associated with predictive validity. For those instruments tested in five or more studies reporting AUC values, the General Statistical Information on Recidivism instrument had the highest mean AUC (0.73).
Agreement between pairs of reviewers in the initial pilot exercises was good but less than perfect, so discrepancies may be present given the complexity and subjectivity of some aspects of violence research. Only five of the seven calendar years (2003-7) are completely covered, with partial coverage of 2002 and 2008. There is no weighting for sample or effect sizes when results from studies are aggregated.
A very large number of studies examining the relationship between a structured instrument and a violent outcome were published in this relatively short 7-year period. The general quality of the literature is weak in places (e.g. over-reliance on cross-sectional designs) and a vast range of distinct instruments have been tested to varying degrees. However, there is evidence of some convergence around a small number of high-performing instruments and identification of the components of a high-quality evaluation approach, including AUC analysis. The upper limits (AUC 0.85) of instrument-based prediction have probably been achieved and are unlikely to be exceeded using instruments alone.
The National Institute for Health Research Health Technology Assessment and Research for Patient Benefit programmes.