Abstract
Objectives
To perform a systematic review of diagnostic test accuracy studies which manipulate or investigate the context of interpretation. In particular, those which modify or conceal sample characteristics (e.g. disease prevalence or reporting intensity) or research setting (“laboratory” versus “field”). We also investigated recall bias.
Methods
We searched the biomedical literature to March 2010 using 3 complementary strategies. Inclusion criteria were: imaging studies quantifying the effect on diagnosis of modifying the context of observers’ interpretations, varying disease prevalence, concealing sample characteristics, reporting intensity and recall bias.
Results
11247 abstracts were reviewed, 201 full texts examined and 12 ultimately included. There were 5 to 9520 patients and 2 to 129 observers per study. Nine studies investigated clinical review bias of sample level information. Only 3 studies investigated prevalence, 2 of which investigated maximum enrichment well below the levels often used by researchers. We identified no research specifically directed at concealing disease prevalence. Available research found no evidence of recall bias or “washout” on study results.
Conclusions
Several sources of bias central to the design of diagnostic test accuracy studies are poorly researched; the implications for evidence-based-practice remain uncertain. Research is suggested to guide methodological design, particularly in the context of screening.
Key Points
-
Imaging research studies often ignore the possible effect of disease prevalence
-
It is unclear how the expectation of disease influences radiological interpretation
-
The potential effect of observer recall bias is poorly researched
-
Such factors might introduce bias into radiological research methodology
-
This systematic review attempts to illustrate these points
Similar content being viewed by others
References
Bossuyt PM, Reitsma JB, Bruns DE et al (2003) Towards complete and accurate reporting of studies of diagnostic accuracy: The STARD Initiative. Radiology 226:24–28
Whiting P, Rutjes AW, Reitsma JB, Bossuyt PM, Kleijnen J (2003) The development of QUADAS: a tool for the quality assessment of studies of diagnostic accuracy included in systematic reviews. BMC Med Res Methodol 3:25
Lucas NP, Macaskill P, Irwig L, Bogduk N (2010) The development of a quality appraisal tool for studies of diagnostic reliability (QAREL). J Clin Epidemiol 63(8):854–861
Loy CT, Irwig L (2004) Accuracy of diagnostic tests read with and without clinical information: a systematic review. JAMA 292(13):1602–1609
Wolfe JM, Horowitz TS, Kenner NM (2005) Cognitive psychology: rare items often missed in visual searches. Nature 435(7041):439–440
Egglin TKP, Feinstein AR (1996) Context bias—A problem in diagnostic radiology. Jama-Journal of the American Medical Association 276(21):1752–1755
Wagner RF, Beiden SV, Campbell G, Metz CE, Sacks WM (2002) Assessment of medical imaging and computer-assist systems: lessons from recent experience. Acad Radiol 9(11):1264–1277
Bossuyt PM, Irwig L, Craig J, Glasziou P (2006) Comparative accuracy: assessing new tests against existing diagnostic pathways. BMJ 332(7549):1089–1092
Metz CE (1989) Some practical issues of experimental design and data analysis in radiological ROC studies. Invest Radiol 24(3):234–245
Gur D, Bandos AI, Cohen CS et al (2008) The "Laboratory" effect: Comparing radiologists' performance and variability during prospective clinical and laboratory mammography interpretations. Radiology 249(1):47–53
Gur D, Rockette HE, Warfel T, Lacomis JM, Fuhrman CR (2003) From the laboratory to the clinic: The "prevalence effect"'. Acad Radiol 10(11):1324–1326
Rutter CM, Taplin S (2000) Assessing mammographers' accuracy. A comparison of clinical and test performance. J Clin Epidemiol 53(5):443–450
Gur D, Rockette HE, Armfield DR et al (2003) Prevalence effect in a laboratory environment. Radiology 228(1):10–14
Gur D (2004) Imaging technology and practice assessments: diagnostic performance, clinical relevance, and generalizability in a changing environment. Radiology 233(2):309–312
Samuel S, Kundel HL, Nodine CF, Toto LC (1995) Mechanism of satisfaction of search: eye position recordings in the reading of chest radiographs. Radiology 194(3):895–902
Aideyan UO, Berbaum K, Smith WL (1995) Influence of prior radiologic information on the interpretation of radiographic examinations. Acad Radiol 2(3):205–208
Berbaum KS, Elkhoury GY, Franken EA, Kathol M, Montgomery WJ, Hesson W (1988) Impact of clinical history on fracture detection with radiography. Radiology 168(2):507–511
Berbaum KS, Franken EA, Dorfman DD, Barloon TJ (1988) Influence of clinical history upon detection of nodules and other lesions. Investig Radiol 23(1):48–55
Berbaum KS, Franken EA, Elkhoury GY (1989) Impact of clinical history on radiographic detection of fractures—a comparison of radiologists and orthopedists. Am J Roentgenol 153(6):1221–1224
Good BC, Cooperstein LA, DeMarino GB et al (1990) Does knowledge of the clinical history affect the accuracy of chest radiograph interpretation? Am J Roentgenol 154(4):709–712
Kundel HL (1982) Disease prevalence and radiological decision-making. Investig Radiol 17(1):107–109
Swensson RG, Hessel SJ, Herman PG (1985) The value of searching films without specific preconceptions. Investig Radiol 20(1):100–107
Greenhalgh T, Peacock R (2005) Effectiveness and efficiency of search methods in systematic reviews of complex evidence: audit of primary sources. BMJ 331(7524):1064–1065
Lucas NP, Macaskill P, Irwig L, Bogduk N The development of a quality appraisal tool for studies of diagnostic reliability (QAREL). J Clin Epidemiol
Burnside ES, Park JM, Fine JP, Sisney GA (2005) The use of batch reading to improve the performance of screening mammography. Am J Roentgenol 185(3):790–796
Gur D, Bandos AI, Fuhrman CR, Klym AH, King JL, Rockette HE (2007) The prevalence effect in a laboratory environment: Changing the confidence ratings. Acad Radiol 14(1):49–53
Gur D, Rockette HE, Good WF et al (1990) Effect of observer instruction on ROC study of chest images. Invest Radiol 25(3):230–234
Hardesty LA, Ganott MA, Hakim CM, Cohen CS, Clearfield RJ, Gur D (2005) "Memory effect" in observer performance studies of mammograms. Acad Radiol 12(3):286–290
Irwig L, Macaskill P, Walter SD, Houssami N (2006) New methods give better estimates of changes in diagnostic accuracy when prior information is provided. J Clin Epidemiol 59(3):299–307
Bytzer P (2007) Information bias in endoscopic assessment. Am J Gastroenterol 102(8):1585–1587
Fandel TM, Pfnur M, Schafer SC et al (2008) Do we truly see what we think we see? The role of cognitive bias in pathological interpretation. J Pathol 216(2):193–200
Meining A, Dittler HJ, Wolf A et al (2002) You get what you expect? A critical appraisal of imaging methodology in endosonographic cancer staging. Gut 50(5):599–603
Metz CE (2006) Receiver Operating Characteristic Analysis: A Tool for the Quantitative Evaluation of Observer Performance and Imaging Systems. J Am Coll Radiol 3(6):413–422
Rich AN, Kunar MA, Van Wert MJ, Hidalgo-Sotelo B, Horowitz TS, Wolfe JM (2008) Why do we miss rare targets? Exploring the boundaries of the low prevalence effect. J Vis 8(15):11–17
Esserman L, Cowley H, Eberle C et al (2002) Improving the Accuracy of Mammography: Volume and Outcome Relationships. Journal of the National Cancer Institute 94(5):369–375
Toms AP (2010) The war on terror and radiological error? Clin Radiol 65(8):666–668
Acknowledgements
This article represents independent research commissioned by the National Institute for Health (NIHR) Research under its Programme Grants for Applied Research funding scheme (RP-PG-0407-10338). This work was undertaken at University College London Hospital (UCLH) and University College London (UCL), which receive a proportion of funding from the NIHR Comprehensive Biomedical Research Centre funding scheme. The views expressed in this publication are those of the authors and not necessarily those of the NHS, the NIHR or the Department of Health.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Boone, D., Halligan, S., Mallett, S. et al. Systematic review: Bias in imaging studies - the effect of manipulating clinical context, recall bias and reporting intensity. Eur Radiol 22, 495–505 (2012). https://doi.org/10.1007/s00330-011-2294-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00330-011-2294-0