Skip to main content
Log in

How reliable are assessments of clinical teaching?

A review of the published instruments

  • Review
  • Published:
Journal of General Internal Medicine Aims and scope Submit manuscript

Abstract

BACKGROUND: Learner feedback is the primary method for evaluating clinical faculty, despite few existing standards for measuring learner assessments.

OBJECTIVE: To review the published literature on instruments for evaluating clinical teachers and to summarize themes that will aid in developing universally appealing tools.

DESIGN: Searching 5 electronic databases revealed over 330 articles. Excluded were reviews, editorials, and qualitative studies. Twenty-one articles describing instruments designed for evaluating clinical faculty by learners were found. Three investigators studied these papers and tabulated characteristics of the learning environments and validation methods. Salient themes among the evaluation studies were determined.

MAIN RESULTS: Many studies combined evaluations from both outpatient and inpatient settings and some authors combined evaluations from different learner levels. Wide ranges in numbers of teachers, evaluators, evaluations, and scale items were observed. The most frequently encountered statistical methods were factor analysis and determining internal consistency reliability with Cronbach’s α. Less common methods were the use of test-retest reliability, interrater reliability, and convergent validity between validated instruments. Fourteen domains of teaching were identified and the most frequently studied domains were interpersonal and clinical-teaching skills.

CONCLUSIONS: Characteristics of teacher evaluations vary between educational settings and between different learner levels, indicating that future studies should utilize more narrowly defined study populations. A variety of validation methods including temporal stability, interrater reliability, and convergent validity should be considered. Finally, existing data support the validation of instruments comprised solely of interpersonal and clinical-teaching domains.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Jones RG, Froom JD. Faculty and administration view of problems in faculty evaluations. Acad Med. 1994;69:476–83.

    Article  PubMed  CAS  Google Scholar 

  2. Beckman TJ, Lee MC, Mandrekar JN. A comparison of clinical teaching evaluations by resident and peer physicians. Med Teach. 2004;26:321–5.

    Article  PubMed  Google Scholar 

  3. Downing SM. Validity: on the meaningful interpretation of assessment data. Med Educ. 2003;37:830–7.

    Article  PubMed  Google Scholar 

  4. Crossley J, Humphris G, Jolly B. Assessing health professionals. Med Educ. 2002;36:800–4.

    Article  PubMed  Google Scholar 

  5. Beckman TJ, Lee MC, Rohren CH, Pankratz VS. Evaluating an instrument for the peer review of inpatient teaching. Med Teach. 2003;25:131–5.

    Article  PubMed  Google Scholar 

  6. Benbassat J, Bachar E. Validity of students’ ratings of clinical instructors. Med Educ. 1981;15:373–6.

    PubMed  CAS  Google Scholar 

  7. Cohen R, McRae H, Jamieson C. Teaching effectiveness of surgeons. Am J Surg. 1996;171:612–4.

    Article  PubMed  CAS  Google Scholar 

  8. Copeland HL, Hewson MG. Developing and testing an instrument to measure the effectiveness of clinical teaching in an academic medical center. Acad Med. 2000;75:161–6.

    Article  PubMed  CAS  Google Scholar 

  9. Donnelly MB, Woolliscroft JO. Evaluation of instructors by third year medical students. Acad Med. 1989;64:159–64.

    Article  PubMed  CAS  Google Scholar 

  10. Donner-Banzhoff N, Merle H, Baum E, Basler HD. Feedback for general practice trainers: developing and testing a standardized instrument using the importance-quality-score method. Med Educ. 2003;37:772–7.

    Article  PubMed  Google Scholar 

  11. Guyatt GH, Nishikawa J, Willan A, et al. A measurement process for evaluating clinical teachers in internal medicine. Can Med Assoc J. 1993;149:1097–102.

    CAS  Google Scholar 

  12. Hayward RA, Williams BC, Gruppen LD, Rosenbaum D. Measuring attending physician performance in a general medicine outpatient clinic. J Gen Intern Med. 1995;10:504–10.

    Article  PubMed  CAS  Google Scholar 

  13. Irby DM, Rakestraw P. Evaluating clinical teaching in medicine. J Med Educ. 1981;56:181–6.

    PubMed  CAS  Google Scholar 

  14. James PA, Osborne JW. A measure of medical instructional quality in ambulatory settings: the MedIQ. Fam Med. 1999;31:263–9.

    PubMed  CAS  Google Scholar 

  15. Litzelman DK, Westmorland GR, Skeff KM, Stratos GA. Student and resident evaluations of faculty—how reliable are they? Acad Med. 1999;74(suppl Oct):s25-s27.

    Article  PubMed  CAS  Google Scholar 

  16. Litzelman DK, Stratos GA, Marriott DJ, Skeff KM. Factorial validation of a widely disseminated educational framework for evaluating clinical teachers. Acad Med. 1998;73:688–95.

    Article  PubMed  CAS  Google Scholar 

  17. Mcgill MK, McClure C, Commerford K. A system for evaluating teaching in the ambulatory setting. Fam Med. 1986;18:173–4.

    Google Scholar 

  18. McLeod PJ, James CA, Abrahamowicz M. Clinical tutor evaluation: a 5-year study by students on an in-patient service and residents in an ambulatory care clinic. Med Educ. 1993;27:48–54.

    Article  PubMed  CAS  Google Scholar 

  19. Ramsbottom-Lucier MT, Gillmore GM, Irby DM, Ramsey PG. Evaluation of clinical teaching by general internal medicine faculty in outpatient and inpatient settings. Acad Med. 1994;69:152–4.

    Article  PubMed  CAS  Google Scholar 

  20. Risucci DA, Lutsky L, Rosati RJ, Tortolani AJ. Reliability and accuracy of resident evaluations of surgical faculty. Eval Health Prof. 1992;15:313–24.

    Article  PubMed  CAS  Google Scholar 

  21. Shellenberger S, Mahan JM. A factor analytic study of teaching in off-campus general practice clerkships. Med Educ. 1982;16:151–5.

    PubMed  CAS  Google Scholar 

  22. Solomon DJ, Speer AJ, Rosebraugh CJ, DiPette DJ. The reliability of medical student ratings of clinical teaching. Eval Health Prof. 1997;20:343–52.

    Article  PubMed  CAS  Google Scholar 

  23. Steiner IP, Franc-Law J, Kelly KD, Rowe BH. Faculty evaluation by residents in an emergency medicine program: a new evaluation instrument. Acad Emerg Med. 2000;7:1015–21.

    PubMed  CAS  Google Scholar 

  24. Tortolani AJ, Rissucci DA, Rosati RJ. Resident evaluation of surgical faculty. J Surg Res. 1991;51:186–91.

    Article  PubMed  CAS  Google Scholar 

  25. Williams BC, Litzelman DK, Babbott SF, Lubitz RM, Hofer TP. Validation of a global measure of faculty’s clinical teaching performance. Acad Med. 2002;77:177–80.

    Article  PubMed  Google Scholar 

  26. Snell L, Tallett S, Haist S, et al. A review of the evaluation of clinical teaching: new perspectives and challenges. Med Educ. 2000;34:862–70.

    Article  PubMed  CAS  Google Scholar 

  27. Williams RG, Klamen DA, McGaghie WC. Cognitive, social and environmental sources of bias in clinical performance ratings. Teach Learn Med. 2003;15:270–92.

    Article  PubMed  Google Scholar 

  28. American Educational Research Association, American Psychological Association, National Council on Measurement in Education. Standards for Educational and Psychological Testing. Washington, DC: American Educational Research Association; 1999.

    Google Scholar 

  29. Nunnally JC, Berstein IH. Psychometric Theory. 3rd ed. New York: McGraw-Hill. 1994:211–54.

    Google Scholar 

  30. DeVillis RF. Scale Development: Theory and Applications. London: Sage Publications. 1991;94:102–37.

    Google Scholar 

  31. Durning SJ, Cation LJ, Jackson JL. The reliability and validity of the American Board of Internal Medicine Monthly Evaluation Form. Acad Med. 2003;78:1175–82.

    Article  PubMed  Google Scholar 

  32. Schwab DP. Construct validity in organizational behavior. Res Organ Behav. 1980;2:3–43.

    Google Scholar 

  33. Perkoff GT. Teaching clinical medicine in the ambulatory setting: an idea whose time may have finally come. N Engl J Med. 1986;314:27–31.

    Article  PubMed  CAS  Google Scholar 

  34. Downing DM, Haladyna TM. Validity threats: overcoming interference with proposed interpretations of assessment data. Med Educ. 2004;38:327–33.

    Article  PubMed  Google Scholar 

  35. Irby DM. Evaluating instruction in medical education. J Med Educ. 1983;58:844–9.

    PubMed  CAS  Google Scholar 

  36. Downing SM. Reliability: on the reproducibility of assessment data. Med Educ. In Press.

  37. Howell DC. Statistical Methods for Psychology. 5th ed. Pacific Grove, Calif: Duxbury; 2002.

    Google Scholar 

  38. McMillan JH, Schumacher S. Research in Education: A Conceptual Introduction. 5th ed. New York: Addison Wesley Longman; 2001.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thomas J. Beckman MD.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Beckman, T.J., Ghosh, A.K., Cook, D.A. et al. How reliable are assessments of clinical teaching?. J GEN INTERN MED 19, 971–977 (2004). https://doi.org/10.1111/j.1525-1497.2004.40066.x

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1111/j.1525-1497.2004.40066.x

Key words

Navigation