Abstract
As interest grows in creating computerized versions of established paper-and-pencil (P&P) questionnaires, it becomes increasingly important to explore whether changing the administration modes of questionnaires affects participants’ responses. This study investigated whether mode effects exist when administering the Center for Epidemiologic Studies Depression (CES-D) scale by a personal digital assistant (PDA) versus the classic P&P mode. The Differential Functioning of Items and Tests (DFIT) procedure identified mode effects on the overall test and individual items. A mixed-effects regression model summarized the mode effects in terms of CES-D scores, and identified interactions with covariates. When the P&P questionnaire was administered first, scores were higher on average (2.4–2.8 points) than those of the other administrations (PDA second, PDA first, and P&P second), and all 20 questionnaire items exhibited a statistically significant mode effect. Highly educated people and younger people demonstrated a smaller difference in scores between the two modes. The mode-by-order effect influenced the interpretation of CES-D scores, especially when screening for depression using the established cut-off scores. These results underscore the importance of evaluating the cross-mode equivalence of psychosocial instruments before administering them in non-established modes.
Similar content being viewed by others
Abbreviations
- CDIF:
-
Compensatory differential item functioning
- CES-D:
-
Center for Epidemiologic Studies Depression Scale
- CFI:
-
Comparative Fit Index
- DFIT:
-
Differential Functioning of Items and Tests
- DIF:
-
Differential item functioning
- DTF:
-
Differential test functioning
- GRM:
-
Graded response model
- IFA:
-
Item Factor Analysis
- IRT:
-
Item response theory
- MMPI-2:
-
Minnesota Multiphasic Personality Inventory-2
- P&P:
-
Paper and pencil
- PDA:
-
Personal digital assistant
- RMSEA:
-
Root Mean Squared Error of Approximation
- TLI:
-
Tucker–Lewis Index
References
American Educational Research Association, American Psychological Association, & National Council on Measurement in Education, Joint Committee on Standards for Educational and Psychological Testing (U.S.) (1999). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.
American Psychological Association (1986). Guidelines for computer-based tests and interpretations. Washington, D.C.: Author.
Aquilino, W. S. (1998). Effects of interview mode on measuring depression in younger adults. Journal of Official Statistics, 14, 15–29.
Axelson, D. A., Bertocci, M. A., Lewin, D. S., Trubnick, L. S., Birmaher, B., Williamson, D. E., Ryan, N. D., & Dahl, R. E. (2003). Measuring mood and complex behavior in natural environments: use of ecological momentary assessment in pediatric affective disorders. Journal of Child Adolescent Psychopharmacology, 13, 253–266
Baker, F. B. (1992). Item response theory: parameter estimation techniques. New York: Marcel Dekker.
Bergstrom, B. A. (1996). Computerized adaptive testing for the national certification examination. American Association of Nurse Anesthetists Journal, 64, 119–124.
Bollen, K. A., & Long, J. S. (1993). Testing structural equation models (p 320). Newbury Park, CA: Sage Publications.
Chan, K. S., Orlando, M., Ghosh-Dastidar, B., Duan, N. H., & Sherbourne, C. D. (2004). The interview mode effect on the Center for Epidemiological Studies Depression (CES-D) scale: an item response theory analysis. Medical Care, 42, 281–289.
Dilalla, D. L. (1996). Computerized administration of the multidimensional personality questionnaire. Assessment, 3, 365–374.
Donovan, M. A., Drasgow, F., & Probst, T. M. (2000). Does computerizing paper-and-pencil job attitude scales make a difference? New IRT analyses offer insight. Journal of Applied Psychology, 85, 305–313.
du Toit, M. (Ed.) (2003). IRT from SSI: BILOG-MG, PARSCALE, MULTILOG, and TESTFACT. Lincolnwood, IL: Scientific Software International Inc.
Embretson, S. E., Reise, S. P. (2000). Item Response Theory for Psychologists. Mahwah NJ: L Erlbaum Associates.
Ferrando, P. J., & Lorenzo-Seva, U. (2005). IRT-related factor analytic procedures for testing the equivalence of paper-and-pencil and internet-administered questionnaires. Psychological Methods, 10, 193–205.
Fisher, L., & van Belle, G.(1993). Biostatistics: a methodology for the health sciences. New York: J. Wiley.
Flowers, C. P., Oshima, T. C., & Raju, N. S. (1999). A description and demonstration of the polytomous–DFIT framework. Applied Psychological Measurement, 23, 309–326.
Geerlings, S. W., Beekman, A. T. F., Deeg, D. J. H., van Tilburg, W., & Smit, J. H. (1999). The Center for Epidemiologic Studies Depression Scale (CES-D) in a mixed-mode repeated measurements design: sex and age effects in older adults. International Journal of Methods in Psychiatric Research, 8, 102–109.
Green, A. S., Rafaeli, E., Bolger, N., Shrout, P. E., & Reis, H. T. (2006). Paper or plastic? Data equivalence in paper and electronic diaries. Psychological Methods, 11, 87–105.
Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory. Newbury Park, CA: Sage Publications.
Hansen, J. I. C., Neuman, J. L., Haverkamp, B. E., & Lubinski, B. R. (1997). Comparison of user reaction to two methods of strong interest inventory administration and report feedback. Measurement and Evaluation in Counseling and Development, 30, 115–127.
Hu, L.-t., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: conventional criteria versus new alternatives. Structural Equation Modelling, 6, 1–55.
Husaini, B. A., Neff, J. A., Harrington, J. B., Hughes, M. D., & Stone, R. H. (1980). Depression in rural communities: validating the CES-D scale. Journal of Community Psychology, 8, 20–27.
van der Linden, W. J., & Glas, C. A. W. (Eds.) (2000). Computerized Adaptive Testing : Theory and Practice. Dordrecht: Kluwer Academic.
Maruyama, G. M. (1998). Basics of structural equation modeling. Thousand Oaks, CA: Sage Publications.
Meijer, R. R., & Nering, M. L. (1999). Computerized adaptive testing: Overview and introduction. Applied Psychological Measurement, 23, 187–194.
Orlando, M., Sherbourne, C. D., & Thissen, D. (2000). Summed-score linking using item response theory: application to depression measurement. Psychological Assessment, 12, 354–359.
Parks, B. T., Mead, D. E., & Johnson, B. L. (1985). Validation of a computer administered marital adjustment test. Journal of Marital Family Therapy, 11, 207–210.
Pinsoneault, T. B. (1996). Equivalency of computer-assisted and paper-and-pencil administered versions of the Minnesota Multiphasic Personality Inventory-2. Computers in Human Behavior, 12, 291–300
Radloff, L. S. (1977). The CES-D Scale: a self-report depression scale for research in the general population. Applied Psychological Measurement, 1, 385–401.
Raju, N. S., van der Linden, W. J., & Fleer, P. F. (1995). IRT-based internal measures of differential functioning of items and tests. Applied Psychological Measurement, 19, 353–368.
Rencher, A. C. (1995) Methods of multivariate analysis. New York: Wiley.
Roberts, R., & Vernon, S. (1983). The center for epidemiologic studies depression scale: its use in a community sample. American Journal of Psychiatry, 140, 41–46.
Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika Monograph Supplement 17, 1–100.
Schmitz, N., Hartkamp, N., Brinschwitz, C., & Michalek, S. (1999). Computerized administration of the Symptom Checklist (SCL-90-R) and the Inventory of Interpersonal Problems (IIP-C) in psychosomatic outpatients. Psychiatry Research, 87, 217–221.
Schwarz, N., & Hippler, H. J. (1995). Subsequent questions may influence answers to preceding questions in mail surveys. Public Opinion Quarterly, 59, 93–97.
Smyth, J. M., & Stone A. (2003). Ecological momentary assessment research in behavioral medicine. Journal of Happiness Studies, 4, 35–52.
Stone, A. A., Schwartz, J. E., Neale, J. M., Shiffman, S., Marco, C. A., Hickcox, M., Paty, J., Porter, L. S., & Cruise, L. J. (1998). A comparison of coping assessed by ecological momentary assessment and retrospective recall. Journal of Personality and Social Psychology, 74, 1670–1680.
Stone, A. A., Turkkan, J. S., Bachrach, C. A., Jobe, J. B., Kurtzman, H. S., Cain, V. S. (Eds.) (2000). The science of self-report: implications for research and practice. Mahwah, NJ: Lawrence Erlbaum.
Teresi, J. A., Kleinman, M., & Ocepek-Welikson, K. (2000). Modern psychometric methods for detection of differential item functioning: application to cognitive assessment measures. Statistics in Medicine, 19, 1651–1683.
Wainer, H., Dorans, N. J., Eignor, D., Faugher, R., Green, B. F., Mislevy, R. J., Steinberg, L., & Thissen, D. (2000). Computerized adaptive testing: a primer. Lawrence Erlbaum Associates: Mahwah, NJ.
Ware, J. E., Jr., Kosinski, M., Bjorner, J. B., Bayliss, M. S., Batenhorst, A., Dahlof, C. G., Tepper, S., & Dowson, A. (2003). Applications of computerized adaptive testing (CAT) to the assessment of headache impact. Quality of Life Research, 12, 935–952.
Acknowledgments
This research was supported in part by a cancer prevention fellowship supported by National Cancer Institute grant R25 CA57730, Robert M. Chamberlain, PhD, Principal Investigator. Data collection was funded in part through a core facility funded by NCI #CA16672. The authors thank Laura Sherman, MD, for her expertise in psychiatry and depression Veronica Avolevan, Denise Rahming, and Stacie Scruggs for helping collect the data; Carol Rosenblum for coordinating the data collection; Ahmed Khalil for programming the PDA survey administration software; the clinical staff in the Genitourinary Medical Oncology, Gastrointestinal Medical Oncology and Psychiatry clinics at The University of Texas M. D. Anderson Cancer Center for their cooperation and support; and Sandra L. Young, Margaret Newell, Nancy Nabilsi, and reviewers for editorial comments and critiques. The authors are also grateful to the patients and their caregivers who participated in this study.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Swartz, R.J., de Moor, C., Cook, K.F. et al. Mode effects in the center for epidemiologic studies depression (CES-D) scale: personal digital assistant vs. paper and pencil administration. Qual Life Res 16, 803–813 (2007). https://doi.org/10.1007/s11136-006-9158-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11136-006-9158-0