Analyzing data with clumping at zero. An example demonstration

J Clin Epidemiol. 2000 Oct;53(10):1036-43. doi: 10.1016/s0895-4356(00)00223-7.

Abstract

This article demonstrates the use of two approaches to analyzing the relationship of multiple covariates to an outcome which has a high proportion of zero values. One approach is to categorize the continuous outcome (including the zero category) and then fit a proportional odds model. Another approach is to use logistic regression to model the probability of a zero response and ordinary least squares linear regression to model the non-zero continuous responses. The use of these two approaches was demonstrated using outcomes data on hours of care received from the Springfield Elder Project. A crude linear model including both zero and non-zero values was also used for comparison. We conclude that the choice of approaches for analysis depends on the data. If the proportional odds assumption is valid, then it appears to be the method of choice; otherwise, the combination of logistic regression and a linear model is preferable.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Activities of Daily Living
  • Aged
  • Disabled Persons
  • Female
  • Health Services Needs and Demand
  • Health Services for the Aged
  • Humans
  • Least-Squares Analysis
  • Linear Models
  • Logistic Models
  • Male
  • Models, Statistical*
  • Odds Ratio
  • Outcome Assessment, Health Care / statistics & numerical data*