Latent class models for classification

doi:10.1016/S0167-9473(02)00179-2

Computational Statistics & Data Analysis

Volume 41, Issues 3–4, 28 January 2003, Pages 531-537

https://doi.org/10.1016/S0167-9473(02)00179-2 Get rights and content

Abstract

An overview is provided of recent developments in the use of latent class (LC) and other types of finite mixture models for classification purposes. Several extensions of existing models are presented. Two basic types of LC models for classification are defined: supervised and unsupervised structures. Their most important special cases are presented and illustrated with an empirical example.

Introduction

Let y denote a discrete dependent, outcome, target, or output variable, and $z$ a vector of independent, input, predictor, or attribute variables.¹ Classification involves predicting the discrete outcome variable y as accurate as possible using the information on the z variables. Recently, latent class (LC), or finite mixture (FM), models have been proposed as classification tools in the field of neural networks (Jacobs et al., 1991; Bishop, 1995, pp. 212–220), as well as in the field of Bayesian (or belief) networks (Kontkanen et al., 1996; Monti and Cooper, 1999; Meilã and Jordan, 2000). This paper gives an overview of these developments and presents several extensions of the proposed models.

Classification using a statistical model involves specifying either a model for $P(y| z)$ , as in regression analysis, or a model for $P(z |y)$ , as in discriminant analysis. In the next two sections, we present two basic types of LC models for classification: they involve specifying a model for $P(y| z)$ and $P(z |y)$ , respectively. Subsequently, we illustrate the most important special cases of these two basic types with an empirical example. The paper ends with a short discussion.

Section snippets

Supervised classification structures

The first basic type of LC model for classification involves specifying a model for the conditional distribution of y given $z$ , where a discrete hidden variable x serves as intervening variable. More precisely, the assumed probability structure for $P(y, z)$ is $P(y, z)=P(z)P(y| z)=P(z) ∑ x P(x| z)P(y| z,x),$ where $P(z)$ is treated as fixed. Besides the above probability structure, regression-type constraints are imposed on the model probabilities. Since both the latent variable and the outcome variable are

Unsupervised classification structures

In the second basic type of LC model for classification, one models the conditional distribution of the z variables given y, $P(z |y)$ . The decomposing of $P(y, z)$ is now $P(y, z)=P(y)P(z |y)=P(y) ∑ x P(x|y)P(z |y,x).$ Since the likelihood function used in the estimation is based on $P(z |y)$ or $P(y, z)$ , there is no direct relationship between model fit and classification performance. These methods belong, therefore, to the family of unsupervised classification or unsupervised learning methods. The predictive

An application

We applied the various LC models for classification to data of 9949 employees of a large national (American) corporation who where asked about their job satisfaction (see Table 5.10 in Agresti, 1990). The outcome variable (job satisfaction) has two levels: satisfied and not satisfied. The predictors are race, gender, age (three age groups) and regional location (seven regions). The data set was randomly split into a training and a validation sample, consisting of 5007 and 4942 cases,

Discussion

We described two basic types of LC models for classification. Advantages of the unsupervised methods are that their estimation is much faster, that they are less prone to local maxima, and that they can easily deal with missing data in the predictor variables. The most important advantage of the supervised methods is their better classification performance.

Among the unsupervised methods, the standard LC model (including factor variant) yields results that are most easy to interpret.

References (13)

A. Agresti
Categorical Data Analysis
(1990)
C.M. Bishop
Neural Networks for Pattern Recognition
(1995)
C.C. Clogg et al.
Latent structure analysis of a set of multi-dimensional contingency tables
J. Amer. Statist. Assoc.
(1984)
C.M. Dayton et al.
Concomitant-variable latent-class models
J. Amer. Statist. Assoc.
(1988)
J.A. Hagenaars
Categorical Longitudinal Data—Loglinear Analysis of Panel, Trend and Cohort Data
(1990)
R.A. Jacobs et al.
Adaptive mixtures of local experts
Neural Comput.
(1991)

There are more references available in the full text version of this article.

Cited by (161)

Pain-Associated Psychological Distress Is of High Prevalence in Patients With Hip Pain: Characterizing Psychological Distress and Phenotypes
2024, Arthroscopy, Sports Medicine, and Rehabilitation
To identify common pain-related psychological factors among patients seeking care for athletic hip pain, as well as characterize psychological distress phenotypes and compare hip-specific quality-of-life measures across those phenotypes.
A total of 721 patients were recruited from hip preservation clinics. The Optimal Screening for Prediction of Referral and Outcome–Yellow Flag Assessment Tool (OSPRO-YF) was used to identify the presence or absence of 11 different pain-associated psychological distress characteristics (yellow flags), while the International Hip Outcome Tool–12 (iHOT-12) was used to assess hip-related quality of life. Latent class analysis identified patient subgroups (phenotypes) based on naturally occurring combinations of distress characteristics. An analysis of variance was used to compare demographics, number of yellow flags, and iHOT-12 scores across phenotypes.
The median (interquartile range) number of yellow flags was 6 (3-9), with 13.5% of the sample reporting 11 yellow flags. Latent class analysis (L² = 543.3, classification errors = 0.082) resulted in 4 phenotypes: high distress (n = 299, 41.5%), low distress (n = 172, 23.9%), low self-efficacy and acceptance (n = 74, 10.3%), and negative pain coping (n = 276, 24.4%). Significant differences in mean yellow flags existed between all phenotypes except low self-efficacy and negative pain coping. There were no differences in demographics between phenotypes. The high distress class had the lowest mean iHOT-12 score (mean [SD], 23.5 [17.6]), with significant differences found between each phenotypic class.
There was a high prevalence of pain-associated psychological distress in patients presenting to tertiary hip arthroscopy clinics with hip pain. Furthermore, hip quality-of-life outcome scores were uniformly lower in patients with higher levels of psychological distress.
Level III, retrospective cohort study.
Multidimensional health heterogeneity of Chinese older adults and its determinants
2023, SSM - Population Health
Nowadays, the “Healthy China” and “Actively Addressing Population Aging” are two important national strategies in China. Promoting high-quality development of demand-driven older adults health services is an important way to achieve these strategies. From the perspective of active ageing, assessing the health status of older adults from multiple dimensions becomes crucial as it helps identify their specific health service needs, intervention measures, and health policies tailored to this population.
Data were derived from the China Health and Retirement Longitudinal Study (CHARLS) wave 4 (2018). A total of 4190 older adults (aged ≥60 years) were included as the analysis sample. Latent class analysis was performed to categorize older adults based on 6 health indicators, including Activities of Daily Living (ADLs), Instrumental Activities of Daily Living (IADLs), doctor diagnosed chronic diseases, depressive symptoms, cognitive function, and social participation. Multinomial logistic model was used to explore determinants associated with the various patterns of multidimensional health of older adults.
The multidimensional health of older people was classified into three latent classes: Relatively Healthy (Class 1, n = 2806, 66.97%), Highly Depressed and Relatively Health Risk (Class 2, n = 1189, 28.38%), and Functional Impairment (Class 3, n = 195, 4.65%). Gender, age, education, marital status, number of children, alcohol consumption, physical activity, savings, residence, air quality satisfaction, and medical service satisfaction had significant effects on the attribution of all multidimensional health latent classes.
Heterogeneous and multidimensional health classes exist in China's older population, and these classes are influenced by a variety of factors and to varying degrees. Policymakers and healthcare providers can use these evidence to further address the diverse needs of older adults and improve older-care health services, ultimately achieving the goal of Active Ageing and Healthy China.
Effect of feed composition on the production of off-gases during vitrification of simulated low-activity nuclear waste
2023, Progress in Nuclear Energy
During the vitrification of nuclear waste, hazardous and radioactive emissions are generated from the feed-to-glass conversion reactions, in addition to discharges from forced air bubbling and air inleakage. Although the major gaseous emissions are water vapor, nitrogen, and carbon dioxide, various monitored environmental pollutants are also released, such as nitrogen oxides or sulfur dioxide. In addition, reactions between organics and nitrates in the feed may also form products of incomplete combustion such as carbon monoxide and acetonitrile. Although off-gas emissions are commonly measured during both laboratory- and pilot-scale melter testing, no predictive tool is currently available to a priori estimate the composition of gaseous emissions during nuclear waste vitrification. This work forms a basis for the development of such predictive tool by measuring gas evolution from a broad range of simulated low-activity waste melter feeds using evolved gas analysis data and developing correlations between the feed and off-gas compositions. Using reaction stoichiometry and regression analysis, we demonstrate that next to the content of nitrogen and organic carbon in the feed, the gaseous emissions are affected by the feed reduction-oxidation conditions – the more the feed is reduced, the less nitrogen monoxide, and more carbon monoxide and acetonitrile evolves. The results presented in this work provide a first step towards reducing the amount of expensive physical melter testing and the regression analysis provides a simple tool for rapid optimization of feed composition with respect to off-gas composition.
A methodology for calculating the unmet passenger demand in the air transportation industry
2023, Research in Transportation Business and Management
A methodology to estimate the unmet demand is developed using machine learning algorithms. The unmet demand in an origin-destination airports pair (OD pair) is the unattended number of passengers that could not fly because of economic conditions of supply and demand. The forecast of the unmet demand is important for strategic decisions of new planning such as opening new routes, increasing/decreasing number of services, and aircraft choice. The first contribution of this paper is to develop a single-class methodology to unconstraint or detruncate pax demand to estimate the market size of an OD pair. This methodology mixes time-series methods with the bootstrap distribution function and machine learning algorithms. This methodology considers socioeconomic variables at community zone and airport levels to forecast the market size of an OD pair. The second contribution of this paper is to design a methodology that estimates the unmet demand of an OD pair. The advantage is its ability to simulate the unmet demand based on statistical analysis with a confidence level of (1-α)%. The calculations are evaluated by describing the distribution of the market size historical data because distribution functions give the possibility to calculate pax demand without knowing the parameters that have an influence on it. Finally, the third contribution of this paper is to develop an approach to identify new airline OD pairs which could be considered as potential airline markets with certain risk level. This approach is based on the calculations of the OD pair unmet demand and OD pair pax demand forecast on four scenarios. The proposed methodology is applied to the US air pax industry as case study. The results indicate that hubs airports are under extreme competition. Small and primary airports located in big cities are not under competition in some quarters meaning that socioeconomic factors among airports change according with the seasonality of year.
Finite mixture (or latent class) modeling in transportation: Trends, usage, potential, and future directions
2023, Transportation Research Part B: Methodological
Accounting for some types of heterogeneity has been an important pathway to improving our models in the transportation domain, specifically in travel behavior research. This study examines the finite mixture modeling (latent class modeling) framework, which has been an appealing approach to that end. Through a comprehensive and systematic review, the paper aims to provide a broader understanding of the usage landscape and also insights into detailed elements. We firstly set up the mixture modeling framework; outline an arena of various relevant research fields; and explain how it is connected to transportation analyses. Then, by using the Scopus database, we explore relevant papers to investigate macroscopic trends in usage of the methodology (yearly trends and research topics). We identify six subdomains in transportation with the aid of nonnegative matrix factorization. We examine several components of the mixture modeling framework in detail. Each subsection covers certain elements of the framework and thus illuminates the landscape of usage and related issues: eight types of heterogeneity; two modeling approaches (exploratory and confirmatory); types of problems (supervised and unsupervised learning); membership model; outcome model; selecting the number of classes; comparisons with competing models; and software and estimation. At the end, we present a few current frontiers and potential directions for future research, and offer further discussion on several issues that arise in the context of mixture models.
Subgroups of borderline personality disorder: A latent class analysis
2023, Psychiatry Research
Borderline personality disorder (BPD) is characterized by instability in interpersonal, affective, cognitive, self-identity, and behavioral domains. For a BPD diagnosis, individuals must present at least five of nine symptoms, resulting in 256 possible symptom combinations; thus, individuals diagnosed with BPD can differ substantially. Specific symptoms of BPD tend to co-occur, suggesting BPD subgroups. To explore this potential, we analyzed data from 504 participants diagnosed with BPD enrolled in one of three randomized controlled trials conducted at center for Addiction and Mental Health in Toronto, Canada from 2002 to 2018. An exploratory latent class analysis (LCA) was conducted to identify symptom subgroups of BPD. Analyses indicated three latent subgroups. The first group (n = 53) is distinguished by a lack of affective instability and low levels of dissociative symptoms (non-labile type). The second group (n = 279) is characterized by high levels of dissociative and paranoid symptoms but low abandonment fears and identity disturbance (dissociative/paranoid type). The third group (n = 172) is characterized by high efforts to avoid abandonment and interpersonal aggression (interpersonally unstable type). Homogenous symptom subgroups of BPD symptoms exist and may have important implications for how to refine BPD treatment interventions.

View all citing articles on Scopus

View full text

Latent class models for classification

Abstract

Introduction

Section snippets

Supervised classification structures

Unsupervised classification structures

An application

Discussion

Categorical Data Analysis

Neural Networks for Pattern Recognition

Latent structure analysis of a set of multi-dimensional contingency tables

J. Amer. Statist. Assoc.

Concomitant-variable latent-class models

J. Amer. Statist. Assoc.

Categorical Longitudinal Data—Loglinear Analysis of Panel, Trend and Cohort Data

Adaptive mixtures of local experts

Neural Comput.