II. International Congress on Critical Care on the Internet –

CIMC 2000

 

 

 

 

Conference:

Evaluation of the Severity of Illness

 

Philipp G. H. Metnitz

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Current address:

Departement Réanimation Médicale,

Hôpital St. Louis, Université Lariboisière-St. Louis,

1 Avenue Claude Vellefaux, 75010 Paris, France.

 

Email: philipp.metnitz@univie.ac.at


Modern intensive care faces increasingly medical, ethical and economical requirements. Quality management has the tools to deal with these issues. Intensive care treatment is, however, a complex process which is carried out in a very heterogeneous population and is influenced by several variables such as cultural background, different health care systems and others. It is therefore extremely difficult to reduce the quality of intensive care to something measurable, to determine it and to compare it between different institutions.

 

Although quality has a variety of dimensions, the main interest today focuses on effectiveness and efficiency: It is clear that other issues are less relevant if the care being provided is either ineffective or harmful. Therefore, the priority must be to evaluate effectiveness. The instrument available to measure effectiveness in intensive care is outcome research. Starting point for this research was the high variability in medical processes [[1]], which was found during the first part of the 20th century, when epidemiology was developing. The variation in medicine – including the lacking standardization – led to the search for the “optimal” therapy. Outcome research provided the methods to compare different groups of patients and institutions. Risk adjustment is now used to standardize different groups of patients, which allows to evaluate possible associations between treatments and outcome.

 

The assessment of the severity of illness through prognosis of a hospital mortality is the method of choice for risk adjustment in intensive. Recent studies have shown that the performance of the prediction models can, however, vary considerably when they are applied to populations, different from the one they were developed from [[2],[3]]

 

For the application of a general severity of illness score, its performance has in general to be tested by means of discrimination and calibration. Discrimination refers hereby to the model’s ability to distinguish between nonsurvivors and survivors, assigning higher scores to patients who die. Calibration refers to the accuracy of the prediction when the number of predicted and observed deaths is compared over the range of severity of illness. Customization of a general severity score by deriving a new logistic regression equation has been found useful when calibration of general scoring systems is poor [3,[4]].

 

During the evaluation of the ASDI Documentation Standard for Intensive Care Medicine, lack of calibration of the SAPS II [[5]] in Austrian patients was found [[6]]. There are several possible explanations for the found lack of calibration. First, SAPS II does not take into account all the factors that are known to influence outcome. Second, our results as well as other studies [2] demonstrated that the lack in the uniformity of fit of the SAPS II was also attributable to factors that are included in the model. Third, there exist a variety of factors (known and unknown) which are not included in the SAPS II, but contribute to the phenomenon of unmeasured case mix.

 

For this reason, SAPS II was calibrated, using "first level customization [[7]]. The so derived new prognostic model SAPS II-AM (AM for Austrian model) has later been validated in a bigger cohort of patients [[8]]. The increased prognostic performance of the SAPS II AM-99 can be seen from the appropriate calibration curves (Figure 1).

 

Severity of illness data and O/E ratios are increasingly used by governmental and commercial institutions to assess the clinical performance of ICUs [[9]]. According to a recent study which suggested that such data might be useful to classify ICUs into different levels of clinical and economic performance [[10]], different models of using these data as a measure for effectiveness in Austrian ICUs were discussed by official institutions. Although these models have not been instituted yet, it seemed important to evaluate the performance of the statistical models used to generate these data and to detect any potential confounders.

 

Analyzing Austrian data we found, that customization changed both predicted hospital mortality and O/E ratios to varying degrees [8]. Moreover, O/E ratios differed widely across subgroups in reason for admission. It is inevitable that this behavior will influence the overall O/E ratio of ICUs: An ICU with a high proportion of a specific patient group could accordingly exhibit lower or higher O/E ratios. Table 1 shows the O/E ratios in categories of the reason for admission, admitted continuously to 35 adult Austrian ICUs in 1998 (n=7851).

 

These results demonstrate that today’s severity scoring systems, such as the SAPS II, are limited by not measuring (and adjusting for) a profound part of what constitutes case mix. Changes in the distribution of patient characteristics (known and unknown) therefore influence prognostic accuracy. Using O/E ratios as a measure for effectiveness (or other dimensions of quality), one has to be critical when using these data.

 

 


Table 1. O/E ratios in reason for admission categories.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


O/E: Observed to expected mortality ratio; CI: confidence interval; SAPS II: original SAPSII model, as described by Le Gall et al.; SAPS II AM99: customized Austrian model; n=7851


 


 

 

 


Figure 1a und 1b. Calibration curves for the SAPS II and the SAPS II-AM-99.

 

(a) Original SAPS II model (n=2901). (b) Austrian SAPS II model (SAPS II-AM-99), validation sample (n=1451). Columns: Number of patients. Squares: Mean predicted hospital mortality. Circles: Mean observed hospital mortality.x-axis: predicted risk of death; y-axis: observed hospital mortality;

 


References



[[1]] Shewhart W. Controlling variation in health care. Med Care 1991;29:1212-1225.

[[2]] Moreno R, Apolone G, Reis Miranda D (1998) Evaluation of the uniformity of fit of general outcome prediction models. Intensive Care Med 24: 40–47.

[[3]] Moreno R, Apolone G (1997) Impact of different customization strategies in the performance of a general severity score. Crit Care Med 25: 2001-2008

[[4]] Le Gall JR, Lemeshow St, Leleu G, Klar J, Huillard J, Rue M, Teres D, Artigas A, for the Intensive Care Scoring Group (1995) Customized probability models for early severe sepsis in adult intensive care patients. JAMA 273: 644 – 650.

[[5]] Le Gall JR, Lemeshow St, Saulnier F. A new Simplified Acute Physiology Score (SAPS II) based on a European/North American Multicentre Study. JAMA 1993; 270(24): 2957-2963.

[[6]] Metnitz PhGH, Vesely H, Valentin A, Popow C, Hiesmayr M, Lenz K, Krenn CG, Steltzer H (1998) Evaluation of an Interdisciplinary Data Set for National ICU Assessment. Crit-Care-Med 1999; 27(8): 1486-1491.

[[7]] Metnitz PGH, Valentin A, Vesely H, Alberti C, Lang Th, Lenz K, Steltzer H, Hiesmayr M. Prognostic Performance and Customization of the SAPS II: results of a multicenter Austrian study. Intensive-Care-Med 1999; 25(2): 192-197.

[[8]] Metnitz PhGH, Vesely H, Valentin A, Lang T, Le Gall JR. Ratios of observed to expected mortality are affected by differences in case mix and quality of care. Intensive Care Medicine 2000; 26:1466-1472.

[[9]] Green J, Wintfeld N. Report cards on cardiac surgeons. Assessing New York State's approach. N Engl J Med 1995;332(18):1229-32.

[[10]] Rapoport J, Teres D, Lemeshow St, Gehlbach St. A method for assessing the clinical performance and cost-effectiveness of intensive care units: a multicenter inception cohort study. Crit Care Med 1994;22(9):1385-1391.