Cave Consulting Group, Improving Efficiency and Quality in the Healthcare System
Home
About Cave Consulting Group
Areas of Service Consulting
Efficiency & Quality Measurement Products
Industries Represented
Contact Cave Consulting Group
 

 

Efficiency Measurement Errors

Recent evidence has demonstrated that leading practitioner efficiency measurement systems have only about 30% agreement across measurement systems. This means that when one system ranks a practitioner as "inefficient," only about 30% of the other systems ranked the same practitioner as inefficient. The remaining 70% of systems ranked the same practitioner as "efficient" (K. Grazier and J.W. Thomas. A Comparative Evaluation of Risk-Adjustment Methodologies for Profiling Physician Practice Efficiency. A report to the Robert Wood Johnson Foundation, September 2002).

These findings show that existing systems have significant error in attempting to accurately identify inefficient practitioners. The error needs to be eliminated, or significantly reduced, if purchasers are to save money by identifying and dealing with inefficient practitioners. Every practitioner falsely measured as efficient (or inefficient) leads to continued inefficiency in the healthcare marketplace.

The methodologies developed by Cave Consulting Group are designed to reduce eight common errors in practitioner efficiency and quality measurement. This error reduction is important to consider--if market quality and efficiency are to be improved. Every practitioner falsely measured as efficient (or inefficient) leads to continued inefficiency in the healthcare market.

We list eight potential errors below, and define how our suggested methodologies correct for each error. There are other errors, but these are the most important.

1. Support an approach based on services per 1,000 members
2. Use a practitioner's actual episode composition
3. No severity-of-illness measure by medical condition
4. No age category assignment by medical condition
5. No tracking mechanism for related complication episodes of care
6. Under-reported charges attributed to partial episodes
7. Over-reported charges attributed to variable chronic disease episode endpoints
8. No minimum number of episodes of care


1. Support an Approach Based on Services per 1,000 members

Many practitioner efficiency and quality methodologies continue to examine "services per 1,000 members." This approach probably adds the most to efficiency measurement error (see Efficiency and Quality Measurement references).

Some methodologies still attempt to adjust very heterogeneous "services per 1,000 members" categories by age and gender--and then compare one practitioner's utilization pattern to a peer group average. However, age and gender explain less than 5% of the variance in a patient's medical expenditures. This means that over 95% of the variance is unexplained, and may be attributed to differences in patient health status.

Some methodologies adjust "services per 1,000 members" based on specific ICD.9 (or diagnosis) algorithms that measure expected resource intensity. The idea is that a patient's diagnosis codes will provide more predictive power than age and gender alone. The most predictive of the published and marketed models explain only 20% to 30% of the variance in a patient's medical expenditures. This means that 70% or more of the variance continues to be unexplained, and may be attributed to differences in patient health status.

Practitioners often criticize the "services per 1,000 members" methodologies that use a predictive case-mix adjustment factor. Practitioners state that the methodologies do not appropriately adjust for differences in patient health status--rightly stating that their patients may be "sicker."

Some models claim to predict over 50% of a patient's medical expenditures, but these models generally do not examine all of a patient's claims (including ambulatory, outpatient facility, and inpatient facility) and/or have significantly manipulated the claims data (e.g., eliminating outliers, observing trimmed means, log transforming variables). The ending model may be more predictive, but generally does not have real-life applications.


The best predictive models on the market today explain only 20% to 30% of the variance in a patient's medical expenditures. This means that 70% or more of the variance is unexplained, and may be attributed to differences in patient health status. Consequently, including all-or almost all-patients in practitioner efficiency and quality measurement will result in unstable and inaccurate ratings.


Our approach:
A methodology developed by the Cave Consulting Group examines condition-specific, longitudinal episodes of care. We apply an efficiency measurement system known as the "marketbasket approach." This approach has gone through over a decade of research and development, and is widely published in academic and trade journals (see Efficiency and Quality Measurement references). We have used the marketbasket approach to examine practitioner efficiency and quality with employers, HMOs, insurance companies, TPAs, physician groups, and physician-hospital organizations--totaling over 30 million member-years.

The approach develops marketbaskets of the most common medical conditions for each specialty type. The methodology uses an indirect standardization technique for weighting together the episodes within the core group of medical conditions in a consistent fashion--thereby allowing each practitioner's efficiency and quality performance to be more accurately compared to one another. The same standardized weights are applied, regardless of each practitioner's actual episode composition.

An advantage of the marketbasket approach over other efficiency methodologies is that examining only common conditions and easier-to-treat patient episodes results in a fair apples-to-apples comparison of each practitioner's practice patterns. Therefore, the variation in treatment patterns is related to actual practitioner efficiency, and not to sicker or healthier patients.

2. Use a Practitioner's Actual Episode Composition

In measuring practitioner efficiency and quality, many traditional methodologies examine a practitioner's actual episode composition as compared to a specialty-specific peer group--and then compare the efficiency and quality of that practitioner to another practitioner. This approach is the second most important factor leading to efficiency measurement error (see Efficiency and Quality Measurement references). The following example will demonstrate why.

Assume one general internist treats 100% sinusitis episodes. A second internist treats 100% ischemic heart disease episodes. Under many efficiency measurement methods, both practitioners' practice patterns will be compared to their same peer group--and then the practitioners' efficiency scores will be compared to one another.

However, this is not a fair statistical comparison, as the internist treating ischemic heart disease has a greater chance of practice pattern variability due to the intensity of services needed to treat ischemic heart disease--as compared to sinusitis episodes. That is, chronic conditions (such as ischemic heart disease) have more variability around average episode treatment charges as compared to acute conditions (such as sinusitis) because more services are used to treat chronic conditions and the episodes have longer durations. Consequently, practitioners treating chronic episodes have a greater chance of practice pattern variability and, therefore, receiving an inefficient ranking than practitioners treating acute conditions.

A correlation analysis helps to clarify this point. Using a more traditional efficiency measurement methodology, the Cave Consulting Group examined the correlation between a practitioner's efficiency score and the practitioner's episode case-mix composition. An analysis showed that lower-volume practitioners with a higher case-mix index for episodes treated were more likely to receive an inefficient score as compared to practitioners with a lower case-mix index score. We suggest you perform a correlation analysis using your current efficiency measurement system and observe first-hand the possible correlation patient case-mix composition and practitioner inefficiency ratings.


A correlation analysis showed that lower-volume practitioners with a higher patient case-mix index for episodes treated were more likely to receive an inefficient score as compared to practitioners with a lower patient case-mix index.


Our approach: A methodology developed by the Cave Consulting Group builds a broad-based marketbasket of medical conditions for each specialty type. The methodology uses an indirect standardization technique for weighting together the episodes within the core group of medical conditions in a consistent fashion--thereby allowing each practitioner's efficiency and quality performance to be more accurately compared to one another. The same standardized weights are applied, regardless of each practitioner's actual episode composition.

For a given practitioner specialty, the marketbasket of medical conditions does not change significantly over time. This means that any trend increase reflected by a marketbasket is independent of changes in patient case-mix, health status, and demographics. Instead, the trend reflects service inflation, service volume increases, and service intensity increases to treat the episodes within the static set of medical conditions.

3. No Severity-of-Illness Measure by Medical Condition

Some claims-based episode groupers and methods do not have a severity-of-illness index by medical condition. This issue is the third most important factor leading to efficiency and quality measurement error. The reason is that the formulated episodes have significant heterogeneity (see Efficiency and Quality Measurement references). The end result may be practitioner efficiency differences that are attributed to inaccurate episode severity-of-illness coding--and not to practice patterns variation.

Moreover, some claims-based episode groupers stratify formulated episodes for a medical condition by the presence or absence of a specific surgery or service (e.g., knee derangement with and without surgery; ischemic heart disease with and without heart catheterization). The reason for performing this stratification is to reduce episode heterogeneity for a medical condition. In effect, the stratification serves as a sort of severity-of-illness adjustment.

However, stratification based on the presence of surgery or a high-cost service results in at least two practitioner efficiency measurement errors: (1) performing surgery versus not performing surgery is the practice pattern variation we need to examine in determining practitioner efficiency and quality, and this variation is not captured in more traditional methodologies; and (2) the episodes of care are unnecessarily separated into smaller groups whereby practitioners may not have enough episodes to examine in any one smaller group. Consequently, the stratified episodes of care need to be recombined for accurate practitioner efficiency and quality measurement.

Our approach: The Cave Consulting Group implements a severity-of-illness index by medical condition to reduce the heterogeneity in longitudinal episodes of care. Then, we examine the most prevalent, condition-specific severity-of-illness classes treated by practitioners of a given specialty type.

For those clients that have implemented an episode grouper without a severity-of-illness index, the Cave Consulting Group applies a more sophisticated outlier analysis by medical condition and practitioner to reduce episode heterogeneity.

4. No Age Category Assignment by Medical Condition

Many methodologies do not examine condition-specific episodes by age category. Yet, studies have illustrated that broad-based age bands are important to separately examine--even after episodes have been assigned a severity-of-illness index (see Efficiency and Quality Measurement references). The reason is that practitioners tend to treat children and adults differently for most conditions. For example, children are less likely than adults to receive a chest x-ray and potent antibiotics for many medical conditions. The end result may be practitioner efficiency and quality differences that are attributed to patient age differences--and not to practice patterns variation.

Our approach: The Cave Consulting Group uses broad-based age classes to reduce the heterogeneity in condition-specific episodes of care. Then, we examine the most prevalent age groups treated by practitioners of a given specialty type.

5. No Tracking Mechanism for Related Complication Episodes of Care

Many methodologies do not link the charges and utilization from a patient's complication episodes to his underlying medical condition. Complications are those episodes that are clinically related to the primary medical condition. Consequently, many condition-specific episodes have under-reported charges (see Efficiency and Quality Measurement references).

For example, practitioners code up to 70% of an average diabetic's charges under related complications to the diabetes (e.g., neuropathies, circulatory, eye, renal) and not diabetes care. Therefore, without considering and including related complication episodes with the actual diabetes episode, practitioner efficiency and quality differences may be attributed to incomplete episode charges and utilization--and not to treatment pattern variations.

Furthermore, models that attempt to predict diabetic patients with an unstable medical condition will produce erroneous results. The reason is that important patient utilization data has not been analyzed by the predictive model.

Our approach: The methodologies developed by the Cave Consulting Group include a patient's underlying medical condition episode of care charges and complication episode charges and utilization. Therefore, we examine the patient's overall condition-specific treatment pattern. For instance, when examining treatment patterns for diabetic patients, our average episode charge per diabetic patient is over $4,000. Many efficiency measurement systems only track $1,250 in average episode charges (or the systems underestimate actual diabetic treatment charges by about 70%).

The difference is that many other efficiency measurement systems do not include for diabetic patients the episodes associated with eye complications (e.g., retinal involvement, visual disturbances), circulatory complications (e.g., peripheral vascular disease, cardiovascular disease), renal complications (e.g., glomerulonephritis, renal failure), and nerve complications (e.g., neuropathies, neuritis). Instead, the systems track only charges and utilization coded directly as diabetes treatment (e.g., ICD.9 code 250). Some systems do track that complications may exist for a diabetic patient, but most of these systems only have a field that states the number of patient complications.

6. Under-Reported Charges Attributed to Partial Episodes

Some methodologies do not separate partial from complete episodes of care when measuring practitioner efficiency and quality (see Efficiency and Quality Measurement references). Partial episodes result because a patient enrolled in a health plan during the study period or disenrolled during the study period. However, including partial episodes leads to inaccurate efficiency and quality measurement because of under-reported episode charges--especially when some practitioners have more partial episodes than other practitioners. The end result may be practitioner efficiency and quality differences that are attributed to the inclusion of partial episodes--and not to practice patterns variation.

Our approach: Our methodologies exclude from efficiency and quality measurement all partial patient episodes of care. To achieve this objective, the Cave Consulting Group has developed sophisticated algorithms that examine the beginning and ending points of all acute- and chronic-condition episodes, and subsequently label each patient episode as partial and complete.

7. Over-Reported Charges Attributed to Variable Chronic Disease Episode Endpoints

Some methodologies do not appropriately end a patient's episode of care before measuring a practitioner's efficiency and quality. For example, chronic conditions may continue indefinitely and, therefore, patient episodes of care may be of various durations (e.g., 60 days or 600 days)--depending on the amount of available patient claims data (see Efficiency and Quality Measurement references). The end result may be practitioner efficiency and quality differences that are attributed to excessively long or variable chronic condition episode durations--and not to practice patterns variation.

Our approach: The Cave Consulting Group's methodologies ensure that patient chronic disease episodes are of an appropriate and consistent duration to ensure accurate practitioner efficiency and quality measurement.

8. Minimum Number of Episodes of Care

Some methodologies do not require a minimum number of condition-specific episodes when comparing a practitioner's practice patterns to a peer group. Instead, one or two episodes of care are enough. However, there may be significant episode of care heterogeneity in one or two condition-specific episodes--even after applying a sophisticated severity-of-illness index. Consequently, examining one episode here-and-there for a practitioner may introduce significant error into a practitioner's efficiency and quality measurement. The end result may be practitioner efficiency and quality differences that are attributed to the heterogeneity in the low number of episodes examined--and not to practice patterns variation.

Our approach: The Cave Consulting Group's methodologies require a practitioner to have a minimum number of episodes for each medical condition. If the minimum episode number is not achieved, then a standardized methodology is applied to measure a practitioner against a peer group (see Efficiency and Quality Measurement references).

back to top



Copyright © 2003 Cave Consulting Group, All Rights Reserved