Efficiency
Measurement Errors Recent evidence has demonstrated that leading
practitioner efficiency measurement systems have only about 30% agreement across
measurement systems. This means that when one system ranks a practitioner as "inefficient,"
only about 30% of the other systems ranked the same practitioner as inefficient.
The remaining 70% of systems ranked the same practitioner as "efficient"
(K. Grazier and J.W. Thomas. A Comparative Evaluation of Risk-Adjustment Methodologies
for Profiling Physician Practice Efficiency. A report to the Robert Wood Johnson
Foundation, September 2002). These findings show that
existing systems have significant error in attempting to accurately identify inefficient
practitioners. The error needs to be eliminated, or significantly reduced, if
purchasers are to save money by identifying and dealing with inefficient practitioners.
Every practitioner falsely measured as efficient (or inefficient) leads to continued
inefficiency in the healthcare marketplace. The methodologies developed
by Cave Consulting Group are designed to reduce eight common errors in practitioner
efficiency and quality measurement. This error reduction is important to consider--if
market quality and efficiency are to be improved. Every practitioner falsely measured
as efficient (or inefficient) leads to continued inefficiency in the healthcare
market. We list eight potential errors below, and define how our suggested
methodologies correct for each error. There are other errors, but these are the
most important.
1. Support an approach based
on services per 1,000 members 2. Use a practitioner's
actual episode composition 3. No severity-of-illness
measure by medical condition 4. No age category
assignment by medical condition 5. No tracking
mechanism for related complication episodes of care 6. Under-reported
charges attributed to partial episodes 7. Over-reported
charges attributed to variable chronic disease episode endpoints 8.
No minimum number of episodes of care 1.
Support an Approach Based on Services per 1,000 members
Many practitioner
efficiency and quality methodologies continue to examine "services per 1,000
members." This approach probably adds the most to efficiency measurement
error (see Efficiency and Quality Measurement
references).
Some methodologies still attempt to adjust very heterogeneous
"services per 1,000 members" categories by age and gender--and then
compare one practitioner's utilization pattern to a peer group average. However,
age and gender explain less than 5% of the variance in a patient's medical expenditures.
This means that over 95% of the variance is unexplained, and may be attributed
to differences in patient health status. Some methodologies adjust "services
per 1,000 members" based on specific ICD.9 (or diagnosis) algorithms that
measure expected resource intensity. The idea is that a patient's diagnosis codes
will provide more predictive power than age and gender alone. The most predictive
of the published and marketed models explain only 20% to 30% of the variance in
a patient's medical expenditures. This means that 70% or more of the variance
continues to be unexplained, and may be attributed to differences in patient health
status. Practitioners often criticize the "services per 1,000 members"
methodologies that use a predictive case-mix adjustment factor. Practitioners
state that the methodologies do not appropriately adjust for differences in patient
health status--rightly stating that their patients may be "sicker." Some
models claim to predict over 50% of a patient's medical expenditures, but these
models generally do not examine all of a patient's claims (including ambulatory,
outpatient facility, and inpatient facility) and/or have significantly manipulated
the claims data (e.g., eliminating outliers, observing trimmed means, log transforming
variables). The ending model may be more predictive, but generally does not have
real-life applications. The best predictive
models on the market today explain only 20% to 30% of the variance in a patient's
medical expenditures. This means that 70% or more of the variance is unexplained,
and may be attributed to differences in patient health status. Consequently, including
all-or almost all-patients in practitioner efficiency and quality measurement
will result in unstable and inaccurate ratings.
Our
approach: A methodology developed by the Cave Consulting Group examines
condition-specific, longitudinal episodes of care. We apply an efficiency measurement
system known as the "marketbasket approach." This approach has gone
through over a decade of research and development, and is widely published in
academic and trade journals (see Efficiency
and Quality Measurement references). We have used the marketbasket approach
to examine practitioner efficiency and quality with employers, HMOs, insurance
companies, TPAs, physician groups, and physician-hospital organizations--totaling
over 30 million member-years.
The approach develops marketbaskets of the
most common medical conditions for each specialty type. The methodology uses an
indirect standardization technique for weighting together the episodes within
the core group of medical conditions in a consistent fashion--thereby allowing
each practitioner's efficiency and quality performance to be more accurately compared
to one another. The same standardized weights are applied, regardless of each
practitioner's actual episode composition. An advantage of the marketbasket
approach over other efficiency methodologies is that examining only common conditions
and easier-to-treat patient episodes results in a fair apples-to-apples comparison
of each practitioner's practice patterns. Therefore, the variation in treatment
patterns is related to actual practitioner efficiency, and not to sicker or healthier
patients. 2. Use a Practitioner's Actual Episode Composition In
measuring practitioner efficiency and quality, many traditional methodologies
examine a practitioner's actual episode composition as compared to a specialty-specific
peer group--and then compare the efficiency and quality of that practitioner
to another practitioner. This approach is the second most important factor leading
to efficiency measurement error (see Efficiency
and Quality Measurement references). The following example will demonstrate
why. Assume one general internist treats 100% sinusitis episodes. A second
internist treats 100% ischemic heart disease episodes. Under many efficiency measurement
methods, both practitioners' practice patterns will be compared to their same
peer group--and then the practitioners' efficiency scores will be compared to
one another. However, this is not a fair statistical comparison, as the
internist treating ischemic heart disease has a greater chance of practice pattern
variability due to the intensity of services needed to treat ischemic heart disease--as
compared to sinusitis episodes. That is, chronic conditions (such as ischemic
heart disease) have more variability around average episode treatment charges
as compared to acute conditions (such as sinusitis) because more services are
used to treat chronic conditions and the episodes have longer durations. Consequently,
practitioners treating chronic episodes have a greater chance of practice pattern
variability and, therefore, receiving an inefficient ranking than practitioners
treating acute conditions. A correlation analysis helps to clarify this
point. Using a more traditional efficiency measurement methodology, the Cave Consulting
Group examined the correlation between a practitioner's efficiency score and the
practitioner's episode case-mix composition. An analysis showed that lower-volume
practitioners with a higher case-mix index for episodes treated were more likely
to receive an inefficient score as compared to practitioners with a lower case-mix
index score. We suggest you perform a correlation analysis using your current
efficiency measurement system and observe first-hand the possible correlation
patient case-mix composition and practitioner inefficiency ratings.
A
correlation analysis showed that lower-volume practitioners with a higher patient
case-mix index for episodes treated were more likely to receive an inefficient
score as compared to practitioners with a lower patient case-mix index. Our
approach: A methodology developed by the Cave Consulting Group builds
a broad-based marketbasket of medical conditions for each specialty type. The
methodology uses an indirect standardization technique for weighting together
the episodes within the core group of medical conditions in a consistent fashion--thereby
allowing each practitioner's efficiency and quality performance to be more accurately
compared to one another. The same standardized weights are applied, regardless
of each practitioner's actual episode composition.
For a given practitioner
specialty, the marketbasket of medical conditions does not change significantly
over time. This means that any trend increase reflected by a marketbasket is independent
of changes in patient case-mix, health status, and demographics. Instead, the
trend reflects service inflation, service volume increases, and service intensity
increases to treat the episodes within the static set of medical conditions. 3.
No Severity-of-Illness Measure by Medical Condition Some claims-based episode
groupers and methods do not have a severity-of-illness index by medical condition.
This issue is the third most important factor leading to efficiency and quality
measurement error. The reason is that the formulated episodes have significant
heterogeneity (see Efficiency and Quality
Measurement references). The end result may be practitioner efficiency
differences that are attributed to inaccurate episode severity-of-illness coding--and
not to practice patterns variation. Moreover, some claims-based episode
groupers stratify formulated episodes for a medical condition by the presence
or absence of a specific surgery or service (e.g., knee derangement with and without
surgery; ischemic heart disease with and without heart catheterization). The reason
for performing this stratification is to reduce episode heterogeneity for a medical
condition. In effect, the stratification serves as a sort of severity-of-illness
adjustment. However, stratification based on the presence of surgery or
a high-cost service results in at least two practitioner efficiency measurement
errors: (1) performing surgery versus not performing surgery is the practice pattern
variation we need to examine in determining practitioner efficiency and quality,
and this variation is not captured in more traditional methodologies; and (2)
the episodes of care are unnecessarily separated into smaller groups whereby practitioners
may not have enough episodes to examine in any one smaller group. Consequently,
the stratified episodes of care need to be recombined for accurate practitioner
efficiency and quality measurement. Our approach: The Cave
Consulting Group implements a severity-of-illness index by medical condition to
reduce the heterogeneity in longitudinal episodes of care. Then, we examine the
most prevalent, condition-specific severity-of-illness classes treated by practitioners
of a given specialty type. For those clients that have implemented an episode
grouper without a severity-of-illness index, the Cave Consulting Group applies
a more sophisticated outlier analysis by medical condition and practitioner to
reduce episode heterogeneity. 4. No Age Category Assignment
by Medical Condition Many methodologies do not examine condition-specific
episodes by age category. Yet, studies have illustrated that broad-based age bands
are important to separately examine--even after episodes have been assigned a
severity-of-illness index (see Efficiency
and Quality Measurement references). The reason is that practitioners
tend to treat children and adults differently for most conditions. For example,
children are less likely than adults to receive a chest x-ray and potent antibiotics
for many medical conditions. The end result may be practitioner efficiency and
quality differences that are attributed to patient age differences--and not to
practice patterns variation. Our approach: The Cave Consulting
Group uses broad-based age classes to reduce the heterogeneity in condition-specific
episodes of care. Then, we examine the most prevalent age groups treated by practitioners
of a given specialty type. 5. No Tracking Mechanism
for Related Complication Episodes of Care Many methodologies do not link
the charges and utilization from a patient's complication episodes to his underlying
medical condition. Complications are those episodes that are clinically related
to the primary medical condition. Consequently, many condition-specific episodes
have under-reported charges (see Efficiency
and Quality Measurement references). For example, practitioners
code up to 70% of an average diabetic's charges under related complications to
the diabetes (e.g., neuropathies, circulatory, eye, renal) and not diabetes care.
Therefore, without considering and including related complication episodes with
the actual diabetes episode, practitioner efficiency and quality differences may
be attributed to incomplete episode charges and utilization--and not to treatment
pattern variations. Furthermore, models that attempt to predict diabetic
patients with an unstable medical condition will produce erroneous results. The
reason is that important patient utilization data has not been analyzed by the
predictive model. Our approach: The methodologies developed
by the Cave Consulting Group include a patient's underlying medical condition
episode of care charges and complication episode charges and utilization. Therefore,
we examine the patient's overall condition-specific treatment pattern. For instance,
when examining treatment patterns for diabetic patients, our average episode charge
per diabetic patient is over $4,000. Many efficiency measurement systems only
track $1,250 in average episode charges (or the systems underestimate actual diabetic
treatment charges by about 70%). The difference is that many other efficiency
measurement systems do not include for diabetic patients the episodes associated
with eye complications (e.g., retinal involvement, visual disturbances), circulatory
complications (e.g., peripheral vascular disease, cardiovascular disease), renal
complications (e.g., glomerulonephritis, renal failure), and nerve complications
(e.g., neuropathies, neuritis). Instead, the systems track only charges and utilization
coded directly as diabetes treatment (e.g., ICD.9 code 250). Some systems do track
that complications may exist for a diabetic patient, but most of these systems
only have a field that states the number of patient complications. 6.
Under-Reported Charges Attributed to Partial Episodes Some methodologies
do not separate partial from complete episodes of care when measuring practitioner
efficiency and quality (see Efficiency and
Quality Measurement references). Partial episodes result because a patient
enrolled in a health plan during the study period or disenrolled during the study
period. However, including partial episodes leads to inaccurate efficiency and
quality measurement because of under-reported episode charges--especially when
some practitioners have more partial episodes than other practitioners. The end
result may be practitioner efficiency and quality differences that are attributed
to the inclusion of partial episodes--and not to practice patterns variation. Our
approach: Our methodologies exclude from efficiency and quality measurement
all partial patient episodes of care. To achieve this objective, the Cave Consulting
Group has developed sophisticated algorithms that examine the beginning and ending
points of all acute- and chronic-condition episodes, and subsequently label each
patient episode as partial and complete. 7. Over-Reported
Charges Attributed to Variable Chronic Disease Episode Endpoints Some methodologies
do not appropriately end a patient's episode of care before measuring a practitioner's
efficiency and quality. For example, chronic conditions may continue indefinitely
and, therefore, patient episodes of care may be of various durations (e.g., 60
days or 600 days)--depending on the amount of available patient claims data (see
Efficiency and Quality Measurement
references). The end result may be practitioner efficiency and quality differences
that are attributed to excessively long or variable chronic condition episode
durations--and not to practice patterns variation. Our approach:
The Cave Consulting Group's methodologies ensure that patient chronic disease
episodes are of an appropriate and consistent duration to ensure accurate practitioner
efficiency and quality measurement. 8. Minimum Number
of Episodes of Care Some methodologies do not require a minimum number of
condition-specific episodes when comparing a practitioner's practice patterns
to a peer group. Instead, one or two episodes of care are enough. However, there
may be significant episode of care heterogeneity in one or two condition-specific
episodes--even after applying a sophisticated severity-of-illness index. Consequently,
examining one episode here-and-there for a practitioner may introduce significant
error into a practitioner's efficiency and quality measurement. The end result
may be practitioner efficiency and quality differences that are attributed to
the heterogeneity in the low number of episodes examined--and not to practice
patterns variation. Our approach: The Cave Consulting Group's
methodologies require a practitioner to have a minimum number of episodes for
each medical condition. If the minimum episode number is not achieved, then a
standardized methodology is applied to measure a practitioner against a peer group
(see Efficiency and Quality Measurement
references). back to top
Copyright ©
2003 Cave Consulting Group, All Rights Reserved
|