CMS star ratings found lacking in some areas, especially risk, study finds

A leading quality rating for the nation’s hospitals appears not to adequately account for the risks of undergoing certain procedures at certain hospitals, particularly joint replacement surgery, according to a new study by researchers at Hospital for Special Surgery.

The ultimate finding is that the overall hospital quality star ratings program from the Centers for Medicare and Medicaid Services is unreliable in several respects, according to the study. In particular, the star system significantly understates the risk of complications among patients who undergo total joint arthroplasty at hospitals that perform relatively few of these surgeries.

Findings showed that the star algorithm fails to fully capture the typical observation that higher surgical volume is associated with better quality outcomes.

IMPACT

CMS launched the star ratings in July 2016 as part of a broader effort to promote value-based care — higher quality outcomes at the lowest possible cost.

The system, which is due for an update this month, currently includes 57 performance measures covering seven categories. Together they capture mortality, patient safety, readmission to the hospital, effectiveness and timeliness of care and other relevant factors in hospitalization. Hospitals can receive an overall rating of between one and five stars.

The rub, according to HHS, is that in creating the system, CMS didn’t fully account for the impact of the volume of procedures in its algorithm.

Hospitals that perform a high volume of a particular surgery or intervention typically have better outcomes for those procedures than facilities that see fewer such patients. But the star system does not include measures for hospitals that perform fewer than 25 (but more than zero) procedures over a three-year period in some cases.

As a result, the authors speculate that the ratings would change, perhaps significantly, if those data were incorporated into the model. Moreover, because the star system links all hospitals through relative ratings, the changes could affect other facilities in the database as well.

WHAT ELSE YOU SHOULD KNOW

The HSS team assessed four measures — two for TJA, complications and readmissions, and two for cardiac surgery, mortality and readmissions, for which high-volume hospitals tend to perform better than low-volume hospitals. They used three methods to estimate values for the missing low-volume facilities from the public CMS database.

For three of the four measures, including the estimates had no effect on the overall ratings — suggesting that the star ratings do not reflect the volume-outcome relationship for these measures.

For the fourth measure, complications after TJA, nearly 40 percent of hospitals saw their score change once the estimates of the low-volume data were added to the model. Of those, roughly a third gained a star or more while the remaining two-thirds lost a star or more.

Although the exact percentages differed depending on the method the researchers used to estimate the missing values, the overall trend was the same for each of the three approaches.

The researchers also showed that the underlying safety domain model is not stable. This is because it heavily weighs one quality measure. Slight changes to the underlying data, like complications after TJA, force the model to “flip” to heavily weigh another quality measure. This can dramatically change a hospital’s star rating.