Home The Quarterly 2012 Measuring hospital quality and performance


Measuring hospital quality and performance Print E-mail
The Quarterly 2012

This article was written by Dr Susan Keam, derived from material presented by Professor Russell Mannion on Wednesday the 12th of October 2011 at The Great Healthcare Challenge: Achieving Patient Centered Outcomes.

Hospital performance measurement is not a new activity – it is a global phenomenon. Performance data have been published since the mid-1980's in the US and since 1999 in the UK. In the last 10 years, many other countries have introduced various forms of hospital performance measurement.

The UK NHS is the largest organisation in Europe, and the fourth largest organisation in the world. It employs 1.3 million people and has an annual budget of £100 billion. So improving NHS performance not only improves patient care and outcomes, but is also of major macroeconomic significance, particularly in the current economic climate.

The history of hospital performance measurement goes back a long way in the UK. In the 1860s, Florence Nightingale – the 'passionate statistician'- pioneered the systematic collection, analysis and dissemination of comparative hospital outcomes data (risk adjusted outcomes measures for all of London's hospitals) in order to understand and improve performance.

Florence Nightingale also anticipated a lot of the problems associated with performance measurement, including case mix, manipulation and gaming, all dysfunctional consequences that were also found in the NHS in our research. Despite her efforts, not all doctors agreed with the idea of measuring hospital performance and so after a couple of years, the British Medical Association decided to stop collecting these data, and it would be another 140 years before the UK again had national performance measures.

Lesson 1 from the NHS experience: doctors may not like the introduction of performance measurement systems so to successfully introduce them, effort is needed to get doctors on board and engaged.

How do you measure and improve performance in healthcare?

The simplest approach is to firstly measure some aspects of performance, then take into account the context when analysing the data, followed by action based on the outcomes, which results in a change in the health system (figure 1).

Figure 1. A simple model of the performance measurement system

Stage 1: Measurement

Before starting to measure performance the purpose of the measurement system needs to be agreed and stakeholders identified, because these factors determine what gets measured and the way the system is designed.

Purposes may include:

  • Ensuring accountability to a range of stakeholders (public, patients, regulatory agencies)
  • Facilitating market mechanisms (patients, purchasers). By informing patients they can make informed decisions when choosing their healthcare provider (facilitate patient choice and market forces)
  • Central government control
  • Performance improvement (internal [versus past] and comparative)
  • Epidemiological and public health data

Then, the dimensions (output or process measures) that should be measured need to be identified.

Commonly used dimensions include:

  • Health outcomes
  • Cost effectiveness
  • Quality
  • Safety
  • Finance
  • Continuity
  • Patient satisfaction
  • Equity
  • Responsiveness
  • Democracy

The selection and weighting of measures is a subjective exercise, not just an exercise of identifying technical issues around reliability, validity, credibility and other psychometric measures. To get support for the process from ground level needs democratic input from the stakeholders that will be using the performance measurement system. It's also important to remember that underpinning all of this are the political issues of what dimensions you choose, how they are weighted, how they are prioritised, and who selects them.

Lesson 2 from the NHS experience: Involve stakeholders, but remember measurement is a subjective exercise.

The next big issue to confront is to decide whether to use outcome measures or process measures to determine performance. In the UK, focus was initially on process measures, but under the current coalition government, the emphasis has shifted to outcome measures. Nevertheless, each has its own advantages and disadvantages (tables I and II).

Table I

Comparison of the Relative Advantages & Disadvantages of Outcome Indicators
Advantages Disadvantages
1. Focus: directs attention towards the patient (rather than the service) and helps nurture'whole system' collaboration. 1. Attribution: may be influenced by many factors that are outside the control of a health care organisation.
2. Goals: more clearly represent the goal of the NHS. 2. Sample size: require large sample sizes to detect a statistically significant difference.
3. Meaningful: tend to be more meaningful to some of the potential users of clinical indicators (patient, purchasers). 3. Timing: may take a long period of time to observe.
4. Innovation: providers are encouraged to experiment with new modes of delivery. 4. Interpretation: may be difficult to interpret if the process that produced the outcome occurred far in the past.
5. Far sighted: encourages providers to adopt long term strategies such as health promotion which may realise long term benefits.  
6. Manipulation: are less likely to be manipulated than process indicators – although providers can influence risk adjusted outcome by exaggerating the severity of patients (upstaging).  

Table II.

Comparison of the Relative Advantages and Disadvantages of Process Measure
Advantages Disadvantages
1. Readily measured: utilization of health technologies is relatively easily measured without significant bias or error. 1. Salience: may have little meaning to patients unless the link to outcomes can be explained.
2. Easily interpreted: utilization rates of different technologies can readily be interpreted by reference to the evidence base rather than inter-unit comparisons. 2.Specificity: they are often quite specific to a single disease or single type of medical care so that process measures across several clinical areas or aspects of service delivery may be required to represent quality for a particular group of patients.
3. Sample size: compared to outcome indicators, process indicators can identify significant deficiencies with much smaller sample sizes. 3. Ossification: may stifle innovation and the development of new modes of care.
4. Unobtrusive: can frequently be assessed unobtrusively (e.g. data stored in administrative or medical records). 4. Obsolete: usefulness may dissipate as technology and modes of care change.
5. Indicators for action: failures identified in the process of care provide clear guidance on what must be remedied to improve health care quality. They are also more quickly acted upon than some outcome indicators which only become available after a long time has elapsed (when it becomes too late to act on the data). 5.Adverse behaviour: process indicators are easily manipulated and may give rise to gaming and other adverse behaviour.  
6. Coverage: can capture aspects of care (such as speed of access and patient experience) that are often valued by patients apart from health outcomes.  

So, going back to the simple model, at the first stage we have to decide the purpose of the system, identify the stakeholders we are serving, and then decide what dimensions of performance measurement we want to include in the system and what sort of indicators (outcome or process or a mixture) are wanted. In practice, most good performance measurement systems include a judicious mix of outcome and process indicators, with the mix tailored to the circumstances.

Stage 2: Analysis

In this stage of the performance management model we need to select the performance criterion to use, the external factors to adjust for, in order to achieve a more accurate assessment of performance. The three most common criteria used when assessing performance are: comparison with past performance within the same organisation; comparison with an arbitrarily defined target (e.g. the nationally defined targets commonly used in the UK); and comparison with similar organisations (league table). Incentives to provide information vary according to the criterion used, and different criteria have been used within the NHS at different times with different effects and have had different incentives attached to them.

External factors that must be adjusted for include:

  • Variations in case mix (e.g. when considering surgical morbidity, adjustment is needed for severity of cases operated on by each surgeon); however, it's important to note that the risk adjustment models used are not perfect. Results can vary between models, and they also don't take into account the effects of comorbidities.
  • Variations in the external environment (e.g. under the old UK hospital star rating system, where a higher number of stars indicated a better hospital, hospitals were penalised if they had a high readmission rate. This was based on the assumption that readmission rates were due to poor discharge practices; however, other community-based factors may have contributed to readmissions. Hospitals were being penalised by the NHS for factors that they had no control over).
  • Chance variability (random variation).
  • Coding differences/errors (wrong inferences can result from incomplete or inaccurate information).

As a consequence of these reasons and more, measured performance doesn't fully equate with actual performance because we can't fully control for all these background factors. Consequently, we need some sort of softer measure in addition to the quantitative measures before making any judgement about performance. An example would be the soft intelligence regarding a doctor's clinical competence and performance which may circulate around professional circles and informal networks.

These background factors can give rise to Type I and Type II interpretation errors. Type 1 errors occur when you get organisations or individuals with average or good performance assessed as underperforming because of these background factors, while Type II errors occur where an underperforming organisation or individual is assessed as adequate for these reasons. Traditionally the NHS has focused on avoiding Type I errors; however, in more recent times since the Bristol problems and other more recent health scandals, focus has switched to avoiding Type II errors and correctly detecting underperforming organisations and underperforming staff. In other words, there has been a shift from trusting clinicians to deliver high quality care to an overemphasis on checking what they do with all the consequences of transaction costs etc (a shift from type I to type II error avoidance)..

Stage 3: Action

There's no point in collecting and processing all this data and analysing it for the background factors if we don't act on it and use it to improve the system in some way. There are two dimensions to consider when implementing changes:

  • dissemination of information and the incentives attached to implementation and
  • the types of information needed by different stakeholder groups who are going to use the data, including consumers, clinicians and managers.

Incentive: “A reward (or sanction) associated with a particular aspect of performance”

The results of the performance management assessment and analysis need to be disseminated to all the people likely to use it (clinicians, managers, policy makers GPs etc.) and we also need to consider the types of reward or sanction (e.g. performance management) that must be established for successful implementation of any required change.

Types of rewards or sanctions include:

  • Personal financial rewards (in recent years there has been a shift towards external rewards over intrinsic motivation)
  • Intrinsic rewards (a job well done)
  • Peer reputation
  • Career advancement
  • Additional budget for service development
  • Time to pursue other activities (e.g. research)
  • Reduced level of inspection for high performers
  • A probationary period
  • Requirement to engage in professional development or re-training
  • Loss of livelihood
  • Dismissal

The US experience of making performance data available

The evaluation of the impact of performance data varies according to needs of key stakeholder, including consumers, physicians and providers.

US Consumers may want access to comparative data, but when it's provided they rarely search for the information, and when they find it, they don't understand it and they don't trust it and they don't use it (experience in the UK with NHS data is similar). Instead, people tend to use information from family and friends (soft, informal intelligence networks) rather than hard, quantitative data when making choices relating to performance and quality. A lesson from this is that we may need to infiltrate these informal networks to communicate our findings and data, rather than relying on traditional methods.

Very few physicians use data to influence referral behaviour; most use informal professional networks rather than hard quantitative data and only a small proportion of physicians share the information with patients. Similar results were seen in surveys of doctors in the NHS. In contrast, provider organisations make extensive use of hard performance data. They are highly sensitive to publication of data, and put a lot of effort into monitoring physician performance, benchmarking and marketing their scores. Also, hospitals operating in competitive markets are more likely to respond.

Dysfunctional consequences of performance incentives

  • Studies in the UK and the US have shown that while there are many benefits from effective incentives, there can also be dysfunctional consequences (the enemies of virtuous performance measurement). These include:
  • Tunnel vision – because things are measured, they get done (concentration on areas included in the clinical measurement to the exclusion of other important unmeasured areas e.g. perinatal mortality indicators may have distorted clinical priorities and exacerbate a bias towards hospital based care at the expense of community ante-natal services; waiting list indicators distorting clinical priorities) The way around this is to build a more balanced scorecard.
  • Measure fixation (the pursuit of success as measured rather than intended e.g. in the NHS the five minutes waiting time criterion in A and E led to the employment of 'hello' nurses who merely made contact with the patient in order to meet the target. It hit the measure, but not the spirit of the measure).
  • Suboptimisation (the pursuit by managers of their own narrow objectives, at the expense of strategic coordination; e.g. targets met for the acute sector, such as higher rates of day care emergency and shorter lengths of stay, do not acknowledge the increased burden implied for community services)
  • Myopia (concentration on short-term issues to the exclusion of long-term considerations that may only show up in performance measures in many years time e.g. curative services [as measured by short-term processes] given higher priority than preventive services [as measured by long-term outcome])
  • Complacency (lack of ambition for improvement brought about by adequate comparative performance; e.g. managers reported that they could improve performance but were happy to be in the middle range of league table rankings and not attract attention)
  • Misrepresentation (the deliberate manipulation of data by staff ranging from 'creative' accounting to fraud -so that reported behaviour differs from actual behaviour; e.g. upstaging [making a situation look worse than it actually is, fiddling waiting list figures)
  • Misinterpretation (incorrect inferences about performance brought about by the difficulty of accounting for the full range of potential influences on a performance measurement; e.g. failure to take fully into account case mix in interpreting mortality 'league tables'; Type I and type II errors).
  • Bullying and intimidation (excessive management pressure to meet performance targets; e.g. threats, bullying and intimidation reported by staff to meet NHS 'star' rating targets).
  • Ossification (organisational paralysis due to an excessively rigid regime of measurement; e.g. using day-case rates as an indicator of performance in gynaecology may inhibit the adoption of latest techniques for treating cases on an outpatient basis).
  • Ghettoisation (polarities in provision because of the differential impact of 'high' and 'low' performing organisations to attract and retain staff, especially for zero starred hospitals).
  • Erosion of public trust (decline in public trust brought about by publicising poor performance ['Naming and shaming', adverse media reporting]; e.g. zero star hospitals, asymmetry of trust). Public trust is built slowly, but can be eroded quickly.
  • Lower staff morale (decreased staff satisfaction and commitment caused by poor reported performance; e.g. healthcare is a people-based service and poor morale impacts on the quality of patient care).


Effective measurement of hospital quality and performance require the following considerations:

  • Address design issues at the measurement, analysis and action stages.
  • Performance management systems need careful piloting before national roll-out.
  • Selecting dimensions to measure is not merely a technical exercise, but has a subjective component around who selects them, how they are weighted and how they are prioritised.
  • Requires stakeholder involvement.
  • Need to anticipate dysfunctional consequences and put in place strategies to avoid or mitigate these.

Professor Russell Mannion
University of Birmingham, UK.


  • Mannion R, Braithwaite J (2011). Unintended consequences of performance measurement in healthcare: Twenty Salutary lessons from the NHS, Internal Medicine (in press)
  • Mannion R, Davies H (2008) Payment for performance in healthcare, British Medical Journal, 303, pp306-308
  • Mannion R, Davies H, Marshall M. Impact of star performance ratings in English acute hospital trusts. Journal of Health Services Research and Policy2005; 10(1):18-24.
  • Mannion R, Goddard M. General practitioners' assessments of hospital quality and performance. Clinical Governance: An International Journal 2004; 9(1):42-7.
  • Mannion R, Goddard M. Public disclosure of comparative clinical performance data: lessons from the Scottish experience. Journal of Evaluation in Clinical Practice 2003; 9(2):277-286.
  • Mannion R, Goddard M. Performance measurement and improvement in health care. Applied Health Economics and Health Policy 2002; 1(1):13-23.
  • Mannion R, Goddard M Impact of published clinical outcomes data: case study in NHS hospital trusts. British Medical Journal, 323: 260-263, 2001