Development and Testing of a Short Form of the Patient Activation Measure

Objective. The Patient Activation Measure (PAM) is a 22-item measure that assesses patient knowledge, skill, and confidence for self-management. The measure was developed using Rasch analyses and is an interval level, unidimensional, Guttman-like measure. The current analysis is aimed at reducing the number of items in the measure while maintaining adequate precision. Study Methods. We relied on an iterative use of Rasch analysis to identify items that could be eliminated without loss of significant precision and reliability. With each item deletion, the item scale locations were recalibrated and the person reliability evaluated to check if and how much of a decline in precision of measurement resulted from the deletion of the item. Data Sources. The data used in the analysis were the same data used in the development of the original 22-item measure. These data were collected in 2003 via a telephone survey of 1,515 randomly selected adults. Principal Findings. The analysis yielded a 13-item measure that has psychometric properties similar to the original 22-item version. The scores for the 13-item measure range in value from 38.6 to 53.0 (on a theoretical 0–100 point scale). The range of values is essentially unchanged from the original 22-item version. Subgroup analysis suggests that there is a slight loss of precision with some subgroups. Conclusions. The results of the analysis indicate that the shortened 13-item version is both reliable and valid.

the short form (PAM-13) is described and the psychometric properties of the short-form PAM are compared with those of the original 22-item PAM. Finally, the potential clinical and research applications of the short form measure are discussed.

BACKGROUND
Patients make many choices in their day-to-day lives that have major implications for their health and their need for care. Chronic disease patients often must follow complex treatment regimens, monitor their conditions, make lifestyle changes, and make decisions about when they need to seek professional care and when they can handle a problem on their own. Effectively functioning in the role of self-manager, particularly when living with one or more chronic illnesses, requires a high level of knowledge, skill, and confidence.
Imagine clinicians trying to treat a patient completely blinded to the patient's record and list of clinical symptoms. Yet, when clinicians encourage patient engagement in their care, they do so blinded to any information on the patient's capabilities for taking on a self-management role. What often results is a ''one size fits all'' patient education approach. If, however, clinicians had information on their patients' level of knowledge and skill to self-manage, they could target self-care education and support to individual patient needs and presumably be more effective in supporting patient's self-management.
Making the suggestion to lose 20 pounds, start going to the gym, and regularly take their hypertension medication to a patient who has little understanding that they even have a chronic illness, the nature of that illness, or that they must play a part in managing it, is unlikely to result in the desired outcome. However, starting with appropriate goals that fit the patient's level of activation, and working toward increasing activation step by step, patients can experience small successes and steadily build up the confidence and skill for effective self-management (Bandura 1991;Battersby et al. 2003).
Supporting patients in their role as self-managers is an essential element of high quality chronic illness care. As with other dimensions of quality, the ability to measure is a prerequisite to improvement. The recent Institute of Medicine Summit on Crossing the Quality Chasm suggested new directions in quality measurement, which are consistent with the use of the PAM for this purpose: First, measurement should focus on the patient, including patient experience and patient outcomes. This could include intermediate patient outcomes such as knowledge and skills for self-management. This approach acknowledges that the patient should be at the center of measurement and care processes, and emphasizes that patients are essential players in their own health outcomes.
Second, measurement should be integrated into the care delivery process and improve the care of the patient being measured. That is, by measuring intermediate patient outcomes, there is the opportunity to improve care for that patient, as well as assess quality across groups of patients.
Finally, measurement should be longitudinal and capture what happens to patients over time. It may be necessary to measure at more than one point in time to understand how care is affecting patients' experiences, their capabilities for self-management, and their quality of life, health, and ability to function. (Institute of Medicine 2004).
A measure of patient activation could also possibly be used to manage whole patient populations. For example, delivery systems can stratify their enrolled patient populations, not only by health risk level (level of resource consumption), as often done, but also by their activation level, allowing for early intervention with patients who lack the skills to self-manage before they inevitably move to a higher health risk group.
A shorter version of the PAM would greatly enhance the feasibility of measuring activation in a clinical setting and would make survey administration much less burdensome and costly. To this end creation of the short form PAM was undertaken.

METHODS
The optimum set of items in a measure have four characteristics: each item contributes substantive content central to the construct being measured; the items are well spaced along the measurement scale from easy to difficult items; each item's location on the measurement scale is precisely located (a small standard error of measurement); each item contributes sufficiently unique information about the amount of the construct (not redundant) to justify the response burden created by inclusion of the item. These criteria guided the item reduction process.
A telephone survey of 1,515 1 randomly selected adults in the U.S., aged 45 years and older, was carried out in 2003. Respondents were selected via a random digit dial selection and a screening question to determine age eligibility. No other eligibility requirement was used. A 48 percent response rate was achieved with a protocol of a minimum of 12 call backs. Respondents ranged in age from 45 to 97, with 66 percent of the sample under the age of 65. Half the sample had a high school education or less and 32 percent had a household income of under $25,000. Seventy-nine percent of the sample reported at least one chronic disease. These numbers compare well with the 2000 U.S. Census in which 53 percent of those aged 45 years and older had a high school education or less and 64 percent of those older than 45 are under the age of 65.
These data were used in the development of the original 22-item measure and again here to develop and test the short form PAM (see Hibbard et al. [2004] for details of the survey). To identify potential items for deletion, both statistical and conceptual approaches were used. We relied on an iterative use of Rasch analysis to identify items that could be eliminated without loss of precision and reliability. Within each of the four stages of patient activation items were identified that could be eliminated while still maintaining the strong psychometric properties of the original measure. Candidate items were deleted one at a time. With each item deletion the item scale locations were recalibrated and the person reliability evaluated to check if and how much of a decline in precision of measurement resulted from the deletion of the item. When there was more than one item that was a good candidate for possible deletion all candidate items in either that scale range or content domain were tested and the item deletion resulting in the smallest decrease in precision of person measurement was retained. Item fit values between 0.5 and 1.5 are considered adequate (Smith 1996) and this was used as a benchmark in decisions about item deletions. Item reduction was considered complete when further deletions resulted in unacceptably low levels of reliability and precision. We had no predetermined number of items for the shorter version of PAM. Once the item reduction was achieved we evaluated the performance of the reduced PAM within various subgroups and compared the results to those of the 22-item PAM. Construct validity assessment on the PAM 13 was conducted and compared with the PAM 22.

FINDINGS
The item reduction analysis resulted in a 13-item measure that has psychometric properties similar to the original 22-item version. Figure 1 shows the item scale calibrations of the PAM 13. The 13 items have a calibrated scale range from 38.6 to 53.0 (on a theoretical 0-100 point scale), compared with 38.3-54.5 for the 22 items. Table 1 shows the item infit and outfit statistics for the two versions. All of the infit and outfit statistics for the 13-item version of the PAM fall well within the 0.5-1.5 acceptable range and are essentially the same as in the 22-item version. Table 2 shows the person reliability statistics for subgroups in the population when using the 13-and 22-item PAM. The 13-item version has slightly  lower reliability for some subgroups: those with no chronic illness, those 85 years or older, those with self-rated poor health and those with lower income and education. Thus, there is some loss of precision with the shorter version of the PAM, however, these lower reliabilities still fall within an acceptable range. Table 2 also shows the mean PAM-13 scores for different subgroups in the population. As with the PAM 22 we see that those who are female, Meas: The calibrated scale value of the item. This represents how much activation is required to endorse the item. SEM: The standard error of measurement in estimation of the item difficulty. SEM is the precision of the item difficulty estimation and is shown in 0-100 units. Infit: Infit mean square error is one of two quality control fit statistics assessing item dimensionality (the degree to which the item falls on the same single, real number line as the rest of the items). Infit is an information-weighted residual of observed responses from model expected responses and is most sensitive to item fit when the item is located near the person's scale location. Outfit: Outfit mean square error fit statistic is most sensitive to item dimensionality when the item scale location is distant from the person's scale location. PAM, Patient Activation Measure. younger, have more education, and better self-reported health have significantly higher PAM scores ( po.001). Race is also significantly associated with PAM score ( po.05).
When the 13-item PAM score is regressed on the 22-item PAM score it accounts for 92 percent of the variation in the 22-item version estimated activation. This verifies what would be expected from the comparative reliability estimates; minimal information was lost in the item reduction process.
Finally, to assess the construct validity of the 13-item PAM, variables that have been conceptually and empirically linked with the 22-item PAM are examined for their relationship to activation as it is measured in the 13-item PAM. Table 3 shows that the preventive behaviors, the disease-specific selfmanagement behaviors, and the consumeristic behaviors are all strongly linked with activation scores using the 13-item PAM and that there is little difference in these relationships regardless of whether the short or the long form of PAM is used.

DISCUSSION
The results indicate that the shortened 13-item version is both reliable and valid. The shorter version of the PAM will make it more feasible to use activation scores to inform patient care plans. However, for PAM users who are seeking the highest level of measurement precision, the PAM 22 may be more desirable.
PAM scores can provide insight into possible strategies for supporting activation among patients at different points along the continuum. Patients who score at the bottom of the measure may still believe that the doctor will ''fix'' them. Patients whose scores are somewhat higher, but are still in the bottom half may understand that they must be involved in their care, but still lack the basic knowledge about their conditions and their treatments that is necessary for them to effectively act. Thus, patients scoring in the bottom half of the measure likely need to work on self-awareness of their role in the care process and in gaining the basic knowledge about their conditions. Patient's whose scores are in the upper half are beginning to gain confidence in their ability to take on self-management behaviors and make lifestyle changes. At this stage experiencing a series of small successes will likely build a sense of self-efficacy and increase activation (Battersby et al. 2003). Patients scoring near the upper range of the measure are likely to have made changes in their lifestyles but may still have difficulty maintaining them when new situations arise or when they are under stress. Thus, for those patients scoring in the upper half of the PAM, working on developing a sense of selfefficacy for taking on and maintaining behaviors is paramount. Attaining the basic knowledge and beliefs reflected in early stages of activation are likely necessary for building a sense of efficacy for the selfmanagement tasks involved in the later stages. We hypothesize that patients need to sequentially pass through each of these stages on the way to becoming effective self-managers. These stages have some similarities with the stages of change in the Transtheoretical Model (Prochaska and DiClemente 1983;Prochaska, Redding, and Evers 1997), which includes precontemplation, contemplation, preparation, action, and maintenance stages. The Transtheoretical Model emphasizes motivation and readiness and does not explicitly deal with issues of skill and knowledge acquisition. Further the Transtheoretical Model focuses on one behavior at a time and requires the development of a measurement tool specific to that behavior. The idea of tailoring interventions to the patient's stage is similar for both models.
While the PAM has strong psychometric properties, research is still needed to make it fully ready for use in different settings and with different populations. PAM users are beginning to translate the measure into other languages. The degree to which the measure is valid and reliable in these different language translations and among different cultures is unknown and deserves investigation. While early evidence indicates the measure is valid and reliable for different chronic illnesses, this too requires further study. Replication studies of the measure with different populations in different settings are underway and will add to our understanding of these questions.
Research, which tests interventions that are effective in encouraging and supporting patient advancement through the stages, is a high priority. It is very likely that a strategy that will help a patient move from stage one (believing the patient has an active role) to stage two (having the confidence and knowledge to take action) is different from what will help her move into stage 3 (taking action). That is, once a patient score or stage is known, what interventions are efficacious in increasing that patient's activation?
In addition to using the PAM score to inform interactions with patients, an alternative approach has been tried in pilot efforts. Because the items can be ordered by difficulty, it is possible to visually scan patient responses and observe when their answers begin to move away from ''strongly agree.'' Clinicians can use this as an opportunity to begin a conversation with the patient about the item where responses changed. For example, ''I see you are less sure about your medications, let's talk about that.'' Using the PAM in this way can sharpen the specificity of the interaction with the patient, increasing the prob-ability that individual barriers and issues can be identified and dealt with. It may be that using both the visual scan and the PAM score or stage may be the most effective use of the measure. If this were the case, it might be advantageous to use the full 22-item PAM to allow for more opportunities to identify problems specific to an individual patient. Just using the ''visual scan'' approach is the easiest way to use the measure in a clinical encounter, particularly when electronic data collection is not an option. Because there is no scoring involved and no data entry, testing the efficacy of this ''low tech'' approach is also a priority.
Among the interventions that do increase activation, what effect do they have on patient health outcomes and costs?
Research on the use of the PAM for managing enrolled patient populations is also needed. Would early intervention with patients identified through screening as having both clinical risk factors and low skills (low PAM scores), reduce costs and improve health outcomes?
Using the PAM as a basis for designing care plans and for assessing individual and patient population progress appears to be a viable approach, and one that warrants controlled testing to determine whether patients' whose care plans are informed by PAM scores have better outcomes and require less health care resources than those patients whose care plans are not so informed. ACKNOWLEDGMENT Support for this study was provided by the Robert Wood Johnson Foundation through grants #045524 and #050787. NOTE 1. Because they responded to the PAM items in a way that suggest that they were doing so in a rote or insincere manner (such as giving the same response to all items), 46 of the 1,515 respondents were eliminated from the analysis.