Usability Testing of a Reusable Pulse Oximeter Probe Developed for Health-Care Workers Caring for Children < 5 Years Old in Low-Resource Settings

Abstract. Hypoxemia measured by pulse oximetry predicts child pneumonia mortality in low-resource settings (LRS). Existing pediatric oximeter probes are prohibitively expensive and/or difficult to use, limiting LRS implementation. Using a human-centered design, we developed a low-cost, reusable pediatric oximeter probe for LRS health-care workers (HCWs). Here, we report probe usability testing. Fifty-one HCWs from Malawi, Bangladesh, and the United Kingdom participated, and seven experts provided reference measurements. Health-care workers and experts measured the peripheral arterial oxyhemoglobin saturation (SpO2) independently in < 5 year olds. Health-care worker measurements were classed as successful if recorded in 5 minutes (or shorter) and physiologically appropriate for the child, using expert measurements as the reference. All expert measurements were considered successful if obtained in < 5 minutes. We analyzed the proportion of successful SpO2 measurements obtained in < 1, < 2, and < 5 minutes and used multivariable logistic regression to predict < 1 minute successful measurements. We conducted four testing rounds with probe modifications between rounds, and obtained 1,307 SpO2 readings. Overall, 67% (876) of measurements were successful and achieved in < 1 minute, 81% (1,059) < 2 minutes, and 90% (1,181) < 5 minutes. Compared with neonates, increasing age (infant adjusted odds ratio [aOR]; 1.87, 95% confidence interval [CI]: 1.16, 3.02; toddler aOR: 4.33, 95% CI: 2.36, 7.97; child aOR; 3.90, 95% CI: 1.73, 8.81) and being asleep versus being calm (aOR; 3.53, 95% CI: 1.89, 6.58), were associated with < 1 minute successful measurements. In conclusion, we designed a novel, reusable pediatric oximetry probe that was effectively used by LRS HCWs on children. This probe may be suitable for LRS implementation.


INTRODUCTION
Pneumonia is the leading infectious cause of death in children < 5 years old, and an estimated 85% of all global pneumonia deaths occur in sub-Saharan Africa and South Asia. 1,2 Severe pneumonia can be associated with hypoxemia, defined by the World Health Organization (WHO) as a peripheral arterial oxyhemoglobin saturation (SpO 2 ) < 90%. 3 Hypoxemia is strongly associated with child pneumonia mortality in lowresource settings (LRS) and can be detected noninvasively by pulse oximetry. 4 Hospital oxygen systems using pulse oximetry in LRS are associated with reduced pneumonia mortality. 5 Current WHO guidelines recommend the use of pulse oximetry at peripheral facilities only if available and provide no guidance for its use at the community level. 6,7 However, it is widely recognized that routine pulse oximetry screening could improve pediatric pneumonia management in LRS. [8][9][10] Routine pulse oximetry use for children in LRS has been limited despite the availability of high-quality, low-cost pulse oximeters designed for use in LRS, such as the Lifebox ® oximeter (New Taipei City, Taiwan). 9,11 Lifebox ® Foundation is a nonprofit organization focused on safer surgery and anesthesia in LRS, and the foundation currently makes available a high-quality pulse oximeter and probe priced at $250 USD/unit. A recent pneumococcal vaccine effectiveness study demonstrated successful implementation and use of the Lifebox ® oximeter with an adult probe on > 13,000 children with clinical pneumonia between 2012 and 2014 across the routine health system in Malawi. 9,12 However, the authors noted that SpO 2 measurements were difficult to obtain in some children, particularly in younger infants and neonates, and that a low-cost, reusable probe designed specifically for children would significantly advance implementation of pulse oximetry in LRS. 9 This project evolved from this key finding. In high-income settings single-use adhesive probes costing ∼$10 each are commonly used but are an unsustainable solution for LRS. A reusable probe, optimized for measuring SpO 2 on children of all ages by health-care workers (HCWs) of varying training backgrounds would potentially be a key advancement for improving pneumonia care in LRS.
We used a human-centered design (HCD) approach 13 that engaged end-users and experts from multiple disciplines into the development process of a reusable, low-cost pediatric oximeter probe. This study presents the summative usability testing process of our HCD approach. Our objective was to evaluate the usability of the probe by end-users and experts across a range of settings and children against an aspirational target product profile (TPP) goal of 95% of readings achieved within 1 minute (Supplemental Appendix 1). As HCWs in LRS are overburdened and have limited time per patient, we established this goal as an ideal time to obtain an SpO 2 reading, based on inputs from experts and end-users. If achievable, this could optimize implementation feasibility of pulse oximetry in this setting.

MATERIALS AND METHODS
We conducted usability testing of a novel pediatric pulse oximeter probe (LB-01), developed using HCD, using feedback from a modified Delphi technique to aid probe design refinements. The probe was used in combination with the Lifebox ® oximeter (version 1.5). Participants were HCWs in two LRS with a high pneumonia burden (Malawi and Bangladesh) and one high-resource setting (the United Kingdom [UK]), included as a site with highly trained HCWs. The research described in this article does not evaluate the device's accuracy. Our study team (M. B.) evaluated the accuracy of this pulse oximetry probe separately in an in-vivo study at the University of California San Francisco. The device passed all testing according to pulse oximeter device regulatory standards. 14,15 Human-centered design with modified Delphi method. The modified Delphi method is a series of consecutive investigations or rounds that seek organized, incremental feedback to achieve the most accurate views from experts. 16 We incorporated this approach within our HCD process and stepwise usability testing, with end-users and experts providing feedback between each round. This allowed us to consider end-user-driven probe refinements before further testing.
Settings. Mchinji, Malawi. Mchinji is located in central Malawi where health care is provided by community health workers (CHWs) called Health Surveillance Assistants, nurses, and non-physician clinicians (clinical officers). 17 All testing was conducted at the district hospital. Mchinji health-care providers have used the Lifebox ® oximeter and adult clip probe since 2012. 9,12 Sylhet, Bangladesh. In northeast Bangladesh, we conducted the study in collaboration with the research consortium Projahnmo in Sylhet. Physicians and nurses staff Projahnmo-supported clinics, and CHWs perform household surveillance. Projahnmo clinical staff and CHWs have used Masimo Rad5 ® pulse oximeters (Irvine, CA) and reusable pediatric wrap probes since 2015.
London, UK. In the United Kingdom, we conducted the study at the Great Ormond Street Hospital where pulse oximetry using single-use probes is routine. Health-care workers were highly trained nurses familiar with pulse oximetry, but not reusable probes, and not the Lifebox ® oximeter.
Recruitment. Study staff purposefully recruited HCWs with prior pulse oximetry experience, but without prior project involvement. In the LRS, this included CHWs. All HCWs received training that included an approximately 1 hour overview of pulse oximetry, orientation to the device, practice with using the device on other HCWs and volunteer children, and other protocol specifics (Supplemental Appendix 2). We reimbursed LRS HCWs for travel.
We recruited children aged < 5 years from inpatient and outpatient settings using convenience sampling and categorized them by age: neonates (0-28 days), infants (1-11 months), toddlers (12-23 months), and children (24-59 months). We excluded children who were clinically unstable, were receiving oxygen, or had an SpO 2 < 95% on expert screening. Seven clinicians with extensive pediatric oximetry experience in LRS conducted participant screening and obtained reference pulse Sample size. Using usability testing, we needed at least 15 users per study site to ensure we identified most device issues. 18,19 For statistical analysis, we needed at least 292 measurements to test the probe against our pre-defined aspirational goal of 95% of measurements in < 1 minute with 2.5% precision. To allow for stratified analyses by site, enduser type, and age group, we aimed to recruit 17 HCW at each study site, with each testing 12 children divided equally by age categories (N = 204 per site). Experts recorded approximately the same number of measurements per site, also equally divided by age strata.
Data collection. We used a Lifebox ® oximeter (version 1.5). To allow the expert to conduct reference SpO 2 measurements, an independent observer recorded the child's demographic data, condition, clinical features, and timed the expert SpO 2 measurement (Supplemental Appendix 3). The observer was a member of the research team who was present to provide independent timing and data recording of the measurement process. Children were first screened by the experts, and those with an SpO 2 > 95% were permitted to participate in HCW testing. For expert measurements, the expert and/or caregiver could distract the child and support the limb. The independent observer started a timer once probe placement was complete and stopped it when the expert said "stop," signifying that, in the expert's view, this was a successful SpO 2 reading. Experts were trained to assume an SpO 2 reading as reliable if the measurement had a consistent, high-amplitude plethysmography waveform, accompanied by an SpO 2 and heart rate that, in their judgment, was biologically plausible for that child. The SpO 2 , heart rate, and additional observations were then recorded. If an SpO 2 was not obtained at the first location, the expert could adjust the probe or use another location for a maximum of 5 minutes, at which point the testing was stopped. Biologically implausible measurements, for example, were measurements that had a normal SpO 2 but had a heart rate lower than the approximate 10th centile for the age of the child, 20 or measurements that had a severely abnormal SpO 2 value in an otherwise clinically stable child.
Health-care workers were blinded to expert measurements. To allow the expert to observe HCW testing without distractions and ensure accurate timing, the independent observer also timed all HCW measurements (Supplemental Appendix 3). Health-care workers followed the exact same measurement process as completed by experts and used the same criteria for determining whether the SpO 2 was reliable. The expert also determined whether the HCW reading was reliable in their judgment by using the same metrics used with the expert measurement, and the observer separately recorded this information without the HCW's knowledge. The HCW could attempt to obtain an SpO 2 for a maximum of 5 minutes, at which point the testing was stopped. No additional guidance or retraining was provided to HCWs by experts during or between measurements if the HCWs were taking measurements incorrectly. De-identified data were captured electronically and uploaded onto secure servers.
Following testing, HCWs completed a written usability questionnaire (Supplemental Appendix 4). Usability testing was self-completed in writing by each HCW after completing their SpO 2 measurements, with assistance to clarify questions by the research team, as necessary.
Data analysis. The primary outcome was a successful measurement in < 1 minute; secondary outcomes were successful measurements in < 2 and < 5 minutes. A successful HCW SpO 2 reading was defined to be relevant for real-world practice as being completed in 5 minutes or sooner, having a consistent, high-amplitude plethysmographic waveform, and displaying a value > 95% (clinically stable) or within ±2% of an immediately repeated expert measurement if the HCW SpO 2 was < 95%. Expert readings were all assumed to be reliable, and therefore were considered successful if achieved in < 5 minutes.
We described the proportion of measurements in < 1, < 2, and < 5 minutes. We stratified results by the child's age, enduser cadre, and study site; differences between the proportion of successful readings in < 1 minute were evaluated by using χ 2 tests, and median time to reading by Kruskall-wallis tests. We used univariable and multivariable logistic regressions for predictors of successful measurements < 1 minute. We a priori selected the following predictors: child's age, weight, ethnicity, measurement site and relocation, child's condition, end-user cadre, and study site. Analyses were adjusted for clustering within children using robust standard errors. All analysis was conducted using Stata 14 (StataCorp LLC, College Station, TX).
Likert scale questions from usability questionnaires were described, primarily for HCW probe usability on age categories. Common themes from free-text responses about usability, challenges, and suggested improvements were coded.
Ethics. Ethical approval, including review of our consent forms and information sheets, was provided by Malawi (ref: 16

RESULTS
Testing process. We conducted four iterative testing rounds, with HCW and expert feedback between the first three rounds and expert feedback after the fourth round ( Figure 1). We tested the same probe design in the first and second rounds. Another expert-only testing session was conducted in Malawi using a Nellcor ® box (Medtronic, Minneapolis, MN) to triangulate our results with a device that was compatible with the probe and incorporated motion sensitive software. This allowed us to discriminate between the oximeter and probe performance by controlling for the oximeter's algorithmic design.
Round 2 testing confirmed round one observations, so we refined the probe before round 3, adding a firmer pad beneath the light detector and a revised pivot to open the probe wider ( Figure 2). Based on round 3 feedback, we shifted the light emitting diode and detector 5 mm away from the probe hinge, and changed the internal padding curvature. The fourth round of expert-only testing completed an LRS field performance check on the final probe design.
A total of 5.8% (76/1,307) of all SpO 2 measurements were biologically implausible (Supplemental Table 1 Pulse oximeter box sensitivity testing. A total of 106 measurements were taken using the Nellcor ® box, achieving a median time of 29.8 seconds and 67% in < 1 minute (Supplemental Table 2). There was no statistical difference between the median time to reading and round one Malawi expert readings (P = 0.32), but there was an upward trend in the proportion of successful measurements achieved < 5 minutes by Nellcor ® (94.8% versus 99.1%, P = 0.06).
Child's behavioral state. The child's behavioral state had an important relationship with the time to a successful SpO 2 (Supplemental Table 3). The median time to a successful SpO 2 was shorter if the child was asleep (24. Predictors of successful measurements. Results from univariate and multivariate analysis of factors associated with successful measurements < 1 minute are in Table 4. In the adjusted model, increasing age and being asleep were associated with achieving an SpO 2 in < 1 minute. The child being agitated or crying, repositioning the probe and first placing the probe across the foot, were all associated with failing to measure an SpO 2 in < 1 minute. Health-care worker feedback. All 51 HCWs completed the questionnaire. Overall, 74% of HCWs either strongly or somewhat agreed that the probe was easy to use on all children aged 0-59 months. There was an upward trend in ease of use, with increasing age across all sites (Figure 3), but there were differences in responses between the sites. Eightyeight percent of respondents strongly or somewhat agreed that the pulse oximeter would make their jobs easier, and 90%

USABILITY TESTING OF A REUSABLE PEDIATRIC PULSE OXIMETER
agreed that it would help them diagnose pneumonia. The main challenges raised by participants were the probe's size relative to neonates, and readings during movement, with only 38% of respondents agreeing that it is easy to get a reading in a moving child. Additional HCW feedback is reported in Supplemental Table 4.

DISCUSSION
We used a HCD process to design a novel pulse oximeter probe for use by HCWs, and evaluated the usability of the probe on children < 5 years old in low-and high-resource environments. Our primary outcome was the time to a successful SpO 2 measurement. We achieved this 67% of the time in < 1 minute, 81% in < 2 minutes, and 90% in < 5 minutes, although we identified differences across testing rounds, with different user cadres and in different ages. In the final round, when all probe modifications were included, experts achieved a reading in < 5 minutes in 100%, < 2 minutes in 79%, and < 1 minute in 61% of children. Although these results suggest feasibility for use of this probe in LRS, and therefore support LRS implementation, it is important to acknowledge that they were lower than our a priori aspirational target of 95% of readings in < 1 minute. Notably, this target was more ambitious than an expert-developed TPP which set < 2 minutes as the ideal and < 5 minutes as the minimum performance. In retrospect, our study target was too ambitious for a lower cost device designed for newborns to 5 year olds, but meets the wider clinically acceptable minimum performance, even in a device unequipped with more sophisticated motion tolerant technology. We believe that achieving a time to measurement target of > 90% of measurements in < 1 minute is necessary for optimal implementation of pulse oximetry in LRS with high patient volumes and few health-care providers. Additional investments in pulse oximetry development are needed to meet this target. We are unaware of any other commercial pulse oximeter that has quantified end-user oximetry usability across the pediatric age spectrum by a range of end-users.
Through expert and end-user engagement, our HCD process established time to a successful SpO 2 measurement as this study's primary endpoint for usability testing. Consistently recording a successful SpO 2 quickly is critical for feasibility of routine pulse oximetry screening of children in busy, understaffed LRS. Despite this, limited published data have examined SpO 2 measurement times in LRS. A study from Malawi in children < 5 years old, using the Lifebox ® oximeter (version 1.0) and an adult clip probe, found that only 45% of HCW's reported  that on average SpO 2 measurements took < 2 minutes. 9 Here, we achieved 81% of measurements in < 2 minutes, suggesting that both the redesigned probe and previous box microprocessor upgrades have markedly improved performance. Emdin et al. 21 explored HCW oximetry testing on infants < 60 days old in Pakistan, reporting readings in 94.4% in < 1 minute and 99% in < 5 minutes of infants. There are two likely explanations for the Pakistan study's higher proportion of readings in < 1 minute. First, they used a pulse oximeter monitor and reusable probe with motion and low perfusion technology (Rad-5v ® and LNCS ® Y-I multisite sensor, Masimo ® , > $700 USD; Irvine, CA,). However, this is prohibitively expensive technology for wide-scale LRS implementation. The Lifebox ® box and redesigned probe is expected to cost between $100 and   22 We recommend that for future pulse oximetry usability testing in LRS investigators consider the metric of time to a successful SpO 2 measurement as the reference standard from which oximeter usability is evaluated.
Our study highlighted the factors associated with longer measurement times as a proxy for difficult SpO 2 measurements. We found that HCWs took longer to achieve an SpO 2 in < 1 year olds than in older children. In addition, we found that it took longer to achieve an SpO 2 if the child was agitated or crying than when sleeping or calm. These findings were supported by HCW feedback and previous studies noting patient age and cooperation as associated with successful measurements. 9,22,23 It is key that these factors are considered when developing future LRS pulse oximeters. Our findings suggest that low-cost motion tolerant technological innovation is essential for future devices.
In both Malawi and Bangladesh, there was a trend that CHWs performed better than non-CHWs who have more extensive training and education. Our previous focus group work suggests that CHWs place a higher "value" on pulse oximetry, which may lead to more careful adherence to SpO 2 protocols, and highlights the enormous potential of community-based implementation programs for pulse oximetry. 9,24 We also found that experts consistently performed better than HCWs in LRS testing, a difference not seen in the United Kingdom. LRS HCWs reported a higher proportion of biologically implausible measurements (18.9%) than UK HCWs (0.5%). Several reasons may account for these findings. First, LRS HCWs are likely to be less familiar with the science behind pulse oximetry and the interpretation of SpO 2 readings in the broader clinical context. This lack of clinical education and training may lead to poorer comprehension of the biological plausibility of SpO 2 readings, and was probably exacerbated in Bangladesh where HCWs were less familiar with the Lifebox ® and had been using oximetry for a shorter period of time than in Malawi. Second, LRS HCWs without pediatric training may be less adept at applying techniques with children to reduce movement and agitation to achieve a successful SpO 2 . This includes correct probe placement, appropriate support of the limb, or other distraction techniques including using toys or breastfeeding. Finally, LRS HCWs may have felt obligated to report an SpO 2 to experts even if they believed that the SpO 2 was biologically implausible. These differences could potentially be addressed with improved education, rather than basic task-specific training and ongoing mentorship approaches, a critical consideration for wide-scale implementation of pulse oximetry in LRS.
This study had several limitations. The act of directly observing HCWs obtaining oximetry readings may have caused them to change their usual practice, the Hawthorne effect, altering the final measurements HCWs provided. 25 Because FIGURE 3. Feedback from health-care worker usability questionnaire from Malawi, Bangladesh, and the United Kingdom (UK). Answers in response to the question: "How easy did you find the probe to use in XX?" presented for the different age categories. This figure appears in color at www.ajtmh.org.
the HCWs were aware that their measurement time was being recorded, this could have led to hastier, inaccurate readings, thinking a faster time was more important than what we defined as a successful measurement. This may be more pronounced in LRS and may therefore, in part, account for the higher proportion of lower successful readings provided by LRS HCWs. We did anticipate this potential bias before the study, and during trainings, we stressed the need for HCWs to believe the SpO 2 to be true. There were also notable differences between patient populations across settings. For example, more LRS children had infectious diagnoses and more UK children had chronic illnesses. United Kingdom children with chronic illnesses may be more familiar with pulse oximetry, and therefore more compliant with measurements. Finally, because of the convenience sample design and patient availability, we re-tested 332 of 526 children; repeat testing may have led to bias if the experts, HCWs, or children modified their behavior between measurements. To account for this, we adjusted for clustering within children in the regression analysis.
In conclusion, this HCD usability study indicates that, in children < 5 years old, it is possible to use a pulse oximetry probe to achieve successful SpO 2 readings in 67% of children in < 1 minute, 81% in < 2 minutes, and 90% in < 5 minutes. These results are encouraging for an innovative pediatric pulse oximetry probe that is reusable and low cost. We believe that this design is an appropriate "universal probe" suitable for use by LRS HCWs on patients of any age, including newborns. Our findings highlight the factors associated with longer measurement times, in particular movement artifact, and suggest that task-specific training is sufficient for LRS study settings, but enhanced training and ongoing supervision is still likely necessary to successfully and sustainably implement pulse oximetry in a non-study LRS settings. We recommend future pulse oximetry usability testing studies in LRS to use a HCD process that incorporates feedback from field experts and end-users. We additionally recommend future usability studies to use time to a successful SpO 2 measurement as the standard for assessing implementation feasibility of oximeter devices in LRS. Next steps could focus on developing lowcost LRS pediatric pulse oximeters as specialized spot-check devices with high motion tolerance that display the most reliable, single SpO 2 reading for easier HCW interpretation.