The Evaluation of Preference and Perceived Quality of Health Communication Icons Associated with COVID-19 Prevention Measures

Icons have been widely utilized to describe and promote COVID-19 prevention measures. The purpose of this study was to analyze the preference and subjective design features of 133 existing icons associated with COVID-19 prevention measures published by the health and medical organizations of different countries. The 133 icons represent nineteen different function names, such as “Wash Hands” and “Wear Face Mask”. A total of 57 participants were recruited to perform two different tests: ranking test and subjective rating test. The ranking test was conducted to elicit the preference ranking of seven icon designs representing each function name. It was followed by a subjective rating test using 13 semantic scales on the two most preferred icons to analyze their perceived quality. Spearmen correlation was applied to derive the possible correlations between users’ rankings and the semantic scales, and Friedman’s test was also performed to determine the true difference between ranking in terms of each semantic scale to provide a fully meaningful interpretation of the data. Generally, findings from the current study showed that the image presented in the icon is the key point that affects the icons’ perceived quality. Interestingly, Spearman’s correlation analysis between preference ranking and semantic scales showed that vague–clear, weak–strong, incompatible–compatible, and ineffective–effective were the four strongest semantic scales that highly correlated with the preference ranking. Considering the significant relationships between the semantic distances and the functions, images depicted in an icon should be realistic and as close as possible to its respected function to cater to users’ preferences. In addition, the results of Spearman’s correlation and Friedman’s test also inferred that compatibility and clarity of icon elements are the main factors determining a particular icon’s preferability. This study is the first comprehensive study to evaluate the icons associated with the COVID-19 prevention measures. The findings of this study can be utilized as the basis for redesigning icons, particularly for icons related to COVID-19 prevention measures. Furthermore, the approach can also be applied and extended for evaluating other medical icons.


Introduction
Icons have been widely utilized as a tool to promote COVID-19 prevention measures during the pandemic. They are intended as tools to represent complex information quickly and clearly regarding functions under COVID-19 preventive measures. Upon using them, the goal is to enable visual communication that allows easier transmission of ideas and information compared to written communication [1]. Effective icon designs can overcome language barriers and can successfully convey useful information since they reduce translation requirements and give the information behind them an international look [2].
As for COVID-19 prevention icons, their success and effectivity may have a big impact with regards to virus containment. The best way to prevent illness from the COVID-19 virus is to understand how the virus spreads and to avoid being exposed to it [3]; therefore, accurately understanding and recognizing what functions under COVID-19 preventive measures the visual icons are representing is important. Furthermore, factors that may influence the effectiveness of these icons should be taken into consideration, such as the icon formats [4].
Chi and Dewi [4] classified icons under seven format categories, namely, image-related, concept-related, semi-abstract, word, abbreviation, and combined. Image-related icons are the typical representations of the object or action, concept-related icons represent concepts that are close to but not exactly the concrete image of the action or object, and arbitrary icons have no obvious reference to their intended meaning and can only be meaningful and understood through education [4][5][6][7]. Semi-abstract icons, on the other hand, are combined image-related (concrete representation of an action or object) icons and concept-related or arbitrary icons (an abstract representation of an action or object) [8]. Aside from graphical icons, textual and combined icons are also considered as icon format categories if textual elements are added into the icons [9]. Furthermore, textual icons can be divided into two classifications-word and abbreviation. Because of the obvious mapping of the imagerelated icon to its referent, it is superior for fast and accurate recognition, while textual icons are better for reaction time [10][11][12][13]. Other graphical icon formats that are conceptrelated and arbitrary are harder to immediately understand since they have less obvious connection to their intended meaning. Arbitrary icons should be avoided because the need to educate people first to comprehend their meaning requires a considerable amount of funding and time [2].
The majority of the existing COVID-19 prevention icons based on the infographics being released by the World Health Organization and other medical or health organizations of different countries are in image-and concept-related formats. For example, the icons that intend to convey protocols about travel restrictions mostly use a concrete illustration of bags or suitcases or a person pulling baggage as a representation (Figure 1). Semi-abstract icon formats were also applied on some preventive measures' functions. On semi-abstract travel restriction icons, aside from the concrete pictorial representations of the intended meaning, arbitrary symbols are applied [8]. There are icons with a red-colored circle around it or a punctuation mark in it, representing that it is prohibited or that it should be avoided. However, given the advantages and contribution that the visual icons are providing in raising awareness about COVID-19 prevention, there are instances in which the icons used to represent a COVID-19 preventive measure may cause confusion and incorrect comprehension, especially considering icons with a similar context (e.g., the confusion between the icon for shortness of breath and the one for breathing difficulty). The image element of the icon is the major cause of confusion among its readers. In this case, designers should consider and evaluate the icon characteristics to determine the icon's recognition performance. difficulty). The image element of the icon is the major cause of con ers. In this case, designers should consider and evaluate the icon mine the icon's recognition performance. Icon characteristics can be classified into physical (external) a ternal). Previous studies about icon physical/external characteri color [15,16], spacing, and density [15] provided a number of pra for icon design. Nonetheless, to address the semantic information icons, other studies focused on the influence of icon internal cha known as psychophysics in human factors and ergonomics. For ex uation methods were utilized by Ng and Chan to explore the effe acteristics on the comprehensibility of traffic and safety signs [18, McDougall et al. carried out a series of studies regarding icon iden how these characteristics affect users' cognitive performance [20 suggested that there are actually factors that may modify the pe such as its communicativeness, complexity, layout, and semanti supported by studies that demonstrated that simple and less com easily recognized [2,[25][26][27] and concrete icons can be identified quickly by the users [12,28]. Complexity pertains to the details int while semantic distance is the measure of the closeness of what is i its true intended meaning [19]. Furthermore, it is also suggested th better if it can express its intended message clearly and if its featu fully [21].
Despite the wide and frequent application of visual icons a communication for COVID-19, no study has yet existed that is COVID-19 prevention icons. In accordance with the International S [29], it is necessary to develop icon design principles to ensure visu preference for enhancing icon recognition and usability. Moreover regarding the evaluation of the perceived quality of existing medi broad population and not limited to medical staff.
The purpose of this study is to analyze the existing icons of measures published by the health and medical organizations of Icon characteristics can be classified into physical (external) and psychophysical (internal). Previous studies about icon physical/external characteristics such as size [14], color [15,16], spacing, and density [15] provided a number of practical design guidelines for icon design. Nonetheless, to address the semantic information conveyed by different icons, other studies focused on the influence of icon internal characteristics [17], widely known as psychophysics in human factors and ergonomics. For example, subjective evaluation methods were utilized by Ng and Chan to explore the effects of sign design characteristics on the comprehensibility of traffic and safety signs [18,19]. On the other hand, McDougall et al. carried out a series of studies regarding icon identification to investigate how these characteristics affect users' cognitive performance [20][21][22][23][24]. Findings on these suggested that there are actually factors that may modify the performance of the icon, such as its communicativeness, complexity, layout, and semantic distance. These were supported by studies that demonstrated that simple and less complex icons can be more easily recognized [2,[25][26][27] and concrete icons can be identified more accurately and quickly by the users [12,28]. Complexity pertains to the details intricated on the icon [19], while semantic distance is the measure of the closeness of what is illustrated in the icon to its true intended meaning [19]. Furthermore, it is also suggested that an icon may perform better if it can express its intended message clearly and if its features were arranged carefully [21].
Despite the wide and frequent application of visual icons as a medium for visual communication for COVID-19, no study has yet existed that is mainly about existing COVID-19 prevention icons. In accordance with the International Standards Organization [29], it is necessary to develop icon design principles to ensure visual clarity and subjective preference for enhancing icon recognition and usability. Moreover, there is a lack of study regarding the evaluation of the perceived quality of existing medical-related icons for the broad population and not limited to medical staff.
The purpose of this study is to analyze the existing icons of COVID-19 prevention measures published by the health and medical organizations of different countries. A ranking test and a subjective rating test were utilized to evaluate the collected icons. This study is the first study to analyze the effectiveness of existing icons for COVID-19 prevention measures. The findings are beneficial for human factors engineers, industrial designers, and the even government, particularly for designing medical-related icons.

Participants
A total of 57 participants aged between 18 and 40 years old were recruited to participate in this study ( Table 1). All of them were the residents of the National Capital Region (Manila), Philippines. Since the data collection was conducted during the COVID-19 pandemic, this study was also conducted in accordance with the Department of Health-Philippines by following COVID-19 safety protocols. As imposed by the National Ethical Guidelines for Health and Health-Related Research 2017 by the Philippine Health Research Ethics board, all participants were fully informed of the purpose of the research as well as all the procedures within the experiments. The respondents were also asked to complete a consent form before performing the required tasks. Finally, they were also paid 200 PHP after completing the experiment.

Icon Collection
Seven existing icons representing each of the considered COVID-19 preventive measures functions were collected from the COVID-19 prevention infographics released by the Department of Health (DOH) Philippines, World Health Organization (WHO), European Centre for Disease Prevention and Control, and other medical organizations. A total of 133 icons representing the preventive measures of COVID-19 were evaluated and assessed through a ranking test experiment [30]. This was then followed by a subjective rating test for the top two icons of each respondent from the ranking test. The online experiment was posted and distributed through social media platforms. Table 2 shows the nineteen function names of icons related to COVID-19 preventive measures, while all the icons collected for the current study are displayed in Table 3.
Cover when Coughing or Sneezing (8)

Ranking Test
In the first phase of the experiment, participants were tasked to rank COVID-19 preventive measure icons within the same function name. Following Chi and Dewi [4], the experiment was administered with a computer program developed using JavaScript and PHP software where respondents ranked the displayed icons from 1 to 7 (See Figure 2). Each participant would rank the most preferred icon under a function name as 1; the next preferred would be ranked as 2, and so on. Thus, the least favored icon was ranked as 7. The icons were laid out in a circular manner to avoid possible sequence effect [31,32] or location bias [33]. The function names were also stated next to the displayed icon to provide appropriate context and description for each function [34]. The experiment was conducted online.

Ranking Test
In the first phase of the experiment, participants were tasked to rank COVID ventive measure icons within the same function name. Following Chi and Dew experiment was administered with a computer program developed using JavaSc PHP software where respondents ranked the displayed icons from 1 to 7 (See F Each participant would rank the most preferred icon under a function name as 1; preferred would be ranked as 2, and so on. Thus, the least favored icon was rank The icons were laid out in a circular manner to avoid possible sequence effect [3 location bias [33]. The function names were also stated next to the displayed icon vide appropriate context and description for each function [34]. The experiment w ducted online.

Subjective Rating Test
According to Liu and Ho [35], subjective rating features are reliable in dete the performance of icons based on recognition accuracy [35]. Additionally, su scales are easy to administer since they are more sensitive than objective measu [36]. Therefore, in this phase of the experiment, participants were asked to rate th icons from the ranking test (i.e., icons with first and second rank for each function basis of subjective design features such as perceived icon quality, communicativen layout [21], and complexity and semantic distance [4,19,20], as defined in Table 4. ing Chi et al. [37], semantic scales were then assigned for each of the subjectiv features (Table 5).

Subjective Rating Test
According to Liu and Ho [35], subjective rating features are reliable in determining the performance of icons based on recognition accuracy [35]. Additionally, subjective scales are easy to administer since they are more sensitive than objective measurements [36]. Therefore, in this phase of the experiment, participants were asked to rate their top 2 icons from the ranking test (i.e., icons with first and second rank for each function) on the basis of subjective design features such as perceived icon quality, communicativeness [21], layout [21], and complexity and semantic distance [4,19,20], as defined in Table 4. Following Chi et al. [37], semantic scales were then assigned for each of the subjective design features ( Table 5).
The respondents' top two icons were shown one by one and they were instructed to evaluate the appearance of each icon according to the semantic scales ( Figure 3). They were made aware that on the 7-point Likert scale, the closer they choose to the left or right semantic scale, the better they think that the icon displayed fits the semantic scale. However, if they choose the middle of the scale, their opinion of the icon fits both semantic scales. Similar to the ranking test, the test on subjective design features was also developed using JavaScript and PHP software and conducted online.

Subjective Design Features Definition
Perceived Icon Quality One of the most critical aspects of icon development that defines the successful design [38] Communicativeness Refers to how the icon expresses its intended meaning [21]. Complexity Pertains to how complex the details intricated on the icon are [19]. Layout How carefully the features of an icon are arranged [21].

Semantic Distance
The measure of the closeness of what is illustrated in the icon to its true intended meaning [20]. Table 5. Semantic scales and their corresponding subjective design feature.

Perceived Icon Quality
The respondents' top two icons were shown one by one and they were instructed t evaluate the appearance of each icon according to the semantic scales ( Figure 3). The were made aware that on the 7-point Likert scale, the closer they choose to the left or righ semantic scale, the better they think that the icon displayed fits the semantic scale. How ever, if they choose the middle of the scale, their opinion of the icon fits both semanti scales. Similar to the ranking test, the test on subjective design features was also develope using JavaScript and PHP software and conducted online.

Statistical Methods
Spearman's correlation analysis could help readers to find possible correlations be tween ranking test and the semantic scales. The users' ranking results were dummy code 1 for the top ranking and 2 for the second ranking. We hypothesized negative correlation between the ranking test and the semantic scales since more positive semantic scale would lead to better ranking (rank 1 instead of 2). p < 0.05 (two-tailed) was set as th threshold for this statistical analysis.
Further detailed analyses for each of 19 functions were conducted using the Fried man's test. The Friedman's test was performed to determine the true difference betwee ranking in terms of each semantic scale to provide a fully meaningful interpretation of th

Statistical Methods
Spearman's correlation analysis could help readers to find possible correlations between ranking test and the semantic scales. The users' ranking results were dummy coded 1 for the top ranking and 2 for the second ranking. We hypothesized negative correlations between the ranking test and the semantic scales since more positive semantic scales would lead to better ranking (rank 1 instead of 2). p < 0.05 (two-tailed) was set as the threshold for this statistical analysis.
Further detailed analyses for each of 19 functions were conducted using the Friedman's test. The Friedman's test was performed to determine the true difference between ranking in terms of each semantic scale to provide a fully meaningful interpretation of the data. It also helps scholars in designing or choosing pertinent communication icons related to COVID-19 prevention measures. As an example, the true difference between ranking 1 and ranking 2 in terms of semantic number 1 (unlikable-likable) for function number 8 (cover when coughing or sneezing) can be evaluated by the following hypotheses: Hypothesis 0 (H0). No true difference between ranking 1 and ranking 2 in terms of "unlikablelikable" for function "cover when coughing or sneezing".
Hypothesis 1 (H1). There was true difference between ranking 1 and ranking 2 in terms of "unlikable-likable" for function "cover when coughing or sneezing".
These two hypotheses were applied for any of the possible conditions (difference between ranking 1 and ranking 2 in terms of each semantic scale to be applied for all the tested functions). Table 3 shows the ranking of the icons per function name based on the responses of the participants. The icons were tabulated with their corresponding mean ranking score and its standard deviation The list of icons in Table 3 were sorted based on the mean ranking values. As presented in Table 3, the image-related and combined icon design formats were preferred by the majority of participants, with the image-related format being ranked first on eight of the nineteen function names, and the combined format also obtaining the first ranks on another eight function names. For the remaining three function names, it was the semi-abstract format that was chosen to be first in the rank. Table 6 shows the descriptive statistics of the semantic scales for all the tested icons. On average, all the tested icons were rated around the score of 6 (of 1-7 scale) for all the subjective design features. These results indicated that all the tested icons are sufficiently recognizable, compatible, organized, simple, familiar, effective, concrete, likeable, clear, uncluttered, strong, beautiful, and colorful. Table 6. Descriptive statistics: semantic scales.

Variable
Mean SD Table 7 shows the result of the Spearman's correlation analysis (two-tailed) between pairs of the semantic scales and the ranks for all functions. The analysis showed that the 13 semantic scales were significantly intercorrelated to each other. Although all semantic scales were also significantly correlated with the ranks, we can highlight that vague-clear, weak-strong, and incompatible-compatible were the three ones with the highest correlation coefficients, i.e., −0.206, −0.205, and −0.200, respectively.   Table 8 represents the Friedman's test result for determining the true difference between ranking 1 and ranking 2 in terms of "unlikable-likable" for function "cover when coughing or sneezing". Based on this table, the mean rank for unlikable-likable rank 1 was 1.7 while the mean rank for unlikable-likable rank 2 was 1.3. Chi-square statistic indicated that there was a significant difference between ranking 1 and ranking 2 in terms of "unlikable-likable" (Table 8). In other words, H0 was nullified and H1 was true.  Table 9 represents the summary of all the Friedman's tests on each semantic scale for all functions. Based on this table, we can see that there were some significant differences between ranking 1 and ranking 2 in different semantic scales. Similar results were gathered for Spearman's and Friedman's tests (See Table 9), particularly for weak-strong and incompatible-compatible, where these two semantic scales showed a significant difference between ranking 1 and ranking 2 in 11 out of 19 functions. On the other side, Friedman's tests showed that for function 1 (Shortness of Breath), Function 5 (Wash Hands), and Function 16 (Wash Clothes Properly), all the semantic scales tested in this study did not have a significant difference between ranking 1 and ranking 2.

Discussion
From the ranking test results, it can be concluded that icon users prefer the icons to be in image-related formats. However, for some function names (e.g., shortness of breath and difficulty in breathing), even though the icons are in image-related format, they were still ranked low if the images in the icon were drawn in silhouette-like illustrations, indicating that the users probably like image-related icons only if they are illustrated realistically or in a more concrete way. This may be because the concreteness of the icons improves its ability to convey its meaning, as concrete symbols tend to be more visually obvious since they depict objects, places, and people that are already familiar to us in the real world [41,42]. Therefore, the more concrete the icon is, the better the semantic distance is, and this results in its users being able to react quickly and accurately to it [12,28]. The combined format, which is a combination of icons and textual labels [10], is also at the same level as the image-related format in terms of the number of function names for which it was ranked first. While image-related icons are known to be superior for fast and accurate recognition because of the obvious mapping between the icon and its referent [10][11][12], textual icons, on the other hand, are proven to be better for reaction time [13]. Since the combined format is the combination of the icons and textual labels [10], the reaction and recognition accuracy of the users to these icons could be significantly increased. In the current study, it is noticeable that if image-related icons are incorporated with textual labels, making it a combined format, the icon users favor them. Furthermore, despite the guideline given by the International Standards Organization [29] that the use of abstract symbols should be avoided, results of this study demonstrated that semi-abstract icons, which are combined image-related (concrete representation of an action or object) icons and concept-related or arbitrary icons (abstract representation of an action or object) [8], are still the most preferred on three function names-a result that is similar to that of Chi et al. [5]. On the functions "avoid touching face" and "avoid travelling to places with known cases", the red circle around the icons and the cross marks might have helped in relating them to their correct function names.
Concept-related and arbitrary icon design formats had consistently low ranking scores. This is understandable considering that concept-related icons outline concepts that are close but are still not the exact concrete image of the action or object, and arbitrary icons have no clear reference to their intended meaning and can only be meaningful and understood through education [4][5][6][7]. As a result, compared to the formats that obtained good ranking scores, the icons designed in these two formats do not have obvious mappings to their referent. Therefore, the connection of the icons designed in these formats to their corresponding function names is harder to distinguish, making them the least favored formats for the users.
The result of Spearman's correlation and Friedman's tests infers that compatibility and clarity of icon elements are the main factors determining a particular icon's preferability. Moreover, the alternative icons with stronger communicativeness in delivering the message should be prioritized to be implemented. Furthermore, the high and significant intercorrelations of Spearman's test (See Table 7) between weak-strong, ineffective-effective, and vague-clear also suggest the importance of selecting icons with better clarity to effectively and powerfully deliver the messages related to COVID-19 prevention measures. However, this study also reveals that for specific functions, there is no significant relationship between preference and tested semantic scales, inferring the possibility of inclusion of other semantic scales in the future study.
Despite the clear contributions of the study for design guidelines on COVID-19 prevention measures icons, the researchers would like to acknowledge several limitations of this study. First, because the study was facilitated during the middle of a pandemic, the authors resorted to conducting the data gathering online, and the online experiment was answered by a total of 57 participants. To produce more comprehensive results, future researchers may consider increasing the number of respondents. It is also recommended to broaden the scope of the current study to come up with more thorough and specific design references for COVID-19 preventive measures icons. Second, this study only focused on knowing and understanding the preference of the icon users through the ranking test and preference test that were performed. Aside from considering how the icons satisfy the subjective preferences of the users, future research should also incorporate the performance of the icons in terms of the effectiveness to accurately interpret their corresponding function names. Finally, future study should utilize eye-tracker [43] to find the relationship between the results of ranking test and eye movement behavior to provide more meaningful findings.

Conclusions
Icon has been widely utilized as a tool to promote COVID-19 prevention measures. The purpose of this study was to analyze 133 existing icons of COVID-19 prevention measures published by the health and medical organizations of different countries.
A rank ordering test was conducted for the seven icons representing each function name, followed by a subjective rating test for the top two chosen icons of the respondent form the ranking test. Generally, findings from the current study showed that the image presented in the icon is the key point on which the perceived quality of the icon depends [44,45], and the preference of users for the icon may rely on this. In this case, designers may consider the cognitive features of an icon such as its familiarity, its concreteness, the complexity of the design intricated on it, its meaningfulness, and its semantic distance or its closeness to its intended meaning [20,46,47].
The current study also further proves that that familiarity and semantic distance should be of primary importance when it comes to selecting icons [22,39]. Interestingly, Spearman's correlation analysis between ranking and semantic scales showed that incompatible-compatible, vague-clear, weak-strong, and abstract-concrete were the four strongest semantic scales that highly correlated with the preference ranking. In addition, Friedman's tests inferred that compatibility and clarity of icon elements are the main factors determining a particular icon's preferability. These suggest that designers should choose images that are realistic and as closely related as possible to the function represented by the icon. It should also be simple and straightforward to reduce complexity. Adding elements on graphical or image-related icons, whether textual or arbitrary symbols, is recommendable since they may increase the cognition of the users into the icons and therefore can make them preferable. Icon design formats having less connection to what they actually depict, such as the concept-related and arbitrary formats, should be avoided since they are more challenging to comprehend and are probably not preferred. Another observable result of the study is that color also affects how well liked the icons are. Black and white and grayscale icons obtained low ranking scores, even though they concretely represent their function name. This gives the conclusion that designers should also consider making the icons colorful so that they may be more visually appealing and likeable.
This study is the first comprehensive study to evaluate the icons associated with the COVID-19 prevention measures. The findings of this study can be utilized as the basis for redesigning icons, particularly for icons related to COVID-19 prevention measures [48]. Furthermore, the approach can also be applied and extended for evaluating other medical icons [49,50], safety icons, disaster-related prevention icons [51], transportation-related icons [52,53], and even entertainment-related icons [54][55][56][57][58].

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.