On the nature of thought processes and their relationship to the accumulation of knowledge, Part XVI—The process of making a diagnosis

The process of making a diagnosis is a problem-solving activity carried out in the human brain by thinking recursively using a strategy of “explore and exploit” under the condition of Uncertainty that arises from multiple sources. Required for success are a knowledge base, a set of reasoning skills, and the ability to obtain the appropriate data in the case of a specific patient. Context and the use judiciously of perspective are the keys to minimizing the anxiety that can overcome us in the cognitive gap between information and knowledge. Failures at diagnosis are due to systematic error and are both predictable in nature, if not in time, and avoidable potentially if one understands the process.

Introduction man is ill and his father has accompanied him, providing moral support. Now imagine you are downtown shopping.
Many people are on the street, going into and coming out of stores. Who has a kidney transplant? Who takes medication to treat high blood pressure? Who has high cholesterol? Most of the attributes we use to identify illness are not directly visible. Even people who have diseases of the skin generally wear clothing that prevents others from noticing a lesion.
Basically, we must decide which attribute is important to notice and how we might best notice the attribute. In medicine, the vast majority of attributes are noticed by indirect means.
We order a blood test or we do a biopsy and stain the fixed tissue to highlight a potential attribute to best advantage.
In addition to the fact that attributes associated with disease may not be readily visible, members of the population, including providers of health care, do not agree on whether an attribute identified is actually within the spectrum of "health" or "disease." Consider the following scenarios. Consider the other end of the spectrum. Ross Upshur, in "Looking for Rules in a World of Exceptions" [3], writes the following about one of his patients, "Consider the following patient. Mrs G. is 82 years old. When I assumed her care six years ago, she was given a prognosis of six months to live from severe congestive heart failure. Mrs. G has lived beyond her original six-month prognosis. Would one consider her in good health? I don't know. To consider her healthy is not in any way correct. To call her unhealthy is also seemingly inappropriate. I believe she is in equilibrium. [Upshur catalogs Mrs. G's experiences with (both the diseases themselves and Is a person who has never had a complaint of ill health and who feels well an hour before s/he drops dead from a heart attack healthy? The point I wish to make is that the same definition of health does not apply in every situation. A determination of health depends very much on context and perspective. The context and perspective of the patient, practitioner, and society all interweave to decide who is healthy and who is ill. Before we proceed to the process of making a diagnosis, we will start with an ideal situation to which we can compare our reality. The ideal-by which we are able to make a correct diagnosis every time different "facts"). Also, in conjunction with the two previous paragraphs, specific and parsimonious criteria would be defined for each disease and these criteria would be agreed upon by all to be sufficient to make a specific diagnosis such that disease A, and only disease A, is defined by the chosen subset of data selected from all of the data known about disease A. This item addresses the topic in information theory of "data compression" and relates to the desirability of making a diagnosis with a minimum of ancillary testing (in order to save time and money within the health care system). Additionally, the inner workings of each diagnostician would be logically consistent. Each diagnostician could assume confidently that, as each attribute of the patient is learned, all relevant data from memory would "pop" into mind and allow him/her to follow the algorithm of the process of diagnosis to the diagnosis, which would be correct in every case. Obviously, in the system as a whole, we are nowhere near to our ideal and, in fact, we never can be since the universe, as discussed in earlier essays in this series, is non-deterministic.

Types of uncertainty
There are multiple types of uncertainty and each type has different implications for us in our task of making a diagnosis. Two types of uncertainty are trivial. First, we might have known some fact and forgotten it. Second, perhaps we have not learned a fact yet, but some one else knows. For these two types of uncertainty, all we must do is look up the answer. A third type of uncertainty is real and unresolvable.
It arises from the necessity of learning about populations by sampling. As we all know, when a new ancillary test comes out to aid in the diagnosis of a disease, it is assigned a "sensitivity" and "specificity" based on the performance of the test in the original study population. As we will discuss later in this essay, the attributes of the patient sitting before us do not match exactly the attributes of the study population.
In fact, no two people share identical attributes (not even identical twins). To confound the issue even further, the new test is compared to a "gold standard" test-ideally a test the results of which differentiate perfectly all people who have the disease in question from all people who do not have the disease. Of course, we realize that no existing "gold standard" is ideal.
Our entire system of making diagnoses and utilizing ancillary testing is rooted in the concept that someone knows, somehow, who really has a disease and who does not. This is simply not true. All we have is a group of people; each person of the group has numerous attributes (some of which they share in common with other members of the group and some of which they do not share). We hope we have understood causally the disease in question well enough that a core subset of attributes is shared by group members with the disease and that a similar, but not identical, core subset of attributes is shared by the group members without the disease. We then ask the question, "does this new test differentiate reliably between the two groups? " Can we use   this new test to diagnose reliably a new patient who is not   a member of the original study group? Since, by definition,   we will use the new test on a patient outside the original test group, uncertainty related to the incomplete overlap of attributes necessarily introduces uncertainty into the diagnosis of every patient we see. This type of uncertainty can be lessened somewhat by ensuring that relevant attributes of the study population are well-known and that the patient on whom we are using the test actually shares the relevant attributes.
A fourth type of uncertainty is related to the philosophical concept of "vagueness." Just where do we draw the line?
In an earlier essay in this series we discussed the work of Bart Kosko, in Fuzzy Thinking [4], about assigning in a dichotomous manner attributes that actually occur on a continuum.
In his example of apples, if we have a hundred apples and try to put them into two groups, one of red apples and one of green apples, the color of some apples will be clearly overwhelmingly red or green, but many apples will have both colors and will be more difficult to assign. Examples of this "vagueness" type of uncertainty are encountered in the practice of medicine on a daily basis. For patients with shortness of breath, one must consider whether the problem is more likely to be cardiac or pulmonary in origin. We have available in our diagnostic armamentarium B-type natriuretic peptide (BNP), which is released when the heart muscle is stretched during heart failure. Using BNP as a diagnostic aid works great, doesn't it? If the patient is short of breath and his BNP is less than 100 units, the patient can be safely classified as a pulmonary patient. If the patient is short of breath and her BNP is more than 500 units, the patient can be safely classified as a cardiac patient. Pretty nifty! But what about the patient who is short of breath and has a BNP of 300 units (half way between 100 and 500)? In this case we simply cannot use this test to make our decision-we must search for other attributes that will help us. And similar examples abound in our daily practice. The uncertainty of "vagueness," however, can be tamed somewhat by defining more carefully the context of the patient. All we must do is identify other attributes that alter the context of the patient and make some of the attributes we have identified more helpful-more dichotomous towards making a decision. In the new context, attribute A now argues unequivocally either for or against a diagnostic possibility. More about context later.
Another type of uncertainty is that which arises in the context of experience (or lack thereof) and is the result of making decisions based on "explicit" or "implicit" knowledge. Luchins [5] avers that "explicit" knowledge is analogous to reading written directions to perform some action, while "implicit" knowledge is analogous to how the experience of actually performing the written instructions changes how one performs the action over time (with practice).
An expert (one who has performed many times the action described in the written directions), for example, executes the written directions differently from someone who is following the instructions for the first time. The expert, via feedback gleaned while watching interim events during his/her multiple attempts performing the task, alters slightly his/her interpretation of the instructions and performs the task differently, paying particular attention to one facet or another along the way. Importantly, the results of this feedback are not usually written into a new version of the instructions. In fact, because of the ambiguity of language (as discussed in the essay in this series on Language), it is probably not even possible to write reliably implicit knowledge into written instructions. Each performer of the task learns nuances from repeated performance and over time, his/her performance improves with continued iterations (the so-called "learning curve"). This fifth type of uncertainty, then is the difference between the written instructions themselves and the unwritten "value added" to the performance resulting from experience. See below the discussion of heuristics.
Yet another source of uncertainty arises from the the fact that all testing is indirect. We often know what we want to know, but we cannot look for it directly. We perform a test and from the results of that test we make inferences about what we really want to know. For example, we often want to know how well the tissues of a patient are oxygenating.
Tests that are used to assess this include a hemoglobin or hematocrit to assess oxygen carrying capacity and the partial pressure of oxygen in the blood. When a patient is relatively healthy, that is to say when most of the patient's physiologic systems are working well, inferences made from indirect evidence work admirably. Suppose a patient comes to the doctor's office complaining of shortness of breath on exertion. If we find that the patient has a lower than normal hemoglobin/hematocrit, we likely assume that is the reason for the symptoms, prescribe an appropriate hematinic agent, and have the patient return in a few weeks to see if the symptoms have improved and the hemoglobin/hematocrit have returned to the normal range. But many of us have likely stood at the bedside of a gravely ill patient with advanced sepsis syndrome. The skin appears dusky or pasty or gray and the hemoglobin is a little low (not low enough to explain the clinical appearance) and the pressure of oxygen is likely adequate. We have obtained our tests to try and see how well the patient is oxygenating, but his appearance itself tells us-he is not doing well. In fact, there is no ancillary test that truly assesses directly how oxygen is being utilized at the cellular level.
And this problem is repeated throughout our practices, day in and day out. Radiology sees "shadows," not tumors; levels of any analyte obtained from a blood sample test the level on a "well-mixed" sample (from a fairly large peripheral vein or artery), making it more difficult to assess a focal process. Additionally, a sample is analyzed in such a way that the analyte usually reacts in a "test system," making the analyte more visible (perhaps with a monoclonal antibody and/or a chromogen). Even when we look histologically at structure, we see artifact induced by us-the process of biopsy wrenches the tissue from the rest of the body thereby severing its ability to receive messages that direct its function, furthermore we place it in fixative to ensure it does not deteriorate, and then we slice it thinly and stain it in a variety of ways, using chemical properties to view indirectly one facet or another. Eosinophils, for example are named after their staining properties, not after any sort of function they may have. Even our own senses "process" raw data and present it to our brains in a different format than received at the receptors of the energy we sense. Then we "recognize" the data after our brain has processed it and sent a conclusion to our consciousness somewhere. We can "work around" the uncertainty arising as a result of vagaries associated with the "indirectness" of testing by performing an additional (indirect) test that examines a different aspect of the problem, thereby obtaining "convergent" evidence.
Convergent evidence comes about when results of tests looking at a problem from different perspectives all support the same hypothesis.
Another type of uncertainty is a type we can only know in hindsight-the so-called "unknown unknowns." For this type of uncertainty we may not even know what questions to ask or how to ask a question. An example of this type of uncertainty occurred relative to subclassifying types of leukemia. Circa 1970 or so, Acute Lymphoblastic Leukemia (ALL) was known and treatable; most children fared pretty well, but a small percentage did not respond as expected to therapy. At the time, the diagnosis was made by observing cells from bone marrow or peripheral blood smeared on a glass slide stained with Wright-Giemsa stain. Shortly thereafter histochemical staining techniques were developed that could differentiate B lymphocytes from T lymphocytes. It turned out that the patients with B cell ALL responded to therapy much better than the patients with T cell ALL. Pre 1970, differences existed between B cells and T cells, but we could not tell the difference. Today we have flow cytometry and Cluster Designator markers and many subsets of lymphocytes can be detected; therapies for each subtype of leukemia or lymphoma have been developed. Yet another type of uncertainty arises as a result of the ambiguity of language. This topic is addressed more fully in earlier essays in this series on "Interpretation" and "Language." This type of uncertainty can be minimized by paying careful attention to the context of the situation in which words are used and by using standard definitions of the words, appropriate to the context. Context will be discussed in more detail later in this essay.

Can we draw any valid conclusions at all?
Good heavens! If everything we "know" is a derivation of something else, and an inexact datum to boot, how can we make any progress? An important point to make here is that uncertainty is not the same as randomness. While we may not be able to pinpoint exactly, and while we therefore feel uncertain, about some aspect of our work, we can be confident that the "true" answer lies somewhere between a set of limits; thus, the result is not random and completely unpredictable. The most important thing we can do is to make careful observations, check the validity of those observations with others (test inter-observer agreement), consider possible patterns and/or develop hypotheses of "causation" about the set of observations, and then test the competing hypotheses.
That is to say, to evaluate observations via the process we know as science. Important factors in reasoning are context and perspective. A scientifically-minded human will construct carefully a context and consider different perspectives, reasoning through the data from each of the perspectives, trying to find a "truth." It is also important that multiple people evaluate the same data. The importance of collective efforts is that different people will likely have different perspectives.
Even if two people are considering what they think is the same perspective, the differing prior experiences of the two "thinkers" will likely lead them to a slightly different view of, and conclusion from, the data, and ensuing discussion will likely further expand the joint thinking process. If different people can confirm data or reaffirm proofs of conclusions drawn, it is more likely that the data and conclusions are "true," that is to say conform with principles of Universal Law. It is most important that we take pains to ensure that we are describing a consistent system. Bronowski [6] relates that Kurt Godel and Bertrand Russell have reminded us, in a consistent system there are true things that cannot be proved (Godel's Incompleteness Theorem), but (Russell) in an inconsistent system one can "prove" anything!
The root of much of the uncertainty we face arises from the situation that we are faced, as a direct result of the nature of the universe in which we find ourselves, with infinite variation around a number of common themes. Human beings are very much alike (each of us shares a large number of the attributes of "humanness"), but additionally, each of us is different in some ways from all others of the set of human beings (the entire set of attributes that describes each of us is different from the entire set of attributes that describes each and every other member of the set "human beings").
Psoriasis, for example, has multiple presentations and features, but all features and presentations share commonalities that, when present, allow us to make the diagnosis and to have certain expectations about treatments that will be efficacious. The task we face is to recognize which subset of attributes represent the commonalities required to make a specific diagnosis among the entire set of attributes that comprise the "infinite variation." Human senses (sight, hearing, touch, etc) work by synthesizing many bits of data almost instantaneously; in fact, we are not even sure as individuals how we accomplish this feat and are not even aware consciously of many of the bits of energy impinging on our senses. Consider the problem of recognition of human faces by computers. Computers have not done very well, especially in the earliest attempts at programming computers to recognize faces, although progress toward recognition has been made. We humans, on the other hand, usually have little trouble recognizing faces that we have seen previously, even if the face belongs to someone we do not know well. Also, we usually have little trouble recognizing a high school classmate 20 years later at a reunion, even thought many features have changed (such as sagging jowls, wrinkles, gray hair, and the like).
Parenthetically, Sacks [7] relates that a small percentage of people have prosopagnosia, the inability to recognize faces. The ability to recognize faces has been very important to humans throughout our evolution because we need to remember who has treated us fairly or unfairly so we can behave appropriately at a future encounter. Good face recognition skills serve a survival advantage.
When computers proved miserable in their first attempts, the human programmers began considering just which features humans consider important. Programs started with photographs, but it turned out that computers could not recognize a recently photographed person who was now tired, for example, because eyelids were puffier or darker than the original photo for comparison. Over time, and with adjustments to programming made possible by careful study by humans of which features are more important than others, computer face recognition has become better. We seem to "know" inherently what features of human faces are important, but we may not be able to articulate what those features are. Our ability to recognize faces represents a heuristic-a "short cut" we use to make decisions.
Much study has been done about heuristics. Are they intuition that can never be defined? If someone says they have a "gut feeling" or instinct, should we trust them? I think it likely that what we call instincts or heuristics can be ultimately defined, if we deem them important enough to study and solve. An elegant example is given by Gerd Gigerenzer in Gut Feelings [8]. He describes the "gaze heuristic." A friend of his played baseball and was very good at catching fly balls. The player's coach thought he was lazy because he would sometimes just trot slowly to catch the ball, and the coach thought he should run as fast as he could to where he had calculated the ball would land. When the player did this, he missed more balls than if he used his usual technique. An assumption, prior to discovering the heuristic, was that play- Obviously, one does not want to "catch" an airplane, as one does want to catch a fly ball.
Gigerenzer further avers that "a simple rule is less prone to estimation and calculation error and is intuitively transparent." That is to say, it is preferable to use heuristics in many situations. However, one should recognize the heuristic as a heuristic and know the mechanism by which it works. Daniel Kahneman, in Thinking Fast and Slow [9], given the expert access to information stored in memory, and the information provides the answer. Intuition is nothing more and nothing less than recognition.' . . . Valid intuitions develop when experts have learned to recognize familiar elements in a new situation and to act in a manner that is appropriate to it." Thus, expert and accurate intuition is the ability to recognize common themes in the new situation arising in an instance of (infinite) variation. And we must study the behaviors of experts to learn the mechanisms of their heuristics so that they can be shared, thereby, in keeping with the topic of this essay, improving the overall performance of diagnosticians in the healthcare system. If we want to understand how experts think, we must understand how the human mind works.

The workings of the human mind
Basically, we can only think about what pops into our minds. Furthermore, there is a limit to how many items we can ponder simultaneously. What pops into our minds depends on association-that is to say, when we try to recall something, we try to draw an association with something else that helps us remember that item. Items that pop into our minds are likely to arrive there by "similarity matching" ("it looks like . . ."), "frequency gambling" ("I have seen that often lately"), and "recency" ("I just saw that"). That item we are considering reminds us of something else, so we consider whether the new item belongs to the same class that we are reminded of.
We have seen a number of one class in particular, so if we see something that shares (reminds us of) an attribute with something we see often, surely this new item is also of the frequently noted class. Gary Marcus, in Kluge, [10] reminds us that we have evolved the processes of thought that we now possess, so those processes must be good enough to enable us to survive long enough to reproduce others of our kind. We can learn and memorize various facts, but we must be able to recall items when we need to. Whether we recall what we need often depends on the context of the situation in which we are trying to think and how closely that context relates to the one in which we learned the fact we now need to recall. Kahneman  • Detect that an object is more distant than another • Orient to the source of a sudden sound • Complete a common phrase, such as "bread and . . ." • Respond to a horrible picture by making a "disgust face" • Detect hostility in a voice Another facet of our minds is that we see what we want or expect to see. In the essay in this series on Patterns, we discussed the work of Erich Harth [11]. He described what occurs in one's brain as one is walking along a beach. One's eye catches sight of a round, shiny object. One's brain tries to make it into a coin, but one's senses can save the day by Kahneman also points out that System 2 can only work on one problem at a time. Recall from above that maintaining a higher than usual walking speed is a System 2 activity.
Kahneman performed some studies in which, while walking at a fast pace with a test subject, the test subject would be asked to perform a complex multiplication task, such as multiply 17 by 24. Each time, the test subject would slow down to complete the multiplication task. The implication of this is that humans really cannot multitask two System 2 activities at the same time. We might be able to perform a System 1 activity concurrently with a System 2 activity, but not two activities that each require our attention (a defining aspect of a System 2 activity) Another thing about the way humans think-we tend to think automatically somebody must know the answer to any question that arises. We may admit, reluctantly, that perhaps we ourselves do not know the answer to some question, but we assume that somebody knows. This is natural in a way. When we are children, our parents or guardians teach us about the world around us. Whenever we do not know something, we ask them and almost always an answer is forthcoming. If they do not know, we are referred to references (dictionaries, encyclopedias and the like) and the answer is there. Even when we have difficulty finding an answer, we assume the answer must be out there somewhere.
It takes a long time, but eventually, especially when we get to graduate school, we begin to learn that some questions do not have satisfactory answers. We detect inconsistencies between the "answer" to this question and the "answer" to another question. We recognize that both answers cannot be true at the same time, and we can find no ready resolution to the dilemma.

Thinking recursively
Another thing about how humans think-we think recursively. When ever we have a thought, we tend to modify that thought by something else that pops into our minds.
James Reason, in Human Error [12], posits that people solve problems in three basic ways: skill-based, rule-based, and knowledge-based. Skill-based is used most often and relies almost entirely on the automaticity of "System 1." Examples include tying one's shoes, answering a telephone, brushing one's teeth, or riding a bicycle. Once we learn the activity, we hardly think about it. The action just seems to occur without much conscious thought, once we decide to initiate the action. Rule-based refers to following specific rules to an end. Algorithms are a good example of rule-based actions, and we are encouraged often to use algorithms when we practice medicine. For both skill-based and rule-based activities we are familiar with the situation and we know what to expect. The only problems with execution of these activities arise when we are distracted during skill-based actions or when we misidentify the problem and choose an inappropriate rule to execute for a rule-based problem.
We use knowledge-based techniques when we face problems that are new. We have never seen anything quite like the situation we find ourselves faced with. As a result we have to figure out what to do as we go along. At each stage, as we are trying to solve the problem, we ask ourselves, "Are we any closer to the answer?" Interestingly, we solve these problems by imagining a desired result and then trying to get to that result. We say to ourselves, "It looks a little like that other problem I had, so I'll try this maneuver that worked back then." After that step, we reassess and decide whether we seem to be closer to our imagined goal. If so, we continue. If not, we take a step back and try another tack.
It is the process of assessment and reassessment, using the feedback we receive from observing the status of events at each step of our progress and then deciding on the next maneuver based on the information gleaned, that is the recursive process.
Interestingly, this process that we have evolved to use is modeled in manufacturing endeavors as "Good Manufacturing Processes," or GMP. By following GMP and checking the interim product after each step, the firm has an opportunity to make alterations and save a batch of product that, if not manufactured according to GMP, might otherwise be lost.
The point I am trying to make about thinking recursively is that even though thoughts occur to us in succession, we A basic premise underlying the project was that the strategy used in solving new problems is one of "Explore and Exploit." One thing Mitchell realized was that all possibilities must be potentially available, but they cannot be equally available. For example, counterintuitive possibilities must be potentially available, but must require a cogent reason to be considered strongly enough to warrant committing significant resources for adequate exploration of that possibility. She also realized the importance of keeping a balance between exploration and exploitation. "When promising possibilities are identified, they should be exploited at a rate and intensity related to their estimated promise, which is continually being updated. [recursive evaluation] But at all times exploration for new possibilities should continue. The problem is how to allocate limited resources-. . . be they lymphocytes, enzymes, or thoughts-to different possibilities in a dynamic way that takes new information into account as it is obtained." Mitchell's goal was to write a computer program called "copycat" (because a premise of the project was that "analogy-making is a subtle form of imitation"). The goal of the program was to start with the example of two given strings of letters, similar but with an alteration, and then to give the problem of a "test" string of letters for the computer to come up with an altered string that was analogous to the example.
One given alteration was "abc morphs to abd." The test was to be open-minded, but the territory is too vast to explore everything; you need to use probabilities in order for exploration to be fair. In Copycat's biologically inspired strategy, early on there is little information, resulting in high temperature and a high degree of randomness, with lots of parallel explorations. As more and more information is obtained and fitting concepts are found, the temperature falls, and exploration becomes more deterministic and more serial as certain concepts come to dominate. The overall result is that the system gradually changes from a mostly random, parallel, bottom-up mode of processing to a deterministic, serial, focused mode in which a coherent perception of the situation at hand is gradually discovered and gradually 'frozen in.'" It seems to me that Mitchell's program serves as a good analogy for the process we use for making a diagnosis, or for that matter for any problem-solving activity. And because we face infinite variation around a number of common themes, we must use a "knowledge-based" approach to problemsolving more often than we would like to, even if we narrow somewhat early in the process our exploration by recognizing an attribute, or group of attributes, that seem to suggest to us a specific common theme. However, as admonished by Mitchell and her program we must still "explore" to a small degree less likely probabilities. After all, if we do not consider, however briefly, a diagnosis, we will never make that diagnosis.

Dealing with large amounts of data
How do we humans deal with large amounts of data? Is more data always better? We have already learned from Kahneman's work that our minds are lazy. We know there is no way we can learn and use efficiently vast amounts of data. We need shortcuts of some sort. A common thing that we humans seem to want is some sort of "unifying theory." We think in our heart of hearts that if we have the rule, or a small and easily remembered set of rules, we can figure out anything and we will not have to memorize so much and work so hard to make progress. In making a diagnosis, context is created by looking for multiple attributes that make up the set of a specific disease.
A chief complaint can mean many conditions, but by adding attributes, gradually other conditions on the list of differential diagnoses are eliminated. A context is created such that the chief complaint comes to mean only one disease. For the example of substernal chest pain, if we add the attributes of "pressure," radiation to the left arm, no change with breathing cycle, lasts about 20 minutes and goes away, relieved by sublingual nitroglycerin, and aggravated by exercise and exposure to cold, we can exclude the possibilities of musculoskeletal pain, pleurisy, and dissecting aortic aneurysm, but we are still left with the possibilities of cardiac angina, hypertrophic cardiomyopathy, aortic stenosis, and esophageal spasm. If we then add to the context the item of systolic ejection murmur, we narrow the problem down to aortic stenosis.

Data and information; context and perspective
I have used frequently the terms "data," "information," "context," and "perspective." How are the meanings of these terms related? A "datum" is a fact that has not yet registered in a human brain. Once a datum registers in a human brain (once a person is actually paying attention to a Datum), it becomes "information." James Gleich, in The Information [15], mentions the lamentation of Heinz von Foerster during an early cybernetics conference, who complained " . . .

that information theory was merely about 'beep beeps,'
saying that only when understanding begins, in the human brain, 'then information is born-it's not the beeps.'" Thus, a datum becomes information when the human mind, using System 1, begins to associate the datum with other information/data stored in the human brain. "Context" is the "system" of interacting data/information, being considered recursively by the thinking human.
"Perspective" is a sort of "lens" through which a thinking human considers the data/information and context. Perspective can be purposely altered to a certain extent, although as mentioned in the earlier essay in this series on Interpretation some of our bedrock beliefs are so ingrained in our worldview that we are not consciously aware of how we came to believe them and we may not be able to consider some perspectives that would require considering those beliefs to be false. An example of a datum not registering in the brain of a diagnostician, and thus not becoming information, might be examining the results of a complete blood count. The diagnostician glances at the entire sheet of data, but perhaps only registers the hemoglobin, hematocrit, total white blood cell count, and platelet count, paying no attention to the mean corpuscular volume, mean corpuscular hemoglobin, red cell distribution width, or mean platelet volume. All those data values are reported by the lab on the report, but diagnosticians may not pay attention to them in a specific patient case.

Making a diagnosis
So, if we cannot make a diagnosis under ideal conditions, how do we do it? The ground rules still apply. That is to say we still require a knowledge base, a set of reasoning skills, and the ability to acquire necessary data in the case of an individual patient.
Considering how the human mind works, we must wait until some possible disease state pops into our minds. We gerer with a complaint that seems improbable in the context of additional data), that diagnostician begins using System 1 and starts to make associations and draw data from stored memory into working memory.
In addition to the fact that we must wait for some idea to pop into our mind based on what we see is that "we notice what we notice" and nothing more. Ian Stewart, in The Mathematics of Life [16], discusses the work of taxonomists. States Stewart, "Taxonomists quickly learned that the most important features for classification were seldom those that immediately attracted the attention of the human observer. . . . Which characters are best suited for classifying organisms? Tigers and Zebras are both striped, but that doesn't imply that they are closely related. In fact, tigers and zebras do not belong to the same genus, to the same family or even to the same order. Tigers are in the order Carnivora Still explaining taxonomy, Stewart also reminds us, "One of the first steps in the development of any branch of science is to find a way to organize the wealth of observations that nature presents to us, and this is especially necessary in biology, because of the vast diversity of life." Stewart describes the use by taxonomists of cladograms, diagrams that relate branch points and their timing and that describe shared attributes and the time during evolution that the attributes split or diverged (were no longer shared by subsequent [new] groups). Each "clade" represents an ancestral organism with all of its evolutionary descendants.
Stewart mentions that constructing a clade "involves three steps: collect data on the organisms concerned, think about suitable cladograms, and choose the best of these." From the collected data, a set of characters are selected and the candidate organisms are assigned a value for having (1) or not having (0) the attribute. Then the data is assessed as to how many organisms have the highest percentage of attributes, how many a smaller percentage and so on.
Organisms more closely related share more attributes and those sharing fewer attributes are less closely related. The data is then fed into a computer and the computer generates possible cladograms. The computer then analyzes statistically the data to see which cladogram is the best fit. Then starting with the values generated, the computer re-runs the data multiple times until there is no significant difference between the previously run cladogram and the subsequent one. The process is re-run, using different attributes. The goal says, Stewart, is to find convergent evidence, "We can be very confident if different data, analyzed by different methods, lead to similar results." I believe this is similar to the process of figuring out which attributes associate to define a disease. If we consider the example above of substernal chest pain, we can see that a certain number of attributes are shared by the different disease entities that make up our differential diagnosis, but as the problem is considered in the context of different data (new attributes added to the mix), some possibilities become less likely (less closely related to the definition of the disease). Also, consider over time how the understanding has evolved of various disease processes and how new attributes are added to the armamentarium in order to better classify a disease. We have iterated the process of defining a disease over decades, each new study helping to find a better definition of disease, a definition that will hopefully differentiate that disease from all others.
Of course contrary to the ideal conditions for making a diagnosis, the knowledge base of each of us is limited, the knowledge base itself has items lacking because many diseases are not fully understood, and some of our current "knowledge" will prove incorrect, perhaps because data is missing or because we have made incorrect inferences about the data we have.
The best we can do, really, is to consider a differential diagnosis based on the presenting situation, either chief complaint or observation of some aspect of the patient. The practice of medicine is a "team sport" (not necessarily a team working at the same time and on the same patient, but collectively and over time we physicians share knowledge about a population of patients), so we had best consider a differential diagnosis that is listed by an authority for the presenting situation. An authority might be a text or consensus statement from a professional society, for example. Then, in order to figure out the most important additional data to obtain, we must understand the clinical definitions of the disease entities on our list of differential diagnoses. This is in a way analogous to using a stratagem in solving a Sudoku When considering the items on our list of differential diagnoses and when looking for data that answers the ques-tion, "Does this patient have the features required to diagnose clinically Disease 'A,' Disease 'B,' or Disease 'C'?" we must ask the question from the perspective of each disease on the list of differential diagnoses. We must say to ourselves, "Feature One is a sign (favors the diagnosis) for Disease 'A,' but a countersign (argues against the diagnosis) for Disease 'B,'" and so forth, going through each of the data from the perspective of each disease. Then the disease that has the fewest countersigns is the most likely diagnosis. Of course, any very important countersigns may cause us to broaden our list of differential diagnoses, considering a less common disease as the cause. With the data we have, we engage in an episode of recursive thinking, going back and forth in our minds between possibilities until we finally decide on one-the diagnosis.
There are two very important points to keep in the back of our minds when we are making a diagnosis. First, disease entities do not have differential diagnoses. Disease entities share attributes and groups of attributes. Only the attributes and groups of attributes themselves have differential diagnoses.
This may seem like a nit-picking distinction, but it is crucial. Think about times you have read an article about one disease or another in the context of thinking about a particular patient. What happens when you get to the list of supposed differential diagnoses for that disease? You probably think to yourself, "I would never have considered that disease in this patient." And then what? If you believe that diseases have differential diagnoses, just because that disease was on the list of differential diagnoses you might order another test or two to rule out the disease that you were not even considering for that patient. But if you look at it from the perspective of the previous paragraph and think about which attribute or attributes the patient has that led you to read the article to begin with, you can then decide rationally whether the disease on the list you were not considering initially should be considered (shares the attribute(s) with the disease described in the article and with your patient) or whether the disease described in the article shares other attributes with the disease listed as a differential diagnosis, but other attributes that your patient does not exhibit; that is to say, you should not consider that disease on the differential list as a possibility for your patient. Disease states can be considered sets of attributes. Each disease-set has many attributes and only a subset (of the disease-set as a whole) of the attributes serve to define the disease for diagnostic purposes. For example, fever is a common symptom and is an attribute shared by many diseases, but usually we recognize that fever means "the patient is ill" and we look for other attributes that define more closely the nature of the disease. Periodicity of the patient's fever may suggest malaria, for example, or presence of a rash accompanying the fever may suggest measles as another example; and usually we order an ancillary test, perhaps a Wright-Giemsa stained smear of peripheral blood to look for malarial parasites.
When considering the concept of disease-sets when making a diagnosis, it can be useful to consider Venn diagrams, whereby one looks for areas of overlap between sets. Looking for attributes in the area of overlap, that is to say looking for shared attributes, does not help us differentiate between disease-sets. We must look for attributes outside the area of overlap to differentiate between two (or more) diseases.
The second very important point is that there is one inviolable rule that applies to the process of making a diagnosis. items are class C, if one pulls one item "out of the hat (the hat representing the attribute or group of attributes in question)" at random, the likelihood is much higher that the item will be of class A than of any other class. Items of class A are ten times more likely than class B and 100 times more likely than class C.
As we add attributes to the set that represents the attributes of our patient, some differential diagnostic possibilities drop out of contention because those deleted diseases have attributes that exclude them from consideration or because they do not have an attribute or attributes required to make a diagnosis, leaving fewer and fewer possibilities. For a disease entity defined precisely, when the attributes of the patient match the set of attributes that make up that disease, the prevalence of the disease in a population of patients sharing all the attributes of our patient approaches 100 percent (of course there will always be some disease not yet discovered that cannot be excluded or some disease we failed to consider; thus the prevalence cannot reach 100%).
The problem, of course, in using this inviolable rule arises from trying to assign the attribute to begin with, particularly if that attribute is "a matter of degrees," like the red and green apples or patients with shortness of breath with a measurement of B-type Natriuretic Peptide level. Attributes that lie on a continuum can only be assigned as "yes/no" in context, and by a process of recursive analysis. We must continuously "shuffle" or "juggle" the data, considering slightly new perspectives with each additional datum thrown into the mix until any change in the "apparent answer" becomes insignificant. When the degree of change has become insignificant, then we have "homed in" on a "common theme," the insignificant change representing part of the infinite variation that is an integral part of the system in which we live and work.

Failure in making a diagnosis
Using the analogy of Explore and Exploit, we can fail at a number of places on the road to making a correct diagnosis.
We can fail to recognize an attribute and not even realize a diagnosis should be made. We can "home in" on an attribute that is not important, or fail to follow up by looking for more important attributes (from the standpoint of accurate classification) and end up making a wrong diagnosis. We can look for "confirmatory" attributes only (from the perspective of only one potential diagnosis on our list of differential diagnoses), failing to rule out other items on the list of differential diagnoses. We must remember that we diagnosticians are human beings first and diagnosticians second. We think just like all other human beings. We must remember to engage System 2 to ensure we are not making a careless mistake.
In the end, every human being classifies items many times a day, but as diagnosticians, our classifications affect another human life and we must take care when we perform our task that we have taken reasonable precaution to avoid the errors common to the process of classification, errors that are most often due to "signing off" too soon on the work of System 1.

Conclusion
The process of making a diagnosis is analogous to a complex adaptive system and, therefore, principles that apply to complex adaptive systems apply to the process of making a diagnosis. Complex Adaptive Systems are composed of interacting elements, each part doing a different job, but each part integral to the outcome. While we can observe and study elements, we can not observe and study interactions in the same way. We can only observe outcomes, that is to say changes to an element, that result from the interaction(s). Furthermore, no one element controls the system; however, any one element can affect all other elements. Some of the elements that make up the system that is the process of making a diagnosis include the subsystems of the human mind (perception, interpretation, imagination, classification, attempts to construct coherence between items of information, knowledge base, ability to recall stored items into working memory, and in general all the items attributed by Kahneman to System 1 and System 2) and the relationship of each human's mind to culture and society (including our ability collectively to study disease processes and our collective understanding of the concepts of health and disease). Any one of these elements can, and does, influence the ultimate diagnosis of a given patient at a given time.
Complex systems, by definition are not completely predictable. To review from Edelman and Tononi [17] "Only something that appears to be both orderly and disorderly, regular and irregular, variant and invariant, constant and changing, stable and unstable deserves to be called complex." That interactions themselves are not observable directly explains in large part the unpredictability that occurs in complex systems. Systematic errors occur in all systems and these errors are predictable but the timing of these errors is not predictable (although systematic errors do not necessarily occur frequently-errors occur secondary to interactions that stress the system in some way). If the process of thinking is a system, there are predictable systematic errors that occur during processes of human thought, such as confirmation bias, failure to consider viable potential diagnoses (premature closure), and the like. Life itself is a complex adaptive system of which each of us is a member, and as a result we must expect rules to change. Some rules, basic rules, do not change-the mechanism of the hydrogen bond, for example.
But strategic rules can change and we must be on our guard, continually looking and assessing "outcomes" to see if we expect them or whether we ourselves should change a strategic rule. It has been said we live in the Age of Information.
We all think we know what that means-that there is "data, data everywhere." We think that too much information is Gleich continues, "Another way to speak of anxiety is in terms of the gap between information and knowledge.
A barrage of data so often fails to tell us what we need to know. Knowledge, in turn, does not guarantee enlightenment or wisdom . . . It is an ancient observation, but one that seemed to bear restating when information became plentiful-particularly in a world where all bits are created equal and information is divorced from meaning." It seems that the only way left to us to make progress and to diminish the anxiety (felt by all diagnosticians) associated with the glut of information is to leave the Age of Information behind us and to enter the Age of Context and Perspective. By entering the Age of Context and Perspective, "all bits" will no longer be equal and "information" will no longer be "divorced from meaning." Each of the Ages of Mankind have formed the foundation of the next age. Principles learned in the Stone Age persisted in the Iron Age and on up through the ages of Agriculture and Industry. Information will not disappear if we leave the Age of Information. On the contrary, "information" can only gain meaning and actually inform us if we use that information "in context" and look at the information "from a variety of perspectives." By doing so, we can make a decision that makes the most sense in the context of what proves to be the best perspective.
As Minsky has reminded us, things that are easy for humans to do are sometimes difficult to study because they are so easy and it is not clear to us what we actually do, but by perseverance, we can make progress. A computer program (Copycat), can make an analogy, one of the most basic cognitive tasks humans do. Scientists have figured out the mechanisms of some heuristics, reliable short cuts to making correct decisions quickly. Although we may not know how or why we know something, Simon avers that a mechanism exists, if only we look for it, and the technique can be taught.
We can put our minds to the problem of using context and perspective more often and more appropriately, thereby improving our efforts at diagnosis. While Uncertainty cannot be banished from our existence, we can ease the anxiety resulting from the gap between information and knowledge.

Summary
The process of making a diagnosis is a problem-solving activity carried out in the human brain by thinking recur-sively using a strategy of "explore and exploit" under the condition of Uncertainty that arises from multiple sources.
Required for success are a knowledge base, a set of reasoning skills, and the ability to obtain the appropriate data in the case of a specific patient. Context and the use judiciously of perspective are the keys to minimizing the anxiety that can overcome us in the cognitive gap between information and knowledge. Failures at diagnosis are due to systematic error and are both predictable in nature, if not in time, and avoidable potentially if one understands the process.