Planning a clinical research study

n planning any research protocol, we should consider HOW WILL POTENTIAL SOURCES OF BIAS BE two questions: 1. Is there a real need for the trial? 2. Is AVOIDED? the study design and methodology robust? We focus on the second issue-study validity. Bias is “a systematic tendency to produce an outcome that differs from the underlying truth”. A randomized controlled trial (RCT) is the most valid of falls into four categories: selection bias, performance bias, the clinical research designs. It is a prospective study where detection bias and attrition bias [Table 4]. allocation to the treatment groups is random. Recently, RCTs have become widespread in the medical literature. Selection bias In 1998, more than 12000 RCTs were being published The goal when enrolling patients is to create comparison each year, more than double the annual publication rate groups that are similar with respect to all known or unknown of just a decade previously. This growth can be traced to confounding factors. This is accomplished by randomizing the growing acceptance of RCTs as the most reliable patients. Reviews comparing randomized with observational experimental design for investigating therapeutic studies have found that a lack of randomization can lead to interventions. Although preferred, RCTs are just one of both underestimation and overestimation of the treatment many research designs [Table 1]. effect. The process of randomization depends on two procedures: generation of an allocation sequence and While outside factors such as cost or time may influence allocation concealment [Table 5]. the choice of design, the most suitable research design is dictated by the research question being asked. For Randomization example, it would be unethical to randomize patients to Fundamental to RCTs is the random allocation of patients an exposure suspected as being harmful. A cohort study to comparison groups. Nonrandom methods of allocation would be an appropriate and ethical design to answer such subvert the whole purpose of an RCT. Some methods are a question. Nonetheless, for questions of therapy, RCTs described as “pseudorandomization”. have moved to the top of what is known as the therapeutic allocating patients by chart number, date of presentation hierarchy [Table 2]. The validity of the evidence is highest or by alternating assignment. There is the risk of introducing for a single, large randomized trial. Randomization limits bias into your study. As an example, in some populations bias and controls for unknown prognostic variables. the day of week on which a child is born is not a completely Careful deliberation of some simple questions can help to random event. There is also the risk of compromising Bias in clinical trials


Selection bias
In 1998, more than 12000 RCTs were being published The goal when enrolling patients is to create comparison each year, more than double the annual publication rate groups that are similar with respect to all known or unknown of just a decade previously. 1 This growth can be traced to confounding factors. This is accomplished by randomizing the growing acceptance of RCTs as the most reliable patients. Reviews comparing randomized with observational experimental design for investigating therapeutic studies have found that a lack of randomization can lead to interventions. 2 Although preferred, RCTs are just one of both underestimation and overestimation of the treatment many research designs [ Table 1]. effect. 8 The process of randomization depends on two procedures: generation of an allocation sequence and While outside factors such as cost or time may influence allocation concealment [ Table 5]. the choice of design, the most suitable research design is dictated by the research question being asked. 3 For Randomization example, it would be unethical to randomize patients to Fundamental to RCTs is the random allocation of patients an exposure suspected as being harmful. A cohort study to comparison groups. 9 Nonrandom methods of allocation would be an appropriate and ethical design to answer such subvert the whole purpose of an RCT. Some methods are a question. Nonetheless, for questions of therapy, RCTs described as "pseudorandomization". 10 have moved to the top of what is known as the therapeutic allocating patients by chart number, date of presentation hierarchy [ Table 2]. The validity of the evidence is highest or by alternating assignment. There is the risk of introducing for a single, large randomized trial. 4 Randomization limits bias into your study. As an example, in some populations bias and controls for unknown prognostic variables. 5 the day of week on which a child is born is not a completely Careful deliberation of some simple questions can help to random event. 11 There is also the risk of compromising allocation concealment if your allocation sequence is predictable.
While there are complex methods of generating an adequate allocation sequence, the most elegant and simple designs are underused. These include a table of random numbers or a computer-generated sequence.
Groups are more likely to be balanced as the sample size increases when using a random number generator. For example, in a sample size of 20 patients, investigators should expect that roughly 10% of the sequences generated via simple randomization would yield a ratio imbalance of three to seven or worse. 12 Manual methods of randomization such as coin-tossing or dice are technically  manual methods of randomization is that they leave no paper trail and so cannot be checked at a later date.

Concealment of allocation
Will the results be valid?

A proper allocation concealment scheme keeps
How will potential sources of bias be avoided? What is the justification for the hypothesis underlying the power investigators and patients unaware of upcoming assignments. In an ideal world, allocation concealment Will the results be applicable?
would be unnecessary and patients would enter into the Has sufficient account been taken within the study design of the issues of generalizability and representativeness? trials groups to which they were originally assigned. It is Is the trial population reflective of the target population so that the important to realize however, that the process of results will have meaning?

randomization often frustrates clinical inclinations. In cases
Have the outcome measures been well chosen and adequately of poor allocation concealment (for example, posting of the allocation sequence), knowledge of upcoming assignments could lead to the exclusion of patients the  adherence to an RCT protocol. In these cases, even good attempts at allocation concealment may be subverted, as was the case in one study where residents held envelopes up to bright light to decipher upcoming assignments to avoid hassling their attendings with the more involved treatment late at night. 13 The importance of allocation concealment in protecting against bias has been shown in a study that showed greater heterogeneity in trials with improperly concealed allocation. 14 Development of a robust method of allocation concealment requires thought and effort. In addition to the demands of day-to-day medicine which frequently trump the desire to maintain good research methodology, one must also contend with human nature and the natural inclination of some to decipher the concealed allocation for curiosity's sake alone.
When designing a trial, use of additional elements to ensure that your concealment is tamper-proof is advised [ Table 6]. 15 possible, an inert, but otherwise identical placebo should be used.

Attrition bias
Throughout the course of a trial, there will be participants who deviate from the study protocol or those who drop out and refuse any further participation. This population of patients may differ in a relevant and systematic way

Performance bias and detection bias
from the patients who have adhered to the trial protocol. Performance bias arises when the treatment assignment is As an example, patients may have dropped out and known to patients or caregivers, and detection bias arises become unavailable for further follow-up due to acute when outcome assessors or data analysts are similarly exacerbations of their illnesses. 18 Likewise, it would not be aware. They will be considered together since the solution surprising if those patients who suffered the most serious for both is the same. Blinding is the process of ensuring side-effects were those who chose to deviate from the study that such parties are kept unaware of whether patients have protocol. For these reasons, the analysis should include all been assigned to a treatment or a control group. Without randomized patients, not just those who adhered to the blinding securely in place, an RCT is vulnerable to bias treatment protocol. In addition, all patients should be from a number of sources [ Table 7]. 16 analyzed according to the groups to which they were originally allocated, regardless of what treatment they The importance of blinding to preventing personal bias actually received. This type of analysis is known as from clouding judgment is especially important when intention-to-treat and guards against the introduction of assessing subjective outcomes. One study has shown that attrition bias. 19 However, exclusion from the analysis is nonblinded assessors were more likely to see the benefit sometimes unpreventable. This occurs if some participants of an intervention than blinded assessors. 17 Blinding of become lost to follow-up before outcomes can be recorded. certain parties may be impossible in some trials. As an In such circumstances, it is important to report explicitly example, it may not be possible to blind caregivers or the number of subjects excluded and to discuss the outcome assessors in surgical trials. The absence of blinding possibility of attrition bias in the written report. Strategies does not preclude the ability to create a methodologically to maximize patient follow-up are presented in Table 8. 19 strong RCT. As an example, use of objective outcome Tips for avoiding bias in a clinical trial are presented in measures or assessment by a third party not involved with Table 9. the RCT are viable methods to avoid bias when blinding of outcome assessors is not possible. Sometimes the SAMPLE SIZE, HYPOTHESIS-TESTING AND STUDY administration of a noneffective treatment can have a POWER positive effect on outcomes because the patient believes it will work. This phenomenon is known as the placebo effect.
The goal of any RCT design is to use the smallest sample Aside from helping to compensate for the placebo effect, size necessary to attain a prespecified level of power to use of a placebo in the control group is an important aspect of blinding. Patients and physicians would quickly discern allocation assignments if the treatment between comparison groups was readily observed to be different. Whenever Description of the mechanism for contact, Ensure enrolment into study before assignment   The magnitude of the difference between comparison groups for any characteristic Streamline trial procedures to move participants quickly through a  Blind as many of the following as possible: study enroller, increases, the necessary sample size increases as well. This participant, caregiver, outcome assessor, data analyst Make sure the placebo is well designed can be illustrated by imagining a population where the Use intention-to-treat analysis variance was zero, which is to say that each member of the population was identical. In this case, the sample size detect an effect of interest. 20 Power is just one factor to could be very small and still be a good representation of consider when determining sample size. It is not the intent the population.
of this article to show how sample size calculations are derived. The focus will instead be on the four key factors As the level of significance (β) and power (1-β) of the test that must be considered in all sample size formulae are often set at β =0.05 and 1-β =0.80 respectively, our influence on the sample size comes from our estimations of variance and effect size. Variance will depend upon the When testing a hypothesis, we risk making two types of population under investigation and the reliability of the fundamental errors [Table 11]. 22,24 Type I errors occur when tool being used to measure outcomes. Estimations of both we conclude that the treatment had an effect, when it in variance and effect size can come from historical data and fact did not. The probability of making a Type I error is from examination of similar populations. While much known as the significance level of the test and is denoted subjective judgment is involved, it is important to temper as α. Type II errors occur when we conclude that the optimism when making these estimations. Overestimation treatment had no effect, when in fact it did. The probability of effect size will result in too few subjects and an RCT that of a Type II error is denoted by β. Power is 1-β and it is under-powered. 23 It may be worthwhile to undertake a represents the probability of avoiding a false-negative pilot study to ensure that your estimations of variance and follow-up visit

No difference
Type I error/ False-positive Correct As the variance [ Table 10]. 21

conclusion.
Typically, α is set at 0.5 and β is set at 0.20, giving rise to a power of 0.80. Stated in words, this means that we're willing to accept a 5% chance of making a false-positive conclusion and that we have an 80% chance of detecting a difference between comparison groups, if a true difference exists.
Variance and effect size have opposite effects on sample size. As the effect size increases, the necessary sample size decreases. The larger the effect size, the more easily it would be detected, so it makes sense intuitively that fewer subjects effect size are realistic. This may also be helpful in helping predict the anticipated rates of noncompliance and loss to follow-up. Again, failure to account for these factors will lead to a decrease in sample size. The resulting study would then lack the power to impact clinical practice and research in a meaningful way. 24,25 WILL THE RESULTS BE APPLICABLE?
The second half of this article deals with the issues of applicability and clinical utility. A study is said to have good external validity if its results will generalize to the larger population.
Has sufficient account been taken within the study design of the issues of generalizability and representativeness?
The trial setting is often a source of concern regarding generalizability. Physicians in primary care often wrestle with the applicability of RCT results obtained in tertiary and secondary centers. 26 Often, primary care patients suffer numerous comorbidities that would have been exclusion

Have the outcome measures been well chosen and adequately defined?
As noted previously in this paper, we typically accept a 5% probability of obtaining a false-positive when testing a hypothesis. For this reason, it is important to limit the number of investigated outcomes. The more the outcomes criteria in the very studies that examine the efficacy of the evaluated, the greater the chance of obtaining a falsetherapies relevant to them. 27 positive result.
The differences between countries with regards to their The applicability of an RCT depends on the clinical demographics and healthcare systems can also affect relevance of the measured outcomes. There has been a external validity. Racial differences can affect the natural shift towards the use of simple, clinically relevant outcomes history or susceptibility to a disease. 28 Regional differences and away from surrogate outcomes. 32 Surrogate outcomes in the diagnosis and treatment of the same disease may are often misleading. Observational studies may show be strikingly different. This can lead to differences in the correlation between a surrogate outcome and a relevant use of adjuvant, nontrial treatments. For example, in an clinical outcome and a treatment may show a positive effect international RCT of aspirin and heparin for acute ischemic on that same surrogate outcome, yet the treatment may stroke, glycerol was used in 50% of the 1473 patients in still be ineffective harmful. Antiarrythmic drugs used to be Italy versus 3% elsewhere. 29 In addition to adjuvant prescribed for postmyocardial infarction to reduce ECG therapies, consideration should also be given to the abnormalities (the surrogate outcome). This ceased generalizability of the entire treatment protocol. In order becoming the standard of care when RCTs showed to have broad applicability, the RCT protocol should increased mortality (clinically relevant outcome) due to this diagnose and manage patients pretrial and posttrial in a treatment. 33 manner that mirrors actual clinical practice. 30 The use of inappropriate scales or composite scores is also Is the trial population reflective of the target harmful to external validity. Unvalidated scales have been population so that the results will have meaning?
found to be more likely to show significant treatment effects To maintain external validity, it is important that the sample than validated scales. 34 In addition, the clinical relevance population be representative of the whole. For many of an apparent treatment effect (i.e. a 5-point mean reasons, this may not be the case. To begin with, recruiting reduction on a 50-point outcome scale made up of various for trials is often undertaken by specialists in tertiary care signs and symptoms) is impossible to determine. 30 centers. From the outset, this group of patients will differ from those patients being managed in the community by Trials can gain statistical power by combining multiple primary care physicians. Often, this threat to validity can outcomes to form a composite outcome. Unfortunately, never fully be eliminated since a certain proportion of the composite outcomes can hurt the applicability of an RCT population never presents at a location or time that is result. The treatment may affect each individual outcome conducive to entry into a trial. However, attempts to rectify it can be made by sampling before other selection pressures impose themselves. A trial's eligibility criteria are then applied to arrive at an even more selective group. Attempts to remove confounding factors and diagnoses can lead to stringent eligibility criteria and very high exclusion rates. An average exclusion rate of 73% was found in a review of 41 US National Institutes of Health RCTs. 31 Strict eligibility criteria create a sample that is again less representative of the population, which limits external validity. This is compounded by the fact that participating clinicians may apply additional selection criteria beyond that of the eligibility criteria. While usually done with altruistic intentions (clinicians seek to enroll those they feel will do well in the trial.), this practice further deteriorates in different ways. The results of an RCT reporting a composite outcome may not be applicable to a patient who is particularly predisposed to developing one of the specific outcomes. Another danger is when outcomes of varying severities are combined. Less serious outcomes often occur more frequently. In this case, the least clinically significant outcome would have an inordinate impact on treatment effects.
Careful consideration should also be given to the patient and disease process. Patients typically prioritize quality of life issues more than clinicians, who tend to focus on the physical aspects of a disease. Since the final goal is to uncover therapies that improve things for patients, it makes sense to adopt patient-centered outcomes. The RCTs investigating chronic diseases have often suffered