Proofreading through spatial gradients.

Key enzymatic processes use the nonequilibrium error correction mechanism called kinetic proofreading to enhance their specificity. The applicability of traditional proofreading schemes, however, is limited since they typically require dedicated structural features in the enzyme, such as a nucleotide hydrolysis site or multiple intermediate conformations. Here, we explore an alternative conceptual mechanism that achieves error correction by having substrate binding and subsequent product formation occur at distinct physical locations. The time taken by the enzyme-substrate complex to diffuse from one location to another is leveraged to discard wrong substrates. This mechanism does not have the typical structural requirements, making it easier to overlook in experiments. We discuss how the length scales of molecular gradients dictate proofreading performance, and quantify the limitations imposed by realistic diffusion and reaction rates. Our work broadens the applicability of kinetic proofreading and sets the stage for studying spatial gradients as a possible route to specificity.


I. INTRODUCTION
The nonequilibrium mechanism called kinetic proofreading [1,2] is used for reducing the error rates of many biochemical processes important for cell function (e.g., DNA replication [3], transcription [4], translation [5,6], signal transduction [7], or pathogen recognition [8][9][10]). Proofreading mechanisms operate by inducing a delay between substrate binding and product formation via intermediate states for the enzyme-substrate complex. Such a delay gives the enzyme multiple chances to release the wrong substrate after initial binding, allowing far lower error rates than what one would expect solely from the binding energy difference between right and wrong substrates.
Traditional proofreading schemes require dedicated molecular features such as an exonuclease pocket in DNA polymerases [3] or multiple phosphorylation sites on Tcell receptors [8,9]; such features create intermediate states that delay product formation (Fig. 1a) and thus allow proofreading. Additionally, since proofreading is an active nonequilibrium process often involving nearirreversible reactions, the enzyme typically needs to have an ATP or GTP hydrolysis site to enable the use of energy supplies of the cell [5,11]. Due to such stringent structural requirements, the number of confirmed proofreading enzymes is relatively small. Furthermore, generic enzymes without such dedicated features are assumed to not have active error correction available to them.
In this work, we propose an alternative scheme where the delay between initial substrate binding and product formation steps is achieved by separating these events in * amurugan@uchicago.edu † phillips@pboc.caltech.edu The T-cell activation mechanism with successive phosphorylation events is used for demonstration [8,10]. (b) The spatial proofreading scheme where the delay between binding and catalysis is created by constraining these events to distinct physical locations. The wavy arrows stand for the diffusive motion of the complex. Binding events primarily take place on the length scale λ S of substrate localization.
space. If substrates are spatially localized and product formation is favorable only in a region of low substrate concentration where an activating effector is present, then the time taken by the enzyme-substrate complex to travel from one location to the other can be used to arXiv:2005.11615v1 [q-bio.MN] 23 May 2020 discard the wrong substrates (Fig. 1b). When this delay is longer than substrate unbinding time scales, very low error rates of product formation can be achieved, allowing this spatial proofreading scheme to outperform biochemical mechanisms with a finite number of proofreading steps.
The nonequilibrium mechanism here does not require any direct energy consumption by the enzyme or substrate itself (e.g., through ATP hydrolysis). Instead, the mechanism relies on energy investment to actively maintain spatial concentration gradients of substrates (or alternatively, the enzyme). Such gradients of different proteins in the cell have been measured in several contexts (e.g., near the plasma membrane, the Golgi apparatus, the endoplasmic reticulum (ER), kinetochores, microtubules [12][13][14]) and several gradient-forming mechanisms have been discussed in the literature [14][15][16]. In this way, energy consumption for proofreading can be outsourced from the enzyme and substrate to the gradient maintaining mechanism.
The scheme proposed here does not rely on any proofreading-specific structural features in the enzyme; indeed, any 'equilibrium' enzyme with a localized effector can proofread using our scheme if appropriate concentration gradients of the substrates or enzymes can be set up. As a result, spatial proofreading is easy to overlook in experiments and suggests another explanation for why reconstitution of reactions in vitro can be of lower fidelity than in vivo.
Further, the lack of reliance on structure makes spatial proofreading more adaptable. We study how tuning the length scale of concentration gradients can trade off error rate against speed and energy consumption on the fly. In contrast, traditional proofreading schemes rely on nucleotide chemical potentials, e.g., the out of equilibrium [ATP]/[ADP] ratio in the cell, and cannot modulate their operation without broader physiological disruptions. We conclude by quantifying the limitations of our proposed scheme by accounting for realistic reaction rates and spatial gradients known to be maintained in the cell. Our work motivates a detailed investigation of spatial structures and compartmentalization in living cells as possible delay mechanisms for proofreading enzymatic reactions.

A. Slow Transport of Enzymatic Complex Enables Proofreading
Our proposed scheme is based on spatially separating substrate binding and product formation events for the enzyme (Fig. 1b). Such a setting arises naturally if substrates are spatially localized by having concentration gradients in a cellular compartment. Similarly, an effector needed for product formation (e.g., through allosteric activation) may have a spatial concentration gradient localized elsewhere in that compartment. To keep our model simple, we assume that the right (R) and wrong (W) substrates have identical concentration gradients of length scale λ S but that the effector is entirely localized to one end of the compartment, e.g., via membrane tethering. We model our system using coupled reaction-diffusion equations for the substrate-bound ("ES" with S = R, W) and free ("E") enzyme densities, namely, Here, D is the enzyme diffusion constant, k on and k S off (with k W off > k R off ) are the substrate binding and unbinding rates, respectively, and ρ S (x) ∼ e −x/λ S is the spatially localized substrate concentration profile which we take to be exponentially decaying, which is often the case for profiles created by cellular gradient formation mechanisms [17,18]. We limit our discussion to this one-dimensional setting of the system, though our treatment can be generalized to two and three dimensions in a straightforward way.
The above model does not explicitly account for several effects relevant to living cells, such as depletion of substrates or distinct diffusion rates for the free and substrate-bound enzymes. More importantly, it does not account for the mechanism of substrate gradient formation. We analyze a biochemically detailed model with this latter feature and experimentally constrained parameters later in the paper. Here, we proceed with the minimal model above for explanatory purposes. To identify the key determinants of the model's performance, we assume throughout our analysis that the amount of substrates is sufficiently low that the enzymes are mostly free with a roughly uniform profile (i.e., ρ E ≈ constant). This assumption makes Eqs. (1)-(3) linear and allows us to solve them analytically at steady state. We demonstrate in Appendix C that proofreading is, in fact, most effective under this assumption and discuss the consequences of having high substrate amounts on the performance of the scheme.
In our simplified picture, enzyme activation and catalysis take place upon reaching the right boundary at a rate r that is identical for both substrates. Therefore, the density of substrate-bound enzymes at the right boundary can be taken as a proxy for the rate of product formation v S , since where L is the size of the compartment.
To demonstrate the proofreading capacity of the model, we first analyze the limiting case where substrates are highly localized to the left end of the compartment  (λ S L). In this limit, the fidelity η, defined as the number of right products formed per single wrong product, becomes where η eq = k W off /k R off is the equilibrium fidelity, and τ D = L 2 /D is the characteristic time scale of diffusion across the compartment (see Appendix A for the derivation).
Eq. 5 is plotted in Fig. 2 for a family of different parameter values. As can be seen, when diffusion is fast (small τ D ), fidelity converges to its equilibrium value and proofreading is lost (η ≈ √ η eq × τ D k W off /τ D k R off = η eq ). Conversely, when diffusion is slow (large τ D ), the enzyme undergoes multiple rounds of binding and unbinding before diffusing across the compartment and forming a product -'futile cycles' that endow the system with proofreading. In this regime, fidelity scales as To get further insights, we introduce an effective number of extra biochemical intermediates (n) that a traditional proofreading scheme would need to have in order to yield the same fidelity, i.e., η/η eq = η n eq . We calculate this number as (see Appendix A) Notably, since τ D ∼ L 2 , the result above suggests a linear relationship between the effective number of proofreading realizations and the compartment size (n ∼ L). In addition, because the right-hand side of Eq. 7 is an increasing function of k W off , the proofreading efficiency of the scheme rises with larger differences in substrate offrates (Fig. 2) -a feature that 'hard-wired' traditional proofreading schemes lack.
Navigating the Speed-Fidelity Trade-Off As is inherent to all proofreading schemes, the fidelity enhancement described earlier comes at a cost of reduced product formation speed. This reduction, in our case, happens because of increased delays in diffusive transport. Here, we explore the resulting speed-fidelity tradeoff and its different regimes by varying two of the model parameters: diffusion time scale τ D and the substrate localization length scale λ S .
Speed and fidelity for different sampled values of τ D and λ S are depicted in Fig. 3a. As can be seen, for a fixed τ D , the reduction of λ S can trade off fidelity against speed. This trade-off is intuitive; with tighter substrate localization, the complexes are formed closer to the left boundary. Hence, a smaller fraction of complexes reach the activation region, reducing reaction speed. The Pareto-optimal front of the trade-off over the whole parameter space, shown as a red curve on the plot, is reached in the limit of ideal sequestration. Varying the diffusion time scale allows one to navigate this optimal trade-off curve and access different performance regimes.
Specifically, if the diffusion time scale is fast compared with the time scales of substrate unbinding (i.e., τ D 1/k R off , 1/k W off ), then both right and wrong complexes that form near the left boundary arrive at the activation region with high probability, resulting in high speeds, though at the expense of error-prone product formation (Fig. 3b, top). In the opposite limit of slow diffusion, both types of complexes have exponentially low densities at the activation region, but due to the difference in substrate off-rates, production is highly accurate (Fig. 3b, bottom). There also exists an intermediate regime where a significant fraction of right complexes reach the activation region while the vast majority of wrong complexes do not (Fig. 3b, middle). As a result, an advantagenous trade-off is achieved where a moderate decrease in the production rate yields high fidelity enhancement -a feature that was also identified in multi-step traditional proofreading models [19].
As we saw in Fig. 3a, in the case of ideal sequestration, the slowdown of diffusive transport necessarily reduced the production rate and increased the fidelity. The latter part of this statement, however, breaks down when substrate gradients are weak. Indeed, fidelity exhibits a non-monotonic response to tuning τ D when the substrate gradient length scale λ S is non-zero (Fig. 3c). The reason for the eventual decay in fidelity is the fact that with slower diffusion (larger τ D ), substrate binding and unbinding events take place more locally and therefore, the right and wrong complex profiles start to resemble  the substrate profile itself, which does not discriminate between the two substrate kinds.
Not surprisingly, the error-correcting capacity of the scheme improves with better substrate localization (lower λ S ). For a fixed τ D , the bulk of this improvement takes place when L/λ S is tuned in a range set by the two key dimensionless numbers of the model, namely, τ D k R off and τ D k W off (Fig. 3c, inset). In Appendix A, we provide an analytical justification for this result. Taken together, these parametric studies uncover the operational principles of the spatial proofreading scheme and demonstrate how the speed-fidelity trade-off could be dynamically navigated as needed by tuning the key time and length scales of the model.

Energy Dissipation and Limits of Proofreading Performance
A hallmark signature of proofreading is that it is a nonequilibrium mechanism with an associated free energy cost. In our scheme, the enzyme itself is not directly involved in any energy-consuming reactions, such as hydrolysis. Instead, the free energy cost comes from maintaining the spatial gradient of substrates, which the enzymatic reaction tends to homogenize by releasing bound substrates in regions of low substrate concentration.
While mechanisms of gradient maintenance may differ in their energetic efficiency, there exists a thermodynamically dictated minimum energy that any such mechanism must dissipate per unit time. We calculate this minimum power P as Here j S (x) = k on ρ S (x)ρ E −k S off ρ ES (x) is the net local binding flux of substrate "S", and µ(x) is the local chemical potential (see Appendix B1 for details). For substrates with an exponentially decaying profile considered here, the chemical potential is given by where k B T is the thermal energy scale. Notably, the chemical potential difference across the compartment, which serves as an effective driving force for the scheme, is set by the inverse of the nondimensionalized substrate localization length scale, namely, where β −1 = k B T . This driving force is zero for a uniform substrate profile (λ S → ∞) and increases with tighter localization (lower λ S ), as intuitively expected. We used Eq. 8 to study the relationship between dissipation and fidelity enhancement as we tuned ∆µ for different choices of the diffusion time scale τ D . As can be seen in Fig. 4, power rises with increasing fidelity, diverging when fidelity reaches its asymptotic maximum given by Eq. 5 in the large ∆µ limit. For the bulk of each curve, power scales as the logarithm of fidelity, suggesting that a linear increase in dissipation can yield an exponential reduction in error. Notably, such a scaling relationship has also been proposed for a general class of biochemical processes involving quality control [20]. This logarithmic scaling is achieved in our model when the driving force is in a range where most of the fidelity enhancement takes place, namely, Beyond this range, additional error correction is attained at an increasingly higher cost. Note that the power computed here does not include the baseline cost of creating the substrate gradient, which, for instance, would depend on the substrate diffusion constant. We only account for the additional cost to be paid due to the operation of the proofreading scheme which works to homogenize this substrate gradient. The baseline cost in our case is analogous to the work that ATP synthase needs to perform to maintain a nonequilibrium [ATP]/[ADP] ratio in the cell, whereas our calculated power is analogous to the rate of ATP hydrolysis by a traditional proofreading enzyme. We discuss the comparison between these two classes of dissipation in greater detail in Appendix B3.
Just as the cellular chemical potential of ATP or GTP imposes a thermodynamic upper bound on the fidelity enhancement by any proofreading mechanism [21], the effective driving force ∆µ imposes a similar constraint for the spatial proofreading model. This thermodynamic limit depends only on the available chemical potential and is equal to e β∆µ . This limit can be approached very closely by our model, which for ∆µ 1 achieves the exponential enhancement with an additional linear prefactor, namely, (η/η eq ) max ≈ e β∆µ /β∆µ (see Appendix B2). Such scaling behavior was theoretically accessible only to infinite-state traditional proofreading schemes [21,22]. This offers a view of spatial proofreading as a procession of the enzyme through an infinite series of spatial filters and suggests that, from the perspective of peak error reduction capacity, our model outperforms the finite-state schemes.

Proofreading by Biochemically Plausible Intracellular Gradients
Our discussion of the minimal model thus far was not aimed at a particular biochemical system and thus did not involve the use of realistic reaction rates and diffusion constants typically seen in living cells. Furthermore, we did not account for the possibility of substrate diffusion, as well as for the homogenization of substrate concentration gradients due to enzymatic reactions, and have thereby abstracted away the gradient maintaining mechanism. The quantitative inspection of such mechanisms is important for understanding the constraints on spatial proofreading in realistic settings.
Here, we investigate proofreading based on a widely applicable mechanism for creating gradients by the spatial separation of two opposing enzymes [12,18,23]. Consider a protein S that is phosphorylated by a membranebound kinase and dephosphorylated by a delocalized cytoplasmic phosphatase, as shown in Fig. 5a. This setup will naturally create a gradient of the active form of protein (S * ), with the gradient length scale controlled by the rate of phosphatase activity k p (S * kp −→ S). Such mechanisms are known to create gradients of the active forms of MEK and ERK [14], of GTPases such as Ran (with GEF and GAP [24] playing the role of kinase and phosphatase, respectively), of cAMP [14] and of stathmin oncoprotein 18 (Op18) [25,26] near the plasma membrane, the Golgi apparatus, the ER, kinetochores and other places.
We test the proofreading power of such gradients, assuming experimentally constrained biophysical parameters for the gradient forming mechanism. Specifically, we consider an enzyme E that acts on active forms of cognate (R * ) and non-cognate (W * ) substrates which have off-rates 0.1 s −1 and 1 s −1 , respectively (hence, η eq = 10). These off-rates are consistent with typical values for substrates proofread by cellular signalling systems [10,27]. We assume that both R * and W * have identical spatial gradients due to the kinase/phosphatase setup shown in Fig. 5a (i.e., S represents both R and W ). We then consider a dephosphorylation rate constant k p = 5 s −1 that falls in the range 0.1−100 s −1 reported for different phosphatases [18,28,29], and a cytosolic diffusion constant D = 1 μm 2 /s for all proteins in this model. With this setup, exponential gradients of length scale ∼ 0.5 μm are formed for R * and W * (see Appendix D for details).
As expected, proofreading by these gradients is most effective when the enzyme-substrate binding is very slow, in which case the exponential substrate profile is maintained and the system attains the fidelity predicted by our earlier explanatory model (Fig. 5b). The system's FIG. 5. Proofreading based on substrate gradients formed by spatially separated kinases and phosphatases. (a) The active form S * of many proteins exhibits gradients because kinases that phosphorylate S are anchored to a membrane while phosphatases can diffuse in the cytoplasm [14]. An enzyme can exploit the resulting spatial gradient for proofreading. (b) At low enzyme activity (i.e., low konρ E ), the gradient of S * is successfully maintained, allowing for proofreading. At high enzyme activity (large konρ E ), the dephosphorylation with rate kp = 5 s −1 is no longer sufficient to maintain the gradient and proofreading is lost. proofreading capacity is retained if the first-order onrate is raised up to k on ρ E ∼ 10 s −1 , where around 10-fold increase in fidelity is still possible. If the binding rate constant (k on ) or the enzyme's expression level (ρ E ) is any higher, then enzymatic reactions overwhelm the ability of the kinase/phosphatase system to keep the active forms of substrates sufficiently localized (Fig. 5c) and proofreading is lost. Overall, this model suggests that enzymes can work at reasonable binding rates and still proofread, when accounting for an experimentally characterized gradient maintaining mechanism. DISCUSSION We have outlined a way for enzymatic reactions to proofread and improve specificity by exploiting spatial concentration gradients of substrates. Like the classic model, our proposed spatial proofreading scheme is based on a time delay; but unlike the classic model, here the delay is due to spatial transport rather than transitions through biochemical intermediates. Consequently, the enzyme is liberated from the stringent structural requirements imposed by traditional proofreading, such as multiple intermediate conformations and hydrolysis sites for energy coupling. Instead, our scheme exploits the free energy supplied by active mechanisms that maintain spatial structures.
The decoupling of the two crucial features of proofreading -time delay and free energy dissipation -allows the cell to tune proofreading on the fly. For instance, all proofreading schemes offer fidelity at the expense of reaction speed and energy. For traditional schemes, navigating this trade-off is not always feasible, as it needs to involve structural changes via mutations or modulation of the [ATP]/[ADP] ratio which can cause collateral effects on the rest of the cell. In contrast, the spatial proofreading scheme is more adaptable to the changing conditions and needs of the cell. The scheme can prioritize speed in one context, and fidelity in another, simply by tuning the length scale of intracellular gradients (e.g., through the regulation of the phosphotase or free enzyme concentration in the scheme discussed earlier).
On the other hand, this modular decoupling can complicate the experimental identification of proofreading enzymes and the interpretation of their fidelity. Here, the enzymes need not be endowed with the structural and biochemical properties typically sought for in a proofreading enzyme. At the same time, any attempt to reconstitute enzymatic activity in a well-mixed, in vitro assay, will show poor fidelity compared to in vivo measurements, even when all necessary molecular players are present in vitro. Therefore, more care is required in studies of cellular information processing mechanisms that hijack a distant source of free energy compared to the case where the relevant energy consumption is local and easier to link causally to function.
While we focused on spatially localized substrates and delocalized enzymes, our framework would apply equally well to other scenarios, e.g., a spatially localized enzyme (or its active form [24,30]) and effector with delocalized substrates. Our framework can also be extended to signaling cascades, where slightly different phosphatase activities can result in magnified concentration ratios of two competing signaling molecules at the spatial location of the next cascade step [14,31,32].
The spatial gradients needed for the operation of our model can be created and maintained through multiple mechanisms in the cell, ranging from the kinase/phosphatase system modeled here, to the passive diffusion of substrates/ligands combined with active degradation (e.g., Bicoid and other developmental morphogens), to active transport processes combined with diffusion. A particularly simple implementation of our scheme is via compartmentalization -substrates and effectors need to be localized in two spatially separated compartments with the enzyme-substrate complex having to travel from one to another to complete the reaction. As specificity is known to be a critical problem in secretory pathways involving the naturally compartmentalized parts of the cell, e.g., the ER, the Golgi apparatus with its distinct cisternae, endosomes and the plasma membrane [33, 34], they are potential candidates for the implementation of spatial proofreading. Experimental investigations of these compartmentalized structures in light of our work will reveal the extent to which spatial transport promotes specificity.
In conclusion, we have analyzed the role played by spatial structures in endowing enzymatic reactions with kinetic proofreading. Simply by spatially segregating substrate binding from catalysis, enzymes can enhance their specificity. This suggests that enzymatic reactions may acquire de-novo proofreading capabilities by coupling to pre-existing spatial gradients in the cell.

ACKNOWLEDGMENTS
We thank Anatoly Kolomeisky and Erik Winfree for insightful discussions, and Soichi Hirokawa for providing useful feedback on the manuscript. We also thank Alexander Grosberg who's idea of a compartmentalized 'rotary demon' motivated the development of our model. This work was supported by the NIH Grant 1R35 GM118043-01, the John Templeton Foundation organization of the mitotic spindle," Trends Cell Biol. 16, 125-134 (2006 Appendix C: Studies on the validity of the uniform free enzyme profile assumption S10 1. Effects that relaxing the ρ E (x) ≈ constant assumption has on the Pareto front S10 2. Effects that relaxing the ρ E (x) ≈ constant assumption has on fidelity in a weak substrate gradient setting S14 Appendix D: Proofreading on a kinase/phosphatase-induced gradient S15 1. Setup and estimation of fidelity S15 2. Energy dissipation S16 References S17

APPENDIX A: ANALYTICAL CALCULATIONS OF THE COMPLEX DENSITY PROFILE AND FIDELITY
We begin this section by deriving an analytical expression for the density profile of substrate-bound enzymes (ρ ES (x)) in the case where the ρ(x) ≈ constant assumption holds. Based on this result, we then obtain expressions for fidelity in low, high, and intermediate substrate localization regimes. We reserve the studies of speed and fidelity in the general case of a nonuniform free enzyme profile to Appendix C.

Derivation of the complex density profile ρ ES (x)
The ordinary differential equation (ODE) that defines the steady state profile of substrate-bound enzymes is Here ρ S (0) is the substrate density at the leftmost boundary, whose value can be calculated from the condition that the total number of free substrates is S total , namely, In the limit of low substrate amounts where the approximation ρ E (x) ≈ constant is valid, Eq. S1 represents a linear nonhomogeneous ODE. Hence, its solution can be written as where ρ (h) ES (x) is the general solution to the corresponding homogeneous equation, while ρ (p) ES (x) is a particular solution. Looking for solutions of the form Ce −x/λ for the homogeneous part, we find The two possible roots for λ are ± D/k S off . Calling the positive root λ ES , which represents the mean distance traveled by the substrate-bound enzyme before releasing the substrate, we can write the general solution to the homogeneous part of Eq. S1 as where C 1 and C 2 are constants which will be determined from the boundary conditions. Since the nonhomogeneous part of Eq. S1 is a scaled exponential, we look for a particular solution of the same functional form, namely, ρ (p) ES (x) = C p e −x/λ S . Substituting this form into the ODE, we obtain The constant coefficient C p can then be found as where we have used the equality λ ES = D/k S off . Now, to find the unknown coefficients C 1 and C 2 , we impose the no-flux boundary conditions for the density ρ ES (x) at the left and right boundaries of the compartment, namely, Note that we did not take into account the product formation flux at the rightmost boundary when writing Eq. S10 in order to simplify our calculations. This is justified in the limit of slow catalysis -an assumption that we make in our treatment. The above system of two equations can then be solved for C 1 and C 2 , yielding With the constant coefficients known, we obtain the general solution for the complex profile as If substrate localization is very poor (λ S L), the substrate distribution will be uniform (ρ S (x) =ρ S = S total /L), resulting in a similarly flat profile of enzyme-substrate complexes with their density ρ ∞ ES given by This is the expected equilibrium result where the complex concentration is inversely proportional to the dissociation constant (k S off /k on ). In the opposite limit where the substrates are highly localized (λ S λ ES , L and ρ S (0) ≈ S total /λ S from Eq. S3), the complex density profile simplifies into The x-dependence through the cosh(·) function suggests that the complex density is the highest at the leftmost boundary and lowest at the rightmost boundary, with the degree of complex localization dictated by the length scale parameter λ ES . Notably, this localization of complexes does not alter their total number, since the average complex density is conserved, that is, Eq. S15 for the complex profile can be alternatively written in terms of the diffusion time scale τ D = L 2 /D and the substrate off-rate k S off . Noting that L/λ ES = L 2 k S off /D = τ D k S off and introducing a dimensionless coordinatẽ x = x/L, we find The above equation is what was used for generating the plots in Fig. 3b of the main text for different choices of the diffusion time scale.

Fidelity in low and high substrate localization regimes
Let us now evaluate the fidelity of the model in the two limiting regimes discussed earlier. In the poor substrate localization case, which corresponds to an equilibrium setting, the fidelity can be found from Eq. S14 as

S4
where we have employed the assumption about the right and wrong substrates having identical density profiles. This is the expected result for equilibrium discrimination where no advantage is taken of the system's spatial structure.
In the regime with high substrate localization, the enzyme-substrate complexes have a nonuniform spatial distribution. What matters for product formation is the complex density at the rightmost boundary (x = 1), which we obtain from Eq. S17 as (S19) Substituting the above expression written for right and wrong complexes into the definition of fidelity, we find This is the result reported in Eq. 5 of the main text. To gain more intuition about it and draw parallels with traditional kinetic proofreading, let us consider the limit of long diffusion time scales where proofreading is the most effective. In this limit, the hyperbolic sine functions above can be approximated as sinh where we have used the definition of equilibrium fidelity (Eq. S18). In traditional proofreading, a scheme with n proofreading realizations can yield a maximum fidelity of η/η eq = η n eq . The value of n for the original Hopfield model, for instance, is 1. It would be informative to also know the effective parameter n for the spatial proofreading model. Dividing Eq. S21 by η eq , we find This exact result can be simplified into an approximate form when diffusion is slow and η eq 1, yielding the expression reported in Eq. 7 of the main text, namely,

Fidelity in an intermediate substrate localization regime
The generic expression for complex density at the rightmost boundary (x = L) can be written using Eq. S13 as For the system to proofread, substrates need to be sufficiently localized (λ S < L) and diffusion needs to be sufficiently slow (τ D k S off > 1 or, λ ES < L). Under these conditions, the substrate profile can be approximated using Eq. S3 as ρ S (x) ≈ λ −1 S S total e −x/λ S , while the hyperbolic sine and cosine functions used above can be approximated as sinh(L/λ ES ) ≈ cosh(L/λ ES ) ≈ 0.5 e L/λ ES . With these approximations, the complex density expression simplifies into Now, depending on how λ S compares with λ ES , there can be two qualitatively different regimes for the complex density, namely, where we used the equilibrium complex density ρ ∞ ES defined in Eq. S14. Notably, the first regime effectively corresponds to the case of ideal sequestration where complex density is independent from the precise value of λ S . The dimensionless number τ D k S off sets the scale for the minimum L/λ S value beyond which ideal sequestration can be assumed. Conversely, the second regime corresponds to the case where the distance traveled by a complex before dissociating is so short that the complex profile is dictated by the substrate profile itself. Because of that, the complex density reduction from its equilibrium limit is independent from the precise values of τ D and k S off , as long as the condition λ ES λ S is met. The scheme yields its highest fidelity when both right and wrong complex densities are in the first regime (ideal sequestration). When both densities are in the second regime, fidelity is reduced down to its equilibrium value η eq (Table S1). The transition between these two extremes happens when the density profiles of right and wrong complexes fall under different regimes. Fidelity can be navigated in the transition zone by tuning the substrate gradient length scale λ S . This is demonstrated in Fig. S1 for three different choices of η eq . In all three cases, the dimensionless numbers τ D k R off and τ D k W off set the approximate range in which the bulk of fidelity enhancement occurs, as stated in the main text.  We start this section by deriving an analytical expression for the minimum dissipated power, which was used in making Fig. 4 of the main text. Then, we calculate the upper limit on fidelity enhancement available to our model for a finite substrate gradient length scale and compare this limit with the fundamental thermodynamic bound. We end the section by providing an estimate for the baseline cost of setting up gradients and compare this cost with the maintenance cost reported in the main text. Similar to our treatment of Appendix A, here too our calculations are based on the ρ E ≈ constant assumption to allow for intuitive analytical results.

Derivation of dissipated power
As stated in the main text, we calculate the minimum rate of energy dissipation necessary for maintaining the substrate profiles as is the net local substrate binding flux and µ(x) = µ(0) − k B T · ln(x/λ S ) is the local chemical potential. Substituting the analytical expression for ρ ES (x) found earlier (Eq. S13) into j S (x) and performing a somewhat cumbersome integral, we obtain where β −1 = k B T , and J bind = k on S total ρ E is the net binding rate of each substrate. Fig. 4 in the main text was made using this expression for power.
To get additional insights about this result, let us consider the case where substrates are highly localized (λ S L) and diffusion is slow (λ ES L) -conditions needed for effective proofreading. Under these conditions, the hyperbolic tangent terms become 1 and the expression for the power expenditure simplifies into The monotonic increase of power with λ ES suggests that energy is primarily spent on maintaining the concentration gradient of right substrates. This is not surprising, since typically right complexes travel a much greater distance into the low concentration region of the compartment before releasing the bound substrate (i.e., λ ER λ EW ). Therefore, neglecting the contribution from wrong substrates and considering the range of λ S values where the bulk of powerfidelity trade-off takes place (λ ER > λ S > λ EW ), we further simplify the power expression into where we used the identities β∆µ = L/λ S and λ ER = L/ τ D k R off . This simple linear relation suggests that in order to maintain the exponential substrate profile, the minimum energy spent per substrate binding event should be at least

Limits on fidelity enhancement
The error reduction capacity of the spatial proofreading scheme improves with a greater difference in substrate off-rates, as was demonstrated in Fig. 2 of the main text. At the same time, Fig. 3c showed that the finite length scale of substrate localization (or, finite driving force) sets an upper limit on fidelity enhancement for substrates with fixed off-rates. It is therefore of interest to consider these two features together to find the absolute limit on fidelity enhancement available to our model and then compare it with the fundamental bound set by thermodynamics.

S8
Intuitively, fidelity will be enhanced the most if the density of right complexes does not decay across the compartment, while that of wrong complexes decays maximally. The first condition can be met if diffusion is fast or if the unbinding rate of right substrates is low, in which case we have where ρ ∞ ER is the equilibrium density of right complexes. Conversely, when the unbinding rate of wrong substrates is very large, the density of wrong complexes is maximally reduced at the rightmost boundary and can be obtained from Eq. S24 by taking the λ ES → 0 limit, namely, Here ρ ∞ EW is the equilibrium density of wrong complexes, and β∆µ = L/λ S is the effective driving force of the scheme. Taking the ratio of Eqs. S31 and S32, we obtain the largest fidelity enhancement of the scheme for the given driving force, namely, When β∆µ 1 (or, λ S L), the limit above gets further simplified into (η/η eq ) max ≈ e β∆µ /β∆µ.
Now, thermodynamics imposes an upper bound on fidelity enhancement by any proofreading scheme operating with a finite chemical potential ∆µ. This bound is equal to e β∆µ and is reached when the entire chemical potential is used to increase the free energy difference between right and wrong substrates [1]. Comparing it with the result in Eq. S35, we can see that fidelity enhancement in the spatial proofreading model has the same exponential scaling term, but with an additional linear factor. Since the dominant contribution comes from the exponential term (as captured also in Fig. S2), we can claim that our proposed model can operate very close to the fundamental thermodynamic limit. Earlier in the section, we calculated the rate at which energy needs to be dissipated to counteract the homogenizing effect that enzyme activity has on the substrate gradient. In addition to this cost, however, there is also a baseline cost for setting up a gradient in the absence of any enzyme. Here, we calculate this cost in the case where the gradient formation mechanism needs to work against diffusion that tends to flatten the substrate profile.
As before, we consider an exponentially decaying substrate gradient with a decay length scale λ S and a total number of substrates S total . We write the minimum power P D required for counteracting the diffusion of substrates as where J D = −D S ∇ρ S (x) is the diffusive flux, with D S being the substrate diffusion constant. The rationale for writing this form is that diffusion moves substrates from a higher chemical potential region into a neighboring lower chemical potential region. The gradient maintaining mechanism would need to spend at least this chemical potential difference (δµ = −µ (x)δx) per each substrate diffusing a distance δx down the chemical potential gradient. Adding up the contribution from all local neighborhoods with a local diffusive flux J D (x) results in Eq. S36. Now, substituting ρ S (x) ∼ e −x/λ S for the substrate profile and µ(x) = µ(0) + k B T ln ρ S (x) /ρ S (0) for the chemical potential, we obtain where in the third step we used the relation ρ S (x) = −ρ S (x)/λ S . This suggests that the minimum dissipated power required for setting up an exponential gradient increases quadratically with decreasing localization length scale λ S . It is informative to also make a comparison between this result and the earlier calculated minimum dissipation needed to counteract the enzyme's homogenizing activity. Recall that when substrates were sufficiently localized and when diffusion was sufficiently slow, proofreading power could be approximated as (Eq. S29) where J bind = k on S total ρ E is the total substrate binding flux. Using the identities λ ES = D/k S off and K S d = k S off /k on , we can calculate the ratio of the proofreading power to baseline power as Presuming for simplicity that the enzyme and substrate diffusion constants are the same, we see that two factors determine the power ratio: 1) the amount of free enzyme in the system (ρ E /K S d ), and 2) the substrate localization length scale relative to the characteristic length scale of complex diffusion (λ S /λ ES ). Now, recall that the enzymatic activity on right substrates dominates the proofreading cost (Appendix B1) and that the bulk of fidelity enhancement takes place when λ S λ ER (Appendix A4). Therefore, when tuning λ S down, initially the power ratio would only depend on the amount of free enzyme in the system (ρ E /K S d ) and then, with tighter substrate localization, the relative contribution of the proofreading power would start to decrease.
In the end, we would like to note that spatial gradients can also be set up using an external potential without a continuous dissipation of energy. In an in vivo setting, gravity can give rise to spatial structures in oocytes [2], while in an in vitro setting, electric fields can create gradients and power the transport of the complex [3]. We leave the investigations of such alternative strategies to future work. S10 APPENDIX C: STUDIES ON THE VALIDITY OF THE UNIFORM FREE ENZYME PROFILE ASSUMPTION In our treatment of the model so far, we have assumed for mathematical convenience that free enzymes are in excess, which suggested the approximation ρ E (x) ≈ constant. Example enzyme density profiles shown in Fig. S3, however, demonstrate that this assumption does not hold in general. Specifically, there is a depletion of free enzymes near the substrate localization site and abundance near the catalysis site. Because of this depletion at the leftmost edge, we expect a reduction in speed in comparison with our earlier treatment where a flat profile was assumed. In addition, if substrates have a weak gradient, we expect the fidelity to also be reduced, since more enzymes will bind substrates at intermediate positions, reducing the average travel distance to the catalytic site. In what follows, we discuss in greater detail the consequences of having a nonuniform free enzyme distribution on the model performance. 1. Effects that relaxing the ρ E (x) ≈ constant assumption has on the Pareto front We begin by studying the effects of relaxing the uniform free enzyme profile assumption on the Pareto front of the speed-fidelity trade-off (Fig. 3a of the main text). This front is reached in the ideal sequestration limit (λ S → 0). Though in general enzyme profiles need to be obtained using numerical methods due to the nonlinearity of reactiondiffusion equations, in this particular limit (λ S → 0) an analytical solution is available. To obtain it, we write the reaction-diffusion equations in the bulk region of space as Substrate binding reactions did not enter the above equations, as they occur at the leftmost boundary only. They are instead accounted for via boundary conditions, which read where S total is the total amount of free substrate of each kind concentrated at x = 0.

S11
Relating local enzyme concentrations. Considering the system at steady state, we add Eqs. S40-S42 and obtain Diving by D and integrating once, we find This above relation must hold for arbitrary position x. Choosing x = 0 and noting that from Eqs. S43-S45 the sum of fluxes should be zero, we can claim that C 1 = 0. Integrating for the second time, we obtain where C 2 is now a different constant. To find it, we perform an integral for the last time across the entire compartment, namely, Here we introduced the parameter E total as the total number of enzymes in the system (in free or bound forms). The constant C 2 , which we will rename into ρ 0 , is then the average enzyme density, i.e., Substituting this result into Eq. S48, we find an insightful relation between free and bound enzyme densities at an arbitrary position, namely, This relation suggests that whenever the local concentration of bound enzymes is high, the local concentration of free enzymes should be correspondingly low, as we see reflected in the profiles of Fig. S3.
Deriving the fidelity expression. Next, we consider Eqs. S40 and S41 separately at steady state, written in the form The general solution to this ODE reads where λ ES = D/k S off , and C S 1 and C S 2 (S = R,W) are constants which are different for right and wrong complexes. The no-flux boundary condition at x = L can be used to relate these constants and simplify the complex profile expression, namely, whereC S 1 = 2C S 1 e −L/λ ES is a new constant coefficient introduced for convenience.

S12
Now, the fidelity of the scheme is the ratio of right and wrong complex densities at x = L. Using the result above, the fidelity can be written as The ratio of these constant coefficients can be obtained by noting that the diffusive fluxes of right and wrong complexes at x = 0 are identical (from Eqs. S43 and S44), that is, Substituting this result into Eq. S57, and recalling the equality L/λ ES = τ D k S off , we obtain This expression is identical to that in Eq. S20 which was derived under the ρ E (x) ≈ constant assumption, suggesting that when substrates are highly localized, the shape of the free enzyme profile does not dictate the fidelity.
Deriving the speed expression. To keep the expression of speed compact while still illustrating the key consequences of relaxing the ρ(x) ≈ constant assumption, we will assume moving forward that the density of wrong complexes is much lower than that of the right complexes, i.e., ρ EW (x) ρ ER (x). This allows us to approximate the free enzyme density from Eq. S51 as ρ E (x) ≈ ρ 0 − ρ ER (x).
The specification of the right complex density profile requires the knowledge of the unknown coefficientC R 1 . To find this coefficient, we use the boundary condition in Eq. S43 and the approximation ρ E (x) ≈ ρ 0 − ρ ER (x) to write With the constant coefficient known, the right complex density then becomes where we used the definitions of the mean substrate densityρ S = S total /L and the dissociation constant To enable a direct parallel between this general treatment and the earlier one with the ρ E (x) ≈ constant approximation, let us introduce ρ ∞ ER as the uniform right complex density when diffusion is very fast (λ ER L) and calculate it from Eq. S64 as Now, using the ρ ∞ ER expression, we rewrite Eq. S64 as where ρ const ER (x) is the complex density obtained under the ρ E (x) ≈ constant assumption (Eq. S15). The extra factor that appears on front does not exceed 1 since γ ≥ 1, indicating a reduction in speed, as we anticipated in our more qualitative discussion at the beginning of the section. The presence of the extra factor suggests two possibilities for the approximation to hold true; first, γ ≈ 1 which happens when λ ER L or when the right complex does not decay noticeably across the compartment, and second, when γ > 1 andρ S γ −1 K R d , which is when right complexes do decay but their fraction is low compared with free enzymes because of low substrate concentration.
Pareto front shift. The previous calculations showed that in the ideal substrate sequestration limit relaxing the ρ(x) ≈ constant assumption keeps the fidelity the same while the speed gets reduced. We therefore expect a shift in the Pareto front which is illustrated in Fig. S4a. To get more intuition about the effect of this shift caused by tuning the amount of substrates, we consider the effective number of proofreading realizations at half-maximum speed (n 50 ) and study how this number changes as a function of the fraction of enzymes bound (p bound ≈ E −1 total ρ ER (x) dx). Fig. S4b shows this dependence. As can be seen, n 50 reduces roughly linearly with p bound ; e.g., if 10% of the enzymes are bound, then a 10% reduction in n 50 is expected. This suggests that as long as the fraction of bound enzymes is low, our findings related to the Pareto front made under the ρ E ≈ constant assumption will generally hold true. In this section, we study how accounting for the spatial distribution of free enzymes affects our results on the model's fidelity in the setting where substrates have a finite localization length scale λ S . In this setting, Eqs. (1)-(3) (in the main text) describing the system's dynamics become a system of nonlinear equations, which we solve at steady state using numerical methods.
An example curve of how fidelity changes with tuning diffusion time scale in a finite λ S setting is shown in Fig. S5. As expected, the nonuniform free enzyme profile leads to a reduction in fidelity. This reduction is not significant when diffusion is relatively fast as in that case the free enzyme profile manages to flatten out rapidly. The reduction is not significant also in the very slow diffusion limit where binding events that lead to production primarily take place in the proximity of the activation region and hence, the nonuniform profile of free enzymes across the compartment has little impact on fidelity. The greatest reduction happens at intermediate diffusion time scales; in particular, when the system achieves its peak fidelity. To quantify the extent of this highest reduction, we calculated the peak value of the effective number of proofreading realizations (n max ) for different free substrate amounts which regulate the fraction of bound enzymes (p bound ). The results obtained for different choices of λ S are summarized in Fig. S6. As can be seen, for the high substrate localization case (λ S /L = 0.04), there is a roughly linear dependence between n max and p bound . The initial decrease in n max with growing p bound is even slower when substrates are less tightly localized (λ S /L = 0.10, 0.30). Taken together, these results suggest that if the substrate concentration is low enough to leave most of the enzymes unbound, then our proposed scheme will proofread efficiently. And this requirement on substrate amount will be further relaxed if diffusion is fast, or if substrates are not very tightly localized. In this section, we introduce the mathematical modeling setup for the kinase/phosphatase-based gradient formation scheme and describe how its fidelity is calculated numerically. In the end, we discuss the energetics of setting up the substrate concentration gradient and link our calculations to the lower bound on energy cost obtained earlier in Appendix B.

Setup and estimation of fidelity
In the analysis thus far, we have imposed a gradient of free substrates and analyzed the proofreading capability of an enzyme acting on this gradient. In a living cell, gradients themselves are maintained by active cellular processes. However, the action of the enzyme -that is, binding a substrate in one spatial location, diffusing away, and releasing the substrate elsewhere -can destroy the gradient, and thereby lead to a loss of proofreading. Here, we analyze the consequences of free substrate depletion and gradient flattening caused by the enzyme.
We model the formation of a substrate gradient by a combination of localized activation and delocalized deactivation. We suppose that substrates can exist in phosphorylated or dephosphorylated forms, and that only the phosphorylated form is capable of binding to the enzyme. The substrates are phosphorylated by a kinase with rate k kin = 0.2 s −1 , and dephosphorylated by a phosphatase with rate k p = 5 s −1 . Crucially, we assume that phosphatases are found everywhere in the domain of size L ∼ 10 μm (a typical length scale in a eukaryotic cell), while kinases are localized to one end of the domain (at x = 0), as may occur naturally if kinases are bound to one of the membranes enclosing the domain.
The minimal dynamics of phosphorylated substrates and enzyme-substrate complexes is then given by Here, we have supposed that the densities of free enzymes, dephosphorylated substrates, and phosphatases are fixed and uniform, and have absorbed them into the relevant rate constants (k b = k on ρ E , k kin , and k p , respectively). For simplicity, we have also assumed that the free substrates and enzyme-substrate complexes have the same diffusion coefficient D = 1 μm 2 /s. We numerically solve Eqs. S67 and S68 at steady state. First, the equations of dynamics are made dimensionless by settings units of length and time by L (x = x/L) and τ D ≡ L 2 /D (t = t/τ D ), respectively. At steady state, the dimensionless equations read∇ with boundary conditions∇ρ where concentrations have been rescaled asρ = ρL, and kinetic rates ask = k τ D . We discretize the steady state equations on a grid with spacing ∆x = 0.01, approximating the second derivative as This is ill-defined at the boundariesx = 0 andx = 1, which is addressed by incorporating the boundary conditions. For illustration, consider the left boundary,x = 0, and suppose that our domain included also a point atx = −∆x.

S16
Then, we could approximate the boundary condition∇ρ S |x =0 = −k kin by a centred difference scheme, and solve out for the fictional point atx = −∆x, namely, ρ S (∆x) −ρ S (−∆x) = −k kin ⇒ρ S (−∆x) =ρ S (∆x) + 2∆xk kin , which, when inserted into Eq. S71, specifies∇ 2ρ S atx = 0, i.e., Similar considerations apply for the boundary at the right (x = 1) and for the boundary conditions ofρ ES . After discretizing, Eq. S69 can be written in a matrix form as where ρ S , ρ ES are column vectors of the nondimensionalized concentration profiles evaluated at the spatial grid points, i.e., [ρ(0),ρ(∆x), · · · ] T . Solving these matrix equations yields We compute Eqs. S74 numerically for two substrates: a cognate ('R') and a non-cognate ('W'), which differ in their off-rates (k R off = 0.1 s −1 and k W off = 1 s −1 , respectively). Having the density profiles, the fidelity of the model becomes η ≈ρ ER (x = 1)/ρ EW (x = 1). We calculate the fidelity for different choices of the first-order rate of enzyme-substrate binding (k b = k on ρ E ); this may be thought of as varying the concentration of free enzyme in the cell. The results are shown in Fig. 5 of the main text.

Energy dissipation
In Appendix B3, we estimated the minimum power that a gradient maintaining mechanism would need to dissipate in order to set up an exponentially decaying profile of diffusing substrates. Here, we calculate this power for the kinase/phosphatase-based mechanism and compare it with the lower bound estimated earlier.
Let us assume that phosphorylation and dephosphorylation reactions by kinases and phosphatases are nearly irreversible with associated free energy costs of ∆ε kin and ∆ε phosph per reaction, respectively. The net rate at which active substrates get dephosphorylated is k p S total and it needs to be identical to the net phosphorylation rate of inactive substrates in order for S total to remain constant. With the costs of each reaction known, we can write the rate of energy dissipation P k/p as P k/p = k p S total (∆ε kin + ∆ε phosph ).
(S75) S17 Now, when the enzyme activity is very low, the kinase/phosphatase mechanism will create an exponential profile of active substrates with a decay length scale λ S = D S /k p . Expressing the rate of phosphorylation in terms of λ S and D S (i.e., k p = D S /λ 2 S ), and substituting it into Eq. S75, we obtain P k/p = D S S total λ 2 S (∆ε kin + ∆ε phosph ).
Comparing this result with the lower bound found earlier (Eq. S37), we can note the presence of an extra factor (∆ε kin + ∆ε phosph ). Since the free energy consumption during ATP hydrolysis is ∼ 10 k B T , we can say that the dissipated power of the kinase/phosphatase system for setting up an exponential gradient surpasses the lower limit roughly by an order of magnitude.