Boolean Models of Genomic Regulatory Networks: Reduction Mappings, Inference, and External Control

Computational modeling of genomic regulation has become an important focus of systems biology and genomic signal processing for the past several years. It holds the promise to uncover both the structure and dynamical properties of the complex gene, protein or metabolic networks responsible for the cell functioning in various contexts and regimes. This, in turn, will lead to the development of optimal intervention strategies for prevention and control of disease. At the same time, constructing such computational models faces several challenges. High complexity is one of the major impediments for the practical applications of the models. Thus, reducing the size/complexity of a model becomes a critical issue in problems such as model selection, construction of tractable subnetwork models, and control of its dynamical behavior. We focus on the reduction problem in the context of two specific models of genomic regulation: Boolean networks with perturbation (BNP) and probabilistic Boolean networks (PBN). We also compare and draw a parallel between the reduction problem and two other important problems of computational modeling of genomic networks: the problem of network inference and the problem of designing external control policies for intervention/altering the dynamics of the model.


INTRODUCTION
One can think of a Gene Regulatory Network (GRN) as a network of relations among strands of DNA (genes) and the regulatory activities associated with those genes [1]. This general definition allows for many mathematical (usually dynamical) systems to be called GRNs. The goodness of such models is evaluated using several important criteria: the level of description of the biochemical reactions involved, the complexity of the model, the model parameter estimation, and its predictive power. There have been many attempts to model the structure and dynamical behavior of GRNs, ranging from deterministic with discrete time space to fully stochastic with continuous time space [2]. The well known central 'dogma' of molecular biology implies that genes communicate via the proteins they encode [3]. Both stages of protein production, transcription and translation, are controlled by a multitude of biochemical reactions, and are influenced by both internal and external to the cell factors. This perspective suggests that the expression of a given gene i , i.e. the quantity of either protein or messenger RNA, should be considered as a random function ) (t X i of the cell's internal and external environments. Thus, if one wants to study the dynamical behavior of a GRN, one must design a mathematical model for the gene-expression vector )) ( ),..., ( ), for the n genes that form the network. The stochastic differential equation model appears to provide the most detailed description of the dynamics of ) (t X . In principle, it could include all of the information where data come from cells operating in different contexts or include noisy observations, which implies that the model has to account for that randomness. The dynamics of a PBN can be studied in the context of Markov chains which allows for the development of control theory for the purposes of intervention. Being a collection of Boolean networks with a probability structure, the PBN model could be viewed as a minimal extension of the Boolean network which allows for modeling of the stochastic nature of complex systems with latent variables and random experimental effects. However, even such a minimal extension of the deterministic model exhibits high complexity which impedes its practical applications to model GRN of more than 40 genes. Hence, there is a need for constructing size-reducing mappings that produce new and more tractable models that share some of the biologically meaningful properties of the larger-scale models.

DEFINITIONS OF BN P AND PBN. INFERENCE FROM DATA AND EXTERNAL CONTROL
The initial application of Boolean networks as a model of genomic regulation was to study the evolution of ensembles of networks which were restricted to a specific type of fitness landscape [5,9]. Here we provide the definition of a Boolean network and briefly discuss the ensemble approach. if there is a transition from the state j i s s ! in S or 0 otherwise. Given an initial state, the network will eventually enter a set of states in G through which it will repeatedly cycle forever. Each such set is called an attractor cycle, and a singleton attractor is an attractor cycle of length 1. The network attractors induce a partition of state space S where the subset of states that belong to the same equivalence class is called the basin of the corresponding attractor cycle. The attractors of a Boolean network represent a type of memory of the dynamical system [10].
Originally, [11], analytical results and numerical simulations based on ensembles of randomly generated Boolean nets focused on the relationships between the structural gene interdependencies and dynamical behavior of the ensembles. Those studies provided insights into the general characteristics of large GRNs and the related evolutionary principles. 'Tuning up' of ensemble parameters such as the average connectivity K and the predictor functions' bias p can be used to study the operating regimes of the networks. The average connectivity is defined as the average size of the predictor sets W i , and the bias p is defined as the probability of a given predictor function to assume a value of 1. Depending on the values of K and p there are two main modes of operation of a BN : ordered and chaotic. In the ordered regime most of the system components/nodes are frozen at either 1 or 0 value, and the transfer of information is impeded by those large frozen islands of genes. In the chaotic regime, the system is very sensitive to small perturbations where a change of the value of one node can propagate to many others in an avalanchelike manner. The phase transition boundary between the ordered and the chaotic regimes is called the complex regime or critical phase. It has been shown that Boolean networks in that regime are the most evolvable and Kauffman [11] argues that life must exist on that edge between order and chaos: "a living system must first strike an internal compromise between malleability and stability. To survive in a variable environment, it must be stable to be sure, but not so stable that it remains forever static". Structural stability is one of the central concepts in the theory of dynamical systems. It describes persistent behavior that cannot be destroyed by small changes to the system. As real GRNs are capable of maintaining metabolic homeostasis and stable developmental program in the face of a changing environment, they certainly possess structural stability. The Boolean network model naturally captures this phenomenon because the network 'flows' back to one of its attractors after a small gene perturbation. Following this line of reasoning, Kauffman [11] suggests that the attractors in a BN correspond to cellular types. Another interpretation of the attractors of a BN is that they represent cellular states, such as proliferation (cell cycle), apoptosis (programmed cell death), and differentiation (execution of cell-specific tasks) [12,13]. For example, if a structural perturbation (mutation) happens which moves the network from the basin of the apoptotic attractor, the cells could exhibit uncontrolled growth or hyper proliferation, typical of tumorigenisis. The two interpretations of the attractors in the Boolean network model are complementary to each other: for a given cell type, different functional states exist and are determined by the collective gene activity. Thus, a particular cell type can encompass several attractor cycles each one corresponding to different cellular functional states. We refer the reader to [11,14,15] for a detailed treatment and additional references to results about the interplay between the average connectivity and the bias of the predictor functions in a BN and how that impacts the dynamical behavior of the network. An important implication from the body of work on the effects of these local parameters on the network is that if one wants to model GRNs with Boolean networks or their generalizations one should constraint the network connectivity in order to keep the model on the edge of chaos and closer to the ordered regime. For example, in the case of unbiased, 0.5 = p , predictor functions the networks with 2 > K operate mostly in the chaotic regime which renders such models incompatible with the real GRNs which are clearly non-chaotic systems. Although the ensemble studies can provide important insights into some general properties of the Boolean network models, a single Boolean network itself is not capable of capturing the effects of latent variables or random gene perturbations. Moreover, the ensemble approach does not provide a way of explicitly inferring the specific BN structure from data, e.g. cDNA microarray gene expression. Inferring the BN structure from data has the potential to reveal how to design therapeutic intervention for GRNs which show a specific disease phenotype. It should be pointed out that the data used for network inference exhibits uncertainty on various levels. First, due to biological variability, gene expression is inherently stochastic. Second, the complex measurement process, the microarray preparation, image acquisition and processing create experimental noise that has to be taken into account during the inference of the network. All of this combined with the presence of latent or unobservable variables such as proteins or environmental conditions present us with the problem to infer deterministic predictor functions under uncertainty. To solve such a problem one needs to reliably estimate the uncertainty. Without such estimation one cannot be sure how the designed predictor function will perform when presented with new data.
One possible way to approach this problem was proposed by Shmulevich et al. [7], and Brun et al. [8]. Keeping in mind that the predictor functions cannot be reliably estimated from the limited amount of data relative to the number of genes on a microarray slide, one can infer a number of simple predictor functions, each of which performs relatively well in predicting the target gene. Here, simpler is understood as having predictor sets i W of smaller size. After producing such predictor functions, one has to combine them together accounting for the uncertainty at the same time. This 'probabilistic' approach to synthesize 'good' predictor functions leads to the PBN model of genomic regulatory networks.
A binary PBN ) , , ( the PBN is said to be context-sensitive [8]. The context-sensitive PBNs allow for the interpretation of data obtained from distinct sources, each representing a specific cell context. Thus, one interprets data as obtained from a family of deterministic BN , and the PBN is viewed as a collection of Boolean networks in which one constituent network governs the gene activity for a random period of time before another randomly selected deterministic BN takes over which might be in response to external stimulate or activity of latent variables. In addition to the network switching and selection in the PBN model, there is mechanism which models random gene mutations, i.e. at each time point t there is a probability p of any gene changing its value uniformly randomly. Thus, the PBN model can account for the uncertainties in both data and model selection. The PBN A shares the same state space S with its realizations, and the state transition diagrams j ! of the individual j B 's combine naturally into a stochastic state transition diagram ! representing the dynamics of A . As in the case of deterministic Boolean networks, ! can be identified with a stochastic n n 2 2 ! matrix P , also known as transition probability matrix, with non-negative entries The synchronicity requirement for the state transitions in a PBN is an oversimplification of the real interactions that take place during genomic regulation. While it is not difficult to extend the PBN model into an asynchronous one [16], we do not discuss such extensions here. There are two reasons for focusing our attention to the synchronous case of PBN only. First, the model estimation from data is a much harder problem for asynchronous compare to the case of synchronous PBNs. Second, the synchronous PBN framework facilitates a simpler and clearer treatment of the problems about complexity-reducing mappings.
One can also view a given context-sensitive PBN A as a collection of r Boolean networks with perturbation It is obvious that a Boolean network with a perturbation is a special case of a PBN with just one context, 1 = r . Just as in the general case of a PBN, the dynamics of a Computing the elements of P is straightforward and we elect to present it here because of its importance in the subsequent considerations. When computing the transition probabilities for a p BN one has to realize that at every time The interpretation of a PBN as a collection of r Boolean networks with a perturbation allows to view the Markov chain representing the dynamics of the PBN as a collection of r Markov chains with a switching mechanism between them. The switching rules are defined by the PBN switching probability q and the set of selection probabilities C . This kind of interpretation of the dynamics of a PBN is advantageous when one considers problems related to control and reduction of the model's complexity.
One of the main objectives of developing mathematical models of genomic regulation is the identification of potential targets for therapeutic intervention [10]. For example, the abundance of mRNA for the gene WINT5A has been shown to discriminate well between cells' low or high metastatic competence [17]. This suggests that altering the expression of WINT5A could be perceived as a goal of a possible therapeutic intervention [18]. The PBN model of genomic regulation provides an appropriate setting for studying optimal regulatory intervention. The question about control and intervention can be posed in terms of the dynamics of the underlying Markov chain [7]. There are two different types of effects of external/control variables on the dynamical evolution of the network: either the finite-time or the infinite-time ones. The short-term control policies have been shown to affect the dynamics of the model over a small number of stages; however they do not always achieve the desired change in the long-run network behavior [19,20]. The infinite-horizon intervention strategies have been studied using stochastic control combined with dynamic programming algorithms. This approach has led to finding of stationary control policies that affect the steady-state distribution of a given PBN? Another important problem in the study of the infinite-horizon control is the identification of the best intervening gene.
The direct approach of solving the optimal control problem for each gene in the model and comparing the performance of the respective control policies is a computationally expensive procedure because the complexity of the dynamic programming algorithms increases exponentially with n -the number of genes [22]. Thus, there is a need to develop less complex algorithms for designing of sub-optimal intervention policies. Vahedi et al. [23] used a biology motivated approach to find such policies based on the mean first-passage times (MFPT) of the states S ! s [10,24]. Instead of formulating the problem about designing the optimal control in its full generality we elect to discuss a simpler version of the MFPT control policy algorithm based on a single control gene g because it facilitates the analysis of the interplay between reduction mappings and control policies for the PBN model. A control for each time step t , [21]. The values 0/1 are interpreted as off/on for the application of the control. The MFPT algorithm is based on the comparison between the MFPT of a state s and its flipped with respect to g state g s , i.e. the state that differs from s only in the value of g . When considering therapeutic interventions the state space S can be partitioned into desirable D and undesirable U states according to the expression values of a given set W of genes. For simplicity we will assume that } { = x W . The intuition behind the MFPT algorithm is that given the control gene g , when a desirable state s reaches U on average faster than g s , it is reasonable to apply control and start the next network transition from g s .
Using this representation one can compute the mean firstpassage times U K and D K by solving the following system of linear equations [25].
where e are unit vectors of the appropriate length. The vectors U K and D K contain the MFPTs from each state in D to the set U , and from each state in U to the set D respectively. The MFPT algorithm designs stationary control policies ,...} , ! to ! -a tuning parameter. The parameter ! is set to a higher value when the ratio of the cost of control to the cost of the undesirable states is higher, the intent being to apply the control less frequently. It is important to notice that while the MFPT control policy ! " , g is a sub-optimal one it can approximate well the optimal control policy which being a solution to the Bellman optimality equation is also a stationary one [23,26,27].

REDUCTION MAPPINGS FOR PBN
High complexity, both model-wise and computational, is a major impediment for the practical applications of mathematical models of genomic regulation. Hence, there is a need for size reducing mappings producing new and more tractable models that share some, preferably all, of the biologically meaningful properties of the larger-scale models. The most common approach for reducing the complexity of a network model of genomic regulation is by 'deleting' a gene from the model. One of the first such mappings, the projection mapping, was proposed in ? as an attempt to reduce the complexity of an independent instantaneously random PBN A while maintaining consistency with its original probability structure. Here, following [28], we provide the definition of the projection mapping for the general case of a context-sensitive PBN. The basic projection i ! is a mapping that transforms the given PBN A into a new one with the same parameters q and p , and such that the number of genes is reduced by one, i.e. the gene i x in the original network is 'deleted'. Without loss of generality one may assume that the deleted gene is ,.., ( {0,1},   instantaneously random PBN is given in [29], and the construction carries on with no changes for the case of a general PBN. One can immediately notice that the reduction mapping reduces the complexity of the original PBN not only by 'deleting' one gene but also by not increasing the number of BNs that comprise the reduced PBN. Here, we want to point out the difference between the reduction and the projection mappings. While the projection is based on the probability distribution of a single gene, the reduction mapping is defined using the probability distribution of the entire collection of states of the given PBN which allows for the optimization procedure given in [29]. In both cases though, there is no control over the changes in the dynamics/state transition diagrams j ! or the gene dependencies digraphs j G of the BNs comprising the  Fig. (1), and applies the reduction mapping 3 ! that removes the rightmost gene from the network, one gets a PBN ,2 , p q A that has only 2 contexts as shown on Fig. (2). Thus, some of the important structure associated with the contexts  The state transition diagram ! of a PBN represents the dynamics of the network and has been related to both cellular types [11] and cellular states [12,13]. Thus, it is desirable not to introduce significant changes in the structure of the attractor cycles and the sizes of the basins of attraction of ! when reducing the size of the model. This kind of considerations led to the development the of Dynamics Induced Reduction (DIRE) algorithm in [28].
DIRE performs reduction of a PBN by deleting genes from the network while maximally preserving the dynamics of the constituent BNs and keeping their number unchanged. The importance of the notion of inconsistency point is illustrated by examining the state transition diagram of the first context 1 BN of the PBN depicted on Fig. (1).
Suppose that the gene d corresponding to the rightmost digit in the binary representation of the states is to be deleted. If one tries to collapse the state transition diagram, one should notice that, with respect to the attractor structure of the original BN, merging the leaf node (001) and the attractor node (000) can be done in two very different ways: either the merging happens towards the attractor state or it happens towards the leaf state. In the first case, the attractor state is preserved in the reduced BN as (00), and the basin of attraction of the attractor (111) in the original network looses one leaf. In the second case, the attractor structure of the reduced network differs significantly from that of the original BN, the only remaining attractor being the reduced state (11). Thus, if one considers the attractor structure of a BN as a representation of important biological characteristics of the real genomic regulatory system, then the merging of the states (000)  DIRE has several advantages compared to the reduction and the projection mappings. First, it optimally preserves the dynamical structure of the original PBN by controlling the damage to the attractor structure of its constituent BNs, by not increasing the maximum number of transitions required for states in a given basin of attraction to reach their attractor, and by preserving the number of the BNs that form the PBN, Fig. (3). Not only the attractor structure of the constituent BNs is preserved but there are no spurious attractors being generated in the reduced network. This is not the case for the projection and reduction mappings, as examples from [28] show. Second, the reduction mapping induced by the algorithm does not introduce changes in the number of the attractors nor in the length of the attractor cycles, unless there are points of inconsistency s that together with their flipped with respect to the 'deleted' gene d states d s are also attractor states as well. In addition, DIRE ensures that there will be a very little change, on average, of the relative sizes of the basins of the attraction. All of this together with the observation that the number r of the constituent BNs, their selection probabilities, and the parameters p and q remain the same for the original and the reduced PBN implies that, with the exception of the degenerate cases where there are relatively large number of attractor states that are points of inconsistency and whose flipped with respect to the 'deleted' gene d states are also Fig. (3). State transition diagrams of the contexts of the reduced model obtained after applying DIRE with respect to the third gene to the PBN presented on Fig. (1). attractor states, the steady-state distribution of the reduced PBN produced by the mapping will match closely the steady-state distribution of the original PBN. Moreover, the new algorithm does not require any prior information for the steady-state distribution of the original PBN as in the case of the projection and the reduction mappings. One can find a study of the DIRE performance on synthetically generated data: constrained PBN generated using the algorithm proposed in [30] and randomly generated PBN, on the complementary web site http://gsp.tamu.edu/Publications/ dire.htm. The algorithm was also applied to a PBN model of a real gene regulatory network inferred from 31 malignant melanoma samples [17]. The 7-gene PBN 4 0.01,0.01, A with four contexts was built using the genes WNT5A, Pirin, S100P, RET1, MART1, HADHB, and STC2 [28]. The network model was reduced by "deleting" WNT5A using both the reduction mapping defined in Sec. III and DIRE algorithm. The steady-state distributions of the reduced networks were compared to the probability distribution * P resulting from the collapsing procedure, Sec. III, which treats WNT5A as a latent variable. That comparison is shown on Fig. (4). The MSE differences between the steadystate distributions of the two reduced networks and * P confirm that by controlling the damage on the state transition diagrams of the PBN's contexts DIRE can produce networks with steady-state distributions very similar to the original one [28]. Given the important biological interpretations of the steady-state distribution of the PBN model, DIRE reduction produces smaller and less complex networks which can be used to model the same phenomena as the larger ones.
Finally, the notion of point of inconsistency creates the opportunity for evaluating the importance of a particular gene for the network in question. This could lead to new methods for gene ranking, as well as options for applying control-based optimization which produces a favorable shift in the steady-state distribution of a given PBN and thus, could impact the design of therapeutic interventions.

THE PROBLEMS OF REDUCTION, INFERENCE AND CONTROL
The examples of reduction mappings in the previous section show that one should take into consideration both the dependencies among the genes in a GRN and the networks dynamics when designing reduction mappings. Thus, 'good' reduction mappings should preserve as much as possible the biologically meaningful properties of the original network model while taking care of its two major components: the digraph representing the dependencies among the genes and the set of functional relations that determine how the expression profile of each gene is predicted by the expression profiles of other genes. In this section we consider reduction mappings based on 'deletion' of one gene Fig. (4). Steady-state distributions for the reduced networks that were obtained by applying the reduction algorithm and DIRE on the 7-gene melanoma related PBN. at a time from the network. We state the general reduction problem for PBNs and discuss different types of constraints. We also compare that general reduction problem to the problems of inference of the PBN model from data and to the problem of optimal control.

The Reduction Problem
Because every PBN is a collection of individual BNs endowed with a probability structure and gene mutation probability, we focus on reduction mappings for Boolean networks. Consider the space n M of all BNs on n genes.
Then, having in mind the example of the multivalued projection mapping from the previous section, a reduction mapping can be defined as any set valued mapping It is important to point out that the constraints ! could be internal with respect to the model, i.e. related to the dynamical or static structure of B (the graphs ! or G ) or B (the graphs ! or G ), or external. For example, ! could be related to qualitative knowledge/description of the biological phenomena being modeled. variables. The set of constraints ! determines some of the entries in those truth tables which could significantly reduce the size of the search space. This interpretation of the reduction problem allows for algorithms that are used to infer BNs from data, e.g. [30], to be used in determining the set • Given a set of constraints ! and a reduction mapping ! there exits a maximal, with respect to the partial order induced by set inclusion, subset such that the same reduction mapping ! solves the reduction problem for every • There is a partial order induced by set inclusion for the sets of constraints. If ! is a solution to the reduction problem for a given 1 ! and n B M ! , then ! might not be a solution to the reduction problem for 2 ! and B if 2 1 ! " ! . Thus, one can look for the maximal, with respect to this partial order, set of constraints ! , ! " ! 1 , so that ! solves the reduction problem for ! and B .
• Because one of the main reasons for constructing reduction mappings is to reduce the complexity of a network model, one can see that the choice of constraints ! has a significant impact on achieving this goal. The cardinality of the set ) (B ! could be so big that the mapping ! leads to an increase of the model complexity, as the example of the projection mapping shows. Moreover, the verification if a BN ) (B ! " satisfies ! can be computationally intensive for some sets of constraints. For example, if { = ! a BN with singleton attractors only} then such a verification might require finding all of the attractor cycles in the state-transition diagrams ! of the BNs ) (B ! " -a problem known to be a NP-complete. The next example illustrates the trade-off between the size of ) (B ! " and the cost to verify that all of the BNs ) (B ! " satisfy the constraint ! .

Example 1
The basic projection mapping n ! is a solution of the reduction problem for the set of constraints

Example 2
Consider the reduction problem for the constraint This example points out to the importance of constraints related to the global properties of ! . At the same time, one has to be careful using properties of ! as constraints when solving the reduction problem. For example, DIRE algorithm uses the the entire state-transition diagram as a constraint but applying it to the network B from the above example produces B that does not possess the dependency 2 2 x x ! that is present in the gene dependencies digraph G of B . Thus, in some cases, the entire state-transition diagram appears to be too strong of a constraint when solving the reduction problem.

Reduction and Inference
The interpretation of the reduction problem as a search problem points out to its similarity to the problem about PBN inference from data. Constraints that are used when one designs network models from data could be also used when solving the reduction problem. For example, it was shown that genes which predict each other have a profound effect on the attractor structure of a BN [31]. The next definition specifies the notion of such gene interdependencies. It is common to assume that in those cases data come from the steady-state of the genomic regulatory system which in its turn implies that the majority of data points represent attractors of the modeled system. The attractor cycles in BNs that model biological networks are typically associated with phenotypes and tend to be short [11], with biological state stability contributing to singleton attractors [32]. The results presented in [31] show that bidirectional gene relationships in a Boolean network are a common cause for the presence of non-singleton attractors in its state-transition diagram. One might guess that the creation of bidirectional relationships by the reduction mapping under the given set of constraints contributes to the spurious attractor cycles in most of the networks from ) (B ! . This points out to the importance of tricycles The similarity between the problems of reduction and inference of PBNs could be explored in the other possible direction: using the knowledge about reduction mappings to infer the model from data. Constraints used by the DIRE algorithm for reducing BNs could be applied during an inference procedure [34]. Because attractor cycles are processed independently of their basins and special care is taken when a point of inconsistency is encounter during reduction one can use DIRE as a constraint in designing PBNs from data, especially when data are believed to come from the steady-state of the underlying genomic regulatory system. DIRE mapping induces a partial order in every subset of data points that is considered to be part of set of fixed points of the regulatory system. The algorithm given in [34] uses this partial order to infer a PBN from data so that the model's attractor structure is stable with respect to the DIRE reduction. A real data example using a melanoma data set [17], could be found on http://gsp.tamu.edu/Publications/ BNs/dire_ranking.pdf.

Reduction and Control
Probabilistic Boolean networks have been the model of choice in studying optimal regulatory intervention. The reason for that lies in the well developed theory of Markov chains and the associated transition probability matrices. To address the issue of changing the long-run behavior, stochastic control has been employed to find stationary control policies that affect the steady-state distribution of a PBN. The algorithms used to find these solutions have complexity which increases exponentially with the number of the genes in the network. Hence, there is a need for sizereducing mappings producing new and more tractable models whose stationary control policies induce sub-optimal stationary control policies on the larger PBN. This subsection focuses on a specific stationary control policy, Mean-First-Passage-Time control policy, and reviews the two major issues that link the reduction problem to the problem about designing the MFPT control policy. The first problem concerns the effects of the reduction mappings introduced in [ Recently, Ghaffari et al. [36] considered this inverse problem for the reduction mapping defined in [29]. That reduction mapping is induced by a selection policy that is determined by the steady-state distribution of the p BN that is being reduced. There are some obvious constraints on the possible ways to extend genes. Each set shared a common set of attractor states with no restrictions on the way they formed attractor cycles. In addition, the cardinalities i W # of the gene predictor sets were restricted to no be no larger than 3 to keep the networks from being chaotic. Furthermore, the two parameters 1 ! and ! were set equal because in practical applications one would often assume similar costs for the original and the reduced networks. The study showed that one can expect relatively significant differences for the two control policies only for very small values of ! , Fig. (5). For 2 > ! the MFPT control policy on the original networks differed less than 3% from the stationary control policy that was induced by the MFPT control policy on the reduced networks. Thus, one can use the reduced network to accurately estimate the MFPT control policy for a large interval of the parameter ! that is associated with a relative high cost of control. The difference in the average behavior of the two sets of networks presented on Fig. (5) suggests that the attractor structure of the models plays an important role in solving this inverse problem.

CONCLUSIONS
High complexity is a major impediment when computational models of genomic regulation are used to design optimal strategies for therapeutic intervention and control of disease. Thus, mappings which reduce the size/complexity of the model while preserving its important structural and dynamical characteristics become indispensable for its successful applications. The reduction problem in its very general formulation emphasizes the role of constraints in the process of designing reduction mappings for probabilistic Boolean networks. It also provides the basis for the comparison drawn between the problems of reduction and inference of PBNs from gene expression data. The similarity between the two problems allows for using reduction mappings in the process of network inference, and the application of known inference algorithms in designing reduction mappings. There is also a similarity between the problems of reduction and control when MFPT stationary control policy and selection policy-induced reduction mappings are considered. This provides the basis for investigating the question about the effects of reduction mappings on the MFPT control policy for the reduced network, and also the question of the possibility to use the MFPT control policy for the reduced network to induce a stationary control policy for the original PBN that approximates its original MFPT control policy. To date, very little is known about the cost of applying reduction mappings. A result of preliminary nature is presented in [37] where for the first time the stochastic complexity was used for that purpose. Estimating the reduction cost is important because it is not advantageous to produce less complex Fig. (5). Average effects of reduction on the control policy. model by a highly complex/expensive mapping. Furthermore, little is known about estimates of the computational savings when a reduced version of a large network is used. Clearly, the reduced models have their state space exponentially smaller compare to those of the larger PBNs. Carefully designed large scale simulation studies could provide hints about how the computational burden of using large network models of genomic regulation compares to their reduced under different sets of constraints versions.