A tractable genotype–phenotype map modelling the self-assembly of protein quaternary structure

The mapping between biological genotypes and phenotypes is central to the study of biological evolution. Here, we introduce a rich, intuitive and biologically realistic genotype–phenotype (GP) map that serves as a model of self-assembling biological structures, such as protein complexes, and remains computationally and analytically tractable. Our GP map arises naturally from the self-assembly of polyomino structures on a two-dimensional lattice and exhibits a number of properties: redundancy (genotypes vastly outnumber phenotypes), phenotype bias (genotypic redundancy varies greatly between phenotypes), genotype component disconnectivity (phenotypes consist of disconnected mutational networks) and shape space covering (most phenotypes can be reached in a small number of mutations). We also show that the mutational robustness of phenotypes scales very roughly logarithmically with phenotype redundancy and is positively correlated with phenotypic evolvability. Although our GP map describes the assembly of disconnected objects, it shares many properties with other popular GP maps for connected units, such as models for RNA secondary structure or the hydrophobic-polar (HP) lattice model for protein tertiary structure. The remarkable fact that these important properties similarly emerge from such different models suggests the possibility that universal features underlie a much wider class of biologically realistic GP maps.


Introduction
Evolution is one of the most fundamental principles in biology. While organismal genotypes are becoming accessible owing to rapid advances in sequencing technology, further understanding of the complicated mapping from sequence to phenotype is necessary for a richer understanding of evolutionary dynamics [1][2][3][4]. While the terms genotype and phenotype can be flexibly assigned in a biological system, genotypes are generally defined as the genetic material upon which mutations act and phenotypes capture the properties of the organism on which selection can differentiate. As such, the mapping from genotype to phenotype-the GP map-links mutations to potentially selectable variation and is therefore of critical importance in understanding evolutionary systems. GP maps also provide a basis for understanding important biological concepts such as mutational robustness and evolvability, which may profoundly affect evolutionary dynamics, and help determine the fundamental topologies of the landscapes upon which evolutionary processes occur [5].
In general, it is intractable to directly model the details of even small parts of a whole organismal GP map, owing to both the very large numbers of genotypes, and a lack of knowledge of all possible phenotypes. Advances have been made in recent years, however, with the use of simplified models. Three particular systems have been modelled with notable success. Firstly, genetic regulatory networks have been approximated using a variety of abstract models, including Boolean networks [6,7]. Despite their simplicity, Boolean networks have demonstrated a remarkable ability to produce biologically realistic results. they have been shown to reproduce key aspects of the yeast cell cycle [7]. Secondly, RNA secondary structure can be routinely and accurately predicted via a host of different methods [8], as a result of which it has become one of the best-known GP maps [4,9], particularly for the study of evolutionary dynamics [9,10]. Finally, the complex problem of a protein folding into a well-defined tertiary structure has been investigated using various models including the highly simplified HP lattice model, where folds are represented as self-avoiding walks on a lattice, and the full sequence is reduced to binary alphabet (H stands for hydrophobic and P for polar amino acids) [11]. Despite its heavily coarse-grained nature, this model has produced important biological insights [12] and has been shown to accurately model known features of protein tertiary structure [13]. While these models have been successful for specific biological examples, another very important advantage of their tractability is the potential for extracting more general underlying principles of GP maps, thus increasing our understanding of how evolution shapes the natural world [2,4].
In this paper, we extend this family of coarse-grained models for biological structure, exploring a very recently introduced model for tile self-assembly [14,15] which can be viewed as a highly coarse-grained GP map for protein quaternary structure (protein complexes). Understanding the formation of protein complexes is important as demonstrated by the fact that most proteins form complexes in the cell (around 80% in yeast [16]) and the function of these complexes is often strongly linked to their physical form. Protein complexes are formed by the interaction of multiple individual protein chains to form larger structures. The interaction between two chains is predominantly mediated by hydrophobic forces acting to pack together non-polar amino acids to provide fewer energetically unfavourable interactions with water [17]. Invaluable resources for the study of protein complexes are the Protein Data Bank, providing a database containing experimentally known protein quaternary structures [18], and the three-dimensional complex database [16] categorizing PDB structures with a graph representation of the interactions between different subunits and a characterization of the symmetry in a complex.
The relationship between the topology of a protein complex and the individual amino acids that make up its protein chains is highly complex owing to the multiple functionalities encoded in the protein sequence. For example, correct tertiary structure, folding pathways and other inter-protein interactions are all potential requirements for a single protein chain. Given these complexities, a direct and complete GP map of protein quaternary structure is intractable as including all required functionality would be unfeasible. Instead, in the spirit of the highly simplified HP model for protein tertiary structure, we represent the proteins as square tiles on a lattice [14,15]. Interactions between tiles model the protein-protein interactions that lead to self-assembly. In proteins, these patches can be made up of a number of different amino acids; therefore, our model provides a coarse-grained representation of the key interactions between proteins. In the model, a genotype is a sequence of characters describing interactions on the edges of the tiles, which, when combined with a selfassembly process on a two-dimensional square lattice, leads to the formation of phenotypes comprising different square tile building blocks conjoined along interacting edges. These square tile structures are known as polyominoes and thus, we refer to the model as the Polyomino model and the resulting GP map as the Polyomino GP map. They are closely related to a wider class of lattice tiling models that have a long and important history in mathematics and computer science (which we discuss further in §2).
Despite these great simplifications, and in analogy with the highly schematic HP model, RNA secondary structure models and Boolean network models of GRNs, we expect the Polyomino model to provide insight into the general structure of the full GP map for the formation of protein complexes from folded proteins. In fact, our earlier work [15] has already discussed the evolutionary dynamics of the Polyomino model, demonstrating that it can be used to rationalize the preference of dihedral over cyclic symmetries in homomeric protein tetramers [19,20].
In figure 1, we compare coarse-grained representations for RNA secondary structure, protein tertiary structure and protein complexes. On the left-hand side, the biological representation is shown and on the right, a given model representation. Through this figure, we wish to highlight two points. Firstly, that each of the model systems is a dramatically simplified version of the corresponding biological system and that the Polyomino model is of a similar order of coarse-graining to previous models. And secondly, that the Polyomino model can provide a concise representation of real protein complexes by capturing features such as the symmetry of the subunit arrangements (C 4 in this case).
In contrast to RNA secondary structure or protein tertiary structure, where structure forms through the folding of a connected string of individual entities (nucleotides or amino acids), protein complexes are built by joining separate individual entities ( protein chains). Furthermore, while for string-like self-assembly the final structure size is constrained, proteins can form unbounded structures that can be ordered or disordered. In our work here, we focus on the subset of deterministic, finite-sized structures to model protein quaternary structure. But, in principle, the Polyomino model can also capture the more general phenomenology of unbounded and non-deterministic assembly [14,15].
As an example of the link between protein quaternary structure and the structures supported by our model, in figure 2 (top) we show the haemoglobin protein complex (Hb A, [25]) and the mutant that causes sickle cell anaemia (Hb S, [26]), alongside a polyomino representation of the wild-type and mutant phenotype (figure 2, bottom). In both cases, we show that a single mutation to the respective wild-types results in the production of extended unbound structures: Hb S forms long fibre aggregates, while the polyomino case forms unbound tiling of the plane (further details in figure). While the exact geometrical results in this analogy are different, the qualitative behaviour that results from a single patch changing is the same, highlighting the utility of our model as a GP map modelling protein quaternary structure. In this paper, we do not study extended structures, but rather structures that reversibly self-assemble to the same finite structure each time.
Finally, tiling models have been used to study synthetic self-assembling systems [28], and a better understanding of the design space of polyominoes may aid in the design of these artificial systems.
This article proceeds as follows. We first describe in detail the Polyomino model and some of its fundamental properties ( §2). We then analyse a wide range of properties of the resultant GP map and compare these properties to those described, for example in [29], for the HP model and the RNA secondary structure map. In particular, we show in §3.1-3.3 that the mapping from sequences to phenotypes rsif.royalsocietypublishing.org J. R. Soc. Interface 11: 20140249 for all three models share the following general properties: redundancy (there are many more genotypes than phenotypes) leading to large neutral sets (the collection of all genotypes that map to a given phenotype) and phenotype bias (some phenotypes are associated with many more genotypes than others). A more fine-grained analysis shows that the neutral sets also exhibit component disconnectivity (not all genomes in a neutral set can be linked with single mutational steps). We proceed with a more detailed comparison of the Polyomino and RNA systems, through considering shape space covering (most phenotypes can be reached from any other phenotype with just a small number of mutations), before showing the mean mutational robustness of a phenotype (the phenotypic robustness) scales very roughly logarithmically with the redundancy of a phenotype, and finally that it is positively correlated with the evolvability (defined here as the number of other phenotypes potentially accessible from a phenotype), as postulated to hold more generally by Wagner [30]. Finally, in §4 we discuss some implications of the remarkable agreement we find between the structure of our Polyomino GP map and those of the better studied RNA secondary structure and HP maps.
2. The Polyomino self-assembly model and its associated genotype-phenotype map The process of tiling and its connection with computer science was first developed by Wang [31]. Since then, tiling models have been shown to be capable of computation and, in particular, Turing-universal computation under the condition that cooperative binding is allowed between tiles, demonstrating the ability of two-dimensional tiling systems to model computational as well as structure-forming processes [32]. Rothemund & Winfree [33] studied the program size complexity necessary to build a structure of a given size. More general considerations of the complexity of tiled structures have since been discussed in [34], with a more biological slant given by Ahnert et al. [14] and applications to artificial biological systems discussed by Rothemund et al. [28].
Here, rather than focusing on these tiles as potential computing devices or as models for complexity, we explore how they can be used to understand the GP map of a specific biological system, namely the self-assembly of finite-sized protein structures. Nevertheless, we are aware that some of our conclusions may have applications for a wider class of systems.
We now proceed with a more detailed description of the Polyomino model as a GP map.

Summary of the Polyomino genotypephenotype map
The genotype is modelled as a character string representation of a set of N t tiles which make up an assembly kit. The edges of each tile in the assembly kit are given a number which represents the interface type. Interactions between interface types are defined via an interface interaction matrix A ij . In our work  Figure 1. A comparison of how three different biological structures may be represented in a corresponding model system. Row (a) compares a version of the iconic hammerhead ribozyme (PDB reference 1RMN [21]) with its secondary structure representation (showing the bonding pattern of nucleotides) produced by the Vienna RNA package [22] shown alongside. The orange, blue and green colours in each part represent the bonded stems in the structure. Row (b) depicts a cartoon of the tertiary structure of a single chain of length 21 (chain A) from an insulin protein (PDB reference 1APH [23]) which is compared to a schematic HP lattice protein interpretation on the right. The orange and blue colours are used to demonstrate the structural feature of a-helices in the pictures. Finally in row (c), we show a protein complex (PDB reference 1BKD [24]) alongside a polyomino representation. The orange and blue colours represent the different subunits involved in the protein. The ability of the polyomino representation to capture the C 4 symmetry of 1BKD is apparent from the rotation of the labelling on the subunits.
here, we only consider one such interaction matrix type, with a total of c interface types. There are no self-interacting interface types and interface types interact in unique pairs (1 $ 2, 3 $ 4, . . . ), with the only fully neutral types being 0 and c 2 1. Defining interface interactions in this way allows the single parameter c to control the number of potential unique bond types. As such, the two parameters that describe a given parametrization of the Polyomino GP map are the tile number N t and the total number of interface types c, allowing a particular Polyomino GP map to be labelled as S Nt , c.
Below we discuss the genotype, phenotype and map used in the Polyomino GP map.

Genotype
The genotype for the set of building blocks in the assembly kit is written as a character string of length L in base K. The base we use in this paper is the total number of interface types c used in a given GP map. This allows each base to mutate to any other base at each site. The procedure of converting a genotype string into the assembly kit is part of the map.

Phenotype
In the context of the self-assembly mapping, there are several ways of classifying a polyomino structure. These may include criteria based on its overall shape and the individual tile types making up the final polyomino structure, as well as individual tile orientations. These different possibilities are discussed in [15].
Here, we will classify the phenotype according to the overall shape of the polyomino independent of origin translation and C 4 (908) rotations. Note that chiral counterparts of polyominoes represent distinct phenotypes.

Map
The map has two stages: conversion of the genotype into the assembly kit, followed by assembly of a polyomino from an assembly process involving the assembly kit and the interface interaction matrix. A diagram representing this process is shown in figure 3.

Genotype to assembly kit
The characters of the genotype are read from left to right along the string. Characters are assigned to the next blank edge of a square tile in the assembly kit. The edges are taken in clockwise order (from the top side) with all edges being written before moving on to a new block. The total number of tiles, in terms of the genotype string length, can be expressed as N t ¼ L/4.

Assembly kit to phenotype
The assembly of the two-dimensional polyomino takes place on a square lattice where individual tiles from the assembly kit are placed. The interface types on the edges of tiles can form an attractive interaction as determined by the binary interface interaction matrix A ij , with 1 denoting attraction and 0 neutrality. A bond may form between tiles if two adjacent (interacting) edges have interface types which attract, as defined by the interaction matrix. The assembly process is initialized by placing (seeding) a single tile on the lattice. We will consider only GP maps, in which the seed tile corresponds to the first tile described in the genotype. A different protocol where any tile may be used to seed the assembly is also possible and does not significantly affect the results presented here (see appendix A). The assembly then proceeds as follows: (1) Available sites on the lattice are identified. These are places on the lattice where a tile may be placed in a particular orientation such that it will form a bond to an adjacent tile that has already been placed. In the assembly algorithm, a list is kept of the position, tile type and orientation for each possible tile placement. A potentially infinite number of tiles, in equal proportions, are available. (2) A random available site on the lattice is chosen.
(3) The chosen site is filled with the associated tile and with the corresponding orientation.
These steps are repeated until either: -There are no available sites for bonding.
-The structure grows beyond a certain width or height D max , which is taken as a proxy for unbounded assembly, so that the resulting phenotype is described as unbound.
We set D max ¼ 16 here, but our results are not sensitive to this cut-off as, for the polyomino systems we study here, there are no bounded structures larger than this.
At this point, the assembly process is terminated and the structure produced is recorded. No re-seeding takes place as we only consider the assembly of a single connected structure. To test whether the structure is deterministic, the assembly process is repeated k times, with each C 4 rotation of the final structure checked against the recorded structure. If there are any differences between any of the k assemblies, the phenotype is classified as non-deterministic. Phenotypes that are classified as unbound or non-deterministic structures are represented by a single phenotype, which we refer to as the undetermined (UND) phenotype and is assumed not to be biologically relevant in this context. A more detailed discussion of the classification of polyomino structures is given in [15].

Methods for RNA and hydrophobic-polar lattice protein models
RNA secondary structure was modelled with the Vienna package [22] with the default parameters. Single isolated base pairs were permitted (see [35] for further discussion of this choice). The HP lattice model data were taken from the enumeration performed by Irbäck & Troein [36] over the set of non-compact folds for L ¼ 25 and reproduced results discussed in their original work as well as that of Ferrada & Wagner [29].

Properties of the Polyomino genotypephenotype map
In this section, we analyse the Polyomino GP map by making measurements of redundancy, phenotype bias and component In the genotype, the characters are underlined with the colour of the block they are assigned to. The interface interaction matrix is not shown explicitly, but throughout our work we assume the convention of interface types interacting in pairs (1 $ 2, 3 $ 4 . . . ), with 0 and c 2 1 being neutral. The interaction matrix together with the assembly kit are passed to the assembly algorithm, which is used to produce the phenotype. (b) An example of the assembly process is shown, which proceeds in discrete time steps. The first tile in the assembly kit is used to seed the assembly. Further time steps follow, with the identification of available sites (depicted in light green and blue in the second picture from left), followed by the random choice of an available site and placement of the corresponding tile (a blue tile is placed on the blue available site in this example). Assembly is terminated when there are either no more available sites (as above), or if the structure is larger than D max (number of blocks) in width or height after which we assume the structure will no longer terminate but grow indefinitely. rsif.royalsocietypublishing.org J. R. Soc. Interface 11: 20140249 numbers, before moving on to analyse the properties of shape space covering, robustness and the relationship between robustness and evolvability. For each measurement, comparisons are made with the RNA secondary structure model and, for the first three of these properties, a HP lattice protein model.

Redundancy
It is now well established that many different sequences can generate similar protein or RNA phenotypes [37]. This largescale redundancy is the basis, for example, of the molecular clock hypothesis, which is widely used for inferring phylogenetic relationships [38]. It is therefore not surprising that such redundancy (also known in the literature on GP maps as degeneracy/designability) has been observed in model RNA [9], HP lattice protein [13] and genetic regulatory [39] GP maps, as well as in more general model systems such as signalling networks [40]. As such, redundancy is expected to be a typical feature of GP maps.
In table 1, we show the number of genotypes (N G ) and phenotypes (N P ) for different polyomino, RNA and HP GP maps. As expected, all three models show significant redundancy. The Polyomino model displays large-scale redundancy shown by the vastly fewer phenotypes in comparison to genotypes for each of the GP maps presented. As can be seen in table 1, this is of a similar order to the RNA GP maps. For example, for RNA L ¼ 12 and Polyomino S 2,8 (which both have 1.7 Â 10 7 genotypes), there are 57 phenotypes for the former and 13 for the latter.

Phenotype bias
On top of the measure of redundancy in GP maps, it is also possible to consider phenotype bias, the idea that some phenotypes are much more redundant than others. Examples of bias can again be seen in RNA [9,35] and the HP model [12,13,29]. We demonstrate bias in the GP maps in two ways. In table 1, we present the fraction of phenotypes (N cov /N P , where N cov is the number of phenotypes) that cover 95% of the genotype space. This statistic shows that in all three GP maps (RNA, HP and Polyomino), of the deterministic/non-trivial space, a smaller fraction of phenotypes is required to cover a given amount of the genotype space, indicating a bias in the number of genotypes assigned to each phenotype. In figure 4, we plot the frequency (fractional coverage of genotype space) of each phenotype against its frequency rank in three different Polyomino GP maps, as well as for the RNA L ¼ 12 GP map. In each of the Polyomino GP maps, we see an approximately linear trend between logfrequency and log-rank, although there are not enough data for this to be considered a power law. The data from the RNA GP map also show a distinct bias, although the relationship is more complicated than the more apparent linear form seen in the Polyomino GP maps. Taken together these statistics show a significant amount of phenotype bias in the Polyomino GP map along with the RNA and HP GP maps.

Component disconnectivity
Further to the consideration of redundancy and bias of phenotypes in genotype space, we can ask how many components a Table 1. Redundancy, phenotype bias and components in Polyomino, RNA and HP GP maps. Comparing the number of phenotypes (N P ) to the number of genotypes (N G ) for each GP map highlights large-scale redundancy present. Phenotype bias is demonstrated in each map with the measure of the fraction of phenotypes that covers 95% of the genotypes (N cov is the number of phenotypes that covers the 95% of genotypes). In all cases the fraction of phenotypes is significantly smaller than the fraction of genotypes being covered, indicating the presence of a strong phenotype bias. The final column is the total number of genotype components (N C ) in each GP map. In all cases (non-computable values left out), the number of components is larger than the number of phenotypes, indicating phenotypes tend to be spread out over multiple disconnected components. RNA data for L ¼ 12 were computed from the Vienna package [22] and taken from [41] for L ¼ 15, 20. The value of N p for the Polyomino S 4,16 GP map is approximate as it is found from large-scale sampling of the GP map over multiple runs of the algorithm presented in [42]. All other Polyomino results were found by exhaustive enumeration. The HP results were calculated from the data made available by Irbäck & Troein [36]. Non-deterministic phenotypes and the trivial structure in RNA are excluded from the statistics. given phenotype forms in a particular GP map. In graph theory, a component is a set of nodes that are connected, and thus when applied to genotype spaces, is a set of connected genotypes with the same phenotype. Here, we consider only point mutations. In RNA, simple biophysical considerations show that two point mutations are needed to turn a purinepyramidine bond into a pyramidine-purine bond. This neutral reciprocal sign epistasis leads to many individual components [43,44]. We expect similar behaviour in polyominoes, where there may be several ways of encoding a given phenotype that are not connected through neutral point mutations owing to there being multiple interface types.
In the final column of table 1, we show the number of genotype components (N C ) for four different GP maps: Polyomino S 2,8 , RNA L ¼ 12, RNA L ¼ 15 and HP L ¼ 25 (Polyomino S 3,8 and S 4,16 and RNA L ¼ 20 GP maps were unobtainable owing to computational expense). In all four, we see a substantially larger number of components than phenotypes, indicating that phenotypes tend to form disconnected components. Although many components may be small, it is still the case that whole-phenotype properties need to be considered carefully in the context of evolutionary dynamics, as it is not typically possible to access every genotype of a given phenotype through neutral point mutations.

Shape space covering
GP maps typically exist in high-dimensional genotype (sequence) spaces, a property with counterintuitive consequences and a topic of current biological research for GP maps [45,46]. For example, for an alphabet of K letters (e.g. K ¼ 4 for RNA or K ¼ 20 for proteins) the number of genotypes scales exponentially as K L , while the number of mutations needed to reach any two genotypes is at most L, and so scales linearly. In practice, considerably fewer than L mutations are needed to connect any two phenotypes. This property has been studied most extensively for RNA GP maps, and in that context Schuster et al. [9], borrowed the term shape space covering from its original use in immunology [47]. The HP model has also been considered with respect to shape space covering [13,29] with seemingly less rapid space covering.
We explore shape space covering in a similar way to Ferrada & Wagner [29], who compared RNA and HP models, building on work of earlier investigators. Shape space covering is measured by counting the average fraction of phenotypes found when applying a given number of independent mutations (the radius, n) to a genotype in the GP map. This process involves picking a sample of genotypes and then measuring the number of phenotypes found in the ball of genotypes n-mutations around each of them. Ferrada & Wagner [29] found that in both the RNA secondary structure and HP lattice protein GP maps, the fraction of phenotypes discovered increases in a roughly sigmoidal fashion with the increasing number of independent mutations applied to the source genotype and at a greater rate for RNA than the HP GP map.
In our work, we measure shape space covering in a similar manner: we picked a sample of 1000 genotypes uniformly across all phenotypes and measured the phenotype of genotypes at each given radius (n). In figure 5, we plot the average fraction of phenotypes found for the Polyomino S 2,8 , S 3,8 and RNA L ¼ 12 GP maps. The general behaviour of both the Polyomino and RNA GP maps is similar. Phenotypes are almost completely covered within half the sequence length, and at a slightly higher rate in the Polyomino GP maps. This provides clear evidence to suggest that the Polyomino GP map has shape space covering-most phenotypes can be found within a ball containing many fewer genotypes than the entire space.
Ferrada & Wagner [29] found that shape space covering was dependent on the exact group of phenotypes under consideration-exploring around genotype networks of the most frequent phenotypes, resulted in a more rapid shape space covering than from random genotypes. In appendix A, we provide additional discussion of phenotype parametrizations in the Polyomino GP map. From the general lack of variation in Polyomino GP map properties such as redundancy and phenotype bias discussed there, we would expect shape space covering to exhibit similar behaviour for other conceivable deterministic, bound phenotype parametrizations.

Robustness
Biological systems need to be robust and consistently produce the same phenotype in response to environmental perturbations or to genetic mutations [2]. Here, we consider only the phenotypic effects of mutations to the genotype. Mutational robustness describes the invariance of a phenotype as a result of mutations to the genotype. We focus here on single point mutations such that the fraction of 1-mutants that have the same phenotype as the original genotype under examination is defined as the genotypic robustness. For a genotype g, with phenotype p, the genotype robustness can thus be expressed as follows:  where r g is the genotypic robustness of g, n p,g is the number of 1-mutant neighbours of g with phenotype p, K is the base and L is the sequence length (there are a total of (K 2 1)L 1-mutants for any genotype). The robustness of the phenotype can then be considered as the average of this quantity over all genotypes with phenotype p, resulting in where r p is the phenotypic robustness and P is the set of genotypes with phenotype p.
In figure 6, we plot the phenotypic robustness r p against the frequency of each phenotype f p for the two Polyomino GP maps S 2,8 and S 3,8 and the RNA L ¼ 12 GP map. Corroborating previous results [41,43], it can be seen that the RNA phenotypic robustness scales logarithmically with phenotype frequency-the exact coefficient of scaling being phenotype dependent [43]. This linear trend with log-frequency is very approximate for the Polyomino GP maps, although the robustness can still be seen to strongly scale with phenotype frequency. Both Polyomino GP maps exhibit a slightly smaller phenotypic robustness at a given frequency when compared with the RNA GP map. Nonetheless, despite these two observations, the clear indication here is that Polyomino GP maps have a phenotype robustness that scales similar to phenotypes in the RNA GP map.

Robustness versus evolvability
Robustness and evolvability are two evolutionary properties that have received much recent attention in the literature [2,30,48 -51]. As discussed in §3.5, robustness is the ability of an organism to maintain its phenotype if its genotype is mutated. Evolvability on the other hand is considered as the capacity for producing phenotypic variation [1,30]. One might expect that for an individual to be robust it would have to compromise its ability to produce variation (to be evolvable). Using the RNA GP map, Wagner [30] demonstrated that this expected trade-off exists for individual genotypes, but that the two properties are in fact positively correlated at the level of phenotypes. A possible geometric explanation is that phenotypes can become more mutationally robust as a result of increased redundancy (more neutral neighbours). This in turn leads to connected components spreading out further across genotype space and possessing a greater 'surface', thus allowing a greater number of phenotypes to be accessible through neutral mutations.
Wagner's argument is really about the static structure of genotype space and does not account for any dynamical evolutionary effects. Other authors have considered further static properties, including a different notion of phenotypic evolvability (diversity evolvability) [52] defined as being the probability that two non-neutral mutations lead to different phenotypes. Such a definition of evolvability was found not to be correlated with robustness in the RNA GP map. A dynamical study by Draghi et al. [53] demonstrated that evolvability could depend non-monotonically on robustness. In our work here, we simply consider whether the static properties of the Polyomino GP map behave in a similar manner to those of the RNA GP map as demonstrated by Wagner [30] without here commenting on the wider debate of how robustness and evolvability relate.
In the left-hand plots of figure 7, we show fractional evolvability versus fractional robustness, for genotypes and phenotypes in both the RNA L ¼ 12 and Polyomino S 3,8 GP maps. The genotype plot is a binned version of 100 randomly sampled genotypes for each phenotype (apart from the non-deterministic/trivial structure) in each GP map. The fraction of all phenotypes that can be produced in any 1-mutation to each sampled genotype (genotypic evolvability, e g ) is plotted against the fraction of those 1-mutations that produce the same phenotype as that of the genotype being tested (genotypic robustness, equation (3.1)). For both polyominoes and RNA, we see a significant negative correlation. This expected negative correlation simply represents the trade-off between genotypic evolvability and robustness at the level of the individual genotype.
In the right-hand plots of figure 7, we plot phenotypic evolvability against phenotypic robustness. The phenotypic evolvability e p of phenotype p is defined here as the fraction of all phenotypes that can be reached in a single mutation from any genotype with phenotype p. We also refer to this as the potential evolvability because it represents the potential number of phenotypes that could be reached. Whether they can be reached depends, of course, on details of the evolutionary dynamics. For example, if only single mutations are available then e p should really be defined with respect to the relevant component [44]. Phenotypic robustness is defined in the same way as in §3.5, as the average genotypic robustness over all genotypes with phenotype p (equation (3.1)). In the plots, we see strong positive correlations for both the RNA L ¼ 12 and the Polyomino S 3,8 GP maps, as expected. In other words, for both maps, phenotypes with a greater redundancy can be generated by a larger number of genotypes and are therefore, on average, more likely to be mutationally robust. Furthermore, such phenotypes are also likely to have a larger set of genotypes, and therefore a greater diversity of other phenotypes potentially log 10 frequency, log 10 f p phenotype robustness, r p Figure 6. Robustness in GP maps. Phenotypic robustness is plotted against frequency (log scale) for the Polyomino S 2,8 (green), S 3,8 (blue) and RNA L ¼ 12 (yellow) GP maps. The top three points in the right-hand side of the plot are the UND/trivial structure in the respective Polyomino and RNA systems. In the RNA case, robustness scales linearly with log-frequency, with the exact coefficient of scaling being phenotype dependent [43]. While the trend is very roughly linear for the Polyomino GP maps, there is still a strong positive scaling relationship, demonstrating that polyomino phenotypes exhibit phenotypic robustness in a similar manner to the RNA GP map. rsif.royalsocietypublishing.org J. R. Soc. Interface 11: 20140249 accessible from this set. Thus, phenotypic robustness and potential evolvability are positively correlated.

Discussion
In this paper, we have explored the properties of a GP map for biological self-assembly of the kind exhibited by protein quaternary structure based on a recently introduced Polyomino model for tile assembly [14,15]. We compared its properties to models of RNA secondary structure and the HP model for protein tertiary structure. As is the case for these two well-studied GP maps, we argue that even though our Polyomino model is highly schematic and thus misses many details of protein quaternary structure, it may nevertheless provide important biological insight into the structure of the design space for protein complexes.
Despite the great complexity in potential phenotypes, the polyomino model remains tractable as demonstrated in this paper by the ability to perform a wide variety of useful measurements on the GP map. Our main results are: firstly, the Polyomino model exhibits large-scale redundancy, a strong phenotype bias and the presence of disconnected genotype components across several parametrizations of the Polyomino GP map ( §3.1 -3.3). Secondly, that shape space may be covered in only a fraction of mutations-that is, all phenotypes are a significantly smaller number of mutations away from each other than the total sequence length ( §3.4). Thirdly, phenotypic robustness scales very roughly with the logarithm of the phenotype frequency ( §3.5). And finally, genotypic robustness and genotypic evolvability are negatively correlated, while phenotypic robustness and phenotypic ( potential) evolvability are positively correlated ( §3.6). The Polyomino model describes the self-assembly of disconnected units (proteins) into finite-sized structures ( protein clusters) that can vary in size. By contrast, for RNA secondary structure and the HP model for protein tertiary structure, strings of connected units (nucleotides or amino acids) assemble into shapes of a fixed size. Given the substantially different class of the phenotypes in our model, it is remarkable that the measured properties of these GP maps turn out to be so similar. This begs the question of whether what we observe is in fact a more general property of selfassembling systems, or even broader, whether a wider class of GP maps will share these properties. This question can be sharpened by looking at the different properties separately. Redundancy should be widely shared across GP maps. Phenotype bias has been observed in models for gene-regulatory networks [54] and developmental networks [55]. Could it be a more general property of GP maps? Disconnected components have also been observed in a Boolean threshold model for gene regulation [56]. To the best of our knowledge, shape space covering has not been studied for other GP maps, but general considerations based on the high-dimensionality of genetic space suggest that something like this may be rsif.royalsocietypublishing.org J. R. Soc. Interface 11: 20140249 more widely relevant [57]. Finally, the correlation between phenotypic robustness r p and potential evolvability e p is deeply connected to the geometry of neutral sets and so is likely to be a much more general property of GP maps. How this correlation plays out for realistic evolutionary dynamics is, of course, a much more complicated question.
We are hopeful that more complete answers will be derived through further analysis and comparisons of different model GP maps in a similar manner to the work here or, for example, in [29] or [2]. An important related question is whether model GP maps for parts of systems can be combined to achieve a more complete understanding of the evolution of phenotypic traits in the full organismal GP maps [37].
The Polyomino model can easily be adapted to study unbounded assembly or the assembly of synthetically produced objects like DNA tiles or patchy colloids [28]. Thus, the perspective gained from viewing polyominoes as a GP map may also shed light on the artificial design process for these systems.
Finally, although we introduce the Polyomino model as a coarse-grained model for protein quaternary structure, it is clear that the model is not capable of modelling particular intricacies of some individual proteins. For example, F1-ATPase is a protein whose subunit structure could not be accurately represented with a square tile model. We argue that the Polyomino model nevertheless provides biological insight into questions about the global nature of the GP map that would not be computationally accessible using more complex models with greater biological detail.
However, further work is needed to assess how well this new GP map performs in this respect.
Appendix A. Methods: interface types, assembly and phenotype In this section, we discuss how variation in number of interface types, phenotype definition and assembly process can affect properties of the GP map produced. We performed GP map analysis of S 2,4 , S 2,6 , S 2,8 , S (any tile may seed the assembly process) and in figure 8 show how redundancy, phenotype bias and robustness vary.

A.1. Interface types
First, we discuss the impact of varying interface type numbers, c. As the number of interface types is increased (c ¼ 4, 6,8), we see that the number of phenotypes (N p ) increases from 8 before saturating at 13 (S 2,c!6 ). For the adjacent integer interface interactions, we have defined for these GP maps, an increase in N p before saturating for a certain value of c is expected to be a general behaviour due to the following rationale: there are 4N t edges in the assembly kit and thus for c ! 4N t , every edge may be given a unique interface, resulting in no further new interactions being able to be introduced to the assembly kit by further increasing c. In general,   (any tile may seed the assembly process). In particular, as number of interface types is increased, N p increases up to saturation where no more new phenotypes may be constructed. The addition of a self-interacting interface type (S 0 2,8 ) allows a further increase in the number of phenotypes. Defining phenotype identity for chiral counterparts only removes a single phenotype in S c 2,8 . Allowing the assembly process to be seeded by any tile in the assembly kit in S g 2,8 reduces N p by two phenotypes as further specificity is required in the assembly kit to ensure deterministic assembly. The overall properties of phenotype bias and robustness scaling with log-frequency are qualitatively invariant. rsif.royalsocietypublishing.org J. R. Soc. Interface 11: 20140249 we expect N p to saturate for smaller values of c due to the deterministic assembly criterion.
We also consider the effect of introducing a self-interacting interface type (7 $ 7) to the standard S 2,8 GP map. In figure 8, we see what nine more phenotypes are now produced, although the overall number of phenotypes remains of the same order of magnitude.

A.2. Phenotype definition
As discussed in §2, there are several ways of defining a polyomino phenotype and, as well as being biologically more or less relevant, can possibly affect the GP map properties discussed in this paper. The approach taken in our work has focused on treating the self-assembly process in two dimensions, leading to tiles with distinct chirality being used in the assembly process, as well as distinct chirality being applied to the overall phenotypes. In this section, we discuss one particular alternative GP map where the chiral counterparts of a given polyomino are treated as the same phenotype with all other aspects of the GP map definition remaining the same (S c Nt,c ). In figure 8, we depict the effect on phenotype numbers in the column S c 2,8 with N p only reduced from 13 to 12 (a reduction of N p by 7.14%) as there is only a single phenotype with a distinct chiral counterpart. The change in N p becomes more pronounced for larger systems-in S c 3,8 a 30.41% reduction and in S c 4,16 43.03% reduction. We note that the change in N p can at most be 50% when every phenotype has a chiral counterpart, and owing to the symmetries of the way assembly kits are constructed in the Polyomino GP map, chiral counterparts would have precisely the same frequencies, component numbers and robustness as each other. With these considerations, we can see that chirality is a property that will not dramatically affect any of the overall GP map properties discussed here, but may be important when considering particular biological situations.
In some GP maps, changes in phenotype definition can have a more dramatic effect. For example, in the HP lattice model, considering phenotypes only from the subset of maximally compact structures greatly affects the number of potential phenotypes. For those of length L ¼ 25 the size of the non-compact set (that may be produced by at least a single sequence from the 2 25 ) is 107 336 [36], in comparison to the compact set of 549 [58], a difference of several orders of magnitude. The strong effect here is due to genotypes that have multiple minimum energy structures in the non-compact case being reduced to a unique structure in the compact case.
An analogous definition for the Polyomino GP maps would be a phenotype reclassification that resulted in members from the set of non-deterministic or unbound phenotypes giving rise to large numbers of new meaningful phenotypes. Such ideas are certainly conceivable-unbound behaviour may be deterministic and thus may be categorized as individual phenotypes in their own right by a description of a repeating unit in the structure. However, detailed categorization of such formulations may be regarded as less biologically relevant-most biological functions performed by protein complexes rely on deterministic bound assemblies. However, we note that the Polyomino GP map may be parametrized to consider such cases, as shown in the haemoglobin example in figure 2.

A.3. General tile seeding
A final variation of the GP map considered is the seeding of the assembly process itself. In the S Nt , c GP maps tested here, it is always the first tile in the assembly kit which is used to begin the assembly process. In figure 8, we demonstrate the effect on N p , phenotype bias and robustness, in allowing any randomly chosen tile to seed the assembly (general tile seeding, first discussed by Ahnert et al. [14]). We denote this assembly protocol as S g Nt,c . As can be seen in figure 8 For GP maps enumerated exhaustively, the enumeration is performed in two steps: first the space is sampled randomly with large values for k (repeat assemblies) and D max , to check for the largest structures that may be produced deterministically. As mentioned in §2, D max ¼ 16 was found to be sufficient for all GP maps studied here. With regard to determinism checking, for low values of k some genotypes will be mistakenly classified as deterministic, while if they were assembled a greater number of times, a different polyomino would eventually be produced. To reduce the chance of error to essentially 0 (it can never be fully 0 as the process is stochastic), in the S 2,8 GP map we used values of k ¼ 1000. For S 3,8 , computationally more challenging owing to the much greater number of genotypes, we were only able to set k ¼ 200 for the full enumeration, although from sampling at k ¼ 5000, we found the exact number of phenotypes, as well as finding the D max ¼ 16 value sufficient. Furthermore, from analyses of genotype sampling for values of k . 200, we saw only rare misclassification, leading to negligible error in this regard.
Owing to the much greater size of the S 4,16 GP, this could only be enumerated through a sampling approach and we made use of the algorithm developed by Jö rg et al. [42]. We used smaller values of k ¼ 20 to aid the speed of the algorithm, with any error due to misclassified deterministic assembly being absorbed into the error predicted by the algorithm, which is shown to be small in our work. The algorithm of Jö rg et al. [42] relies upon a distance measure between phenotypes A and B. We constructed such a measure between polyominoes based upon the number of blocks that would need to be moved to produce B from A. If there is a difference in sizes of A and B, the number of blocks required to be added or removed is also included in the distance. This simple measure allowed an effective implementation of the algorithm.