Chromosomal Mcm2-7 distribution and the genome replication program in species from yeast to humans

The spatio-temporal program of genome replication across eukaryotes is thought to be driven both by the uneven loading of pre-replication complexes (pre-RCs) across the genome at the onset of S-phase, and by differences in the timing of activation of these complexes during S phase. To determine the degree to which distribution of pre-RC loading alone could account for chromosomal replication patterns, we mapped the binding sites of the Mcm2-7 helicase complex (MCM) in budding yeast, fission yeast, mouse and humans. We observed similar individual MCM double-hexamer (DH) footprints across the species, but notable differences in their distribution: Footprints in budding yeast were more sharply focused compared to the other three organisms, consistent with the relative sequence specificity of replication origins in S. cerevisiae. Nonetheless, with some clear exceptions, most notably the inactive X-chromosome, much of the fluctuation in replication timing along the chromosomes in all four organisms reflected uneven chromosomal distribution of pre-replication complexes.


Did the authors themselves perform the control experiments in conditions matched to
those of the ChEC-Seq experiments? Several recent papers have debated the merits of ChEC-Seq [1][2][3][4], and it seems clear that the MNase controls should be closely matched to the experiment performed [5]. It is not clear from the details provided by the authors if that is the case here. For HeLa cells, the authors used published data. It seems unlikely that this is a sufficiently well matched control.
Indeed, the biggest difference in some cases appears to be the unusually low background in the ChEC-Seq experiments. Since the authors have not complied with my request to provide numerical metrics of read density in the heatmaps (which I still disagree with), it is impossible to tell if the differences between the MNase coverage and ChEC-Seq are simply the result of coverage imbalances. I attempted to get this information from one of the supplementary tables, however the MNase experiments are not listed in the table. Thus, I cannot conclude much from these data as presented. Controls should be presented far more clearly and unambiguously.
3. The authors state that: "Of particular note is our observation that they do not show the preferential association with replication initiation sites over replication termination sites that we see with Mcm-ChEC in S pombe (S10 Fig) and HeLa cells (S16 Fig)." I would caution the authors that while replication initiates at relatively well defined loci, termination seems likely to be more stochastic, spread across a far wider region than initiation is in most cases. Thus, these side-by-side comparisons of initiation vs termination regions should be taken with this precaution in mind and are not suitable as the primary evidence that their experiments are working.
4. The authors provide numerous scatterplots to compare ChEC-Seq to free MNase. These comparisons raise multiple questions. First, the authors should explicitly explain what exactly they are plotting. Is it all reads, or reads of a particular size? In cerevisiae ( Fig.  S2), the scatterplots are generated using 100 bp windows, which is reasonable, as it reflects the scale of the expected ChEC-Seq signal. However, in other organisms, the authors have performed these comparisons in 10Kb bins across the entire genome. This is the wrong scale for this comparison and much of the correlation (or lack thereof) is potentially driven by reads derived from the genomic background (S13, S17). Finally, the correlations for ChEC-Seq are all a little unusual because they appear to lack any stochastic genomic background. Is this the result of some data processing not described in the figure legend? Such background appears to be absent in the heatmaps too but is not explained by the authors. This point was raised, but not addressed in my initial review. Importantly, it is not clear how much the correlation with MNase-Seq would improve if their analyses were restricted to just the places where MCM-ChEC-Seq shows a "peak" or signal of some sort. This should be explored in depth.
5. The authors do not show any data for replicates at high resolution. This is essential to validate the consistency of their technique.

Figure S3:
The data for individual panels are not the same as the grouped panel. In particular, the MCM6 data are shifted.

Reviewer 1; point 6:
Figure S12B: The authors provide a new figure that helps to better infer the MCM2 ChEC-Seq fragment length in different species. This figure corroborates the issue I raised in my review but the differences between species are still not discussed by the authors.

Reviewer 1; point 7:
The authors agree that their method cannot distinguish between double MCM peaks and multiple binding sites for single MCMs. Nonetheless, they persist in stating that an advantage of ChEC-Seq is precisely that they can do this analysis, both in the legend of S6:

"indicates that the two MCM DH footprints at this origin mostly reflect signals that emanate from two populations of cells, each with a single MCM DH"
and in the introduction:

"Mcm ChIP, while adequate for determining the relative MCM abundance among the origins,lacks the necessary resolution to unambiguously resolve the exact number of loaded MCM complexes. The high resolution of the ChEC technique overcomes this barrier and thus enabled us to address the question directly"
Similar comments are also made in the discussion and should be rectified.

Review point 11:
The authors have not performed this analysis and have ignored this suggestion. Instead, and despite their acknowledgement that ChEC-Seq cannot make inferences about binding multiplicity, the authors persist in claiming that:

"Our results contradict the previous conclusions by Das et al., and we now note this in the Discussion."
Das et al. showed that multiple MCMs appear to load at yeast origins. At one origin they studied in detail (ARS1; aka ARS416), they observed that three MCM complexes load at this locus on each plasmid in their system. A quick glance at Figure S4 shows multiple (2)(3)(4) peaks at this locus using ChEC-Seq. Given these data together, I would suggest that in contrast to the assertions of the authors, the ChEC-Seq experiment may instead be showing exactly the expected pattern if multiple MCMs load at the same time.
Review point 12: (Fig 6) The authors have inexplicably provided variance estimates for the figure as a separate table instead of as error bars. I understand that the standard deviations are huge relative to the mean, but that is exactly why they should be shown on the graph! A boxplot would be a better solution and this should be remedied. Also, it is not clear why the inclusion of other chromosomes would make "computational analyses simpler and less error-prone". Although this is commendably honest, the authors should explain what "errors" arise when using other chromosomes? Finally, the numerical values on the y-axis still do not match the (revised) numbers in the figure legend.

Reviewer 1; point 15:
This point was modified in one place in the text but not others. It should be checked throughout.

Reviewer 1; point 18:
The authors have neither addressed my request to examine MCM2-ChEC Seq at high resolution in human cells, nor explained why they have ignored this request. High resolution proxies for origins of replication in human include ORC2 ChIP-Seq [6], Ini-Seq [7] and SNS-Seq in HeLA cells [8](GSE134988). SNS-Seq data from mouse are also available (Cayrou et al.

Reviewer 1; point 19:
In human, although some initiation appears to occur in zones spanning tens of Kb, experiments such as those mentioned above also identify discrete sites that represent putative origins of replication of <1 Kb in width. These could be assessed.
I also disagree that showing mouse data in Figure 2A is redundant. There is no figure that shows high-resolution mouse data, analogous to figure 2A (top). Given the high-resolution nature of the ChEC-Seq protocol, this is essential.

Reviewer 1; point 22:
The authors have not responded to my questions.

Reviewer 1; point 23:
This point still requires clarification.