Quantifying individual influence in leading-following behavior of Bechstein’s bats

Leading-following behavior as a way of transferring information about the location of resources is widespread in different animal societies. However, it cannot always be observed directly. Here, we develop a general method to infer leading-following events from observational data if only the discrete appearance of individuals is recorded. Our method further allows to distinguish such events from local enhancement at the resource, such as swarming behavior in case of bats, which is another widespread way of transferring information among animals. To test our methodology, we analyze longitudinal data about the roosting behavior of Bechstein’s bats from two different colonies and different years. The detection of leading-following events allows us, in a second step, to construct social networks in which nodes represent individual bats and directed and weighted links the leading-following events. We analyze the topology of these networks on the level of the colony, to see whether all individuals participate in leading-following behavior. Further, based on the leading-following network we measure the importance of individuals in this leading-following behavior by calculating different centrality measures. We find that individuals can be consistently ranked regarding their influence on others. Moreover, we identify a small set of individuals that play a central role in leading other bats to roosts. Our methodology can be used to understand the leading-following behavior and the individual impact of group members on the spread of information in animal groups in general.


Introduction
Leading-following behavior is prominent in dierent species to transfer information from informed to naïve individuals (Franks et al., 2002;Reebs, 2000;Kerth and Reckardt, 2003;Biro et al., 2006;Strandburg-Peshkin et al., 2015). Individuals who actively explore their environment, gather private information about the availability or the location of a certain resource, and subsequently lead naïve individuals to these resources (Franklin and Franks, 2012). By following a leader, naïve individuals gather information socially and become informed without having to spend prior search eort (Giraldeau and Caraco, 2000). When grouping at the resource is benecial, e.g. during communal roosting, informed individuals benet from leading naïve individuals as this increases the likelihood of conspecics being present at the resource (Richner and Heeb, 1996).
This points to the question how individuals can assume their role as leaders or followers. Studies in collective motion have already reported that distinct leadership roles can emerge if some individuals are more active or better informed than others (Reebs, 2000;Pettit et al., 2013) or stand to gain more from imposing their preferences (Conradt and List, 2009;Rands et al., 2003). The presence of a small fraction of informed leaders has also been shown to be sucient in guiding the movement of large groups with great accuracy in both human and animal societies (Couzin et al., 2005;Dyer et al., 2008). Some animal studies have even suggested that in addition to immediate cost and benets, leadership is a personality trait independent of dierences in information or knowledge of the environment (see Johnstone and Manica (2011) and references therein).
However, answering such primary questions becomes complicated when observations do not continuously track the information transfer through an animal system, but rather contain isolated individual measurements, e.g. discrete records of animal occurrences at measurement sites. In such cases, any leading-following behavior must be rst reconstructed from the available data, for which one needs a sound methodology. It is one of the aims of this paper to provide this methodology, to (i) infer leading-following events from observational data and (ii) to distinguish such events from local enhancement at the resource. An example for such local enhancement is swarming behavior at potential day roost in the morning before bats collectively choose where to roost communally.
The second aim is to identify those individuals that play an important role in such leadingfollowing behavior, by recruiting many other naïve individuals. This has as a precondition the reliable reconstruction of leading-following events from data. But it further needs an appropriate representation of the consecutive interactions between individuals and in particular a suitable measure to quantify importance, i.e. the inuence on naïve individuals in leading-following behavior.
To reach this second aim, we build on the established methodology of social network analysis (Wasserman and Faust, 1994). Social network theory has transcended the human domain and has become widely accepted as an important conceptual framework for studying social interactions in animal groups (Croft et al., 2008;Wey et al., 2008;Pinter-Wollman et al., 2013). Its level of abstraction, where individuals become nodes and their interactions become links, allows us to quantitatively analyze social organisation in animal groups at all levels (individual, group, 2/32 community, population, etc.) across a wide range of interaction types (recruitment, friendship, conict, communication, etc.) (Krause et al., 2009).
As social structures in vertebrate animal systems are founded on behavioural interactions among individuals (Whitehead, 2008), social network analysis can be applied for studying social organisation in these systems as well. In this paper, we focus on Bechstein's bats (Myotis bechsteinii), a forest-living, European bat species. During summer females form colonies that switch between many dierent communal day roosts in tree cavities and bat boxes (Kerth and Reckardt, 2003;Kerth et al., 2006;Fleischmann et al., 2013). In Bechstein's bats, social network theory has unveiled the presence of long-term social relationships despite the high ssion-fusion dynamics of the colonies, thereby imparting novel insights on the relation between cognitive abilities and social complexity (Kerth et al., 2011).
Specically, in this paper we analyze the leading-following behavior of these bats to potential day roosts (bat boxes). After inferring such leading-following events from observational data, we construct a social network in which individuals are represented as nodes and their leadingfollowing events as directed and weighted links, where the weights indicate the frequency of such events. This abstraction allows us to further analyze topological characteristics of such networks.
On the level of the animal group (here, bat colony), this includes features such as connectedness, i.e. whether all individuals are part of the network. On the individual level, it allows to calculate centralities to infer the importance of the nodes, which translates to the inuence of specic bats in this leading-following behavior.
To demonstrate the applicability of our methods, we analyze data sets from two dierent colonies of Bechstein's bats and from ve dierent years. This has implications for a better understanding of the collective behavior and information transfer about novel roosts in Bechstein's bats. As we point out in the concluding discussions, we see the potential for a much broader application of our methodology to the leading-following behavior in dierent species.  Kerth and König (1999)). Such maternity colonies comprise 10-50 individuals, have a very stable individual composition, and are highly heterogeneous with respect to the age, reproductive status and the degree of relatedness among colony members (Kerth et al., 2002(Kerth et al., , 2011. Colonies switch communal roosts (tree cavities and bat boxes) almost daily and regularly split into several subgroups that use separate day roost (Kerth and König, 1999;Kerth et al., 2011). Communal roosting provides the females and their osprings with grouping 3/32 benets, such as energetic advantages through clustering (e.g. social thermoregulation; Pretzla, Kerth and Dausmann, 2010;Kuepper, Melber and Kerth, 2016).
At the same time the frequent roost switching forces the female Bechstein's bats to regularly explore new potential roosts during their nightly foraging trips and to coordinate their movements among day roosts in order to avoid permanent ssion of the colony (Kerth and Reckardt, 2003;Kerth et al., 2006;Fleischmann et al., 2013). Experienced individuals, who have discovered the locations of suitable roosts through independent exploration, transfer their private knowledge to naïve conspecics by leading them to these locations. Such leading-following events take place when one or several experienced bats arrive together with one or several naïve bats at a box at night. Information transfer about suitable roosts provides benets to both the leading and the following bat. By leading conspecics to potential roosts, an experienced individual increases the likelihood of communally roosting with conspecics. At the same time, by following experienced individuals, naïve bats gather information socially without the need to spend prior search eort.

Field data collection
From 2007 to 2011, we studied two colonies (BS and GB2) of Bechstein's bats within their home ranges located in two forests near Würzburg, Germany ( Figure S1, left). Since 1996, all adult female bats in both colonies have been individually marked with individual RFID-tags in their rst year of life (Kerth and van Schaik, 2012). Each RFID-tag is programmed with a unique 10digit ID that can be identied and recorded by automatic reading devices Kerth and Reckardt (2003). The study period in each year was between the beginning of May and end of September.
In that time, the colonies' home ranges were equipped with about 20-30 experimental bat boxes per year in addition to a large number of already existing boxes (about 100; Fleischmann et al. (2013); Figure S1, right). These boxes were to serve as day roosts, similar to natural roosts in tree cavities, in which the Bechstein's bats spend the day. All experimental boxes were equipped with RFID-loggers that recorded the bats' nightly visits (Kerth and Reckardt, 2003;Fleischmann et al., 2013). In this way, every time a bat passes the entrance of an experimental box, its unique ID would be read and stored by the reading device without disturbance to the individual.
At the beginning of the study period in each year, the experimental boxes were placed within the home ranges and thus their locations were unknown to the bats until the rst colony members discover them through private information gathering. Importantly, not all experimental boxes were discovered by the colony in a given year. Moreover, not all discovered and visited experimental boxes were subsequently used as day roosts.
Our datasets, thus, consist of the yearly recordings of the reading devices from all experimental boxes for each of the two colonies in each of the ve years. Each recording contains a timestamp and the unique 10-digit ID of the bat who activated the reading device. An example dataset is 4/32 P. Mavrodiev, D. Fleischmann, G.Kerth, F. Schweitzer: Quantifying individual inuence in leading-following behavior of Bechstein's bats Submitted for publication shown in Table S1 in the supplemental material. Table 1 shows a summary of the total number of readings and the number of installed, discovered and occupied experimental roosts, for each colony throughout the years.  and Reckardt (2003). An individual bat is said to be naïve at time t 1 regarding a given box, if it has not been recorded by the reading device in that box for all times t < t 1 . Similarly, an individual bat is considered experienced at time t 2 regarding a given box, if it has been recorded in that box at any previous time t < t 2 . We dene a leading-following (L/F) event to a given box at time t 3 as the joint visit of two individuals -one naïve and one experienced at time t 3 .
In case more than two bats arrive jointly, we form all possible L/F pairs consisting of one naïve follower and one experienced leader.
With this denition of L/F events, the actual inference of L/F event patterns from the data relies Constructing leading-following networks. Following the above procedure, we identied all L/F events in each of our datasets. We then constructed directed and weighted leadingfollowing (L/F) networks, aggregated over the duration of the study period. In these networks, a node represents an individual bat and a link between two nodes indicates their involvement in a leading-following event. More specically, links are directed. A directed link from node A to node B, denoted as A → B, means that individual A followed individual B to a given experimental box. The weight of this directed link is the number of times that A followed B (to dierent experimental boxes) during the study period.
We also compute the number of weakly connected (WCC) and strongly connected components (SCC). A WCC of a network is a sub-network in which any node can be reached from any other node, either by a link between these two nodes, or by following a sequence of links through other nodes, regardless of the direction of these links. Similarly, a SCC is a WCC with the additional restriction that the direction of the links must be respected when connecting any two nodes. As we explain in the next section, these two measures are particularly important for judging the extent to which information can spread in a network.

Social Network Analysis
Quantifying individual inuence. Social network analysis builds on the existence of a social network that can be analyzed. Such a network has been constructed in the previous step, where directed links represent leading-following events between individual bats. We can now use the topology of the network, i.e. the relation between nodes expressed by their links, to characterize the position of individuals in such a network.
Our aim is to identify those nodes, i.e. individual bats, that are most inuential in leading other bats. In social network analysis, the importance, or inuence, of a node in a certain dynamical process owing through the network is referred to as centrality. There are various centrality measures in use, and each makes certain implicit assumptions about the dynamical process owing through the network (Borgatti, 2005). Choosing a centrality measure is, thus, contextdependent (see Figure 1). degree centrality with α = 0.5. Individual 4 has a higher centrality than her in-degree score, as we account for the indirect contribution of individual 1 (3 + 0.5 × 1 = 3.5). However, 5 is now more important than 1, because 4 contributes to 5 indirectly (1 + 0.5 × 3 = 2.5).
In-degree, eigenvector and second-degree centrality. In our case, an appropriate centrality measure must reect the notion of individual importance in spreading information about suitable roosts. The simplest possible measure is the in-degree centrality (Figure 1a), which denes individual importance as the total number of bats that an experienced bat spread information to directly. In-degree centrality is, thus, calculated as the weighted sum of all directed links that point to a given experienced individual.
In-degree centrality measures the total number of leadings, i.e. direct inuence, without considering how the information distributed by a leader to its followers propagates further through the colony. To also account for such indirect eects, an alternative centrality measure is eigenvector centrality (Figure 1b). In a social network, a node has high eigenvector centrality if it is pointed to by nodes that themselves have high eigenvector centralities. In other words, an experienced bat leading a few bats, who themselves lead a lot can be more inuential than a bat leading many other bats who never lead. The computation of eigenvector centralities is presented in Section S.5 of the Supplementary Material.
The in-degree and eigenvector centralities represent two extremes, the former measuring exclusively direct inuence, and the latter additionally measuring all possible indirect ways, in which information can ow from one individual to all the rest. Eigenvector centrality, however, considers all chains to be of equal importance. Hence, this metric will grow with the length of the chain and individuals who are part of longer chains will tend to be quantied as more inuential.
This inuence, however, does not reect genuine information spreading, as it is quite likely that beyond length two, the target roost of the L/F events further down the chain, changes.

7/32
To address this issue with eigenvector centrality, we dene a new metricsecond-degree centrality ( Figure 1c) -which computes centrality as the in-degree of the focal individual and the sum of the in-degrees of its followers, weighted by a factor α (in that sense the followers of one's followers are its second-degree followers). This reects our observation that chains of length up to two constitute the majority in all datasets. We, thus, use second-degree centrality as the main measure for quantifying individual inuence.

Chains of L/F events
Using the above denition of L/F events, rst we have determined the three relevant parameters to determine L/F events in the data as (1) 5 minutes for the maximum allowed time dierence, Identifying all L/F events allows us to construct the respective network in the following. Before, however, we are interested in the occurence of chains of L/F events of a certain length, through which information about a xed roost is spread. For example, two L/F events, A → B and C → A, constitute a chain of length two (in addition to forming two separate chains of length one), provided both were to the same roost. In other words, we assume that B spread the information to A, and A, in turn, transferred it further to C. Therefore, B ought to obtain direct importance from having led A, but also indirect contribution, for were it not to B, A would not have learned about this box and thus could not lead C to it. This assumption is not entirely correct, however, since it is possible, though unknowable, that A would have found the roost by its own exploration, or that A forgot the information obtained from B, and re-visited the box before leading C. The latter issue is exacerbated with the length of the event chains we consider. Figure 2 shows the relative frequency, aggregated over all datasets, of observing chains of L/F events. This frequency can be interpreted as the probability of nding chains of a given length.
As the inset in Figure 2 demonstrates, the probability distribution resembles an exponential distribution. The plot further indicates that chains longer than 16 did not occur in any of the datasets we have. More importantly, event chains of length up to two constitute about 80% of all lengths observed, and the probability of longer chains decreases drastically. We, thus, argue that the long L/F chains we observe in the L/F networks likely do not represent information spread about the same roost, and should therefore be discounted by any inuence measure.  Figure 3, we realize that individuals differ remarkably with respect to their importance, as reected both by their in-degree centrality (size of the nodes) and their eigenvector centrality (node color). It is also evident that there are correlation between in-degree and eigenvector centrality, as visible for the four individuals in the center. Table 2 presents salient network characteristics, regarding the degree of connectedness of the L/F networks in all datasets. Network density is dened as the fraction of inferred L/F events out of the maximum possible number of L/F events for that network. For example, the L/F network for the GB2 colony in 2007 consists of 31 individuals, hence the maximum possible number of L/F events is 31 × 30 = 930, which yields a network density of 0.06.
We see that the two colonies dier in this respect through the years. While the L/F networks for the BS colony displays high density and connectivity for all study years, the L/F networks for GB2 colony in the years 2007, 2009 and 2010 have low density consistent with the fewer L/F events observed. Therefore, to calculate the importance of each individual, we use only the cyan-coloured datasets in Table 2, as they provide the most reliable sample sizes of detected L/F events for statistical analysis.
If we focus only on these datasets, we nd that their respective L/F networks are weakly con-9/32 leading-following between the same leader and follower, but to dierent roosts, are omitted to maintain the readability of the graph. Total number of unique L/F events is 262, while the total number of L/F events, including multiple leading-following between the same individuals, is 321 (Table 2). nected. I.e. there is only one weakly connected component, which means that all individuals participated in L/F events. Moreover, these networks consist of only a few (1-3) strongly connected components (SCC). Within an SCC, each individual can be reached from any other individual by following (a chain of ) directed links. In most of the chosen cases, the size of the largest SCC is similar to the total number of nodes, which means that the vast majority of indi-10/32 viduals participated as both leaders and followers. Otherwise, one could reach a given individual through a directed chain, but will not be able to connect from this individual back to the network via a directed chain. Hence, individuals would be part of a weakly connected component (WCC) because they are either followers or leaders, but they would not be part of a SCC.

Quantifying individual inuence
The construction of the dierent L/F networks as described above now allows us to quantify the importance of individuals in these networks. For this, we use the three dierent centrality measures introduced in Section 3.1, i.e. in-degree centrality, eigenvector centrality and seconddegree centrality. Figure S2 in the Supplementary Material shows the results of each of these measures separately for the colony GB2 for the year 2008. If we compare the absolute values of the centralities, we nd that inuence scores are heterogeneous with a majority of individuals exerting low to mid inuence and a minority having high inuence. This result holds regardless of the centrality metric used to quantify inuence. We note that already the visualization of the L/F network in Figure 3 uses the information of centrality values.
We can use the absolute values to determine the relative importance, by ranking individuals according to their second-degree centrality. The results are shown in Figure 4, where the diagonal indicates increasing rank, i.e. decreasing importance. In order to determine whether these results

11/32
second-degree centrality indegree centrality eigenvector centrality  to the second degree centrality (square symbols) in increasing rank order (rank 1 -highest centrality). For each bat we additionally plot its rank when importance is quantied as in-degree (circle symbols) and eigenvector centrality (cross symbols). Overlap of the three symbols indicates that the given individual has the same rank, regardless of the centrality measure used. For the individual centrality values see Figure S2.
are robust if instead of second-degree centrality the other two measures are used for the ranking, we have provided the respective ranks in the same plot. As Figure 4 shows, the three proposed centrality measures produce a highly consistent ranking of individual inuence.
To verify this nding, we have extended the above analysis to all datasets indicated in Table 2.
For each dataset, we have then calculated the Pearson correlation between the rankings obtained from the three centrality measures. The results are given in Table S10 in the Supplementary Material. We nd that for all datasets the Pearson correlation is very high for all combinations.
That means that ranking individuals according to any of the measures leads to a consistent rank of inuence.

Discussion
This paper provides a general methodology for inferring interaction networks from proximal data.
We use rich longitudinal data sets of joint visits of Bechstein's bats in potential day roosts. While proximal networks do not always correlate well with interaction networks (Castles et al., 2014), it has been argued (Farine, 2015) that proximity is a good proxy for interactions in ssion-fusion societies such as Bechstein's bats.

12/32
Below we summarize our approach as it diers from common techniques (Farine and Whitehead, 2015) to study animal association patterns via social networks. Typically, when social networks are used, the observed interaction strength between two individuals is either thresholded, sampled or used as a link weight, to calculate various association indexes (Franks et al., 2010;Croft et al., 2008;Hoppitt and Farine, 2018). In line with Farine and Whitehead (2015), we do not threshold our networks to avoid dubious statistical biases. Instead, we include all of the observed individuals and their recorded activity and analyze the full scale of inferred interactions.
Moreover, we also go beyond calculating association indexes and the corresponding Mantel tests.
Association indexes are local measures in that they only reect dyadic relations between any two individuals. To quantify the systemic inuence of individuals, we need to provide measures that also capture their proclivity to act as social hubs, as recognized already by Farine and Whitehead (2015) and Brent (2015). Therefore, we have proposed a novel centrality measure, second-degree centrality.
Our methodology builds on raw data that contains only the recordings of single bats entering a given roost site at a particular time. Such data per se does not contain any information about importance, or inuence. We focus on a specic type of inuence, namely that an experienced individual leads an inexperienced, i.e. naïve , individual to a particular roost. Therefore, the rst challenge is to identify leading-following (L/F) events from this data and to construct a social network from all these L/F events, and the second challenge is to quantify the importance of individuals in this leading-following network, appropriately.
Regarding the rst challenge, we note that most eld experiments, including ours, are limited by the state-of-the-art passive RFID-tagging, which only records presence data. There is a more advanced technique (Ripperger et al., 2019) that uses an proximity sensor system to continuously track the leading-following behaviour between female bats and their juvenile to suitable roosts.
However, such technology is still in its nascent stage and not widely used in eld experiments, as with this battery-powered system small bat species cannot be tagged at present and it is not possible to follow many individuals over an extended period of time.
Our methodological contribution can be also adopted for other species where leading-following behavior plays a role and only recordings of individual positions are available. This includes, for example, automatic RFID-tag recordings at feeding stations (Farine and Whitehead, 2015) and other resources where dierent group members meet, such as burrows in rodents (König et al., 2015). As we demonstrate, such recordings can be systematically analyzed by comparing (statistically) the distributions of L/F time dierences, to infer genuine L/F events.
A major contribution of our analysis is a thorough investigation of the parameters that allow to distinguish a L/F event from other types of encounters (e.g. local enhancement) at a given box.
We recall that there is no ground truth available that tells us about the correct identication of L/F events from the data. We argued that the time dierences of L/F events can be used Regarding the second challenge, we have proposed a new measure of individual inuence that can be derived from these L/F events. Obviously, there is no natural distinction between only leaders and only followers in the observed bat colonies. Instead, almost all individuals are both leaders and followers, but at dierent times and, importantly, to a dierent degree. For comparison, in African elephants, a single matriach leads a group and the group members prot from following her as she has long-term experience (McComb et al., 2001). In primates, the individual inuence of group members can depend on the context, and may range from a single dominant individual who inuences where a group moves to, to a more widely distributed inuence on travel destinations among group member (King et al., 2008;Stueckle and Zinner, 2008). To quantify individual inuence, we have constructed a social network in which nodes represent individual bats, directed links indicate a leading-following event and the weight of the links considers the frequency of such events. Analyzing the topology of these social networks already allows us to draw several conclusions about the information sharing in the respective colonies.
First, note that we focus our analysis on dense networks (see Table 2). We found that these networks have only one weakly connected component (WCC) which contains most of the individuals. Density is a proxy for the intensity of leading-following behaviour, while the presence of one large WCC indicates that the majority of the colony partook in leading-following. Moreover, we also found that in most cases there are only very few (1-3) strongly connected components (SCC) of dierent size in the network (see Table 2). Hence, we can conclude that individuals in the same SCC participated both as leaders and as followers in dierent events. This tells us that information about suitable roosts is not concentrated in only a few important individuals, but is spread across the whole colony.
At the same time, we could also detect that not all individuals play an equal role as leaders or followers. Instead, their inuence, measured by leading inexperienced bats, diers considerably.
To quantify these dierences, we used dierent centrality measures as proxies of importance.
Two of these, in-degree and eigenvector centrality, are established measures, while the third one, second-degree centrality is a new measure introduced by us. As explained in Section 3.2, it cures certain shortcomings of the other two centrality measures if applied to L/F networks.
When considering aggregated measures, such as rankings, second-degree centrality is correlated to in-degree and eigenvector centrality, because it is derived from them. However, second-degree 14/32 P. Mavrodiev, D. Fleischmann, G.Kerth, F. Schweitzer:

Quantifying individual inuence in leading-following behavior of Bechstein's bats
Submitted for publication centrality diers on the individual level, as it more accurately reects the genuine information spreading observed in the data.
Computing the dierent centralities for each individual, we could identify that there are only a few important individuals that lead most of the other bats. These individuals stand out regardless of the centrality measure used. In particular, we also calculated that there are signicant correlations between the rankings obtained by using the dierent centrality measures. We emphasize that measuring inuence by means of centralities cannot be simply reduced to comparing numbers of leading events. The latter would not allow us to distinguish whether individuals always lead the same or diverse followers, or whether such followers are of less or equal importance in comparison to the leader.
We believe that our results can guide future empirical and theoretical studies in two ways.
First of all, we should realize that the constructed L/F networks do not already tell us about the mechanisms by which pairs of leaders and followers are formed. This process, known as recruitment, can be revealed by testing dierent recruitment rules in computer simulations, to check whether they result in the importance scores obtained from the empirical networks. In essence this entails the development of various null models. Null models are recognized as useful tools to test the viability of these recruitment rules in the presence of inherently non-independent behavioral data (Farine, 2017). We investigate a variety of such null models about recruitment behavior in Bechstein's bats in a subsequent paper (Mavrodiev et al., 2019).
Secondly, additional eld work needs to be devoted to study the behavioural variability of individuals in playing their role as leaders or followers. For example, demographic, health or genetic characteristics can inuence such roles Fischho et al., 2007;Keiser et al., 2016;McComb et al., 2011). With our study, however, we have already identied those individual bats that are prominent in these roles. This allows to target future experiments particularly toward individuals with very high or very low inuence, to nd out how dierent characteristics impact their leading-following behavior.

Electronic Supplementary Information
S.1 Illustration of the raw recordings in our datasets

S.2 Inferring L/F events
Recall that an L/F event is dened as the joint visit of an experienced and a naïve individual at a given box. Furthermore, we associate with each L/F event the experimental box in which it was detected, and the times at which the leader and the follower were recorded by the reading device in the box. Note that, it is not necessary for the leader to enter the box before the follower. Often it is the latter who is registered rst. In case the leader and the follower were recorded multiple times, we take those times that minimize the dierence between their appearances in the dataset (see Table S2 and associated explanation). Finally, we refer to the time_dierence of an L/F event as the absolute dierence between the recording times of the leader and the follower.
The actual inference of L/F events from the denition above depends on three parameters. The rst parameter is the maximum time dierence allowed between consecutive recordings of a leader and a follower, regardless of order. We refer to it as lf_delay. The lf_delay is important in determining which patterns constitute a joint visit of two individuals, as bats do not enter a box immediately upon arriving: females returning at night to a day roost usually encircle it several times before entering (Kerth and Reckardt, 2003;Schöner et al., 2010). Therefore, lf_delay limits the sheer number of L/F events we detect, since the higher the limit, the more likely it is to nd an experienced and a naïve individual recorded within lf_delay of each other. In the 20/32 P. Mavrodiev, D. Fleischmann, G.Kerth, F. Schweitzer: Quantifying individual inuence in leading-following behavior of Bechstein's bats Submitted for publication limit of lf_delay → ∞, we would detect the maximum number of L/F events, many of which would be false positives, as bats recorded days apart would still be assumed to have jointly arrived at a box.
The second parameter represents the minimum time a follower in an L/F event needs to potentially become a leader, i.e. the time needed to nd, recruit, and lead other followers. We denote it as turnaround_time. The importance of this parameter becomes apparent in Table S2, which shows a frequently occurring recording pattern. Assume that, for this box, individual 00065db1f6 is experienced at time 01:00:00 (line 1), individual 00068e1ac4 is naïve at 01:00:20 (line 2), and individual 00065ded81 is naïve at 01:01:10 (line 4). Taking lf_delay=3 minutes (which is a good rule-of-thumb Kerth and Reckardt (2003) we can deduce that individual 00068e1ac4 followed individual 00065db1f6 to that box, i.e. 00068e1ac4→00065db1f6. More precisely, we infer an L/F event to this box with the leader recorded at 01:00:00 and the follower at 01:00:20. The time dierence of this event is 20 seconds.
Let us further assume that 00068e1ac4 liked the box she was just led to, and in turn would like to show it to other individuals. Its second recording in this dataset is on the third line -40 seconds after its rst appearance as a follower. If we assume that turnaround_time < 40 seconds, then we also have to assume that 00068e1ac4 would have had enough time to y within its home range, meet other individuals, recruit and ultimately lead them back to this box. In this example, she led individual 00065ded81 who appeared within a time of lf_delay from it, i.e. we then also have to infer the L/F event 00065ded81→00068e1ac4. In addition, however, we see that 00065db1f6 and 00065ded81 appear within lf_delay of each other, hence we must also form the L/F pair 00065ded81→00065db1f6. Evidently this last L/F event contradicts 00065ded81→00068e1ac4. Hence, turnaround_time < 40 seconds is a wrong assumption.
The issue is that, in reality, the 40-second delay between the two readings of 00068e1ac4 is most likely not due to it having led another individual to the box. Instead, it is highly likely that either (i) the rst reading showed the bat entering the box and then leaving it again shortly thereafter or (ii) that the bat was simply encircling the box for 40 seconds, and then trig-21/32 gered the reading device a second time upon re-entry. The proper distinction between actual recruitment and such behavioural variability is the role of the parameter turnaround_time. In the toy example from Table S2 a more realistic interpretation is that 00065db1f6 led both 00068e1ac4 and 00065ded81, i.e. we would only infer two L/F events. Note that since 00068e1ac4 appears twice, we associate the time of its rst recording (01:00:20) with the L/F event 00068e1ac4→00065db1f6, since it minimizes the time dierence to the recording of the leader.
The third parameter is the hour in the morning, on the day of a box occupation, after which subsequent recordings from this box are ignored. The necessity to ignore some recordings comes from the need to distinguish between genuine information exchange about suitable roosts (in terms of leading-following) and pre-occupation behaviour. Before the occupation of a given box, experienced individuals who have decided to roost there, y around the box and emit echolocation calls that attract naïve individuals to the same box (O'Shea and Vaughan, 1977;Schöner et al., 2010). It has been suggested that this broadcasted information is used by naïve bats (especially juveniles) to learn the location of suitable roosts from experienced conspecics . The result is that occupation is preceded by a growing group of individuals (experienced and naïve ) ying around, or swarming, the roost for several hours. In our data, this is reected by readings of naïve individuals, which appear shortly after each other in a long sequence, together with the readings of experienced bats. As a result, additional L/F events will be identied with time dierences close to the allowable limit of lf_delay (see Section S.4 for illustration). These L/F events do not constitute genuine recruitment, in the sense that naïve individuals were led to a roost, but rather reect the swarming phenomenon (local enhancement).
Therefore, we dene the parameter occupation_deadline as the temporal deadline on the day of a box occupation, after which subsequent readings in this box are attributed to swarming, and thus ignored.

S.3 Selecting parameter values
As illustrated in Section S.2, each of the three parameters aects the inference of L/F events dierently. Therefore, it is important to choose proper values that allow us to identify an adequate number of genuine leading-following events for statistical analysis. Empirical research in the eld of information transfer in Bechstein's bats has suggested 3 minutes for lf_delay and 3am for occupation_deadline as a reasonable rule of thumb (Kerth and Reckardt, 2003). We build upon these heuristics by comparing the distributions of time dierences of all L/F events, xing lf_delay and varying the other two within a reasonable range (see Figure S1).
To generate sucient sample sizes for the comparison, the dataset we chose to analyze was the GB2 colony in 2008 (Table 1 in  highest number of discovered and occupied boxes, the second largest colony size, and a large amount of individual readings. Therefore, we expected to identify the largest number of L/F events from this dataset. Note that any combination of the three parameters is a 3-tuple, which generates a set of L/F time dierences from all identied L/F events in the dataset. An example is presented in Figure S1, where we show histograms of the L/F time dierences for lf_delay=turnaround_time=3 minutes, and occupation_deadline=2am (left) and occu-pation_deadline=3am (right). Figure S1 also illustrates why we focus on the distributions of L/F time dierences to select the values of the three parameters. As there is no objective method † to quantify the behaviour underlying each of the parameters, we argue that L/F time dierences best capture the eect that varying the parameters has on the L/F events we identify.
For example, a visual inspection of Figure S1 hints that increasing occupation_deadline from 2am to 3am does not change the time dierence distributions. This implies that swarming has not yet set in (otherwise, we would expect quantitatively more events with longer time dierence), and the additional L/F events on the right-hand side are genuine. Consequently, we would prefer occupation_deadline=3am, as it increases our sample size. Table S3 formalizes this argument. † Objective, as in best reection of reality. Indeed, one cannot ask a bat how much time she needs for recruitment or how far away she travels from a follower. for the two-sided and one-sided test, respectively.

23/32
As an example, xing turnaround_time = 2 minutes, we see that the distribution of L/F time dierences for occupation_deadline at 2am is not statistically dierent from the distribution with occupation_deadline at 3am (p-value = 0.602). This is an indication that the nature of the identied L/F events is invariant to the later deadline, hence it is unlikely that we have inadvertently included swarming eects. Further inspection of the table reveals that qualitative changes in L/F time dierences occur when occupation_deadline=8am, but not for the other pair-wise comparisons. The one-sided test indicates the type of these changes, namely that L/F events inferred up to 8am on the day of occupation, tend to have larger time dierences compared to earlier occupation deadlines. This is in line with the reasoning in Appendix S.4 and implies the presence of swarming eects. Therefore, occupation_deadline=8am is likely too late.
Moreover, this conclusion holds when varying turnaround_time, as well. The impact of this parameter on the L/F time dierences seems to be small, in the range considered, except for values smaller than 5 minutes and comparing occupation_deadline = 5am vs. occupation_deadline = 8am. In these cases, too many events with small time dierences are identied, which conceals the swarming events. The eect of turnaround_time is primarily on the number of identied L/F events, as assuming larger recruitment delays excludes events where the leader found a follower relatively quickly (Table S5).
Based on these arguments, for a xed lf_delay=3 minutes, we would choose turnaround_time=3 minutes and occupation_deadline=5am on the day of occupation.
In Table S4 we apply the same comparison procedure, but this time we x lf_delay=5 minutes. Again, occupation_deadline=8am produced consistently larger time dierences that are not present when comparing all other occupation_deadline pairs. Additionally, the eect of turnaround_time is again small. Considering that higher lf_delay further increases our sample of identied L/F events (Table S5), we x lf_delay=5 minutes. ence of the events is 2.4 minutes, and the minimum is strictly above 1 minute. We argue that this characteristic is more consistent with swarming behaviour, in which a few experienced individuals attract naïve conspecics by circling around the roost and emitting echolocation calls. Since experienced and naïve individuals are not grouped together as in genuine leading-following pairs, it takes time for a naïve individual to respond to the calls and y to the roost. As a result, most L/F events identied in this way tend to have larger time dierences closer to the allowable limit of lf_delay. We use precisely this observation when ne-tuning the occupation_deadline parameter.
As a comparison, consider the recording pattern in Table S9 from another box close to 2am on the day of its occupation. The L/F events corresponding to this pattern are shown in Table S8. The mean time dierence is 1.5 minutes and the minimum is zero, as individual arrivals exceeded the time resolution of the reading device. This indicates that an experienced individual did appear close together with a designated follower. As for the couple of events with large time dierences, they are most likely due to naïve individuals remaining at the entrance of the box, thereby triggering the reading device repetitively, than to swarming. As seen from    An alternative representation of this network is through its so called adjacency matrix, A, which indicates which two nodes are adjacent to each other. The elements a i,j (i, and j index rows and columns, respectively) in this matrix are 1 if a directed link exists between nodes j and i. In other words, a i,j = 1 if j followed i. Otherwise, a i,j = 0. For example the rst column a i,1 gives all nodes that node 1 follows. We see that a 2,1 = a 3,1 = 1, so 1 has followed both 2 and 3.
The main idea behind eigenvector centrality is that the centrality of a node i, c i , is proportionate to the sum of the centralities of all nodes who follow it. Staying with node 1, its centrality is the sum of the centralities of nodes 2 and 3, i.e. c 1 = 1 λ c 2 + 1 λ c 3 or λ.c 1 = c 2 + c 3 for some proportionality constant λ. In this way, we can express the centralities of all nodes and write them as a system of equations: λ · c 1 = 0 · c 1 + 1 · c 2 + 1 · c 3 + 0 · c 4 λ · c 2 = 1 · c 1 + 0 · c 2 + 0 · c 3 + 0 · c 4 λ · c 3 = 1 · c 1 + 1 · c 2 + 0 · c 3 + 1 · c 4 λ · c 4 = 0 · c 1 + 0 · c 2 + 0 · c 3 + 0 · c 4 In matrix form the above system can be rewritten as:      or in vector notation: λ · c = A · c. This is the familiar eigenvector problem. We need to nd a vector c such that upon applying matrix A to it, the result is a scaled version of c with a scaling factor λ. The unknown vector c is called an eigenvector of the matrix A, and λ is referred to as the eigenvalue, which corresponds to that eigenvector. Solving the system of equations yields: c = {0.408, 0.408, 0.816, 0} and λ = 1. Therefore node 3 is most central since it is followed by everyone. Nodes 1 and 2 follow each other so they boost their own centrality, and node 4 is not followed by anyone so its centrality is 0.  0AF1  1731  1775  1890  1AC4  1B66  2122  29BB  3F49  4A31  5C0C  5D64  6C05  6D0D  80ED  814F  99E2  9AC0  A16A  A2BA  A84E  B1F6  B597  B8AA  D00F  D1A0  D2F4  D726  D7EC  E480  ED81  FD3D second-degree centrality  Table S10: Correlations between the rankings produced by the three dierent centrality measures.
S stands ranking using second degree centrality, E stands for ranking using eigenvector centrality, and D stands for ranking using in-degree centrality. The rst column lists the dierent datasets as described in Table 2.