Learning to play the piano with the Supernumerary Robotic 3rd Thumb

We wanted to study the ability of our brains and bodies to be augmented by supernumerary robot limbs, here extra fingers. We developed a mechanically highly functional supernumerary robotic 3rd thumb actuator, the SR3T, and interfaced it with human users enabling them to play the piano with 11 fingers. We devised a set of measurement protocols and behavioural “biomarkers”, the Human Augmentation Motor Coordination Assessment (HAMCA), which allowed us a priori to predict how well each individual human user could, after training, play the piano with a two-thumbs-hand. To evaluate augmented music playing ability we devised a simple musical score, as well as metrics for assessing the accuracy of playing the score. We evaluated the SR3T (supernumerary robotic 3rd thumb) on 12 human subjects including 6 naïve and 6 experienced piano players. We demonstrated that humans can learn to play the piano with a 6-fingered hand within one hour of training. For each subject we could predict individually, based solely on their HAMCA performance before training, how well they were able to perform with the extra robotic thumb, after training (training end-point performance). Our work demonstrates the feasibility of robotic human augmentation with supernumerary robotic limbs within short time scales. We show how linking the neuroscience of motor learning with dexterous robotics and human-robot interfacing can be used to inform a priori how far individual motor impaired patients or healthy manual workers could benefit from robotic augmentation solutions.


Introduction
From ancient myths, such as the many-armed goddess Shiva to modern comic book characters, augmentation with supernumerary (i.e. extra) limbs has captured our common imagination. In real life, Human Augmentation is emerging as the result of the confluence of robotics and neurotechnology. We are mechatronically able to augment the human body; from the first myoelectric prosthetic hand developed in the 1940s (Reiter, 1948) to the mechanical design, control and feedback interfaces of modern bionic prosthetic hands (e.g. Zollo et al., 2007;Farina et al., 2014;Bensmaia, 2015;Godfrey et al., 2018). Robots have been used to augment the bodies of disabled humans, restoring some of their original capabilities (e.g. Aszmann et al., 2016;Hussain et al., 2016;Benabid et al., 2019;Shafti et al., 2019b). Similar setups can augment healthy users beyond their capabilities, e.g. augmenting workers in industrial settings through intelligent collaboration (e.g. Haddadin and Croft, 2016;Maurtua et al., 2016;Shafti et al., 2019a), or equipping them with additional arms to perform several tasks concurrently (e.g. Dziemian et al., 2016;Parietti and Asada, 2016). The latter fits within a particular area of human augmentation robotics which is referred to as supernumerary robotics. These are robotic systems, typically worn by the user, to extend their body and its physical capabilities. However, a major question is, to what extent do the human brain and body have the capability of adapting and learning to use such technologies efficiently (Makin et al., 2017). The supernumerary and augmentative nature of this area of research presents an interesting challenge on how to map human motor commands to robot control. Supernumerary robotic limbs (e.g. Llorens-Bonilla et al., 2012) are envisioned to assist human factory workers, and adapted for different types of applications (e.g. Parietti and Asada, 2016). The introduction of supernumerary robotic fingers (SRF) developed as a grasp support device (Wu and Asada, 2014) led to further exploration on optimal materials and mechanical designs for supernumerary robotics (e.g. Hussain et al., 2016Hussain et al., , 2019Tiziani et al., 2017). SRF have been particularly envisioned, and successful in grasp restoration for stroke patients (e.g. Wu and Asada, 2015;Hussain et al., 2016Hussain et al., , 2017. However, regardless of the mechanism, material, and use case, given the presence of a human within these devices' control loops, the control interface is of essential importance. It was recently shown that for polydactyly subjects who possess six fingers on their hands, the control interface involves a cortical representation of the supernumerary finger (Mehring et al., 2019). Unlike polydactyly, robotic supernumerary robotic limbs and fingers must utilize indirect control interfaces to achieve the same goal -enable more complex movements and better task performance. Abdi et. al investigate the feasibility of controlling a supernumerary robotic hand with the foot (Abdi et al., 2015). Others have focused on electromyography (EMG) as the control interface -both in SRF (e.g. Hussain et al., 2016), and supernumerary robotic limbs (e.g. Parietti and Asada, 2017). Other interfaces used for supernumerary robotics include inertial measurement units (e.g. Wu and Asada, 2015), voice (e.g. Liang et al., 2019), pushbuttons (e.g. Hussain et al., 2019), and graphical user interfaces (e.g. Al-Sada et al., 2019). Researchers have also explored indirect control interfaces, e.g. using the concept of grasp synergies (Santello et al., 2016) to assume that the SRF posture will be highly correlated with that of natural fingers during manipulation, allowing SRF control through natural movement of existing fingers (Wu and Asada, 2016). Importantly, all these user interfaces focus on the interface and not the user.
While extensive research has been conducted on the mechanical design, interface, and control of supernumerary robotics, there is a gap in understanding the role of human motor control in the success and adoption of these robotic human augmentation systems. In the rapid development of human augmentation little attention is devoted to how humans interact with the technology and learn to control it (Makin et al., 2017). Learning to control a supernumerary robot limb or finger is a complex process which involves learning to utilize one movement (set muscles activations) to perform a new movement. The field of motor neuroscience extensively studied the control mechanisms and learning processes of perturbed movements, where we utilize arm movement in one direction to move a cursor on the screen in a different direction, accounting for a rotation perturbation (e.g. Krakauer et al., 2000;Taylor et al., 2014;Haar et al., 2015;Bromberg et al., 2019) or a mirror reversal perturbation (e.g. Telgen et al., 2014;Wilterson and Taylor, 2019;Yang et al., 2020). In these settings, one can predict subjects learning from the task relevant variability in their unperturbed movements (e.g. Van Der Vliet et al., 2018). Nevertheless, these studies were done on simplistic lab-based tasks and only recently the field is starting to address the complexities of real-world movement and to ask to what extent those lab-based findings generalize to real world motor control and learning (e.g. Maselli et al., 2017;Haar et al., 2019). While the relationship between task-relevant variability and learning performance seems to generalize to real-world tasks, defining task relevance is less trivial (e.g. Haar et al., 2019) and the learning mechanism can differ between users (e.g. Haar and Faisal, 2020). In the case of augmentation technology, the relevant features can be either those related to performing the task itself without the augmented device, or features related to the control interface of the augmented device.
In human performance research, such as sports science and rehabilitation, there are significant efforts to predict future performance. In sports science there is an attempt to predict athlete's future success for talent identification purposes. Motor coordination and motor learning are often used as predictors (e.g. Lopes et al., 2011;Vandorpe et al., 2012;Di Cagno et al., 2014;Johnston et al., 2018). Similar approaches are used in rehabilitation to predict skill learning capacity following traumatic brain injury, stroke, or neurodegenerative disease (e.g. Wadden et al., 2017;Olivier et al., 2019). Here we are looking into the predictability of future performance with augmentation technology, using a set of motor coordination tests that we developed, the Human Augmentation Motor Coordination . We specifically ask which aspect of motor coordination is a better predictor of performance with the device -performance in task related tests versus performance with the control interface.
To address this, we have created an experimental setup to study how different parameters within the remit of human motor control contribute to successful control, coordination and usage of a human augmentative robotic system. We have created a 2 degrees of freedom (DOF) robotic finger, worn on the side of the hand, to augment human finger count to 11, effectively giving them a 3rd thumb. We call this the supernumerary robotic 3rd thumb (SR3T -device previously described in (Cunningham et al., 2018)), and we study its usage in a skilled human: playing the piano. The piano is a setting which involves the use of all fingers of the hand, and hence a good environment to consider for testing the augmentation of fingers. Furthermore, piano playing is structured both in spatial and temporal dimensions, allowing for quantification of the performance in both aspects. Piano key sequences to be played with individual fingers to capture hand and finger positional acuity. Performance is assessed by timing and key board press down errors movement between 3 keys (spaced 1 octave apart) labelled "Left", "Middle" and "Right".

Results
We developed a mechanically powerful supernumerary robotic 3 rd thumb, SR3T, and means for interfacing it with human users. We then devised protocols and behavioural markers for motor coordination assessment (HAMCA), developed a music playing task as well as measures for assessing the quality of playing. Finally, we evaluated the SR3T on human subjects, and predicted how well subjects would be able to perform in playing the piano with an augmented additional finger, based on the basic motor assay from the HAMCA. Twelve right-handed participants (6 experienced pianists and 6 naïve players), attended 2 experimental sessions held on separate days in the lab. In the first session they performed the HAMCA set of 8 tasks to assess their hand and foot dexterity. This set was developed to enable a priori prediction of how well each individual human user can learn to use an augmented device. From the HAMCA we extracted 8 scores as measures of hand and foot motorcoordination (see Methods). In the second session the subjects learned to play a piece on the piano and then repeated it with our human augmentation device, the supernumerary robotic 3rd thumb (SR3T), operated through foot motions as the interface (Figure 1).
The hand and foot motor-coordination scores from the HAMCA, recorded during the first experimental session, showed moderate differences between the pianists and the naïve players ( Figure 3A). Remarkably, there were no significant group differences in any of the piano tasks (Piano Position, Piano Timing, and Piano Loudness). The only measure where the pianists performed significantly different to the naïve players was the Hand Dimensionality (p = 0.049), which is based on the assembly of toys (see methods). On the other hand, in the Foot Up-Down Spatial measure the naïve players showed higher scores than the pianists (p = 0.012). In both groups the inter-subject variabilities were relatively evenly distributed except for one pianist who was an outlier showing poor performance on the Foot Balance, Foot Tracking, and the Piano Loudness tasks.

A B
The correlation matrix between the subjects' motor-coordination scores ( Figure 3B) suggest relatively weak dependencies, i.e. a subject who showed high dexterity in one task did not necessarily show high dexterity in any other task. There were no significant dependencies within the foot measures and the only dependency within the hand measures was between the Piano Position and Piano Timing scores (r = 0.64, p = 0.026). There were a few intriguing correlations between foot and hand measures. First, the foot and hand timing scores were highly correlated (Foot Up-Down Temporal and Piano Timing: r = 0.68, p = 0.015). Second, the Foot Up-Down Temporal also correlated with the Piano Position (r = 0.66, p = 0.020). This was expected considering the correlation between the piano position and timing scores. These three tasks are metronome based, thus measuring rhythmic coordination. Lastly, we found a significant correlation between Foot Balance and Piano Loudness scores (r = 0.61, p = 0.035), but this was driven by the pianist who was an outlier in both tasks.
In the second experimental session, all subjects performed 10 trails of the Piano Playing task, using their left-hand index finger (LHIF) to play notes further to the right of their right-hand. This was followed by an additional 10 trials of Piano Playing with the SR3T. In both tasks (playing with the LHIF and with the SR3T) subjects showed improvement over the first 5 blocks after which they plateaued ( Figure 4). Thus, for all future analysis we averaged over trials 5 to 10 to have a single piano playing score for each of these tasks. Here as well, there were no significant differences between the pianists and the naïve players in any of the trials played with their LHIF (t-test p > 0.12) and with the SR3T (ttest p > 0.32). Testing over all plateau trials (5-10), pianists were significantly better in playing with their LHIF (t-test p = 0.017) but there were no group differences in playing with the SR3T (t-test p = 0.9). Therefore, we merged the two groups and all further analysis was done on all 12 subjects together.
The Piano Playing with SR3T score is our metric for performance with the human augmentation device, and the fundamental question is to what extent can it be explained by motor-coordination measures. The correlations between the subjects' scores in the Piano Playing tasks and the motorcoordination measures suggest different dependencies for playing with and without the SR3T ( Figure  5 & Supp Fig 5). The scores in the Piano Playing task with the LHIF, which required no foot interface, The note sequence played for the piano playing task. Notes exclusively played with the right hand, and those with the SR3T or LHIF are marked, (B) Visualisation of how each individual note is scored linearly based on delay from the beat. Incorrect notes and skipped notes are assigned a score of 0, full sequence score is the average of all individual scores, (C) Accuracy over trials with the SR3T (orange) and without, using the LHIF (blue).
were significantly correlated with Foot Tracking and Foot Up-Down Temporal scores (r = 0.66, p = 0.019 and r = 0.64, p = 0.026, respectively). The scores in Piano Playing with SR3T were significantly correlated only with the Piano Loudness scores (r = 0.59, p = 0.044).
The pianist who was an outlier in few motor-coordination measures was also an outlier in the Piano Playing with the SR3T score (but not in Piano Playing without) and is driving the correlation with the Piano Loudness. Thus, we further investigated the correlations between Piano Playing with SR3T and the motor-coordination scores with Spearman rank correlation scores ( Figure 5). Foot Up-Down Temporal was the only measure which showed significant Spearman correlation with the Piano Playing with SR3T score (r(Spearman) = 0.67, p = 0.02). The Piano Playing scores without and with SR3T were highly correlated even with the outlier (r = 0.63, p = 0.028), and even better correlated with Spearman rank correlation (r(Spearman) = 0.71, p = 0.012).

Figure 5.
Correlations between accuracies. The first eight panels show correlations between accuracies in piano playing with the SR3T and in the HAMCA tasks. The ninth panel shows correlations between accuracies in piano playing with and without the SR3T. Naïve subjects marked as grey and experienced subjects as black dots.
Since none of the motor-coordination scores explained the Piano Playing with SR3T score well, we asked whether a combination of motor-coordination scores can explain it. We specifically asked which combination could better explain it -that of the hand measures, which includes piano tasks as well as the only score where the pianists were significantly better than the naïve players (Hand Dimensionality); or that of the foot measures, considering it being the control interface of the SR3T. Generalized linear models were fitted to the hand ( Figure 6A) and foot ( Figure 6B) measures trying to explain the Piano Playing with SR3T score. To account for the impact of the outlier subject, we removed them and fitted the Generalized linear models to the remaining 11 subjects. While both models could explain most of the variance in the SR3T Piano Playing score, the Generalized linear models based on foot measures showed a much better fit than the GLM based on the hand measures (r = 0.79 and r = 0.55, respectively) and was the only significant fit (p = 0.0035 and p = 0.081, respectively). Moreover, while in the foot model all scores had positive contributions to the model with similar magnitudes, the hand model was dominated by Piano Position while Piano Timing had negative contribution and Hand Dimensionality and Piano Loudness had none. We then tried Generalized linear models where we added the LHIF Piano Playing score (without SR3T) to the hand and foot models (Figure 6.C&D). While it performed better for both models (r = 0.87 and r = 0.92 respectively, p < 0.005), it still showed more contribution from the foot measures than the hand, and thus better performance in the Foot+LHIF model.
Lastly, we fitted all models to all 12 subjects, including the outlier subject who drives the correlations (Supp Fig 6). While both hand and foot model could now significantly explain most of the variance in the SR3T Piano Playing score (p < 0.01) the GLM based on foot measures showed a better fit than the GLM based on the hand measures (r = 0.92 and r = 0.71, respectively). Here, even the GLM with the LHIF Piano Playing score added to the hand model fell short relative to the foot only model (r = 0.78).

Discussion
In this study we addressed a gap in our understanding of human augmentation technology which is how human-in-the-loop interaction with an augmentative device is learned and performed by the human brain. We focused on how different parameters within the remit of human motor control contribute to successful control of an SRF. For that purpose, we developed the HAMCA set and created a supernumerary robotic 3rd thumb (SR3T), controlled by an IMU placed on the subjects' foot, to perform a piano playing task. Our findings suggest that it is not your expertise in the task you perform with the SRF (i.e. piano playing expertise), nor your task-space coordination (i.e. hand dexterity), but your interface-space coordination (i.e. foot dexterity) that can predict your level of task performance with the augmented device.
While half of our subjects were experienced pianists and the other half naïve, there were not many significant differences between the two groups. The only motor-coordination score in the HAMCA in which the experienced pianists performed significantly different to the naïve players was the hand dimensionality (HD, see Figure 3A). The notion that as a skill evolves into an expertise one learns to use more degrees of freedom in the movement, is known since the pioneering work of Nikolai Bernstein. Bernstein found that professional blacksmiths use high variability in their joint angles across repetitive trials to achieve low variability in their hammer end-point trajectory (Bernstein, 1967). Pianists need to get their hands to posture with independent control of digits which are not common in daily life. Thus, they should be able to control more degrees of freedom in their hand movement. We would have expected to see significant differences on the piano tasks as well as timing-based tasks with the foot, given the pianists' expertise. This is, however, not the case within our performance results (see Figure 3.A).
Looking at correlations between the different motor coordination scores (Figure 3.B), we see a high and significant correlation between Piano Position (PP), Piano Timing (PT) and Foot Up-Down Temporal (FUDT). These are the only three tasks relying on a timing-based measure, with a metronome-controlled beat. Therefore, they are measuring rhythmic coordination, and presumably relay on a common timing mechanism (Ivry and Hazeltine, 1995), and a common coordinationdependent timing network (Jantzen et al., 2005). The strong correlations between these measures is suggesting that rhythmic coordination is a personal trait similarly performed in both the fingers (for PP and PT) and foot (FUDT).
In the piano playing task, all subjects (naïve players and experienced pianists alike) initially showed improvement in accuracy from trial to trial (i.e. learning). This was a short learning process which plateaued quickly after 5 trials. For all subjects the plateau with the SR3T was significantly lower than with their LHIF. This fast learning within a session and low plateau (which leaves much room for improvement in future sessions) are hallmarks of early motor skill learning. This is in line with many evidence of multiple time scales in skill learning where fast improvement in performance occurs in the initial training and plateau within a session, and slow improvement develops across sessions (e.g. Karni et al., 1998;Costa et al., 2004;Dayan and Cohen, 2011;Papale and Hooks, 2018). Accordingly, learning to play the piano, augmented with the SR3T, seems to be a novel motor skill learning task. Further support can be found in the group differences while playing the piano with and without the SR3T. When subjects played with their own LHIF, across all plateau trials pianists performed better than naïve players, as expected based on their piano experience. Though, surprisingly, there were no significant group differences on a trial by trial basis, not during learning nor during plateau. When subjects played with the SR3T there were also no significant group differences on a trial by trial basis, but also across all plateau trials pianists did not perform better than naïve players. This suggests that playing augmented with the SR3T is not a trivial extension of the regular piano playing task with your own finger, but a novel skill that the subjects need to learn.
The correlations of the SR3T piano playing score with all motor coordination measures ( Figure 5) suggest no one-to-one mapping. While most measures showed some positive correlation trend, hand dimensionality -the best metric for piano playing expertise -showed no correlation, and even a slightly negative trend. This is in line with the lack of difference between experienced pianists and naïve players in performance with the SR3T. The two measures that showed the strongest correlations to the SR3T score were Piano Loudness and Foot Up-Down Temporal. Yet, Piano Loudness correlation was driven by the outlier and Foot Up-Down Temporal showed significant rank correlation but no Pearson or robust correlation. Overall, no motor coordination measure can predict the SR3T piano playing score by itself. The only measure that showed high correlation with the SR3T score was the LHIF piano playing score. Thus, while piano playing experience showed no significant contribution to the performance in the SR3T task, performance in the same task without the SR3T is a good predictor of the performance with the SR3T. Interestingly, looking at correlations between the coordination measures and the LHIF piano playing score, we see Foot Tracking showing a significant and robust correlation with the LHIF score, rather than any of the hand related scores.
Next we asked if a combination of motor coordination measures from HAMCA can predict performance with the SR3T, and if so, which set of measures would be a better predictor -that of the hand coordination measures, directly linked to playing the piano; or that of the foot coordination measures, which are linked to the control mechanism of the SR3T. Our results suggest that the set of foot coordination measures is a good predictor of performance with the SR3T (Figure 6). The regression coefficients of the four measures are within the same range, suggesting a balanced contribution of these different measures. The model based on hand coordination measures does not perform as well in prediction. Furthermore, the contributions of the measures are unbalanced relative to how foot measures contributed to the foot-based model. Hand dimensionality which was the best metric to distinguish pianists from naïve subjects shows a minimal contribution to the model. Piano position and timing measures are showing reverse contributions, even though they are highly and significantly correlated (see Figure 3).
Our results suggest that the human motor coordination skill in using the control interface of the robotic augmentation device (in our case, the foot) is the best predictor of how well the augmented human performs with the robotic system. Interestingly, skills otherwise relating to the actual task do not serve as good predictors, i.e. in the case of piano playing, the hand-related motor coordination measures from HAMCA are not good predictors of how well the human will perform, even though the piano task relies heavily on those skills. Previous work on interfaces for supernumerary robotics have shown that the foot can generally serve as a good interface for robotic limbs working collaboratively with the user's hands (e.g. Carrozza et al., 2007;Abdi et al., 2015;Dougherty and Winck, 2019). Abdi et al. (2015) study the control of a third robotic hand via the foot in virtual reality, for robotic surgery applications, showing similar learning trends to what we observe here. Results obtained with adaptive foot interfaces for robot control (e.g. Huang et al., 2020), where data-driven approaches are used to create subject-specific motion mapping, are in line with our findings. Huang et al. (2020) report that inter-subject variability decreases once a subject-specific motion mapping is enabled. This confirms the dependency of robot control performance on metrics inherent to each subject, which we present here to be motor coordination skills within the limb used on the control interface.
Our work shows the possibility of humans being able to quickly acquire a skilled behaviour, such as piano playing, with a human augmentative robotic system. Both naïve piano players (i.e. without prior experience) and piano playing experts demonstrated the same ability to integrate the supernumerary robotic limb: We saw no difference in the performance with the SR3T, suggesting that integrating robotic augmentation is primarily driven by a priori motor coordination skills and not affected significantly by expert motor domain knowledge. It is important to consider the meaning of these results in the context of prosthetics, and human augmentation in general. Prosthetics replace a limb that was lost whereas with the SR3T, and with supernumerary robotics in general, the human is operating a new, additional limb -in our case a thumb. Our augmentation is done through substitution, i.e. we use an existing limb to operate an additional one. We show here that this makes the system reliant on human motor skills, specifically in controlling the interface-space (i.e. foot dexterity). Thus, people with higher motor skills will be neurobiologically better suited to use augmentation robots.
We also demonstrate the possibility of applying substitution across different levels of the biomechanical hierarchy. The foot, which in terms of the biomechanical hierarchy is equivalent of the entire hand, is used here as the interface-space controlling a thumb, which is further down the biomechanical hierarchy. These results sit at one end of the spectrum of solutions for controlling an augmentative device, which goes from substitution all the way to direct augmentation via higher level control, either brain-machine-interfacing or cognitive interfaces such as eye gaze decoding. We previously showed that the end-point of visual attention (where one looks) can control the spatial end-point of a robotic actuator with centimetre-level precision (Tostado et al., 2016;Maimon-Mor et al., 2017;Shafti et al., 2019b). This direct control modality is more effective from a user perspective than voice or neuromuscular signals as a natural control interface (Noronha et al., 2017). We showed that such direct augmentation can be used to control a supernumerary robotic arm to draw or paint, freeing up natural two arms to do other activities such as eating and drinking at the same time (Dziemian et al., 2016). But such direct augmentation has to date not achieved augmenting fine motor skills such as playing the piano, as playing this instrument requires not just the execution of a note: it is not a simple button-press exercise, but requires fine grade expression of temporal and spatial motor coordination across robotic and natural fingers. Thus, using the operational definition of Makin et al., (2017) for embodiment of robotic augmentation as the ability to use extra limbs in natural tasks, we can predict the degree to which subjects can integrate supernumerary limbs into their natural body movements, as a function of their basic motor skills. Thus, our work shows that we can achieve but also predict the capability of individuals to embody supernumerary robotic limbs in real-world tasks, which has impact for robotic augmentation from healthcare to agriculture and industrial assembly e.g. in the aerospace industry.

Methods
Subjects. Twelve right-handed participants took part in our experiments (mean age 23.3+/-2.8 years). Six of the participants had played the piano for several years (pianists group) and the other six did not have any piano playing experience (naïve group). All of the participants from the pianists group had at least 5 years of piano training (range 5-21 years, mean 10.6+/-5.4 years). Two participants of the naïve group had over 5 years' experience of guitar playing. All participants gave informed consent prior to participating in the study and all experimental procedures were approved by Imperial College Research Ethics Committee and performed in accordance with the declaration of Helsinki.
Setup. We created an experimental setup to investigate how individual motor skills contribute to the performance of a human user of a supernumerary robotic thumb; i.e. a robotic augmented human. To this end, we have created a 2 degrees of freedom (DoF) robotic finger that users can wear on the side of their hand, effectively augmenting them with a third thumb. The design, creation and testing of the supernumerary robotic third thumb (SR3T) was reported in (Cunningham et al., 2018). The design specifications for the SR3T were derived from the design requirements of a fully spherical operating thumb (Konnaris et al., 2016a) and the natural eigenmotions of human thumbs in daily life activities (Konnaris et al., 2016b). The SR3T is attached to the user's right hand and is controlled through the user's right foot. The motion of the foot was measured using an accelerometer in our previous implementation (Cunningham et al., 2018), but here we used a 9DoF inertial measurement unit (IMU -Bosch BNO055, breakout board by Adafruit) for increased stability. The unit can provide absolute orientation measurements (with respect to the earth's magnetic field) thanks to an onboard sensor fusion algorithm. Absolute orientation can then be extracted as Euler vectors, at 100Hz. The SR3T's two DoFs correspond to horizontal and vertical movements of the robotic fingertip. These are mapped to horizontal and vertical movements of the foot, i.e. yaw and pitch, respectively. Once the subject is wearing the SR3T on their hand, and the IMU on their foot, they are asked to sit with their foot on the ground facing the piano. The SR3T is moved horizontally for the fingertip to face the forward position as well. The values read by the IMU for the orientation of the foot, and by the motor encoders for the position of the SR3T are recorded. The subject is then asked to rotate their foot clockwise, with the heel as the centre of rotation, to their maximum comfortable reach (typically 45 degrees from the forward-facing pose). The SR3T fingertip is also moved accordingly, to the maximum horizontal position on the right, and values recorded as before. These are used to map the horizontal motion of the foot to that of the SR3T, with a similar process for vertical motions. The setup can be seen in Figure 1.
For the piano playing tasks and piano related hand dexterity tasks we used a digital piano (Roland RP501R-CB, Roland Corp., Osaka, Japan). The piano was connected to a PC with a MATLAB script establishing communication through its MIDI interface. Each keystroke on the piano was received by the MATLAB script as a MIDI message which comprised data regarding the note played, time of keystroke (with a 1ms resolution) and the keystroke velocity, which leads to proportional loudness of the note played.
Experimental design. We developed a set of measurement protocols and behavioural biomarkers, the Human Augmentation Motor Coordination Assessment (HAMCA), and ran this set of tests to assess hand and foot dexterity (due to the foot being the control interface) and piano-related skills. The HACMA set includes both spatial and temporal evaluations. From these dexterity tasks we extracted 8 hand and foot motor-coordination scores. Finally, the participants were given specific melodies to play on the piano with and without the SR3T. The melody was designed to require 6 fingers, forcing the participant to either use their left-hand index finger (LHIF), or the SR3T if they are wearing it. Table  1 summarises the experimental setup procedure and how they map to results.

Foot balance task
A Wii Balance Board (Nintendo Co. Ltd., Kyoto, Japan) together with the BrainBloX software (Cooper et al., 2014) was used. The board (Figure 2A) is made of four pressure plates and the software interface displays the real-time centre of pressure computed by the Wii Balance Board across all four plates, and relative to the board's coordinate system. Weight plates (70 N) were placed on the left side of the board, moving the centre of pressure away from the system origin. Subjects then had to move the centre of pressure back towards the origin by applying pressure on the right side of the board with their right foot. The plates were placed in three positions (Figure 2A), with five trials per position, resulting in a total of 15 trials, performed in random order. Before the beginning of each trial, participants were asked to place the centre of pressure as close as possible to the origin. Once they stated their readiness and after a 3-seconds countdown, a 15-seconds recording was started. Samples were recorded at 85 Hz. The resulting motor-coordination score is computed according to equation (1): Where error corresponds to the mean Euclidean distance of the centre of pressure from the origin of the coordinate system across all recorded samples. The maximum error corresponds to the error computed if the subject was not acting on the platform.

Foot up-down task
The same setup as in the Foot Balance Task was used, without the weights. A steady beat was played with a metronome, which the subjects had to match when moving their feet from a toe-lifted (dorsiflexion) to a heel-lifted (plantarflexion) pose and vice versa (see Figure 2C). The pressure exerted by the foot had to match an upper and lower target value marked on the screen. Ideally, the output should resemble a square signal with a period equivalent to that of the beat on the metronome. Subjects performed 15 trials in random order, five at each selected tempi: 40bpm, 60bpm and 80bpm.
Two types of motor-coordination scores are computed from this task: spatial and temporal, both using equation (1). For the spatial measure, the error is calculated as the absolute distance between the target pressure position and the measured position of the centre of pressure. The maximum error corresponds to the total distance between the upper and lower pressure targets. The temporal measure's error is based on how precise in time the change between target positions occurs. This is specifically measured at the time of zero-crossing, respective to the beats of the metronome. Maximum error is the time corresponding to one full period. Both the spatial and temporal absolute errors had skewed distributions; therefore, the median of the error was utilised instead of the mean.

Foot tracking task
The subjects controlled the 2D position of a dot on a screen through rotations of their ankle, captured with an inertial measurement unit (IMU) attached to their shoe (see Figure 2B) --the same setup used as the control interface of the SR3T. The subjects were directed to use ankle rotations only, the result of which they could see as a blue dot on a screen. They had to make the blue dot follow the position of a red dot moving along a figure-of-eight path, as shown in Figure 2B. The path, the red and blue dots were shown to subjects on a monitor screen in front of them. Each trial is composed of 6 laps around the figure-of-eight path, lasting 35 seconds total. The subjects sat at a height to have their foot freely moving in space (see Figure 2B). The motor-coordination score for this task follows equation (1), with the error defined as the Euclidean distance between the blue and red dots. The maximum error is taken as the maximum recorded error across all time points in all trials of all subjects. Once again, due to the skewness in the absolute error distribution, the median of the error was used in the accuracy calculation.

Hand dimensionality
Subjects performed two toy assembly tasks while wearing a Cyberglove II (CyberGlove Systems LLC, San Jose, CA) to capture the motion of their hand and fingers, with 22 degrees of freedom. The tasks involved assembling a LEGO DUPLO toy train (LEGO 10874), and assembling a toy car (Take Apart, F1 Racing Car Kit) using a toy drill and screws (see Figure 2, D and E). To ensure the appropriate fit of Cyberglove II we made sure all participants had a minimum hand length of 18 cm, measured from the wrist to the tip of the middle finger.
Principal Component Analysis (PCA) was performed on the collected data. We relate a greater number of principal components needed to explain the variance of the motion, to greater hand dexterity (Belic and Faisal, 2015). The resulting motor-coordination score is defined as the number of principal components required to explain 99% of the recorded motion's variance, normalised by the number of degrees of freedom recorded: 22.

Piano timing
The subjects used their right-hand index finger to press the same piano key at varying tempi (40bpm, 60bpm, 80bpm, 100bpm and 120bpm) played by a metronome. In total, subjects performed 25 trials in random order (5 at each tempo) composed of 10 keystrokes.
The relevant motor-coordination score follows the same concept as that of equation (1); for further clarity we present it in more detail, in equation (2). The normalised error is the absolute time deviation from the correct tempo divided by its period; that is, the time between keystrokes (inter-onset intervals or IOI) minus the period of each tempo in seconds, as shown in equation (2).
Where tempo is the beats per minute value, hence making 60/tempo the beat period in seconds. Nine samples were generated in each trial (given that nine IOI are generated by ten keystrokes); hence, there were 45 samples generated at each tempo, which had a skewed distribution. The median of these values was taken as the score at each tempo and then the five tempi's scores were averaged to obtain a single value for their motor-coordination score in the task.

Piano positioning
The right-hand index finger was used to move back and forth between two keys and press them at a rate given by the metronome (fixed tempo of 60bpm). Three piano keys were selected, one positioned in the middle of the piano and the other two spaced 7 whole notes to the left and right of it. Three combinations of two keys were to be followed: left and middle, middle and right, left and right (see Figure 2F). In total, subjects performed 15 trials in random order (5 at each key combination) composed of 12 keystrokes each. The relevant motor-coordination score is defined the same way as in the piano timing task. Timings are measured between two consecutive and correct key pressestimings relating to incorrect keypresses were discarded. The latter is done automatically as the note values will be different to what is expected. In order to make up for cases where participants might have pressed the incorrect key, or missed a beat, we consider a window of size of the tempo period (1 second) centred on the correct beat time. If notes are played outside of this window, we assume that the first keystroke of the interval is a wrong one. As the incorrect notes are already removed, a time before the window would mean that the same key was pressed twice consecutively and a time after it would mean that a keystroke was missed. Most of the subjects had no misses or 1 miss out of 165 samples.

Piano loudness
On the digital piano, the loudness of a note depends on the velocity with which the relevant key is pressed. A fast press will produce a louder sound and vice versa. Subjects were instructed to press a single key at a target level of loudness, with both the target level and the level at which they pressed shown to them visually. Before starting the experiment, participants were instructed on how to set their minimum (0%) and maximum (100%) keystroke loudness values. The piano's recorded loudness values range from 0 to 127. A very slow key press corresponds to values around 2-8, whereas fast presses fall within values of 120-127. Participants were allowed to familiarise themselves with the visual interface by the experiment runner doing one block of trials on themselves, with the participant watching the interface. Then, they are given up to 5 unrecorded trials to familiarise themselves with how the key presses relate to numerical values, and for the experiment runner to ensure that they cover the full range of values in their key presses. They are then asked each to define their own range, by pressing the key at 0% and 100%. These values are recorded and used to define their range for the experiment. The loudness values for levels 25%, 50% and 75% are obtained by linear interpolation between the 0% and 100% values defined for each participant.
In total, subjects performed 25 trials, 5 at each loudness level (randomised): 0%, 25%, 50%, 75% and 100%, composed of 10 keystrokes each. The motor-coordination score is calculated following equation (1), with the error defined as the deviation from the target values (in percentage loudness) and maximum error as the maximum committed among all the trials of all of the participants for each loudness level (these are as follows, Level 0: 34.0833, Level 25: 42.6250 , Level 50: 34.6667, Level 75: 28.4583 and Level 100: 24.4167). After analysing the results, the average motor-coordination score was calculated using only the results at the 25%, 50% and 75% loudness levels given that these targets required more skilled velocity control than the 0% and 100% levels. Thus, their use would enhance individual differences between participants.

Piano playing
To assess the participants' performance on actual piano playing, a sequence with 38 notes played at a constant tempo (isochronous) of 80bpm was devised. Subjects were able to learn and follow the sequence while playing, aided by the software Synthesia (Synthesia LLC). Synthesia showed the notes of the sequence as coloured blocks scrolling on-screen. The participants had to press the keys corresponding to the positions on the keyboard with which the Synthesia blocks were aligned in time to the music in order to score points (see Figure 1). The sequence of notes was designed to be played mainly with the right hand, plus one finger for notes that were too far to the right side of the right hand. These notes could then be reached either using the index finger of the left hand, or, if wearing the SR3T, by activating the robotic finger. On Synthesia, the notes to be played by the right-hand fingers were coloured in green, and the notes to be reached with the extra finger were coloured blue. Similarly, the relevant keys on the keyboard were marked with the same colours (see Figure 2F).
Subjects played the sequence first without and then with the SR3T for 15 trials in each block. The first five trials were considered as practice trials (not recorded), the next ten were recorded but only the last 5 are used for computing the mean piano playing scores per subject due to the fact that subjects were still learning the sequence, especially the ones with no piano playing experience. For the first block of trials, without the SR3T, subjects played using their right hand for green coloured notes while blue coloured ones were played with the left-hand index finger. To achieve this, subjects had to cross their left hand over the right one. For the second block of trials, the left index finger was replaced by the SR3T. We score each individual keypress's timing as follows: where ΔT is the absolute time difference between the keypress and metronome beat, and IOI is the corresponding beat time period. Therefore, the participants receive a full score for each correct keypress at the exact correct time, with the score linearly decreasing for time deviations, up to half the beat period on each side, at which point the score is 0. Incorrect notes within this window are obviously marked as 0 too. We then average the note scores for the entire sequence.
In some trials, recordings were stopped prematurely due to technical errors. In all such cases, the score is calculated with respect to the recorded section only. However, if less than 50% of the notes are recorded, then the trial is discarded. This happened only in two trials from the same subject which were removed. No other trials for any subjects had any missed recordings. There were also cases where participants missed one initial beat, leading to them being off-beat for the entire sequence. To adjust for this, we calculate the scores for the sequence as originally timed, plus if it were started one beat early, or one beat late. We then take the highest score of the three cases to represent the piano playing score for that trial. This only occurred twice.