Skeleton Plays Piano: Online Generation of Pianist Body Movements from MIDI Performance
Bochen Li, Akira Maezawa, and Zhiyao Duan

This project is in collaboration with the
Yamaha Corporation.
This project is partially supported by the National Science Foundation under grant No. 1741472, titled "BIGDATA: F: Audio-Visual Scene Understanding". |
![]() | ![]() |
Publication
What is the problem?
We aim to train a system to generate a virtual pianist animation with expressive performance motions given a symbolic music in MIDI format.
- Input: a live data stream of key depression actions and the corresponding metric structure (optional)
- Ouput: a time sequence of body joint coordinates

Motivation
- Generating expressive body movement is important for music interactions
- Most existing framework cannot incorporate music context information for whole-body expressive movement generation
Applications
- Demonstration for music learners by replicating a musician's body interpretations of music
- More immersive music enjoyment experience
- Visual interactions in automatic computer accompaniment

What is our approach?
We first use two CNN structures to parse the raw input of the MIDI note stream and the metric structure, and then feed the extracted feature representations to an LSTM network to generate the body movements, as a sequence of upper-body joint coordinates forming a skeleton.

Our Results
Subjective Evaluations
We conduct subjective evaluations to rate the expressiveness and naturalness of the generated skeleton movements compared with the ones extracted from real human players. More specifically, we recruit 18 subjects from Yamaha company to watch 32 10-sec video excerpts of "skeleton plays piano", 16 from the generated ones, and 16 from the real ones. The rating result is plotted in the following figure, where the tracks with significant different ratings are marked with "*".

Demo Videos
All the generated skeleton movements (compared with real human) for the 16 tracks are listed here:
Visit the YouTube playlist for the above 16 demo videos <here>
Visit the YouTube playlist for demo videos without comparing with real human <here>