Here we show a few sample motions in our dataset using a 3D interactive visualizer. Please download the dataset to see full recordings of all 153 pieces.
Results from our method. We train policies to play new music scores unseen from the dataset in physics simulation.
Our markerless motion capture system first reconstructs initial 3D motions by triangulating 2D keypoints from multi-view videos. Then the motions are regularized by fitting MANO parameters. Finally, they are refined via inverse kinematics using the MIDI key-pressing data recorded by specialized sensors.
Reconstructed motion (shown as white hands) accurately matches the original video in multiple camera angles.
Given a novel sheet music in MIDI format, we first create a reference motion ensemble of 3D hands using a trained diffusion model and motion retrieval from the dataset. We then train a control policy along with two discriminators to imitate the reference motion ensemble while achieving musical accuracy in physics simulation.
We thank Yifeng Jiang and Jiaman Li for providing detailed feedback on the paper. This work was supported in part by the Wu-Tsai Human Performance Alliances, Stanford Institute for Human-Centered Artificial Intelligence and Roblox. We thank the 15 pianist volunteers for their essential contributions to this study. To protect their privacy, they remain unnamed, but their participation was invaluable to our research.
@inproceedings{wang2024piano,
title = {FürElise: Capturing and Physically Synthesizing Hand Motions of Piano Performance},
author = {Ruocheng Wang and Pei Xu and Haochen Shi and Elizabeth Schumann and C. Karen Liu},
booktitle = {SIGGRAPH Asia 2024},
year = {2024}
}
Website template adapted from DexCap,