Recently, Nvidia announced the company’s progress in artificial intelligence (AI) research. Earlier this month, for example, the company partnered with Hackster to introduce its own AI at the Edge Challenge. It is able to leverage the Jetson Nano Developer Suite to create a new model based on neural networks. At the same time, Nvidia released the multi-mode AI software development kit Jarvis in November, which integrates multiple sensors into one system. In addition, the company has designed a prototype of a new algorithm that will help the robot pick up any object.
(Instagram via Neowin)
However, this article is about A new model based on deep learning introduced by Nvidia on NeurIPS 2019. It can automatically generate appropriate dance moves based on the music entered.
The software, developed in collaboration with the University of California and merced, is also known as AI Choreographer.
Although it may not seem difficult on the surface, the team notes that measuring the precise correlation between music and dance requires a number of variables, such as the beat and style of the music.
To this end, the team collected three representative dance categories: ballet, Zumba, and hip-hop. After analyzing 361,000 dance clips, the researchers then used a training system to use the Confrontation Network (GAN).
As the core component of the decomposition synthesis framework, GAN’s complexity is shown in the figure above (from: GitHub)
From the top down, the team normalized the units cut from the actual dance sequence with the help of a motion beat detector, and then trained the DU-VAE to model the dance units.
During the choreography phase, you give them a certain amount of music and dance data, and then use MM-GAN to learn how to arrange sections based on specific music.
During the test, the researchers extracted styles and beats from the music, then looped a series of dance units, eventually with a metronome shaping device to comb the output.
To train the model, the team used the PyTorch Deep Learning Framework and the Nvidia Tesla v100 GPU, which was complemented by OpenPose for posture.
The latter is a real-time multiplayer system that jointly detects key points in a single image for the human body, hands, face, and foot.
Looking ahead, Nvidia plans to expand the approach to other dance styles, such as pop and dance. After the NeurIPS meeting, the relevant source code and model will be published on GitHub.