STAT946F17/ Dance Dance Convolution: Difference between revisions

From statwiki
Jump to navigation Jump to search
Line 34: Line 34:




[[File:DDRFigure3n4.png|thumb|400px|center]]
[[File:DDRFigure3n4.png|thumb|600px|center]]

Revision as of 23:24, 22 November 2017

Introduction

Dance Dance Revolution (DDR)

Dance Dance Revolution (DDR) is a rhythm-based video game. Players perform steps on a dance platform in synchronization with music as directed by on-screen step charts. The dance pad contains up, down, left, and right arrows, each of which can be in one of four states: on, off, hold, or release. There are $4^4 = 256$ possible step combinations at any instant since the four arrows can be in any of the four states independently.

Step charts exhibit complex semantics and tend to mirror musical structure: particular sequences of steps correspond to different motifs and reoccur as sections of the song are repeated. (Figure 1) The DDR community uses simulators, such as the opensource StepMania, which allow fans to create their own charts. Typically, for each song, packs containing one chart for five difficulty levels are created.

Learning to Choreograph

Figure 2. Proposed learning to choreograph pipeline for four seconds of the song Knife Party feat. Mistajam - Sleaze. The pipeline ingests audio features (Bottom) and produces a playable DDR choreography (Top) corresponding to the audio.

While players may grow tired of existing charts in standardized packs or creating charts can be really time-consuming for players, the authors introduce the task of learning to choreograph, which learns to produce new step charts given raw audio tracks. This task has been approached via ad-hoc method, it is the first time to be casted as a learning task to mimic the semantics of human-generated charts. The learning task is broke into two subtasks: First, step placement task decides when to place steps conditioning on various diffculty levels. Second, step selection task decides which steps to select at each timestamp. (Figure 2)

Music Information Retrieval (MIR)

Music information retrieval (MIR) may be involved in the progress on learning to choreograph. 1) step placement task closely resembles onset detection, a well-studied MIR problem. Onset detection identifies the times of all musically salient events, such as melody notes or drum strikes and each DDR step corresponds to an onset. 2) DDR packs specify a metronome click track for each song. For songs with changing tempos, the location of change and the new tempo are annotated. This click data could help for beat tracking and tempo detection.

Contributions

  • Defining learning to choreograph, a new task with real-world usefulness and strong connections to fundamental problems in MIR.
  • Introducing two large, curated datasets for benchmarking DDR choreography algorithms.
  • Presenting an effective pipeline for learning to choreograph with deep neural networks.

Data

Basic statistics of the two datasets are shown in Table 1.

  • Note:

1) Fraxtil: single author vs. In The Groove (ITG): multi-author

2) Each dataset contains five charts per song corresponding to increasing difficulty levels except that ITG dataset has 13 songs that lack charts for the highest difficulty.

3) The two datasets have similar vocabulary sizes (81 and 88 distinct step combinations, respectively).

4) The datasets contain around 35 hours of annotations and 350,000 steps considering all charts across all songs.