Matt Prockup

Music, Machine Learning, Interactive Systems

I am currently a scientist at Pandora working on methods and tools for Music Information Retrieval at scale. I received my Ph.D. in Electrical Engineering from Drexel University. My research interests span a wide scope of topics including audio signal processing, machine learning, and human computer interaction. I am also an avid percussionist and composer, having performed in and composed for various ensembles large and small. I’ve also studied floral design and making wheel thrown ceramics.

LiveNote

Many people enjoy live orchestral performances, but those without musical training may find it hard to relate to the music. We have developed a system that helps users by guiding them through the performance using a handheld application in real-time. Using chroma features and dynamic time warping, we attempt to align the live performance audio with that of a previously annotated reference recording. The aligned position is transmitted to users’ handheld devices and pre-annotated information about the piece is displayed synchronously. 

Acoustic Features: Chroma  

Frequencies are divided into 12 pitch classes corresponding to the 12 notes in the Western chromatic scale (A, A#, etc.). The chromagram reflects the intensity of pitch over time, regardless of octave. 

 

Alignment Using Dynamic Time Warping

Dynamic Time Warping (DTW) is related to dynamic programming, and is designed to compare two signals with different timescales and align them.  This is used to determine the equivalent time in a reference recording for the music  that is currently unfolding in the live performance .

  1. Chroma are extracted from live audio and compared with chroma from a reference recording.
  2. A distance matrix and a cost matrix are generated between the two sets of chroma.
  3. The DTW algorithm aligns the two where this difference is minimized.

The Tracking System

The Application

Application Use  

With an iPhone or iPod Touch, audience members use an application that helps guide them through the performance.

  • The server pushes the live position to the clients. Users view time-relevant information based on the current position in real time.
  • Users can look ahead or back up.
  • Multiple information tracks convey different subject matter.

Application Content

Annotation content is generated through collaborations with Philadelphia universities and organizations. In addition to user feedback, the musicolgists’ feedback helps to tailor the application to the information that they want to convey to their audience.