Automatic Music Summarization via Similarity Analysis

 Matthew Cooper and Jonathan Foote

{cooper, foote }@fxpal.com

ABSTRACT

We present methods for automatically producing summary excerpts or thumbnails of music. To find the most representative excerpt, we maximize the average segment similarity to the entire work. After window-based audio parameterization, a quantitative similarity measure is calculated between every pair of windows, and the results are embedded in a 2-D similarity matrix. Summing the similarity matrix over the support of a segment results in a measure of how similar that segment is to the whole. This can be maximized to find the segment that best represents the entire work. We discuss variations on the method, and present experimental results for orchestral music, popular songs, and jazz. These results demonstrate that the method finds significantly representative excerpts, using very few assumptions about the source audio.

To appear in Proc. Third International Symposium on Musical Information Retrieval, September 2002, Paris.


  (PDF, 465 kb)  (sound examples)