| |
| disclaimer |
This page contains digital audio data that is reproduced here under
the
"Fair Use" clause of the 1976 Copyright Act (17
USCS �107). |
|
| introduction |
| In this experiment, we compute summaries of pop and rock
songs. Our aim is to construct summaries for use in database
applications both to help users browse individual audio files, and also
as proxies for indexing, searching, and retrieval. The
summarization is based on a complete structural characterization of the
piece. We extract the two segment clusters which are most
frequently repeated in the song. In contrast to existing methods,
the measure of repetition is not dependent on segment
length. |
|
| results |
The table below shows results for a
few songs.
The columns show the summary elements. Segments I and
II are representative segments for the two dominant clusters in the
song, selected based on global similarity to the song's
segments. They are ordered by occurrence in the piece, so
that Summary I is the predicted verse segment and Summary II is the
predicted chorus segment, assuming that the verse and chorus segments
are the dominant segment clusters detected by the automatic
algorithm. The 2 Segment summary is the most globally similar
contiguous combination of a verse and chorus cluster. If there is
a segment(s) between the verse and chorus (i.e. a lead-in segment) in
every verse/chorus occurrence through the piece, these are included to
provide a contiguous summary. |
| Song - Title / Artist |
Entire Song |
Segment I |
Segment II |
2 Segment Summary |
| Wild Honey / U2 |
MP3 |
MP3 |
MP3 |
MP3 |
| Lucy in the
Sky with Diamonds / The Beatles |
MP3 |
MP3 |
MP3 |
MP3 |
| The Magical
Mystery Tour / The Beatles |
MP3 |
MP3 |
MP3 |
MP3 |
| Optimistic /
Radiohead |
MP3 |
MP3 |
MP3 |
MP3 |
| Hash Pipe /
Weezer |
MP3 |
MP3 |
MP3 |
MP3 |
| Bohemian Like
You / The Dandy Warhols |
MP3 |
MP3 |
MP3 |
MP3 |
| Tahitian Moon
/ Porno for Pyros |
MP3 |
MP3 |
MP3 |
MP3 |
| The Zephyr
Song / The Red Hot Chili Peppers |
MP3 |
MP3 |
MP3 |
MP3 |
| I Did It / The
Dave Matthews Band |
MP3 |
MP3 |
MP3 |
MP3 |
To review segmentation results for a subset of the above songs
(included in [2]), see this page.
For comparison, 30 second summary from Amazon.com: Tahitian
Moon
For comparison, 30 second summary from Amazon.com: Hash
Pipe
For comparison, 30 second summary from Amzon.com: I
Did It
|
| technical details |
| The
approach is fully documented in the papers below.
A basic flowchart appears below. The first step is to segment the
digital audio into its major components. For pop music, these are
typically, verse, chorus, bridge, etc.. The segments are then
statistically clustered using spectral methods, and the dominant
segment clusters are determined. Representative segments are
selected from the dominant segment clusters to comprise the
summary. Using the time-ordering of the segments, the verse and
chorus cluster is predicted. A final summary consisting of
adjacent "verse/chorus" segments is also provided. In the event
that the verse and chorus are not adjacent anywhere in the piece, we
include intermediate segments. As each segment is assigned a
cluster, many other forms of summaries can be provided, according to
the application context or bandwidth constraints. |
 |
|
|
| references |
| [1] J. Foote and M. Cooper. Media Segmentation using Self-Similarity
Decomposition. Proc. SPIE, 5021:167--75,
2003. |
| This paper (above) provides a description of the
approach with a single example for exposition. |
|
|
| [2] M. Cooper and J. Foote. Summarizing Popular Music via Structural
Similarity Analysis. Proc. IEEE Workshop on Applications of
Signal Processing to Audio and Acoustics, 2003. |
| This paper provides an overview of the approach and
more complete experimental results. |