MIR – The Phrase Stealing Music Polyphonic Reduction Algorithm

Related work: Lui, S., Horner, A., and Ayers, L. 2006. “An Intelligent SP-MIDI Polyphonic Reduction Algorithm”, IEEE Transactions on Multimedia, Volume 13, Issue 2, pp.52-59.

[Download the paper]


MIDI (Musical Instrument Digital Interface) is a popular music format used in multimedia messaging service (MMS), such as mobile phone ringtones. Scalable Polyphony MIDI (SP-MIDI) is an enhanced format that allows composers to specify how MIDI data should be performed by hardware devices with different numbers of polyphonic voices. I-Melody is a popular standard file format for simple melodies, and has been adopted as a monophonic ringtone format. Most current mobile phones only support SP-MIDI ringtones with specific polyphonic limits, (for example, 1 for monophonic, 2, 4, 8, 16, etc) or I-melody ringtones. Since most MIDI files are composed without regard to polyphonic limits, a common problem in the mobile phone industry is conversion from MIDI to SP-MIDI. However, simple MIDI to SP-MIDI reduction algorithms, such as note-stealing, may lose or interrupt important musical information. This paper presents a phrase stealing algorithm that drops the perceptually least important notes when reducing a MIDI file to SP-MIDI, and preserves the most important phrases. The phrase stealing algorithm produces SP-MIDI files with an average phrase length of 10 notes, in contrast to the note stealing algorithm which disrupts perceptually important melodic phrases. Formal listening test results show that listeners found the phrase stealing reduction very similar to the original, representing a big improvement over note stealing which listeners only found somewhat similar to the original.

Brief Introduction

We reduce the polyphony of MIDI file and yet preserve the auditory important component (such as bass line and melody). The resulting file sounds perceptually very similar to the original. for example, a 16 polyphony Bach partita can be reduced to 4 polyphony with 60% notes truncated, but yet producing very similar perceptual outcome. We proposed a phrase stealing algorithm that preserve music phrases to maintain music perceptual smoothness. It is more effective than the traditional approach that truncate music notes by First-In-First-Out (FIFO). 


Figure 1. The original music (left) and the music with reduced polyphony by the phrase stealing algorithm (right). The two music sounds similar perceptually.