Piano Transcription with Convolutional Sparse Lateral Inhibition

Cite Details

Andrea Cogliati, Zhiyao Duan and Brendt Wohlberg, "Piano Transcription with Convolutional Sparse Lateral Inhibition", IEEE Signal Processing Letters, vol. 24, no. 4, doi:10.1109/LSP.2017.2666183, pp. 392--396, Apr 2017

PDF

HTML

Software

Abstract

This paper extends our prior work on context-dependent piano transcription to estimate the length of the notes in addition to their pitch and onset. This approach employs convolutional sparse coding along with lateral inhibition constraints to approximate a musical signal as the sum of piano note waveforms (dictionary elements) convolved with their temporal activations. The waveforms are pre-recorded for the specific piano to be transcribed in the specific environment. A dictionary containing multiple waveforms per pitch is generated by truncating a long waveform for each pitch to different lengths. During transcription, the dictionary elements are fixed and their temporal activations are estimated and post-processed to obtain the pitch, onset and note length estimation. A sparsity penalty promotes globally sparse activations of the dictionary elements, and a lateral inhibition term penalizes concurrent activations of different waveforms corresponding to the same pitch within a temporal neighborhood, to achieve note length estimation. Experiments on the MAPS dataset show that the proposed approach significantly outperforms a state-of-the-art music transcription method trained in the same context-dependent setting in transcription accuracy.

BibTeX Entry

@article{cogliati-2017-piano,

author = {Andrea Cogliati and Zhiyao Duan and Brendt Wohlberg},

title = {Piano Transcription with Convolutional Sparse Lateral Inhibition},

year = {2017},

month = Apr,

urlpdf = {http://brendt.wohlberg.net/publications/pdf/cogliati-2017-piano.pdf},

journal = {IEEE Signal Processing Letters},

volume = {24},

number = {4},

doi = {10.1109/LSP.2017.2666183},

pages = {392--396}

}