News - Justin Salamon

Best Student Paper Award at 2017 AES International Conference on Semantic Audio

23/6/2017

I'm excited to report that our paper "Pitch Contours as a Mid-Level Representation for Music Informatics", has won the Best Student Paper Award at the 2017 AES International Conference on Semantic Audio. The paper, led and presented by my colleague Rachel Bittner, proposes a factored architecture for a variety of pitch-informed MIR tasks such predominant and multiple f0 estimation, genre, gender and singing style classification; with pitch contours as a powerful and semantically rich mid-level representation.

So... should all machine learning for music be end-to-end? See what we found in the full paper:

Pitch Contours as a Mid-Level Representation for Music Informatics
R. M. Bittner, J. Salamon, J. J. Bosch, and J. P. Bello.
In AES Conference on Semantic Audio, Erlangen, Germany, Jun. 2017.
[PDF]

Pitch Contours as a Mid-Level Representation for Music Informatics

14/4/2017

Content-based Music Informatics includes tasks that involve estimating the pitched content of music, such as the main melody or the bass line. To date, the field lacks a good machine representation that models the human perception of pitch, with each task using specific, tailored representations. This paper proposes factoring pitch estimation problems into two stages, where the output of the first stage for all tasks is a multipitch contour representation. Further, we propose the adoption of pitch contours as a unit of pitch organization. We give a review of the existing work on contour extraction and characterization and present experiments that demonstrate the discriminability of pitch contours.

Agree? Disagree? Get the full details here:

Pitch Contours as a Mid-Level Representation for Music Informatics
R. M. Bittner, J. Salamon, J. J. Bosch, and J. P. Bello.
In AES Conference on Semantic Audio, Erlangen, Germany, Jun. 2017.
[PDF]

Pitch Analysis for Active Music Discovery @ ICML 2016

23/6/2016

Today I'll be giving an invited talk at the Machine Learning for Music Discovery Workshop as part of the ICML 2016 conference.

The talk is about Pitch Analysis for Active Music Discovery:

A significant proportion of commercial music is comprised of pitched content: a melody, a bass line, a famous guitar solo, etc. Consequently, algorithms that are capable of extracting and understanding this type of pitched content open up numerous opportunities for active music discovery, ranging from query-by-humming to musical-feature-based exploration of Indian art music or recommendation based on singing style. In this talk I will describe some of my work on algorithms for pitch content analysis of music audio signals and their application to music discovery, the role of machine learning in these algorithms, and the challenge posed by the scarcity of labeled data and how we may address it.

And here's the extended abstract:

Pitch Analysis for Active Music Discovery
J. Salamon
Machine Learning for Music Discovery workshop, International Conference on Machine Learning (ICML), invited talk, New York City, NY, USA, June 2016.
[PDF]

The workshop has a great program lined up, if your'e attending ICML 2016 be sure to drop by!

A Comparison of Melody Extraction Methods Based on Source-Filter Modelling

26/5/2016

This work explores the use of source-filter models for pitch salience estimation and their combination with different pitch tracking and voicing estimation methods for automatic melody extraction. Source-filter models are used to create a mid-level representation of pitch that implicitly incorporates timbre information. The spectrogram of a musical audio signal is modelled as the sum of the lead- ing voice (produced by human voice or pitched musical instruments) and accompaniment. The leading voice is then modelled with a Smoothed Instantaneous Mixture Model (SIMM) based on a source-filter model. The main advantage of such a pitch salience function is that it enhances the leading voice even without explicitly separating it from the rest of the signal. We show that this is beneficial for melody extraction, increasing pitch estimation accuracy and reducing octave errors in comparison with simpler pitch salience functions. The adequate combination with voicing detection techniques based on pitch contour characterisation leads to significant improvements over state- of-the-art methods, for both vocal and instrumental music.

Click on image to enlarge

For further details see our paper:

A Comparison of Melody Extraction Methods Based on Source-Filter Modelling
J. J. Bosch, R. M. Bittner, J. Salamon, and E. Gómez
Proc. 17th International Society for Music Information Retrieval Conference (ISMIR 2016), New York City, USA, Aug. 2016.

Melody Extraction in Python with Melodia

28/1/2016

Thanks to the great work of Chris Cannam and George Fazekas at the C4DM, it is now possible to run vamp plugins directly in python via the vamp module. This is fantastic news, allowing researchers (and everyone else) to integrate algorithms implemented as vamp plugins directly into their python processing pipeline. In this way it becomes much easier to build fully automated experimental (and application) pipelines purely in python, without the need to make external system calls or export the output of a vamp plugin to disk before importing it into python. w00t.

This is great news for Melodia, my melody extraction algorithm that's implemented as a vamp plugin. Now you can use Melodia to extract the pitch contour of a melody from a song directly in python and use the output for further processing, for example you could segment and quantize the contour into notes and export the melody as a MIDI file or a JAMS file.

If you're interested in using Melodia in python, I've created a short tutorial notebook. It should be easy enough to modify it for using other vamp plugins in python too.

For those looking to save a click, here's the (embedded) tutorial:

Convert audio to MIDI melody using Melodia

27/1/2016

The audio_to_midi_melodia python script allows you to extract the melody of a song and save it to a MIDI file. The script uses the Melodia algorithm to perform melody extraction, taking advantage of the new vamp module that allows running vamp plugins (like Melodia) directly in python.

Once the pitch contour of the melody is extracted, the next (non-trivial!) step is to segment it into notes and quantize the pitch of each note, producing a discrete series of notes that can then be exported into a any symbolic format such as MIDI or JAMS.

Quantizing a continuous pitch sequence into a series of notes is an active area of research and remains and open problem. Still, we can obtain fairly decent results using a series of heuristics:

1. Convert the pitch sequence from Hertz to (fractional) MIDI note numbers
2. Round each value to the nearest integer MIDI note number
3. Optionally apply a median filter to smooth out short jumps in pitch (e.g. due to vibrato)
4. Iterate over the sequence and whenever the pitch changes start a new note

Here's an example:

Original audio

Continuous melody contour extracted by Melodia

Quantized MIDI melody obtained with audio_to_midi_melodia

Note that exporting to MIDI requires providing a BPM value. You could select a value arbitrarily, estimate it manually, or estimate it automatically using one of the tempo estimation algorithms included in Essentia, Librosa, or if you'd like to stick to vamp plugins QMVP. No BPM is required if you export to JAMS, which directly uses the note onset times estimated from the audio track.

The script is open source and available on GitHub: https://github.com/justinsalamon/audio_to_midi_melodia

MIDI melody extracted using the audio_to_midi_melodia script and loaded into GarageBand

Once the melody is saved as MIDI it can be imported into any DAW that supports MIDI editing, like GarageBand.

Best Oral Presentation Award at ISMIR 2015

2/11/2015

Our paper "Melody Extraction by Contour Classification", presented by colleague and first author Rachel Bittner, has won the Best Oral Presentation Award at the ISMIR 2015 conference!

A huge congratulations to my co-authors Rachel Bittner, Slim Essid and Juan Pablo Bello, and especially to Rachel for doing such an excellent job at presenting the paper at the conference!

R. Bittner, J. Salamon, S. Essid and J. P. Bello. "Melody Extraction by Contour Classification". Proc. 16th International Society for Music Information Retrieval Conference (ISMIR 2015), Malaga, Spain, Oct. 2015.
[ISMIR][PDF][BibTex]

Melody Extraction by Contour Classification

21/7/2015

Due to the scarcity of labeled data, most melody extraction algorithms do not rely on fully data-driven processing blocks but rather on careful engineering. For example, the Melodia melody extraction algorithm employs a pitch contour selection stage that relies on a number of heuristics for selecting the melodic output. In this paper we explore the use of a discriminative model to perform purely data-driven melodic contour selection. Specifically, a discriminative binary classifier is trained to distinguish melodic from non-melodic contours. This classifier is then used to predict likelihoods for a track’s extracted contours, and these scores are decoded to generate a single melody output. The results are compared with the Melodia algorithm and with a generative model used in a previous study. We show that the discriminative model outperforms the generative model in terms of contour classification accuracy, and the melody output from our proposed system performs comparatively to Melodia. The results are complemented with error analysis and avenues for future improvements.

For further details please see our paper:

R. Bittner, J. Salamon, S. Essid and J. P. Bello. "Melody Extraction by Contour Classification". Proc. 16th International Society for Music Information Retrieval Conference (ISMIR 2015), Malaga, Spain, Oct. 2015.
[ISMIR][PDF][BibTex]

The Neumerator

22/6/2015

The Neumerator will take any audio file and generate a medieval neume-style manuscript from it!

As you might have guessed this is a hack... more the specifically the hack I worked on together with Kristin Olson and Tejaswinee Kelkar during this month's Monthly Music Hackathon NYC (where I also gave a talk about melody extraction).

How does The Neumerator work?

You start by choosing a music recording, for example "Ave Generosa" by Hildegard Von Bingen (to keep things simple we'll work just with the first 20 seconds or so):

The first step is to extract the pitch sequence of the melody, for which we use Melodia, the melody extraction algorithm I developed as part of my PhD. The result looks like this (pitch vs time):

Once we have the continuos pitch sequence, we need to discretize it into notes. Whilst we could load the sequence into a tool such as Tony to perform clever note segmentation, for our hack we wanted to keep things simple (and fully automated), so we implemented our own very simple note segmentation process. First, we quantize the pitch curve into semitones:

Then we smooth out very short glitches using a majority-vote sliding window:

Then we can keep just the points where the pitch changes, and we're starting to get close to something that looks kinda of "neumey":

Finally, we run it through the neumerator manuscript generator which applies our secret combination of magic, unicorns and gregorian chant, and voila!

Being a hack and all, The Neumerator does have its limitations - currently we can only draw points (rather than connecting them into actual neumes), everything goes on a single four-line staff (regardless of the duration of the audio being processed), and of course the note quantization step is pretty basic. Oh, and the mapping of the estimated notes onto the actual pixel positions of the manuscript is a total hack. But hey, that's what future hackathons are for!

Want to take The Neumerator to the next level? It's all on GitHub:
github.com/justinsalamon/neumerator

Melody Extraction Talk at Spotify Monthly Music Hackathon

22/6/2015

On Saturday June 20th I gave a talk about melody extraction at the Monthly Music Hackathon NYC hosted by Spotify's NYC office. The talk, titled "Melody Extraction: Algorithms and Applications in Music Informatics", provided a bird's eye view of my work on melody extraction, including the Melodia algorithm and a number of applications such as query-by-humming, cover song ID, genre classification, automatic transcription and tonic identification in Indian classical music (all of which are mention in my melody extraction page and in greater detail in my PhD thesis).

In addition to my own talk, we had fantastic talks by NYU's Uri Nieto, Rachel Bittner, Eric Humphrey and Ethan Hein. A big thanks to Jonathan Marmor for organization such an awesome day!

Oh, we also had a lot of fun working on our hack later on in the day: The Neumerator!