Category: Melodia

A Comparison of Melody Extraction Methods Based on Source-Filter Modelling

26/5/2016

This work explores the use of source-filter models for pitch salience estimation and their combination with different pitch tracking and voicing estimation methods for automatic melody extraction. Source-filter models are used to create a mid-level representation of pitch that implicitly incorporates timbre information. The spectrogram of a musical audio signal is modelled as the sum of the lead- ing voice (produced by human voice or pitched musical instruments) and accompaniment. The leading voice is then modelled with a Smoothed Instantaneous Mixture Model (SIMM) based on a source-filter model. The main advantage of such a pitch salience function is that it enhances the leading voice even without explicitly separating it from the rest of the signal. We show that this is beneficial for melody extraction, increasing pitch estimation accuracy and reducing octave errors in comparison with simpler pitch salience functions. The adequate combination with voicing detection techniques based on pitch contour characterisation leads to significant improvements over state- of-the-art methods, for both vocal and instrumental music.

Click on image to enlarge

For further details see our paper:

A Comparison of Melody Extraction Methods Based on Source-Filter Modelling
J. J. Bosch, R. M. Bittner, J. Salamon, and E. Gómez
Proc. 17th International Society for Music Information Retrieval Conference (ISMIR 2016), New York City, USA, Aug. 2016.

1 Comment

Melody Extraction in Python with Melodia

28/1/2016

0 Comments

Thanks to the great work of Chris Cannam and George Fazekas at the C4DM, it is now possible to run vamp plugins directly in python via the vamp module. This is fantastic news, allowing researchers (and everyone else) to integrate algorithms implemented as vamp plugins directly into their python processing pipeline. In this way it becomes much easier to build fully automated experimental (and application) pipelines purely in python, without the need to make external system calls or export the output of a vamp plugin to disk before importing it into python. w00t.

This is great news for Melodia, my melody extraction algorithm that's implemented as a vamp plugin. Now you can use Melodia to extract the pitch contour of a melody from a song directly in python and use the output for further processing, for example you could segment and quantize the contour into notes and export the melody as a MIDI file or a JAMS file.

If you're interested in using Melodia in python, I've created a short tutorial notebook. It should be easy enough to modify it for using other vamp plugins in python too.

For those looking to save a click, here's the (embedded) tutorial:

0 Comments

Convert audio to MIDI melody using Melodia

27/1/2016

0 Comments

The audio_to_midi_melodia python script allows you to extract the melody of a song and save it to a MIDI file. The script uses the Melodia algorithm to perform melody extraction, taking advantage of the new vamp module that allows running vamp plugins (like Melodia) directly in python.

Once the pitch contour of the melody is extracted, the next (non-trivial!) step is to segment it into notes and quantize the pitch of each note, producing a discrete series of notes that can then be exported into a any symbolic format such as MIDI or JAMS.

Quantizing a continuous pitch sequence into a series of notes is an active area of research and remains and open problem. Still, we can obtain fairly decent results using a series of heuristics:

1. Convert the pitch sequence from Hertz to (fractional) MIDI note numbers
2. Round each value to the nearest integer MIDI note number
3. Optionally apply a median filter to smooth out short jumps in pitch (e.g. due to vibrato)
4. Iterate over the sequence and whenever the pitch changes start a new note

Here's an example:

Original audio

Continuous melody contour extracted by Melodia

Quantized MIDI melody obtained with audio_to_midi_melodia

Note that exporting to MIDI requires providing a BPM value. You could select a value arbitrarily, estimate it manually, or estimate it automatically using one of the tempo estimation algorithms included in Essentia, Librosa, or if you'd like to stick to vamp plugins QMVP. No BPM is required if you export to JAMS, which directly uses the note onset times estimated from the audio track.

The script is open source and available on GitHub: https://github.com/justinsalamon/audio_to_midi_melodia

MIDI melody extracted using the audio_to_midi_melodia script and loaded into GarageBand

Once the melody is saved as MIDI it can be imported into any DAW that supports MIDI editing, like GarageBand.

0 Comments

The Neumerator

22/6/2015

0 Comments

The Neumerator will take any audio file and generate a medieval neume-style manuscript from it!

As you might have guessed this is a hack... more the specifically the hack I worked on together with Kristin Olson and Tejaswinee Kelkar during this month's Monthly Music Hackathon NYC (where I also gave a talk about melody extraction).

How does The Neumerator work?

You start by choosing a music recording, for example "Ave Generosa" by Hildegard Von Bingen (to keep things simple we'll work just with the first 20 seconds or so):

The first step is to extract the pitch sequence of the melody, for which we use Melodia, the melody extraction algorithm I developed as part of my PhD. The result looks like this (pitch vs time):

Once we have the continuos pitch sequence, we need to discretize it into notes. Whilst we could load the sequence into a tool such as Tony to perform clever note segmentation, for our hack we wanted to keep things simple (and fully automated), so we implemented our own very simple note segmentation process. First, we quantize the pitch curve into semitones:

Then we smooth out very short glitches using a majority-vote sliding window:

Then we can keep just the points where the pitch changes, and we're starting to get close to something that looks kinda of "neumey":

Finally, we run it through the neumerator manuscript generator which applies our secret combination of magic, unicorns and gregorian chant, and voila!

Being a hack and all, The Neumerator does have its limitations - currently we can only draw points (rather than connecting them into actual neumes), everything goes on a single four-line staff (regardless of the duration of the audio being processed), and of course the note quantization step is pretty basic. Oh, and the mapping of the estimated notes onto the actual pixel positions of the manuscript is a total hack. But hey, that's what future hackathons are for!

Want to take The Neumerator to the next level? It's all on GitHub:
github.com/justinsalamon/neumerator

0 Comments

Melody Extraction Talk at Spotify Monthly Music Hackathon

22/6/2015

0 Comments

On Saturday June 20th I gave a talk about melody extraction at the Monthly Music Hackathon NYC hosted by Spotify's NYC office. The talk, titled "Melody Extraction: Algorithms and Applications in Music Informatics", provided a bird's eye view of my work on melody extraction, including the Melodia algorithm and a number of applications such as query-by-humming, cover song ID, genre classification, automatic transcription and tonic identification in Indian classical music (all of which are mention in my melody extraction page and in greater detail in my PhD thesis).

In addition to my own talk, we had fantastic talks by NYU's Uri Nieto, Rachel Bittner, Eric Humphrey and Ethan Hein. A big thanks to Jonathan Marmor for organization such an awesome day!

Oh, we also had a lot of fun working on our hack later on in the day: The Neumerator!

0 Comments

Replace Your Favourite Singer With a Robot

12/6/2015

1 Comment

Melody extraction can be used for a number of cool applications including query-by-humming, automatic transcription, or computational musicology. But it can also be used to replace your favourite singer with a robot!

How? You start by choosing a track, for example this one:

Then you can use Melodia to estimate the pitch curve of the singer. If we synthesize the result with a sine wave, it would sound like this:

But wait! We can load the pitch curve into Vocaloid and synthesize a pitch-accurate rendition with whichever voice we want, like this one:

Finally, we can mix our new rendition with the original accompaniment to produce our very own robot-remix! Here it is:

TADA!

Here's another example, this time with opera! Here's the original track:

And here's the Melodia+Vocaloid version:

The opportunities for creating new songs (or just wreaking havoc with existing ones) are limitless!

1 Comment

Melodia featured in Coursera MOOC on Audio DSP

12/11/2014

0 Comments

Audio Signal Processing for Music Applications

Melodia, the melody extraction algorithm I worked on for my PhD thesis, has been included in the Coursera MOOC on Audio Signal Processing for Music Applications run by Prof. Xavier Serra (UPF) and Prof. Julius O Smith III (Standford). If you're signed up for the course, you can see the lecture here (Melodia is discussed about half way into the lecture).

It's very exciting to have Melodia mentioned in the context of this popular course by two leading members of the audio processing community!

Since its release in 2012, Melodia has been downloaded almost 7000 times by researchers, educators, artists and hobbyists. Here's a list of scientific works citing the article describing the algorithm.

Disclosure: Prof. Xavier Serra was the co-supervisor of my PhD thesis together with Dr. Emilia Gómez.

0 Comments

MeloSynth

16/10/2014

1 Comment

Since we released the MELODIA vamp plugin implementing our melody extraction algorithm, I've been contacted a number of times by people interested in synthesizing the pitch sequences estimated by MELODIA, like the examples provided on my melody extraction and phd thesis pages.

To this end, I've written a small python script, MeloSynth, to do just that:
www.github.com/justinsalamon/melosynth

MeloSynth is written in Python, is open source, and requires Python and NumPy. It's designed to be as simple as possible to use, no programming/python knowledge required. Given a txt or csv file with two columns [timestamps, frequency], the default behavior is to synthesize a wav file using a single sinusoid. The script also has options for setting the sampling frequency, adding more harmonics, changing the waveform, synthesizing negative values (which are used to indicate the absence of pitch by convention) and batch processing all files in a folder.

MeloSynth can of course also be used to synthesize pitch estimates from other algorithms, as long as the output is provided in the expected double column format.

Give it a spin and let me know what you think :)

1 Comment

3 papers to make MIR a better place

1/9/2014

This year I've collaborated on 3 papers for the ISMIR 2014 conference, and they are all about making MIR a more reproducible, transparent, and reliable field of research. In a nutshell, they're about making MIR a better place :)

The first, lead by Rachel Bittner (MARL @ NYU), describes MedleyDB, a new dataset of multitrack recordings we have compiled and annotated, primarily for melody extraction evaluation. Unlike previous datasets, it contains over 100 songs, most of which are full-length (rather than excerpts), in a variety of musical genres, and of professional quality (not only in the recording, but also in the content):

R. Bittner, J. Salamon, M. Tierney, M. Mauch, C. Cannam and J. P. Bello. "MedleyDB: A Multitrack Dataset for Annotation-Intensive MIR Research", in Proc. 15th International Society for Music Information Retrieval Conference (ISMIR 2014), Taipei, Taiwan, October 2014.

We hope this new dataset will help shed light on the remaining challenges in melody extraction (we have identified a few ourselves in the paper), and allow researchers to evaluate their algorithms on a more realistic dataset. The dataset can also be used for research in musical instrument identification, source separation, multiple f0 tracking, and any other MIR task that benefits from the availability of multitrack audio data. Congratulations to my co-authors Rachel, Mike, Matthias, Chris and Juan!

The second paper, lead by Eric Humphrey (MARL @ NYU), introduces JAMS, a new specification we've been working on for representing MIR annotations. JAMS = JSON Annotated Music Specification, and as you can imagine, is JSON based:

E. J. Humphrey, J. Salamon, O. Nieto, J. Forsyth, R. M. Bittner and J. P. Bello. "JAMS: A JSON Annotated Music Specification for Reproducible MIR Research", in Proc. 15th International Society for Music Information Retrieval Conference (ISMIR 2014), Taipei, Taiwan, October 2014.

The three main concepts behind JAMS are:

Comprehensive annotation: moving away from lab files, a JAMS file can store comprehensive annotation data and annotation metadata in a structured way that can be easily loaded from and saved to disk.
Multiple annotations: sometimes an annotation should be considered more of a reference than a ground truth, in that different annotators may produce different references (e.g. chord annotations). JAMS allows to store multiple annotations for the same recording in a single file.
Multiple tasks: traditionally, the annotation for each MIR task (e.g. melody extraction, chord recognition, genre identification, etc.) is stored in a separate file. JAMS allows to store the annotations of different tasks for the same recording in a single JAMS file which, in addition to keeping things tidy, facilitates the development/evaluation of algorithms that use / extract multiple musical facets at once.

As with all new specifications / protocols / conventions, the real success of JAMS depends on its adoption by the community. We are fully aware that this is but a proposal, a first step, and hope to develop / improve JAMS by actively discussing it with the MIR community. To ease adoption, we're providing a python library for loading / saving / manipulating JAMS files, and have ported the annotations of several of the most commonly used corpora in MIR into JAMS. Congratulations to my co-authors Eric, Uri (Oriol), Jon, Rachel and Juan!

The third paper, lead by Colin Raffel (LabROSA @ Columbia), describes mir_eval, an open-source python library that implements the most common evaluation measures for a large selection of MIREX tasks including melody extraction, chord recognition, beat detection, onset detection, structural segmentation and source separation:

C. Raffel, B. McFee, E. J. Humphrey, J. Salamon, O. Nieto, D. Liang and D. P. W. Ellis. "mir_eval: A Transparent Implementation of Common MIR Metrics", Proc. 15th International Society for Music Information Retrieval Conference (ISMIR 2014), Taipei, Taiwan, October 2014.

We hope this (a) makes the life of MIR researchers easier, providing an easy-to-use MIR DIY library and more importantly (b) promotes transparency and reproducibility in MIR research by ensuring researchers use the same evaluation code (as opposed to every researcher re-implementing their own eval code as is the case right now) and making that code available online for inspection. Congratulations to my co-authors Colin, Brian, Eric, Uri (Oriol), Dawen and Dan!

Looking forward to discussing these papers and ideas with everyone at ISMIR 2014! See you in Taipei ^_^

IEEE SPM Melody Extraction Review published online

16/2/2014

Our review article on melody extraction algorithms for the IEEE Signal Processing Magazine is finally available online! The printed edition will be coming out in March 2014:

J. Salamon, E. Gómez, D. P. W. Ellis and G. Richard, "Melody Extraction from Polyphonic Music Signals: Approaches, Applications and Challenges", IEEE Signal Processing Magazine, 31(2):118-134, Mar. 2014.

Abstract—Melody extraction algorithms aim to produce a sequence of frequency values corresponding to the pitch of the dominant melody from a musical recording. Over the past decade melody extraction has emerged as an active research topic, comprising a large variety of proposed algorithms spanning a wide range of techniques. This article provides an overview of these techniques, the applications for which melody extraction is useful, and the challenges that remain. We start with a discussion of ‘melody’ from both musical and signal processing perspectives, and provide a case study which interprets the output of a melody extraction algorithm for specific excerpts. We then provide a comprehensive comparative analysis of melody extraction algorithms based on the results of an international evaluation campaign. We discuss issues of algorithm design, evaluation and applications which build upon melody extraction. Finally, we discuss some of the remaining challenges in melody extraction research in terms of algorithmic performance, development, and evaluation methodology.

For further information about this article please visit my Research page.

<<Previous

A Comparison of Melody Extraction Methods Based on Source-Filter Modelling

Melody Extraction in Python with Melodia

Convert audio to MIDI melody using Melodia

The Neumerator

How does The Neumerator work?

Melody Extraction Talk at Spotify Monthly Music Hackathon

Replace Your Favourite Singer With a Robot

Melodia featured in Coursera MOOC on Audio DSP

MeloSynth

3 papers to make MIR a better place

IEEE SPM Melody Extraction Review published online

NEWS

Archives

Categories

﻿How does The Neumerator work?﻿

NEWS

Archives

Categories

How does The Neumerator work?