Justin Salamon
  • Home
  • News
  • Research
  • Publications
  • Code/Data
  • Melody Extraction
  • PhD Thesis
  • Contact
    • Music
    • Music Technology

New Tools & Data for Soundscape Synthesis and Online Audio Annotation

10/10/2017

0 Comments

 
We're glad to announce the release of two open-source tools and a new dataset developed as part of the SONYC project we hope will be of use to the community: 

Scaper: a library for soundscape synthesis and augmentation
  • Automatically synthesize soundscapes with corresponding ground truth annotations 
  • Useful for running controlled ML experiments (ASR, sound event detection, bioacoustic species recognition, etc.)
  • Useful for running controlled experiments to assess human annotation performance
  • Potentially useful for generating data for source separation experiments (might require some extra code)
  • Potentially useful for generating ambisonic soundscapes (definitely requires some extra code)

AudioAnnotator: a javascript web interface for annotating audio data
  • Developed in collaboration with Edith Law and her students at the University of Waterloo's HCI Lab
  • A web interface that allows users to annotate audio recordings
  • Supports 3 types of visualization (waveform, spectrogram, invisible)
  • Useful for crowdsourcing audio labels and running controlled experiments on crowdsourcing audio  labels
  • Supports feedback mechanisms for providing real-time feedback to the user based on their annotations

URBAN-SED dataset: a new dataset for sound event detection
  • Includes 10,000 soundscapes with strongly labeled sound events generated using scaper
  • Totals almost 30 hours and includes close to 50,000 annotated sound events
  • Baseline convnet results on URBAN-SED are included in the scaper-paper.

Further information about scaper, the AudioAnnotator and the URBAN-SED dataset, including controlled experiments on the quality of crowdsourced human annotations as a function of visualization and soundscape complexity, are provided in the following papers:

Seeing sound: Investigating the effects of visualizations and complexity on crowdsourced audio annotations
M. Cartwright, A. Seals, J. Salamon, A. Williams, S. Mikloska, D. MacConnell, E. Law, J. Bello, and O. Nov.
Proceedings of the ACM on Human-Computer Interaction, 1(2), 2017.

Scaper: A Library for Soundscape Synthesis and Augmentation
J. Salamon, D. MacConnell, M. Cartwright, P. Li, and J. P. Bello.
In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NY, USA, Oct. 2017.

We hope you find these tools useful and look forward to receiving your feedback (and pull requests!).

Cheers, on behalf of the entire team,
Justin Salamon & Mark Cartwright.
​--
Picture
Picture
Picture
Picture
0 Comments

mir_eval 0.3 released!

26/2/2016

0 Comments

 
We're pleased to announce that mir_eval 0.3 has been released. Since our ISMIR paper, we have
  • Added transcription, key, tempo, and hierarchical segmentation evaluation
  • Improved documentation and the sonify submodule
  • Made mir_eval Python 3-compatible
  • Added continuous integration and near-100% coverage

​You can grab the latest version from pip or from our GitHub repository. Or, if you are allergic to Python, our mir_eval web service has been updated to 0.3 too, so you can run all of your eval without installing anything or writing any code.

We'd like to thank new contributors Nils Werner, Rachel Bittner, Stefan Balke, and Fabian-Robert Stöter.  On that note, and while I have your attention, I'd like to emphasize our goal of making mir_eval a community endeavor: If you disagree with how a metric has been implemented or think you have found a bug, please open an issue, and if you have a new feature to add, pull requests are welcome!

See you all in NYC in August!

(Based on official announcement sent to the ISMIR community mailing list)
Picture
0 Comments

MeloSynth

16/10/2014

1 Comment

 
Picture
Since we released the MELODIA vamp plugin implementing our melody extraction algorithm, I've been contacted a number of times by people interested in synthesizing the pitch sequences estimated by MELODIA, like the examples provided on my melody extraction and phd thesis pages.

To this end, I've written a small python script, MeloSynth, to do just that:
www.github.com/justinsalamon/melosynth

MeloSynth is written in Python, is open source, and requires Python and NumPy. It's designed to be as simple as possible to use, no programming/python knowledge required. Given a txt or csv file with two columns [timestamps, frequency], the default behavior is to synthesize a wav file using a single sinusoid. The script also has options for setting the sampling frequency, adding more harmonics, changing the waveform, synthesizing negative values (which are used to indicate the absence of pitch by convention) and batch processing all files in a folder.

MeloSynth can of course also be used to synthesize pitch estimates from other algorithms, as long as the output is provided in the expected double column format.

Give it a spin and let me know what you think :)

1 Comment

3 papers to make MIR a better place

1/9/2014

 
Picture
This year I've collaborated on 3 papers for the ISMIR 2014 conference, and they are all about making MIR a more reproducible, transparent, and reliable field of research. In a nutshell, they're about making MIR a better place :)

The first, lead by Rachel Bittner (MARL @ NYU), describes MedleyDB, a new dataset of multitrack recordings we have compiled and annotated, primarily for melody extraction evaluation. Unlike previous datasets, it contains over 100 songs, most of which are full-length (rather than excerpts), in a variety of musical genres, and of professional quality (not only in the recording, but also in the content):

  • R. Bittner, J. Salamon, M. Tierney, M. Mauch, C. Cannam and J. P. Bello. "MedleyDB: A Multitrack Dataset for Annotation-Intensive MIR Research", in Proc. 15th International Society for Music Information Retrieval Conference (ISMIR 2014), Taipei, Taiwan, October 2014.

We hope this new dataset will help shed light on the remaining challenges in melody extraction (we have identified a few ourselves in the paper), and allow researchers to evaluate their algorithms on a more realistic dataset. The dataset can also be used for research in musical instrument identification, source separation, multiple f0 tracking, and any other MIR task that benefits from the availability of multitrack audio data. Congratulations to my co-authors Rachel, Mike, Matthias, Chris and Juan!


The second paper, lead by Eric Humphrey (MARL @ NYU), introduces JAMS, a new specification we've been working on for representing MIR annotations. JAMS = JSON Annotated Music Specification, and as you can imagine, is JSON based:

  • E. J. Humphrey, J. Salamon, O. Nieto, J. Forsyth, R. M. Bittner and J. P. Bello. "JAMS: A JSON Annotated Music Specification for Reproducible MIR Research", in Proc. 15th International Society for Music Information Retrieval Conference (ISMIR 2014), Taipei, Taiwan, October 2014.

The three main concepts behind JAMS are:

  1. Comprehensive annotation: moving away from lab files, a JAMS file can store comprehensive annotation data and annotation metadata in a structured way that can be easily loaded from and saved to disk.
  2. Multiple annotations: sometimes an annotation should be considered more of a reference than a ground truth, in that different annotators may produce different references (e.g. chord annotations). JAMS allows to store multiple annotations for the same recording in a single file.
  3. Multiple tasks: traditionally, the annotation for each MIR task (e.g. melody extraction, chord recognition, genre identification, etc.) is stored in a separate file. JAMS allows to store the annotations of different tasks for the same recording in a single JAMS file which, in addition to keeping things tidy, facilitates the development/evaluation of algorithms that use  / extract multiple musical facets at once.

As with all new specifications / protocols / conventions, the real success of JAMS depends on its adoption by the community. We are fully aware that this is but a proposal, a first step, and hope to develop / improve JAMS by actively discussing it with the MIR community. To ease adoption, we're providing a python library for loading / saving / manipulating JAMS files, and have ported the annotations of several of the most commonly used corpora in MIR into JAMS. Congratulations to my co-authors Eric, Uri (Oriol), Jon, Rachel and Juan!

The third paper, lead by Colin Raffel (LabROSA @ Columbia), describes mir_eval, an open-source python library that implements the most common evaluation measures for a large selection of MIREX tasks including melody extraction, chord recognition, beat detection, onset detection, structural segmentation and source separation:
  • C. Raffel, B. McFee, E. J. Humphrey, J. Salamon, O. Nieto, D. Liang and D. P. W. Ellis. "mir_eval: A Transparent Implementation of Common MIR Metrics", Proc. 15th International Society for Music Information Retrieval Conference (ISMIR 2014), Taipei, Taiwan, October 2014.
We hope this (a) makes the life of MIR researchers easier, providing an easy-to-use MIR DIY library and more importantly (b) promotes transparency and reproducibility in MIR research by ensuring researchers use the same evaluation code (as opposed to every researcher re-implementing their own eval code as is the case right now) and making that code available online for inspection. Congratulations to my co-authors Colin, Brian, Eric, Uri (Oriol), Dawen and Dan!

Looking forward to discussing these papers and ideas with everyone at ISMIR 2014! See you in Taipei ^_^

ESSENTIA wins ACM Multimedia '13 Best Open Source Software Award

31/12/2013

 
ESSENTIA
ESSENTIA is an audio analysis software library developed at the MTG over the past eight years, to which I am proud to have made my small contribution too (through the great effort of Dmitry Bogdanov). 

Recently ESSENTIA was released as open source software, and shortly after won the ACM Multimedia 2013 Best Open Source Award! A massive congratulations to everyone at the MTG who has worked on the library over the years, and especially to Dmitry Bogdanov and Nicolas Wack.

ESSENTIA's first open source release is accompanied with two papers:
  • D. Bogdanov, N. Wack, E. Gómez, S. Gulati, P. Herrera, O. Mayor, G. Roma, J. Salamon, J. Zapata and X. Serra, "ESSENTIA: an Audio Analysis Library for Music Information Retrieval", in Proc. 14th International Society for Music Information Retrieval Conference (ISMIR 2013), Curitiba, Brazil, November 2013.
        [ISMIR][PDF][BibTex]
  • D. Bogdanov, N. Wack, E Gómez, S. Gulati, P. Herrera, O. Mayor, G. Roma, J. Salamon, J. Zapata and X. Serra, "ESSENTIA: an Open-Source Library for Sound and Music Analysis", in 21st ACM Int. Conf. on Multimedia, Barcelona, Spain, Oct. 2013.
        [ACM][PDF][BibTex]

    NEWS

    Machine listening research, code, data & hacks!

    Archives

    March 2023
    April 2022
    November 2021
    October 2021
    June 2021
    January 2021
    October 2020
    June 2020
    May 2020
    April 2020
    January 2020
    November 2019
    October 2019
    June 2019
    May 2019
    March 2019
    February 2019
    January 2019
    November 2018
    October 2018
    August 2018
    July 2018
    May 2018
    April 2018
    February 2018
    October 2017
    August 2017
    July 2017
    June 2017
    April 2017
    March 2017
    January 2017
    December 2016
    November 2016
    October 2016
    August 2016
    June 2016
    May 2016
    April 2016
    February 2016
    January 2016
    November 2015
    October 2015
    July 2015
    June 2015
    April 2015
    February 2015
    November 2014
    October 2014
    September 2014
    June 2014
    April 2014
    March 2014
    February 2014
    December 2013
    September 2013
    July 2013
    May 2013
    February 2013
    January 2013
    December 2012
    November 2012
    October 2012
    August 2012
    July 2012
    June 2012

    Categories

    All
    ACM MM'13
    ACM MM'14
    Acoustic Ecology
    Acoustic Event Detection
    Acoustic Sensing
    AES
    Applied Acoustics
    Article
    Audio-annotator
    Audio To Midi
    Auditory Scene Analysis
    Avian
    Award
    Baseball
    Beer
    Best Oral Presentation
    Best Paper Award
    Best Student Paper Award
    BigApps
    Bioacoustics
    BirdVox
    Book
    Chapter
    CHI
    Citizen Science
    Classification
    Computer Vision
    Conference
    Connected Cities
    Convolutional Neural Networks
    Cornell Lab Of Ornithology
    Coursera
    Cover Detection
    CREPE
    Crowdcrafting
    Crowdsourcing
    CUSP
    CVPR
    Data Augmentation
    Data Science
    Dataset
    Data Structures
    Dcase
    Deep Learning
    Domain
    Education
    Entrepreneurship
    Environmental Sound
    Essentia
    Eusipco
    Eusipco2015
    Evaluation
    Few-shot Learning
    Flight Calls
    Girl Scouts
    Grant
    Hackathon
    Hackday
    Hackfest
    HCI
    Hildegard Von Bingen
    ICASSP
    ICASSP 2020
    IEEE Signal Processing Letters
    Ieee Spm
    Indian Classical Music
    Interface
    Interspeech
    Interview
    Ismir 2012
    Ismir2014
    Ismir2015
    Ismir2016
    Ismir2017
    Ismir2020
    ITP
    Jams
    Javascript
    JNMR
    Journal
    Machine Learning
    Machine Listening
    Map
    Media
    Melodia
    Melody Extraction
    Metric Learning
    Midi
    Migration Monitoring
    MIR
    Mir_eval
    MOOC
    MTG-QBH
    Music Informatics
    Music Information Retrieval
    Music Similarity
    National Science Foundation
    Neumerator
    New York Times
    Noise Pollution
    Notebook
    NPR
    NSF
    NYC
    NYU
    Open Source
    Pitch
    Pitch Contours
    Pitch Tracking
    Plos One
    Plug In
    Plug-in
    Presentation
    Press
    PRI
    Prosody
    Publication
    Python
    Query By Humming
    Query-by-humming
    Radio
    Representation Learning
    Research
    Robots
    Scaper
    Science And The City
    Science Friday
    Self-supervision
    Sensor Network
    Sensors
    Sight And Sound Workshop
    Smart Cities
    Software
    SONYC
    Sound Classification
    Sound Education
    Sound Event Detection
    Soundscape
    Sounds Of New York City
    Sound Workshop
    Speech
    STEM
    Synthesis
    Taste Of Science
    Taxonomy
    Technical Report
    Time Series
    Tonic ID
    Tony
    Tutorial
    Unsupervised Feature Learning
    Urban
    Urban Sound Analysis
    Urban Sound Tagging
    Vamp
    Version Identification
    Visualization
    Vocaloid
    Vocoder
    Warblers
    Wav To Midi
    Welcome
    Wired
    WNYC
    Women In Science
    Workshop
    World Domination
    Wsf14
    Youtube

    RSS Feed

Powered by Create your own unique website with customizable templates.
  • Home
  • News
  • Research
  • Publications
  • Code/Data
  • Melody Extraction
  • PhD Thesis
  • Contact
    • Music
    • Music Technology