Category: Journal

Time Lattice: A Data Structure for the Interactive Visual Analysis of Large Time Series

19/7/2018

Advances in technology coupled with the availability of low-cost sensors have resulted in the continuous generation of large time series from several sources. In order to visually explore and compare these time series at different scales, analysts need to execute online analytical processing (OLAP) queries that include constraints and group-by's at multiple temporal hierarchies. Effective visual analysis requires these queries to be interactive. However, while existing OLAP cube-based structures can support interactive query rates, the exponential memory requirement to materialize the data cube is often unsuitable for large data sets. Moreover, none of the recent space-efficient cube data structures allow for updates. Thus, the cube must be re-computed whenever there is new data, making them impractical in a streaming scenario. We propose Time Lattice, a memoryâ€efficient data structure that makes use of the implicit temporal hierarchy to enable interactive OLAP queries over large time series. Time Lattice is a subset of a fully materialized cube and is designed to handle fast updates and streaming data. We perform an experimental evaluation which shows that the space efficiency of the data structure does not hamper its performance when compared to the state of the art. In collaboration with signal processing and acoustics research scientists, we use the Time Lattice data structure to design the Noise Profiler, a web-based visualization framework that supports the analysis of noise from cities. We demonstrate the utility of Noise Profiler through a set of case studies.

For example, we used the Noise Profiler to rapidly explore and visualize noise patterns in NYC during weekdays versus weekends across multiple locations, using time series data from SONYC noise sensors:

Noise patterns on weekdays vs. weekends from a variety of locations in NYC. Time series data from SONYC noise sensors explored and visualized using the Noise Profiler tool built with Time Lattice.

For further details see our paper:

Time Lattice: A Data Structure for the Interactive Visual Analysis of Large Time Series
F. Miranda, M. Lage, H. Doraiswamy, C. Mydlarz, J. Salamon, Y. Lockerman, J. Freire, C. Silva
Computer Graphics Forum (EuroVis '18), 37(3), 2018, 13-22
[Wiley][PDF][BibTeX]

0 Comments

Deep Convolutional Neural Networks and Data Augmentation For Environmental Sound Classification

20/1/2017

0 Comments

The ability of deep convolutional neural networks (CNN) to learn discriminative spectro-temporal patterns makes them well suited to environmental sound classification. However, the relative scarcity of labeled data has impeded the exploitation of this family of high-capacity models. This study has two primary contributions: first, we propose a deep convolutional neural network architecture for environmental sound classification. Second, we propose the use of audio data augmentation for overcoming the problem of data scarcity and explore the influence of different augmentations on the performance of the proposed CNN architecture. Combined with data augmentation, the proposed model produces state-of-the-art results for environmental sound classification. We show that the improved performance stems from the combination of a deep, high-capacity model and an augmented training set: this combination outperforms both the proposed CNN without augmentation and a “shallow” dictionary learning model with augmentation. Finally, we examine the influence of each augmentation on the model’s classification accuracy for each class, and observe that the accuracy for each class is influenced differently by each augmentation, suggesting that the performance of the model could be improved further by applying class-conditional data augmentation.

For further details see our paper:

Deep Convolutional Neural Networks and Data Augmentation For Environmental Sound Classification
J. Salamon and J. P. Bello
IEEE Signal Processing Letters, In Press, 2017.
[IEEE][PDF][BibTeX][Copyright]

0 Comments

Towards the Automatic Classification of Avian Flight Calls for Bioacoustic Monitoring

23/11/2016

0 Comments

A white-throated sparrow, one of the species targeted in the study. Image by Simon Pierre Barrette, license CC-BY-SA 3.0.

Automatic classification of animal vocalizations has great potential to enhance the monitoring of species movements and behaviors. This is particularly true for monitoring nocturnal bird migration, where automated classification of migrants’ flight calls could yield new biological insights and conservation applications for birds that vocalize during migration. In this paper we investigate the automatic classification of bird species from flight calls, and in particular the relationship between two different problem formulations commonly found in the literature: classifying a short clip containing one of a fixed set of known species (N-class problem) and the continuous monitoring problem, the latter of which is relevant to migration monitoring. We implemented a state-of-the-art audio classification model based on unsupervised feature learning and evaluated it on three novel datasets, one for studying the N-class problem including over 5000 flight calls from 43 different species, and two realistic datasets for studying the monitoring scenario comprising hundreds of thousands of audio clips that were compiled by means of remote acoustic sensors deployed in the field during two migration seasons. We show that the model achieves high accuracy when classifying a clip to one of N known species, even for a large number of species. In contrast, the model does not perform as well in the continuous monitoring case. Through a detailed error analysis (that included full expert review of false positives and negatives) we show the model is confounded by varying background noise conditions and previously unseen vocalizations. We also show that the model needs to be parameterized and benchmarked differently for the continuous monitoring scenario. Finally, we show that despite the reduced performance, given the right conditions the model can still characterize the migration pattern of a specific species. The paper concludes with directions for future research.

The full article is available freely (open access) on PLOS ONE:

Towards the Automatic Classification of Avian Flight Calls for Bioacoustic Monitoring
J. Salamon , J. P. Bello, A. Farnsworth, M. Robbins, S. Keen, H. Klinck and S. Kelling
PLOS ONE 11(11): e0166866, 2016. doi: 10.1371/journal.pone.0166866.
[PLOS ONE][PDF][BibTeX]

Along with this study, we have also published the three new datasets for bioacoustic machine learning that were compiled for this study.

0 Comments

The Implementation of Low-cost Urban Acoustic Monitoring Devices

16/6/2016

0 Comments

click on image to enlarge

The urban sound environment of New York City (NYC) can be, amongst other things: loud, intrusive, exciting and dynamic. As indicated by the large majority of noise complaints registered with the NYC 311 information/complaints line, the urban sound environment has a profound effect on the quality of life of the city’s inhabitants. To monitor and ultimately understand these sonic environments, a process of long-term acoustic measurement and analysis is required. The traditional method of environmental acoustic monitoring utilizes short term measurement periods using expensive equipment, setup and operated by experienced and costly personnel. In this paper a different approach is pro- posed to this application which implements a smart, low-cost, static, acoustic sensing device based around consumer hardware. These devices can be deployed in numerous and varied urban locations for long periods of time, allowing for the collection of longitudinal urban acoustic data. The varied environmental conditions of urban settings make for a challenge in gathering calibrated sound pressure level data for prospective stakeholders. This paper details the sensors’ design, development and potential future applications, with a focus on the calibration of the devices’ Microelectromechanical systems (MEMS) microphone in order to generate reliable decibel levels at the type/class 2 level.

For further details see our paper:

The Implementation of Low-cost Urban Acoustic Monitoring Devices
C. Mydlarz, J. Salamon and J. P. Bello
Applied Acoustics, special issue on Acoustics for Smart Cities, 2016.
[Elsevier][PDF]

This paper is part of the SONYC project.

0 Comments

JNMR review of tonic ID algorithms for Indian classical music published

1/4/2014

Our article reviewing and comparing tonic identification algorithms for Indian classical music has just been published in the Journal of New Music Research.

Abstract: The tonic is a fundamental concept in Indian art music. It is the base pitch, which an artist chooses in order to construct the melodies during a rag(a) rendition, and all accompanying instruments are tuned using the tonic pitch. Consequently, tonic identification is a fundamental task for most computational analyses of Indian art music, such as intonation analysis, melodic motif analysis and rag recognition. In this paper we review existing approaches for tonic identification in Indian art music and evaluate them on six diverse datasets for a thorough comparison and analysis. We study the performance of each method in different contexts such as the presence/absence of additional metadata, the quality of audio data, the duration of audio data, music tradition (Hindustani/Carnatic) and the gender of the singer (male/female). We show that the approaches that combine multi-pitch analysis with machine learning provide the best performance in most cases (90% identification accuracy on average), and are robust across the aforementioned contexts compared to the approaches based on expert knowledge. In addition, we also show that the performance of the latter can be improved when additional metadata is available to further constrain the problem. Finally, we present a detailed error analysis of each method, providing further insights into the advantages and limitations of the methods.

S. Gulati, A. Bellur, J. Salamon, H. G. Ranjani, V. Ishwar, H. A. Murthy, and X. Serra, "Automatic Tonic Identification in Indian Art Music: Approaches and Evaluation", J. of New Music Research, 43(1):53–71, Mar. 2014.

[Taylor & Francis][DOI][PDF][BibTex]

Congratulations to all of the authors, and in particular to Sankalp Gulati for all the effort he put into this paper.

IEEE SPM Melody Extraction Review published online

16/2/2014

Our review article on melody extraction algorithms for the IEEE Signal Processing Magazine is finally available online! The printed edition will be coming out in March 2014:

J. Salamon, E. Gómez, D. P. W. Ellis and G. Richard, "Melody Extraction from Polyphonic Music Signals: Approaches, Applications and Challenges", IEEE Signal Processing Magazine, 31(2):118-134, Mar. 2014.

Abstract—Melody extraction algorithms aim to produce a sequence of frequency values corresponding to the pitch of the dominant melody from a musical recording. Over the past decade melody extraction has emerged as an active research topic, comprising a large variety of proposed algorithms spanning a wide range of techniques. This article provides an overview of these techniques, the applications for which melody extraction is useful, and the challenges that remain. We start with a discussion of ‘melody’ from both musical and signal processing perspectives, and provide a case study which interprets the output of a melody extraction algorithm for specific excerpts. We then provide a comprehensive comparative analysis of melody extraction algorithms based on the results of an international evaluation campaign. We discuss issues of algorithm design, evaluation and applications which build upon melody extraction. Finally, we discuss some of the remaining challenges in melody extraction research in terms of algorithmic performance, development, and evaluation methodology.

For further information about this article please visit my Research page.

Melody Extraction Review Published in IEEE Signal Processing Magazine

6/7/2013

Our review article on melody extraction algorithms has been accepted for publication in the IEEE Signal Processing Magazine!

Here are the full details (including a link to a preprint of the article):

J. Salamon, E. Gómez, D. P. W. Ellis and G. Richard, "Melody Extraction from Polyphonic Music Signals: Approaches, Applications and Challenges", IEEE Signal Processing Magazine, In Press (2013).
[IEEE][DOI][PDF][BibTeX][Copyright]

The paper provides a detailed review of the current state of the art in melody extraction. For a slightly longer description here's the abstract:

Melody extraction algorithms aim to produce a sequence of frequency values corresponding to the pitch of the dominant melody from a musical recording. Over the past decade melody extraction has emerged as an active research topic, comprising a large variety of proposed algorithms spanning a wide range of techniques. This article provides an overview of these techniques, the applications for which melody extraction is useful, and the challenges that remain. We start with a discussion of `melody' from both musical and signal processing perspectives, and provide a case study which interprets the output of a melody extraction algorithm for specific excerpts. We then provide a comprehensive comparative analysis of melody extraction algorithms based on the results of an international evaluation campaign. We discuss issues of algorithm design, evaluation and applications which build upon melody extraction. Finally, we discuss some of the remaining challenges in melody extraction research in terms of algorithmic performance, development, and evaluation methodology.

A special thanks to the co-authors of the article: Emilia Gómez, Dan Ellis and Gaël Richard!

Paper on version identification and query-by-humming published

15/11/2012

0 Comments

Our paper:

J. Salamon, J. Serrà and E. Gómez, "Tonal Representations for Music Retrieval: From Version Identiﬁcation to Query-by-Humming", International Journal of Multimedia Information Retrieval, special issue on Hybrid Music Information Retrieval, In Press.

has now been officially accepted for publication. The paper compares different tonal representations (melody, bass line and harmony) for version identification (automatically detecting cover songs). We also show how our approach for vesrion ID can be easily adapted for query-by-humming (QBH, i.e. searching for a song stuck in your head by singing or humming part of the melody), and since both the melody extraction (using MELODIA) and the matching is fully automatic, this is a fully automatic audio-to-audio QBH system prototype!

We're also planning to put all the queries we recorded for the experiments online, together with a list of the songs in the music collections we used for evaluation (unfortunately we can't share the songs themselves because they are protected by copyright law). I'll write a new post once the files are up.

I'd like to thank my co-authors Joan Serrà and Emilia Gómez for their excellent work. Hope you find article interesting!

0 Comments

Time Lattice: A Data Structure for the Interactive Visual Analysis of Large Time Series

Deep Convolutional Neural Networks and Data Augmentation For Environmental Sound Classification

Towards the Automatic Classification of Avian Flight Calls for Bioacoustic Monitoring

The Implementation of Low-cost Urban Acoustic Monitoring Devices

JNMR review of tonic ID algorithms for Indian classical music published

IEEE SPM Melody Extraction Review published online

Melody Extraction Review Published in IEEE Signal Processing Magazine

Paper on version identification and query-by-humming published

NEWS

Archives

Categories