AI in Audio

Part 2: Music Information Retrieval
Spring 2019 - Audio Tech Talk Series
March 26, 2019

What is MIR?

"extending the understanding and usefulness of music data, through the research, development and application of computational approaches and tools"

"use of theories, concepts and techniques from music, computer science, signal processing and cognition"

Introduction to MIR

J. P. Bello

Applications

Music recommendation

Source separation

Auto tagging

Instrument recognition

Music Transcription

Traditional approaches

Hand-crafted Feature Extractors

Centroid

Rolloff

Flux

Zero Crossings

Low Energy

Automatic Musical Genre Classification Of Audio Signals

G. Tzanetakis, G. Essl, P. Cook 2002

Classical Machine Learning

Gaussian Classifiers

Support Vector Machines

Random Forests

Clustering

Enter deep learning...

Applications

Music recommendation

Methods

Collaborative Filtering

Natural Language Processing

Audio Content based with CNNs

Source Separation

Deep Convolutional Neural Networks for Musical Source Separation

M. Miron, P. Chandna, G. Erruz, and H. Martel 2016

The Sound of Pixels

H. Zhao, C. Gan, A. Rouditchenko, C. Vondrick, J. McDermott, A. Torralba 2018

Auto Tagging

J.S. Bach - Aria (Vergnügte Ruh, Beliebte Seelenlust)

Top 10: Human-labels

female vocals, triple meter, acoustic, classical music, baroque period, lead vocals, string ensemble, major, compositional dominance of: lead vocals and melody

Top 10: Deep Learning

acoustic, string ensemble, classical music, period baroque, major, compositional dominance of: the arrangement, form, performance, rhythm and lead vocals.

Kendrick Lamar - Complexion (A Zulu Love)

Top 10: Human-labels

English, male vocals, rap, East Coast, breathy vocal, joyful lyrics and compositional dominance of: lyrics, melody, rhythm, accompanying vocals.

Top 10: Deep Learning

English, lead vocals, male vocals, rap, accompanying vocals, danceable and compositional dominance of: accompanying vocals, lead vocals, rhythm, lyrics.

End-to-end learning for music audio tagging at scale

J. Pons, O. Nieto, M. Prockup, E. Schmidt, A. Ehmann and X. Serra 2017

Instrument Recognition

Neutron Track Assistant

Gordon Wichern, iZotope 2017

Music Transcription

Onsets and Frames: Dual-Objective Piano Transcription

C. Hawthorne, E. Elsen, J. Song, A. Roberts, I. Simon, C. Raffel, J. Engel, S. Oore, D. Eck 2018

Next Talk - April 16

Building Audio Plugins

A look under the hood and getting started