Grows out of DSP and speech recognition research Feature detection mostly from Fast Fourier Transforms FFT and Mel Frequency Cepstral Coefficients MFCC 1 Music Digital Audio 2 httpenwikipediaorgwikiDigitalaudio ID: 786736
Download The PPT/PDF document "Digital Music Audio Processing" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Digital Music Audio Processing
Grows out of DSP and speech recognition researchFeature detection mostly from Fast Fourier Transforms (FFT) and Mel Frequency Cepstral Coefficients (MFCC)
1
Slide2Music Digital Audio
2
http://en.wikipedia.org/wiki/Digital_audio
Slide3Audio: Two Domain Problem
Frequency domainTime domain
3
Slide4Our Hero
4
https://upload.wikimedia.org/wikipedia/commons/a/aa/Fourier2.jpg
By
User:Bunzil
at
en.wikipedia
[Public domain], from Wikimedia
Commons
Jean-Baptiste Joseph
Fourier
Mathematician and physicist
Born: 21
March
1768
Died: 16
May
1830
Most famous for his spiffy
“Fourier Transform” and related
“Fourier’s Law”
Also noted for early “greenhouse effect” work!
Slide5From Wave to Data
5
http://en.wikipedia.org/wiki/User:LucasVB/Gallery
Slide6What do we mean by “audio feature”?
Ideal: TRUE MEANING extracted from the audio signal
Slide7What do we mean by “audio feature”?
Ideal: TRUE MEANING extracted from the audio signal
Slide8What do we mean by “audio feature”?
Reality: something we can squint at & interpret a bit
Slide9“Low-level
” and “high-level” featuresLow-level: “mechanically recovered” from the audioe.g. amplitude, timbral
descriptors, spectral features
High-level:
usually obtained from low-level features + lots of context (template matching, machine-learning, domain knowledge)
e.g. key, pitch, tempo, notes, phrases, similarity
Slide10Vamp
pluginsSmall files you can install that add new feature extractors.
Once installed, can be used with several different “hosts”:Sonic Visualiser
Audacity audio editor (simple feature extractors only)
Sonic Annotator – batch audio feature extraction program
Python Vamp host – use with scientific coding packages for analysis, search, plotting etc
Slide11Vamp plugins and audio features
Slide12What does a Vamp feature consist of?
Slide13Example:
Chromagram
Somewhat representative of time-varying harmonic contentMade by “wrapping around” time-frequency spectrogram into a single octaveVarious ways to do this
→
lots of different
chromagram
plugins
Good example of an
almost
intuitively meaningful feature
Slide14Chromagram
MotivationReduce spectrogram in a way informed by musical structure
LimitationsTime/frequency resolution tradeoffMisleading outcome of harmonic folding (different approaches to this)
Intrinsic difficulties, e.g. with temperament
Applications
Chord and key estimation
“Harmonic feature” for search, retrieval & similarity tasks