Digital Audio Signal Processing Marc Moonen Dept EEESATSTADIUS KU Leuven marcmoonenesatkuleuvenbe homesesatkuleuvenbe moonen Speech amp Audio Processing PartI H Van ID: 631990
Download Presentation The PPT/PDF document "Speech & Audio Processing - Part..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Speech & Audio Processing - Part–IIDigital Audio Signal Processing
Marc Moonen
Dept. E.E./ESAT-STADIUS, KU Leuven
marc.moonen@esat.kuleuven.be
homes.esat.kuleuven.be
/~
moonen
/Slide2
Speech & Audio ProcessingPart-I (H. Van hamme
)
speech recognition
speech coding (+audio coding)
speech synthesis (TTS)
Part-II
(M. Moonen):
Digital Audio Signal Processing
microphone array processing
noise
cancellation
acoustic echo
cancellation
acoustic feedback- cancellation
active noise control
3D audio
PS: selection of topicsSlide3
Digital Audio Signal ProcessingAims/scope Case study: Hearing instrumentsO
verview
Prerequisites
Lectures/course material/literature
Exercise sessions/project
Exam
Slide4
Aims/ScopeAim is 2-fold :
Speech & audio per se
S & A industry in Belgium/Europe/…
Basic signal processing theory/principles :
Optimal filters
Adaptive filter algorithms
(
Filtered
-X LMS,..)
Kalman
filters
etc
...Slide5
Hearing
Outer ear/middle ear/inner ear
Tonotopy
of inner ear: spatial
arrangement of where sounds of different frequency are processed
= Cochlea
Low-
freq
tone
High-
freq
tone
Neural activity
f
or low-
freq
tone
Neural
activitity for high-freq tone
© www.cm.be
Case Study: Hearing Instruments
1
/14Slide6
Hearing loss types: conductive sensorineural
mixed
One
in six
adults (Europe)
…and
still
increasingTypical causes:aging exposure to loud sounds …
Case Study: Hearing Instruments
2
/14
[Source: Lapperre]Slide7
Hearing impairment :
Dynamic range & audibility
Normal
hearing Hearing impaired
subjects
subjects
Case Study: Hearing Instruments
3
/14
Level
100dB
0dBSlide8
Hearing impairment :
Dynamic range & audibility
Dynamic range compression (DRC
)
(…rather than `amplification’)
Case Study: Hearing Instruments
4
/14
Level
100dB
0dB
Input
Level (dB)
Output
Level (dB)
0dB
100dB
0dB
100dB
Design: multiband DRC, attack time, release time, …Slide9
Hearing impairment :
Audibility
vs speech
intelligibility
Audibility does not imply
intelligibility
Hearing impaired subjects need 5..10dB larger
signal-to-noise ratio
(SNR)
for speech understanding in noisy
environments
Need for
noise reduction
(=speech enhancement) algorithms:
State-of-the-art: monaural 2-microphone adaptive noise reduction
Near future:
binaural noise reduction (see below)
Not-so-near future: multi-node noise reduction (see below)Case Study: Hearing Instruments 5
/14
SNR
20dB
0dB
30 50 70 90
Hearing loss (dB, 3-freq-average)Slide10
1921
2007 (
Oticon
)
Case Study: Hearing Instruments
6
/14
Hearing
Aids (HAs)
Audio input/audio output
(`microphone-processing-loudspeaker’)
‘Amplifier’
,
but so much more than an amplifier!!
History:
Horns/trumpets/…
`Desktop’ HAs (1900)
Wearable HAs (1930)
Digital HAs (1980)
State-of-the-art:
MHz’s clock speed
Millions of arithmetic operations/sec, …
Multiple
microphonesSlide11
Alessandro Volta 1745-1827
©
Cochlear Ltd
Case Study: Hearing Instruments
7
/14
Electrical stimulation
for low frequency
Electrical stimulation
for high frequency
Cochlear
Implants (CIs
)
Audio input/electrode stimulation output
Stimulation
strategy
+
preprocessing similar to HAs
History:
Volta’s experiment…
First implants (1960)
Commercial CIs (1970-1980)
Digital CIs (1980)
State-of-the-art:
MHz’s clock speed, Mops/sec, …
Multiple
microphones
Other
: Bone anchored HAs, middle ear implants, …
Intra-cochlear electrodeSlide12
External Processor
Digital
/analog-
conversion
Digital
processing & filterbank
Etc
..
Coil
Inductive
/magnetic coupling
Implant
Electrode array
PS: number of CI-implantees worldwide approx. 200.000
PS
: 1 CI is approx. 25kEURO, plus surgery, revalidation,..
PS
: 3 companies (Cochlear LtD, Med-El, Advanced Bionics)
©
Cochlear Ltd
Case Study: Hearing Instruments
8/14Slide13
T
echnology challenges in hearing instruments
Small form factor (cfr. user acceptance)
Low power: 1…5mW (cfr. battery lifetime ≈ 1 week)
Low processing delay: 10msec (cfr. synchronization with lip reading)
DSP
challenges in hearing instruments
Dynamic range compression (cfr supra)
Dereverberation: undo filtering (`echo-ing’) by room acoustics
Feedback cancellation
Noise reduction
Case Study: Hearing Instruments
9
/14Slide14
DSP Challenges: Feedback Cancellation
Problem
statement: Loudspeaker signal is fed back into microphone, then amplified and played back again
Closed loop system may become unstable (howling)
Similar to feedback problem in public address systems (for the musicians amongst you)
Case Study: Hearing Instruments
10/14
Model
F
-
Similar to echo cancellation in GSM handsets, Skype,…
but more difficult due to signal correlationSlide15
DSP
Challenges: Noise reduction
Multimicrophone
‘
beamforming
’, typically with 2 microphones
, e.g.
‘directional’ front microphone and ‘omnidirectional’ back microphone
Case Study: Hearing Instruments
11/14
“filter-and-sum” the
microphone signalsSlide16
Binaural hearing:
Binaural
auditory cues
ITD (interaural time difference)
ILD (interaural level difference)
Binaural cues
(ITD: f < 1500Hz, ILD: f > 2000Hz)
used for
Sound localization
Noise reduction
=`Binaural unmasking’ (‘cocktail party’ effect)
0-5dB
Case Study: Hearing Instruments
12/14
ITD
ILD
signalSlide17
Binaural hearing
aids
Two hearing aids
(L&R)
with wireless link & cooperation
Opportunities:
More signals (e.g. 2*2 microphones)
Better sensor spacing (17cm i.o. 1cm)
Constraints
: power/bandwith/delay of wireless link
..10kBit/s: coordinate program settings, parameters,…
..300kBits/s: exchange 1 or more (compressed) audio
signals
Challenges:
Improved localization through cue preservation
Improved noise reduction + benefit from binaural
unmasking
Signal selection/filtering
,
audio coding
,
synchronisation
, …
Case Study: Hearing Instruments
13/14Slide18
Future:
Multi
-node noise reduction
– sensor networks
Case Study: Hearing Instruments
14/14Slide19
Overview : Lecture-
2
Microphone Array Processing
R
eferred to as ‘s
patial filtering’ (similar to ‘spectral filtering’)
or ‘beamforming’ Fixed vs. adaptive beamforming
Application: hearing aids
F
ilter
-and-sum
beamformerSlide20
Overview : Lecture
-3
Noise
Reduction
`
microphone_signal[k] = speech[k] + noise[k]’
Single-microphone noise reduction
Spectral Subtraction Methods (
spectral filtering
)Iterative methods based on speech modeling (Wiener & Kalman Filters)Multi-microphone noise reductionBeamforming revisitedOptimal filtering approach : spectral+spatial
filteringSlide21
Overview : Lecture-4
Guest Lecture
Prof
. Tom
Francart
, KU Leuven,
ExpORL
‘Evaluation of Audio/Speech Signal Processing Algorithms’ Speech intelligibility in noiseInstrumental meassuresBehavioral measuresSlide22
Adaptive Filters
for
Acoustic Echo- and
Feedback Cancellation
Adaptive filtering problem:
non-stationary/wideband/… speech signals
non-stationary/long/… acoustic channels
Adaptive filtering algorithms
AEC Control
AEC Post-processing Stereo AEC
Overview : Lecture
-5Slide23
Overview : Lecture
-5
Adaptive Filters for Acoustic Echo- en
Feedback
Cancellation
(continued)
Hearing aids, public
a
ddress (PA) systemscorrelation between filter input (`x ’) and near-end signal (‘ n ’)fixes : noise injection, pitch shifting, notch filtering, …
amplifierSlide24
Overview : Lecture
-6
Kalman
Filters for Acoustic Echo- en
Feedback
Cancellation
‘Generalizes’ Wiener Filter..
..based on model for time-evolution of filter coefficients
amplifierSlide25
Overview : Lecture-7Active Noise Control
Solution based on `filtered-X LMS
’
Application : active headsets/ear defendersSlide26
Overview : Lecture-7
3D
Audio & Loudspeaker
Arrays
Binaural
synthesis
…with headphones
head related transfer functions (HRTF)
…with 2+ loudspeakers (`sweet spot
’) crosstalk cancellationSlide27
Overview : Lecture-8
Guest Lecture
Dr.
Enzo
De
Sena
, KU Leuven, ESAT/STADIUS
‘Auralization for Architectural Acoustics, Virtual Reality and Computer Games - from Physical to Perceptual Rendering of Dynamic Sound Scenes’Slide28
Aims/Scope (revisited)Aim is 2-fold :Speech & audio per se Basic signal processing theory/principles :
Optimal filtering /
Kalman
filters (linear/nonlinear)
here :
echo cancellation, speech
enhancement
other : automatic control, spectral estimation, ... Advanced adaptive filter algorithms here : acoustic echo cancellation other : digital communications, ... Filtered-X LMS here : 3D audio
other : active noise/vibration control Slide29
Lectures
Lectures
:
1 Intro +
7
Lectures
PS: Time budget = 1*(2hrs)*2 +7*(2hrs)*4 = 60 hrs Course Material: Slides Use version 2015-2016
! Download from DASP webpage
homes.esat.kuleuven.be/~dspuser/dasp
/Slide30
Prerequisites
H197
Signals & Systems (JVDW)
HJ09
Digital Signal Processing (I)
(PW)
signal transforms, sampling, multi-rate, DFT, …
HC63 DSP-CIS (MM)
filter design, filter banks, optimal & adaptive filtersSlide31
Literature
Literature
(general
)
(available in DSP-CIS library)
Simon
Haykin
`Adaptive Filter Theory
’ (Prentice Hall 1996)P.P. Vaidyanathan `
Multirate Systems and Filter Banks’ (Prentice Hall 1993)
Literature (specialized) (available
in DSP-CIS library) S.L. Gay & J. Benesty
`Acoustic Signal Processing for Telecommunication’ (Kluwer 2000)
M. Kahrs & K. Brandenburg (Eds)
`Applications of Digital Signal Processing to Audio and Acoustics’
(Kluwer1998)B. Gold & N. Morgan
`Speech and Audio Signal Processing’ (Wiley 2000)Slide32
Exercise Sessions/Project
Acoustic source localization
Direction-of-arrival estimation
Noise reduction
Synthesis
Simulated
set-up
Direction-of-arrival
θSlide33
Runs over 4 weeks (non-consecutive)Each week 1 PC/Matlab
s
ession (supervised, 2.5hrs)
2 ‘Homework’
sesions
(unsupervised, 2*2.5hrs)
PS: Time budget = 4*(2.5hrs+5hrs) = 30 hrs ‘Deliverables’ after week 2 & 4Grading: based on deliverables, evaluated during sessionsTAs: guiliano.bernardi@esat
(English+Italian)
PS: groups of 2
Acoustic Source Localization Project Slide34
Work Plan Week 1: Matlab acoustic simulation environmentWeek 2: Direction-of-arrival (
DoA
)
estimation based on the ‘MUSIC’ algorithm
*deliverable
*
Week
3:
DoA estimation + noise reduction (‘DOA informed beamforming’)Week 4: Binaural synthesis and 3D audio *deliverable*Acoustic Source Localization Project ..be there !Slide35
Oral exam, with preparation timeOpen bookGrading 7 for question-1
7 for question-2
+6 for project
___
= 20
ExamSlide36
Oral exam, with preparation timeOpen bookGrading 7 for question-1
7 for question-2
+6 for question-3
(related to project work)
___
= 20
September Retake ExamSlide37
WebsiteTOLEDOhttp://
homes.esat.kuleuven.be
/~
dspuser
/
dasp
/
Contact:
guiliano.bernardi@esat
Slides (use `version 2015-2016’ !!)ScheduleDSP-libraryFAQs (send questions to marc.moonen@esat)Slide38
Questions?Ask teaching assistant
(during exercises sessions)
E-mail
questions to
t
eaching assistant
or
marc.moonen@
esat3) Make appointment marc.moonen@esat ESAT Room B.00.14