/
M. Tech. project, EE Dept., IIT Bombay, Jun. 2013. M. Tech. project, EE Dept., IIT Bombay, Jun. 2013.

M. Tech. project, EE Dept., IIT Bombay, Jun. 2013. - PowerPoint Presentation

tabitha
tabitha . @tabitha
Follow
342 views
Uploaded On 2022-06-11

M. Tech. project, EE Dept., IIT Bombay, Jun. 2013. - PPT Presentation

Realtime Enhancement of Noisy Speech Using Spectral Subtraction Santosh K Waddi 10307932 Supervisor Prof P C Pandey IIT Bombay June 2013 Overview Introduction Speech Enhancement Using Spectral Subtraction ID: 917030

noise speech subtraction spectral speech noise spectral subtraction amp noisy signal time estimation snr median white proc real enhancement

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "M. Tech. project, EE Dept., IIT Bombay, ..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

M. Tech. project, EE Dept., IIT Bombay, Jun. 2013.Real-time Enhancement of Noisy Speech Using Spectral SubtractionSantosh K. Waddi (10307932)Supervisor: Prof. P. C. Pandey

IIT Bombay

June 2013

Slide2

OverviewIntroductionSpeech Enhancement Using Spectral SubtractionInvestigations Using Offline ProcessingImplementation for Real-time ProcessingSummary & Conclusions

Slide3

1. IntroductionSensorineural hearing lossIncreased hearing thresholds and high frequency lossDecreased dynamic range & abnormal loudness growth Reduced speech perception due to increased spectral & temporal masking→ Decreased speech intelligibility in noisy environmentSignal processing in hearing aidsFrequency selective amplificationAutomatic volume control Multichannel dynamic range compression (settable attack time, release time, and compression ratios)

Processing for reducing the effect of increased spectral masking in sensorineural loss

Binaural dichotic presentation (

Lunner

et al.

1993,

Kulkarni

et al.

2012)

Spectral contrast enhancement

(Yang

et al.

2003)

Multiband frequency compression

(Arai

et al.

2004,

Kulkarni

et al.

2012)

Slide4

Techniques for reducing the background noise Directional microphoneAdaptive filtering (a second microphone needed for noise reference)Single-channel noise suppression using spectral subtraction (Boll 1979, Berouti et al.1979, Martin 1994, Kamath & Loizou 2002, Loizou 2007, Lu & Loizou 2008, Paliwal et al. 2010) Processing stepsDynamic estimation of non-stationary noise spectrum

- During non-speech segments using voice activity detection

- Continuously using statistical techniques

Estimation of noise-free speech spectrum

- Spectral noise subtraction

- Multiplication by noise suppression function

Speech resynthesis

(using enhanced magnitude and noisy phase)

Slide5

Research objective Real-time single-input speech enhancement for use in hearing aids and other sensory aids (cochlear prostheses, etc) for hearing impaired listeners and in communication devices Main challenges Noise estimation without voice activity detection to avoid errors under low-SNR & during long speech segmentsLow signal delay(algorithmic + computational) for real-time applicationLow computational complexity & memory requirement for implementation on a low-power processor

Slide6

2. Speech Enhancement Using Spectral SubtractionDynamic estimation of non-stationary noise spectrumEstimation of noise-free speech spectrum Speech resynthesis

Slide7

Generalized spectral subtraction (Berouti et al. 1979)Power subtractionWindowed speech spectrum = Xn(k)Estimated noise mag. spectrum = Dn(k)Estimated speech spectrum Yn

(

k) = [|

X

n

(

k

)|

2

– (

D

n

(

k

))

2

]

0.5

e

j<

Xn

(

k

)

Problems:

residual noise due to under-subtraction, distortion in the form of musical noise & clipping due to over-subtraction.

|

Y

n

(

k

)|

= [ |

X

n

(

k

)|

γ

α

(

D

n

(

k

))

γ

]

1/

γ

, if |

X

n

(

k

)| > (

α

+

β

)

1/

γ

D

n

(

k

)

β

1/

γ

D

n

(

k

) otherwise

γ

=

exponent factor (

2:

power subtraction,

1:

magnitude subtraction)

α

=

o

ver-subtraction

factor

(for limiting

the effect of short-term variations in noise

spectrum)

β

= f

loor

factor to mask the

musical noise due to over-subtraction

Re-synthesis with noisy phase without explicit phase calculation

Y

n

(

k

) = |

Y

n

(

k

)|

X

n

(

k

) / |

X

n

(

k

)|

Slide8

Multi-band spectral subtraction (Kamath & Loizou 2002)Noise does not effect spectrum uniformlySpeech spectrum divided into B non-overlapping bands, spectral subtraction is performed independentlyTest material: 10 sentences from HINT database, noise: speech-shaped noise, Noisy speech: 0 dB and 5 dB SNREvaluation: Itakura-Saito (IS) distance method as an objective measure Improvement over the conventional power spectral subtraction, a very little trace of musical noiseGeometric approach to spectral subtraction (Lu & Loizou 2008)

Without assuming the cross-terms as zero

Test material:

NOIZEUS database, noise: babble, street, car, white, Noisy speech: 0 dB, 5 dB and 15 dB SNR

Evaluation:

mean square error (MSE), PESQ, log likelihood ratio

Cross terms can be ignored at very low and high SNRs but not near to 0 dB

Proc. output:

no audible musical noise , smooth and pleasant residual noise

Performed significantly better than power spectral subtraction in all conditions

Slide9

Noise estimationMinimal-tracking algorithmsMinimum statistics (Martin 1994)Tracks the noise as minima of past frames Minimum tracking (Doblinger 1995)Smoothing noisy speech power spectra in each frequency bin using a non-linear smoothingTime-recursive averaging algorithmsSNR-dependent recursive averaging (Lin 2003)Noisy speech decomposed into sub-band signals, noisy signal power is smoothened and noise estimated adaptivelySmoothing parameter: function of estimated SNRWeighted spectral averaging (Hirsch and Ehrlischer 1995)

First order recursive weighted average of past spectral magnitude values over 400 ms which are below an adaptive threshold

Slide10

Improved minima-controlled recursive averaging (Cohen 2007)Two iterations of smoothing and trackingFirst iteration: rough voice activity detection is provided in each frequency bandSecond iteration: smoothing excludes strong speech components, makes the minimum tracking robust during speech activitySmoothing parameter: frequency-dependent & dynamically adjusted by signal presence probabilityLower estimator error than minimum statisticsMethod is combined with log-spectral amplitude estimator Higher segmental SNR improvement than minimum statisticsHistogram-based technique (Hirsch & Ehrlicher 1995)Histogram: noisy speech over 400 ms

Noise estimated:

maximum of distribution in each sub-band

Avoid spikes:

estimated values smoothed along time axis

Objective evaluation:

relative error

Relative error is low compared to weighted spectral average method (

Hirsch &

Ehrlicher

1995)

Slide11

Cascaded-median based estimation (Basha & Pandey 2012) Moving median approximated by p-point q-stage cascaded-median, with a saving in memory & computation for real-time implementation.Quantile-based noise estimation (Stahl 2000)Speech signal energy: low in most of the frames high in only 10 – 20 % framesNoise estimation: Selecting certain quantile value from previous frames of noisy speech spectrumFrequency-dependent and SNR-dependent for quantile selection

Median-based noise estimation work well in robust manner

Slide12

Median Storage No of sortings per frame per freq. bin per freq. binM-point 2M (M–1)/2p-pont q-stage (M = pq)

pq

p

(

p–

1)/2

MBNE

vs

CMBNE Comparison

Project objective

Implementation of generalized spectral subtraction along with cascaded-median based noise estimation for real-time processing using a low-power DSP

Selection of optimal set of processing steps and parameters, using offline processing

Implementation on a DSP board with a16-bit fixed-point processor & evaluation

Condition for reducing sorting operations and storage:

low

p

,

q

ln

(

M

)

p

= 3 → code simplification for sorting operations

Slide13

3. Investigations Using Offline ProcessingTest materialSpeech material1: Recording with three isolated vowels, a Hindi sentence, an English sentence (-/a/-/i/-/u/– “aayiye aap kaa naam kyaa hai?” – “Where were you a year ago?”) from a male speaker. Referred to as "VHSES"Speech material2: Six sentences from NOIZEUS database of one male speaker

Noise: white, pink, street, babble, car, and train noises.

SNR: ∞, 18, 15,12, 9, 6, 3, 0, -3, -6 dB.

Evaluation methods

Informal listening

Objective evaluation using PESQ measure (0

– 4.5)

Investigations

(

f

s

= 10 kHz)

Overlap of

50%

&

75%

: indistinguishable outputs

γ

=

1 (

magnitude subtraction) :

higher tolerance to variation in

α

,

β

values

Slide14

Investigation on noise estimation(a) Clean speech signal(b) White noise(c) Noisy speech: white noise, 3 dB SNR(d) Noisy speech: white noise, 0 dB SNRScatter plots for magnitude spectra. Speech material: VHSES

Slide15

(a) Clean speech signal(b) White noise(c) Noisy speech: white noise, 3 dB SNR(d) Noisy speech: white noise, 0 dB SNRScatter plots for magnitude spectra. Speech material: NOIZEUS

Slide16

(a) Mean(b) Median(c) MinimumMean, median and minimum of magnitude spectra of clean speech signal, noise and noisy speech (white, SNR: 0 dB), speech material: VHSESNoisy signal median tracks the noise median & Noisy signal minimum tracks the noise minimum at almost all the frequencies

Slide17

(a) Mean(b) Median(c) MinimumMean, median and minimum of magnitude spectra of clean speech signal, noise and noisy speech (white, SNR: 0 dB), speech material: NOIZEUS

Slide18

(a) Speech material: VHSES(b) Speech material: NOIZEUSRelative RMS error (dB)Objective evaluation of the accuracy of noise estimationRelative RMS error (dB) decreases as SNR decreases

Slide19

Effect of window length and noise estimation durationProcessing: Magnitude spectral subtraction with median based noise estimationHigh PESQ score: Noise estimation across 81 past frames & 20 – 40 ms window length30ms window length was chosen (approximately 1.2 s duration)(a) Speech: VHSES, noise: white, SNR: 0dB(b) Speech: NOIZEUS, noise: white, SNR: 0dB

Slide20

Comparison of enhanced speech using MBNE and CMBNEMBNE requires large memory & computation intensive3-point 4-stage cascaded-median significantly reduces memory requirement & computationsReduction in storage requirement per freq. bin: from 162 to 12 samples Reduction in number of sorting operations per frame per freq. bin: from 40 to 3Information listening: Perceptually sameObjective evaluation: Almost same in most cases and maximum difference of 0.06Noise type

Un proc.

Proc. using MBNE

Proc. using CMBNE

white

1.55

1.84

1.84

babble

1.75

1.80

1.81

street

1.83

2.08

2.04

pink

1.60

2.00

1.98

train

2.05

2.40

2.35

car

1.72

1.95

1.89

PESQ score of the enhanced speech. Speech: NOIZEUS, SNR: 0 dB

Slide21

Effect of spectral subtraction parametersProcessing: Magnitude spectral subtraction using 3-point 4-stage CMBNEAnalysis-synthesis: 30 ms window length & 50% overlapSpectral floor factor β : 0.01 appropriate for all the casesSubtraction factor α: in 2 – 2.5 for VHSES and in 1.2 – 1.4 for NOIZEUS speech materialPhase estimation for spectral subtractionProcessing: Magnitude spectral subtraction using 3-point 4-stage CMBNE

Phase:

zero,

Cepstrum

1978,

Quatieri

& Oppenheim 1981,

Nawab

et al.

1983

Analysis-synthesis:

50% overlap rect. win., 75% overlap rect. win., Griffin-Lim method

(Griffin & Lim 1984)

Informal listening:

No improvement over by using phase using noisy phase

Objective evaluation:

signal estimated using noisy phase has higher PESQ score

Slide22

Comparison of proposed method with other methodsProposed method: Magnitude spectral subtraction with cascaded-median based noise estimation. Analysis-synthesis with 30 ms and 50% overlapComparison: spectral-subtractive, statistical-model based, and subspace algorithms (implementations available on CD accompanying Loizou 2007: specsub, mband, ga, wiener_iter, wiener_as, wiener_wt, mt_mask, audnoise, mmse, logmmse, logmmse_spu, stsa_weuchild, stsa_wcosh, stsa_mis, kli, pklt)Enhancement method

Noise type

white

babble

street

pink

train

car

un proc.

1.54

1.73

1.78

1.59

2.00

1.67

specsub

1.78

1.74

1.85

2.01

2.33

1.91

mband

1.43

1.88

2.06

1.72

2.61

2.09

ga

1.82

2.05

2.42

2.11

2.67

2.26

mmse

1.95

1.86

1.99

2.22

2.69

2.24

klt

2.51

1.89

1.84

2.27

2.32

1.91

Proposed method

2.14

1.93

2.16

2.31

2.63

2.15

Comparison of PESQ scores for VHSES speech material, 0 dB SNR

Slide23

Comparison of PESQ scores for NOIZEUS speech material, 0 dB SNREnhancement methodNoise type

white

babble

street

pink

train

car

un proc.

1.55

1.75

1.83

1.60

2.05

1.72

specsub

1.63

1.58

1.81

1.78

2.04

1.77

mband

1.59

1.84

2.02

1.73

2.30

1.91

ga

1.61

1.82

2.07

1.86

2.35

1.96

mmse

1.82

1.78

2.02

2.01

2.43

1.99

klt

1.89

1.71

1.87

1.97

2.07

1.78

Proposed method

1.83

1.81

2.03

1.95

2.33

1.90

Observation:

Comparable to the best ones

Slide24

DiscussionFFT length N = 512 & higher: indistinguishable outputsProcessing: Magnitude spectral subtraction with 3-point 4-stage CMBNE, analysis-synthesis with 30ms window length & 50% overlap Informal listening: Significant enhancement for all noises with different SNR'sSpectral subtraction parameters: β = 0.01 appropriate for all the cases, α in 2 – 2.5 for VHSES and in1.2 – 1.4 for NOIZEUS speech materialSNR advantage: 4 – 13 dB for VHSES & 2 – 7 dB for NOIZEUS speech material

Noise type

Material: VHSES

Material:

NOIZEUS

SNR advantage

Optimal

α

SNR advantage

Optimal

α

white

13 dB

2.0

7 dB

1.4

babble

4 dB

2.0

2 dB

1.2

street

5 dB

2.5

4 dB

1.4

pink

11 dB

2.0

7 dB

1.4

train

7 dB

2.0

5 dB

1.4

car

6 dB

2.0

5 dB

1.4

Slide25

4. Implementation for Real-time Processing16-bit fixed point DSP: TI/TMS320C551516 MB memory space : 320 KB on-chip RAM with 64 KB dual access RAM, 128 KB on-chip ROMThree 32-bit programmable timers, 4 DMA controllers each with 4 channelsFFT hardware accelerator (8 to 1024-point FFT)Max. clock speed: 120 MHzDSP Board: eZdsp 4 MB on-board NOR flash for user programCodec TLV320AIC3204: stereo ADC & DAC, 16/20/24/32-bit quantization , 8 – 192 kHz sampling

Development environment for C: TI's '

CCStudio, ver. 4.0'

Slide26

ImplementationOne codec channel (ADC and DAC) with 16-bit quantizationSampling frequency: 10 kHzWindow length of 30 ms (L = 300) with 50% overlap, FFT length N = 512Storage of input samples, spectral values, processed samples: 16-bit real & 16-bit imaginary parts

Slide27

Data transfers and buffering operations (S = L/2)DMA cyclic buffers3 block input buffer2 block output buffer

(each with

S samples)

Pointers

current input block

just-filled input block

current output block

write-to output block

(incremented cyclically on DMA interrupt)

Signal delay

Algorithmic:

1 frame (30 ms)

Computational ≤

frame shift (15 ms)

Slide28

ResultsPESQ Score vs SNR for noisy and enhanced speech using offline and real-time processing(a) Speech: VHSES(b) Speech: NOIZEUSOffline proc. improvement: 0.57 – 0.80 for VHSES & 0.28 – 0.44 for NOIZEUSReal-time proc. improvement: 0.39 – 0.71 for VHSES & 0.22 – 0.32 for NOIZEUS

Slide29

Example of Processing : "-/a/-/i/-/u/– "aayiye aap kaa naam kyaa hai?" – "Where were you a year ago?", with white noise at 3 dB SNR(a) Clean speech

(c) Offline processed

(b) Noisy speech

(d) Real-time processed

Slide30

Noise typeUn proc.Offline proc.Real-time proc.white

1.54

2.14

2.10

babble

1.73

1.93

1.87

street

1.78

2.16

1.92

pink

1.59

2.31

2.20

train

2.00

2.63

2.45

car

1.67

2.15

2.09

Real-time processing tested using white, babble, car, pink, train noises:

real-time processed output perceptually similar to the offline processed output

Signal delay =

48 ms

Lowest clock for satisfactory operation = 16.4 MHz

Processing capacity used ≈ 1/7 of the capacity with highest clock (120 MHz)

Comparison of enhanced speech between offline and real-time processed. Speech: VHSES, SNR: 0dB

Slide31

5. Summary & ConclusionsInvestigation & implementation of spectral subtraction for real-time operation: Magnitude spectrum subtraction and resynthesis using noisy phase, along with cascaded-median based dynamic noise estimation for reducing computation and memory requirementEnhancement of speech with different types of additive stationary and non-stationary noise: SNR advantage : 4 – 13 dB for VHSES & 2 – 7 dB for NOIZEUSImplementation for real-time operation using 16-bit fixed-point processor TI/TMS320C5515: Implementation with 10 kHz sampling using 1/7 of processing capacity, signal delay = 48 msFurther work Frequency & a posteriori SNR-dependent subtraction & spectral floor factorsCombination of speech enhancement technique with other processing techniques in the sensory aids Implementation using other processorsSubjective evaluation of intelligibility and quality of enhanced speech

Slide32

Thank You

Slide33

AbstractSensorineural loss is generally associated with increased spectral masking due to widened auditory filters and the listeners having this kind of hearing impairment often experience great difficulty when the speech is contaminated by noise. This thesis presents investigations for real-time enhancement of noisy speech using spectral subtraction for suppressing the external noise. Investigation using offline processing for enhancing the noisy speech with different types of noise and SNR values is carried out to select the optimal set of steps and parameters for real-time processing. PESQ score is used for objective comparison of quality of the enhanced speech. Results show that median based noise estimation is effective in estimating noise from noisy speech without a voice activity detector, for a wide variety of stationary and non-stationary noises and range of SNR values and that a cascaded-median can be used as an approximation to median for significantly reducing the computation and memory requirement, without adversely affecting the noise estimation. Speech enhancement using magnitude spectrum subtraction with 3-point 4-stage cascaded median for noise estimation and resynthesis using noisy phase resulted in improvements in PESQ scores in the range 0.28 – 0.44 for speech material from NOIZEUS database with added white noise. Resynthesis using phase estimated from the enhanced magnitude spectrum did not result in any further improvement in the scores. The processing technique is implemented and tested for satisfactory operation, with sampling frequency of 10 kHz, 30 ms analysis window with 50% overlap, using a DSP board based on 16-bit fixed-point DSP processor TMS320C5515 with on-chip FFT hardware. The implementation uses

data transfer and buffering operations devised for an efficient realization of analysis-synthesis

and codec and DMA for acquisition of the input signal and outputting of the processed output signal

.

The real-time operation is achieved with signal delay of approximately 48 ms and using about one-seventh of the computing capacity of the processor.

Slide34

References[1] H. Levitt, J. M. Pickett, and R. A. Houde, Eds., Senosry Aids for the Hearing Impaired. New York: IEEE Press, 1980, pp. 3–10.[2] J. M. Pickett, The Acoustics of Speech Communication: Fundamentals, Speech Perception Theory, and Technology. Boston, Mass.: Allyn Bacon, 1999, pp. 289–323.[3] H. Dillon, Hearing Aids. New York: Thieme Medical, 2001.[4] B. C. J. Moore, An Introduction to the Psychology of Hearing, London, UK: Academic, 1997, pp 66–107.

[5] T.

Lunner, S.

Arlinger

, and J.

Hellgren

, “8-channel digital filter bank for hearing aid use: preliminary results in monaural,

diotic

, and dichotic modes,”

Scand.

Audiol

. Suppl

., vol. 38, pp. 75–81, 1993.

[6] P. N.

Kulkarni

, P. C. Pandey, and D. S.

Jangamashetti

, “Binaural dichotic presentation to reduce the effects of spectral masking in moderate bilateral sensorineural hearing loss,”

Int. J.

Audiol

., vol. 51, no. 4, pp. 334–344, 2012.

[7] J. Yang, F.

Luo

, and A.

Nehorai

, “Spectral contrast enhancement: Algorithms and comparisons,”

Speech

Commun

., vol. 39, no. 1–2, pp. 33–46, 2003.

[8] T. Arai, K.

Yasu

, and N.

Hodoshima

, “Effective speech processing for various impaired listeners,” in

Proc. 18th Int. Cong.

Acoust

.

(ICA 2004)

, Kyoto, Japan, 2004 pp. 1389–1392.

[9] P. N.

Kulkarni

, P. C. Pandey, and D. S.

Jangamashetti

, "Multi-band frequency compression for improving speech perception by listeners with moderate sensorineural hearing loss,"

Speech

Commun

.

, vol. 54, no. 3, pp. 341–350, 2012.

[10] P. C.

Loizou

,

Speech Enhancement: Theory and Practice

. New York: CRC, 2007.

[11] S. F. Boll, “Suppression of acoustic noise in speech using spectral subtraction,”

IEEE Trans.

Acoust

., Speech, Signal Process.

, vol. 27, no. 2, pp. 113–120, 1979.

[12] M.

Berouti

, R. Schwartz, and J.

Makhoul

, “Enhancement of speech corrupted by acoustic noise,” in

Proc. IEEE ICASSP

1979, Washington, DC, pp. 208–211.

[13] S.

Kamath

and P.

Loizou

, “A multi-band spectral subtraction method for enhancing speech corrupted by colored noise,” in

Proc. IEEE ICASSP

, 2002, Orlando, Florida, vol. 4, pp. IV–4164.

[14] Y. Lu and P. C.

Loizou

, “ A geometric approach to spectral subtraction,”

Speech

Commun

.

, vol. 50, no. 6, pp. 453–466, 2008.

Slide35

[15] K. Paliwal, K. Wojcicki, and B. Schwerin, “Single-channel speech enhancement using spectral subtraction in the short-time modulation domain,” Speech Commun.

, vol. 52, no. 5, pp. 450–475, 2010.

[

16

]

R. Martin, “Spectral subtraction based on minimum statistics,” in

Proc. Eur. Signal Process. Conf.

, 1994, pp. 1182-1185.

[

17

] I. Cohen, “Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging,”

IEEE Trans. Speech Audio Process

., vol. 11, no. 5, pp. 466

475, 2003.

[

18

] H. Hirsch and C.

Ehrlicher

, “Noise estimation techniques for robust speech recognition,” in

Proc. IEEE ICASSP

, 1995, Detroit, MI, pp. 153

156.

[

19

]

V. Stahl, A. Fisher, and R.

Bipus

, “Quantile based noise estimation for spectral subtraction and Wiener filtering,” in

Proc. IEEE ICASSP

, 2000, Istanbul, Turkey, pp. 1875–1878.

[

20

] G.

Doblinger

, “Computationally efficient speech enhancement by spectral minima tracking in

subbands

,” in

Proc. 4th Eur. Conf. Speech

Commun

. and Technology (EUROSPEECH’95)

, Madrid, Spain, 1995, pp. 1513

1516.

[

21

] L. Lin, W.H. Holmes, and E.

Ambikairajah

, "Adaptive noise estimation algorithm for speech enhancement,"

Electronics Letters

, vol.39, no. 9, pp.754-755, 2003.

[

22

] C.

Ris

and S.

Dupont

, “Assessing local noise level estimation methods: application to noise robust ASR,”

Speech

Commun

.

, vol. 34, no. 1-2, pp. 141–158, 2001.

[

23

] S. K. Basha and P. C. Pandey, “Real-time enhancement of

electrolaryngeal

speech by spectral subtraction,” in

Proc. Nat. Conf. on

Commun

. 2012 (NCC 2012)

,

Kharagpur

, India, 2012, pp. 516

520.

[

24

] S. K.

Waddi

, P. C. Pandey, and N.

Tiwari

, “Speech enhancement using spectral subtraction and cascaded-median based noise estimation for hearing impaired listeners,” in

Proc. Nat. Conf.

Commun

. (NCC 2013)

, Delhi, India, 2013, paper no. 1569696063.[25] ITU, “Perceptual evaluation of speech quality (PESQ): an objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs,” ITU-T Rec., P.862, 2001.[26] Y. Hu and P. C. Loizou, “Subjective evaluation and comparison of speech enhancement algorithms,” Speech Communication, vol. 49, pp. 588–601, 2007.[27] T. F. Quatieri, and A. V. Oppenheim, “Iterative techniques for minimum phase signal reconstruction from phase or magnitude,” IEEE Trans. Acoust., Speech, Signal Process., vol. 29, no. 6, pp. 1187–1193, 1981.[28] S. H. Nawab, T. F. Quatieri, and J. S. Lim, “Signal-reconstruction from short time Fourier transform magnitude,” IEEE Trans. Acoust., Speech Signal Process., vol. 31, no. 4, pp. 986–998, 1983.

Slide36

[29] L. R. Rabiner and R. W. Schafer, Digital Processing of Speech Signals. Englewood Cliffs, New Jersey: Prentice Hall, 1978, pp. 356–362.

[

30

]

D. W. Griffin and J. S. Lim, “Signal estimation from modified short-time Fourier transform,”

IEEE Trans.

Acoust

., Speech, and Signal Process.

, vol. 32, no. 2, pp. 236–243, 1984.

[

31

] Spectrum Digital, Inc. (2010) TMS320C5515

eZdsp

USB Stick Technical Reference. [online]. Available: support.spectrumdigital.com/boards/usbstk5515/

reva

/files/

usbstk

5515_TechRef_RevA.pdf

[

32

] T

exas Instruments, Inc. (2011) TMS320C5515 Fixed-Point Digital Signal Processor. [online]. Available: focus.ti.com/lit/

ds

/

symlink

/tms320c5515.pdf.

[

33

] Texas Instruments, Inc. (2008) TLV320AIC3204 Ultra Low Power Stereo Audio Codec. [online]. Available: focus.ti.com/lit/

ds

/

symlink

/tlv320aic3204.pdf.