/
User Manual of User Manual of

User Manual of - PowerPoint Presentation

lindy-dunigan
lindy-dunigan . @lindy-dunigan
Follow
383 views
Uploaded On 2017-06-18

User Manual of - PPT Presentation

Mining Mouse Vocalizations Prepared by Jesin Zakaria and Eamonn Keogh CREATE SPECTROGRAM Run the code createSpectrom to create spectrogram from a wav file idealize the spectrogram ID: 560848

candidate syllables set class syllables candidate class set txt spectrogram normalization folder ground create method paper truth time 692

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "User Manual of" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

User Manual ofMining Mouse Vocalizations

Prepared by

Jesin

Zakaria

and

Eamonn

KeoghSlide2

CREATE SPECTROGRAM

Run the code

createSpectro.m

to

create spectrogram from a

.wav

file

idealize

the spectrogram

extract

candidate syllables

from

idealized spectrogram

Try the following example

Set,

rec

= ‘..\031611KOKO02MATED.wav';

% put the address and name of the wav file

D = ‘...\031611KOKO02MATEDspectro\';

% location of the folder

% that will contain syllables

Depending on the size of main memory and recording set range of the

for

loop

In each iteration we created spectrogram of

two

minutes of the recording,

this value can be changed to create spectrogram of longer section of the recording.

RUNNING TIME

:

Since the running time is faster than real time, we did not include running time analysis in our paper.

For example,

It took on average,

(12.95 + 12.81 + 12.67)/3 = 12.81 second, to create spectrogram of a

two

minute long recording

It took, 85.7 second to extract

connected components

from the

idealized spectrogram

of a

six

minute long recordingSlide3

CREATE SPECTROGRAM

rec

= 'C:\Users\Jesin\Desktop\temp\031611KOKO02MATED.wav';

t1 = 124000*250;

t2 = 125000*250;

[Y, FS] = wavread(rec,[t1,t2]);[y,F,T,P]=spectrogram(Y,512,256,512,FS,'yaxis'); C = -10*log10(P);C(C<35)=0;C(C>80)=0;C(C~=0)=1; imshow(~C);

124

Time (second)

125

40

kHz

100

l

aboratory

mice

Figure 1

: Use the following code to create the idealized spectrogram.Slide4

EXTRACT CANDIDATE SYLLABLES

In

createSpectro.m

we marked the part of code to extract candidate syllables

Results of all filtering steps are included in the extractcandidatesyllable.zip folderThe folder …\031611KOKO02MATEDspectro contains all connected components with duration >10 and <300 and within frequency range 30 to 110kHzThe folder …\031611KOKO02MATED contains all candidate syllables after filtering out some noise and excluding all the syllables but one that appear in the same time stampThe folder …\sametime contains syllables that were excluded for appearing in same timestampSlide5

CLASSIFY CANDIDATE SYLLABLES

Run the code

classifySyllables.m

Require:

labelGrndTruth.txt

contains labels of the ground truththeta.txt contains thresholds for each class. mean, sigma, mean+sigma and mean+2*sigma for each class of syllables in the ground truth are included in column 1, 2, 4 and 5 of theta.txtNomalized Ground truthCandidate syllables bitmapsList of candidate syllables in sorted orderResult:For our sample example,‘dis031611KOKO02MATED.txt’, contains distance

of the candidate syllables to GroundTruth

‘label 031611KOKO02MATED.txt

’, contains labels of all the candidate syllables

If you want to see class distribution unblock the code for class distribution in classifySyllables.mSlide6

CLASSIFY CANDIDATE SYLLABLES

Normalization method

In our paper we said that all the candidate syllables and ground truth are normalized

before computing the GHT distance between them.

B

ut for brevity we did not include details about our normalization method and also did not validate our normalization method.In the next slide we will present detail about our normalization method.Slide7

CLASSIFY CANDIDATE SYLLABLES

Normalization method

Set:

16

syllables of class 1, 3, 4 and 11 (non confusing classes)Syllables that are not clustered correctly are marked with red circleGHT is calculated without normalizing the syllablesSlide8

CLASSIFY CANDIDATE SYLLABLES

Normalization method

Set:

16

syllables of class

1

,

3

,

4

and 11 (non confusing classes)Still there are some syllables that are not clustered correctly as evident from the following figureGHT is calculated after normalizing the syllables by dividing x and y by the larger dimension(row or column)

Same set of syllables after normalizationSlide9

CLASSIFY CANDIDATE SYLLABLES

Normalization method (we used in our paper)

Set:

16

syllables of class 1, 3, 4 and 11 (non confusing classes)All the syllables except one (marked with arrow), are clustered correctly as evident from the following figureGHT is calculated after normalizing the syllables by dividing x and y by the size of row and column respectively

Same set of syllables after normalizationSlide10

CLASSIFY CANDIDATE SYLLABLES

Same set of syllables after normalization

Set:

16

syllables of class

1 and 27 syllables of class 9 (Confusing classes)

Normalization method (we used in our paper)

GHT

is

calculated after

normalizing the syllables by dividing x and y by the size of row and column respectively Slide11

EDITING GROUND TRUTH

0

100

200

300

400

500

600

700

0

0.2

0.4

0.6

0.8

1

Adding more instances

Classification Accuracy

for edited ground truth

for all the labeled syllables

Run

accuracyGrndTrth.m

to generate the plot

It requires,

editMatrix.txt

dis692.txt

label692.txt

DESCRIPTION OF THE FILES

In our paper we have mentioned about the

692

annotated syllables by the domain expert.

Instead of using that

692

syllables as

ground truth

we used data editing technique,

that resulted in a set of

108

syllables which we used as

GROUNDTRUTH

for our experiments

1. editMatrix.txt

contains result of editing

692

annotated syllables

Column 2, 3, 4 and 5 represent the number of syllable added to the ground truth, class label of

the syllable, total number of classified syllable using the edited ground truth and accuracy rate.

2.

dis692.txt

contains GHT distances of the

692

annotated syllables

3. label692.txt

contains class labels of the

692

syllables

groundtruth.zip

contains the set

of

692

syllable and

108

syllables

that we mentioned in our paper.Slide12

MOTIF DISCOVERY

Run

findMotif.m

to find motifs from

a vocalization

944.7 – 945.2 sec194.8 – 195.2 sec

Instruction:

In

findMotif.m

need to change location of the folders that will contain motifs,

.wav file, list of syllables, label of the syllablesAnd also create folder e.g. …/motif/6 …/motif/7 before running the code.

These folders will contain motifs of length 6, 7 etc.motif.zip contains motifs from the attached .wav file.Slide13

Clustering mice vocalizations

Run

clusterMtf.m

to cluster motifs from

mice vocalizations

The folder ‘dendo_mice’ contains all the required files used to generate the dendrograms of figure 12 and figure 13.Slide14

d

d

q

d

ddqd

(‘q’ means, unknown class)

QUERY

Similarity search / Query by content

Some additional results are attached here

10 NN from four vocalizations are presented.Slide15

qaiaiacia

(‘q’ means, unknown class)

QUERY

Similarity search / Query by content

Some additional results are attached here

10 NN from four vocalizations are presented.

a

q

i

a

i

ac

iaSlide16

Motif Significance

Run

mtfSgnfnc.m

to assess significance of motifs based on their z-score.

The folder ‘

../mtfSgnfcn’ contains all the required files used to generate the plot of figure 17.Slide17

Contrast sets

createContrastset.m

is used to create the contrast sets.

contratset.m

is used to extract the patterns in contrast sets, from a vocalization.The folder ‘../contrastSet’ contains some examples of contrast set that we mentioned in our paper. It also contains necessary files needed in createContrastset.m‘contrastset.txt’ contains the list of substrings sorted in descending order of their information gain. Slide18

Question/ comment?Email at, jzaka001@cs.ucr.edu