/
Anomaly detection for ECAL DQM Anomaly detection for ECAL DQM

Anomaly detection for ECAL DQM - PowerPoint Presentation

shangmaxi
shangmaxi . @shangmaxi
Follow
342 views
Uploaded On 2020-08-28

Anomaly detection for ECAL DQM - PPT Presentation

Nabarun Dev 1 Colin Jessop 1 Nancy Marinelli 1 Maurizio Pierini 2 08 11 2017 1 University of Notre Dame 2 CERN DQMML meeting Outline NDev University of Notre Dame DQMML Meeting 081117 ID: 807039

university dqm notre meeting dqm university meeting notre dev dame loss data set histograms training ecal images model size

Share:

Link:

Embed:

Download Presentation from below link

Download The PPT/PDF document "Anomaly detection for ECAL DQM" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Anomaly detection for ECAL DQM

Nabarun

Dev1, Colin Jessop1, Nancy Marinelli1, Maurizio Pierini208/11/2017

1

University of Notre

Dame2CERN

DQM-ML meeting

Slide2

Outline

N.Dev, University of Notre Dame

DQM-ML Meeting, 08/11/17IntroductionDatasetModels

Preliminary results

Slide3

Introduction

N.Dev, University of Notre Dame

DQM-ML Meeting, 08/11/17DQM (data quality monitoring) system is an important tool to ensure high quality data-taking for analyses purposes. It is used both online and offline.In real time data quality is currently assessed by looking at a dashboard containing a set of histograms which are compared to a reference set of histograms according to certain set of instructions.

By spotting anomalies in these images

(plots/histograms) it is possible to identify problems that appear in the detector, flag poor

quality data and/or take steps towards fixing these issues.

Slide4

ECAL DQM

N.Dev, University of Notre Dame

DQM-ML Meeting, 08/11/17The ECAL DQM consists of a set of several histograms that help us monitor the running condition of the ECAL subdetector. DQM GUI shown below:The histograms are usually redrawn every lumisection. By monitoring these the shifter is able to spot problems in real time.

Slide5

Avenues of improvements

N.Dev, University of Notre Dame

DQM-ML Meeting, 08/11/17Although the DQM system has been performing well over the years there are areas which can be potentially improved.The number of plots to monitor can be overwhelmingly large which can cause delay in spotting a problem or cause a transient problem to be overlooked.Lot of manpower is needed to constantly monitor these systems during data-taking. There is a possibility that the monitoring decisions can vary from shifter to shifter.

The aim of this project is to (study the feasibility of) automatize(

ing) the ECAL DQM system.

This is under the umbrella of the entire ML for DQM effort for the entire CMS detectorMachine Learning techniques can help reduce these issues.

Slide6

N.Dev, University of Notre Dame

DQM-ML Meeting, 08/11/17

DISCLAIMER: WORK IN PROGRESS The following is very much a work in progress and all models and strategies discussed are preliminary. Kindly chime in with suggestions.

Slide7

Current Strategy

N.Dev, University of Notre Dame

DQM-ML Meeting, 08/11/17Assume that normal instances occur much more frequently than anomalous instancesUse only normal instances (semi-supervised learning) to train an

autoencoder (input is mapped to ouput, the system learns to reconstruct the input with minimum loss ) [1] to minimize the loss function.

The loss then also serves as a metric:Good instances should be reconstructed with low lossBad instances should be reconstructed with higher loss

[1]:http://

ufldl.stanford.edu/tutorial/unsupervised/Autoencoders/ , http

://

www.deeplearningbook.org

/contents/

autoencoders.html

Slide8

Dataset

N.Dev, University of Notre Dame

DQM-ML Meeting, 08/11/17Current dataset consists of ~40000 samples from 2016 data from lumisections marked as good. (2016 goldenjson , CMSSW_9_2_11) [Thanks to Tanmay & Michael from ECAL DQM team]

The dataset (SingleElectron

/RAW) is processed to emulate the online DQM running conditions and produce one sample (set of images) per

lumisection ( Most of the RAW data required for this has been moved to TAPE. Was able to acquire only 40k images. Plan to add more images using 2017 data in the near future.)

The current image set per sample consists of rechit occupancy and timing plots: one for barrel and one each for both endcaps.

Slide9

Dataset

N.Dev, University of Notre Dame

DQM-ML Meeting, 08/11/17The following study only uses the rechit occupancy images for the barrel: PREPROCESSING: The only preprocessing that was done was to normalize the histograms to their integral.

The holes in the plot are usually due to permanently masked channels/towers and network can be expected to learn that they are ok.

Slide10

Model

0

N.Dev, University of Notre DameDQM-ML Meeting, 08/11/17FrameworkUsed:Keras library using tensorflow backendAuto-encoder with convolutional layersConv2D (8

channels,(3x3

) patches)

MaxPooling2D(2, 2)Conv2D (8 ,(3x

3)) MaxPooling2D

(5, 5)Conv2D (8 ,(3x3))

UpSampling2D

(5, 5

)

Conv2D

(

8

,

(

3

x

3

))

UpSampling2D(2, 2)

Conv2D

(

8

,

(

3

x

3

))

All

Conv

layers are ‘padded’ to keep size of output

channels same as input.

Slide11

Training performance

N.Dev, University of Notre Dame

DQM-ML Meeting, 08/11/17Trained on gpu, batch size of 3060:20:20 split of train:test:validationPatience =5 (Number of epochs to wait in which validation loss doesn’t decrease by a minimum threshold (0.05%) before stopping training)Optimizer: Gradient Descent (learning rate 0.01), Loss Function: Binary-Cross Entropy loss

Slide12

Loss as metric

N.Dev, University of Notre Dame

DQM-ML Meeting, 08/11/17Evaluate trained model over (each sample) training and test sets and histogram the reconstruction loss

Training and test sets have similar performance.

Slide13

Choosing a better optimizer

N.Dev, University of Notre Dame

DQM-ML Meeting, 08/11/17Tried several other optimizers available within the KERAS library: ADAM, ADAM with Nesterov, RMSPROP, ADADELTA etc.Chose ADAM optimizer based on validation loss, rmsprop has similar performance

Training and test sets have similar performance.

Slide14

Adding more layers: model

1

N.Dev, University of Notre DameDQM-ML Meeting, 08/11/17

Had to decrease batch size to 20, was hitting

gpu memory bottleneck (I think)

No decrease in loss, trains faster probably due to smaller batch size.

Slide15

N.Dev, University of Notre Dame

DQM-ML Meeting, 08/11/17

Even more layers, less pooling: model 2Final train/val loss is worse with respect to model version 1 , with the same patienceIncreasing patience helps decrease loss to similar value as mode v1.

Patience=5

Patience=20

THIS IS WHERE I AM AT RIGHT NOW

Slide16

Next Steps

N.Dev, University of Notre Dame

DQM-ML Meeting, 08/11/17Gather some anomalous examples and evaluate the model on them Compare their loss spectrum to that of normal examplesIncrease training set size.Try more sophisticated networks: bigger autoencoders with sparsity constraints etc.

Use other images besides occupancy (e.g. timing) as input.

Try other (supervised) learning techniques (e.g.- SVMs ) and compare performance.

Slide17

DQM-ML Meeting, 08/11/17

N.Dev, University of Notre Dame

BACK UP

Slide18

DQM-ML Meeting, 08/11/17

N.Dev, University of Notre Dame

Slide19

DQM-ML Meeting, 08/11/17

N.Dev, University of Notre Dame