/
PHMM Applications PHMM Applications

PHMM Applications - PowerPoint Presentation

alida-meadow
alida-meadow . @alida-meadow
Follow
381 views
Uploaded On 2017-11-30

PHMM Applications - PPT Presentation

PHMM Applications 1 Mark Stamp Applications We consider 2 applications of PHMMs from information security Masquerade detection Malware detection Both show some strengths of PHMMs Both are somewhat unique ID: 611393

applications phmm data hmm phmm applications hmm data training commands detection sequences results schonlau malware dynamic simulated masquerade curves

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "PHMM Applications" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

PHMM Applications

PHMM Applications

1

Mark StampSlide2

Applications

We consider 2 applications of PHMMs to problems in information securityMasquerade detectionMalware detection

Both show some strengths of PHMMsBoth are somewhat unique PHMMs not always a first choice…

PHMM Applications

2Slide3

PHMM for Masquerade Detection

PHMM Applications

3

Lin

Huang

Mark StampSlide4

Masquerader?

Masquerader makes unauthorized use of another user’s accountMasquerader tries to evade detection by pretending to be the other userCan we detect masquerader?

Intrusion Detection System (IDS)We consider special case where such an IDS is based on UNIX commands

PHMM Applications

4Slide5

Schonlau Dataset

Collection of UNIX commands, 50 users5k training commands per user, plus

…10k “attack” commands per userAlso, a key to tell which blocks are attack and which belong to same user

Nominally, 100 blocks, 100 commands each

No real session start/end info provided

This could be an issue

PHMM Applications

5Slide6

Previous Work

Lots of papers use “Schonlau

dataset”Types of methods that have been usedInformation theoreticText mining

Hidden Markov Model

Naïve Bayes

Sequences and bioinformatics

SVM, and Other

PHMM Applications

6Slide7

Information Theoretic

Schonlau originally used compression-based schemeThe theory is that commands by same user should compress more

By subsequent standard, poor resultsSome other similar work, but…

no strong results based on compression

Compression for malware detection?

PHMM Applications

7Slide8

Text Mining

Look for repetitive sequencesCan be used to detect particular userAlmost like a signature

PCA has also been used hereRepetitive sequences, i.e., patternsPCA can find such structure

Training cost considered high

Other ways to do “text mining”?

PHMM Applications

8Slide9

Hidden Markov Model

Need we say more?HMM is one of the most popular detection strategies in this fieldResults are good

Serves as benchmark in many (most) studies of other techniquesWe implement HMM detector and compare to PHMM

PHMM Applications

9Slide10

Naïve Bayes

Naïve Bayes (NB) relies on frequenciesNo sequential info used

Very simpleEfficient training & scoringDiscuss naïve Bayes in later chapter

C

lose connection between HMM and NB

So, not too surprising that this works

But, surprising that it works so well

PHMM Applications

10Slide11

Sequences and Bioinformatics

n-gram approaches very popularLike HMM, also used as benchmark

Sequence alignment has been usedBased on Smith-Waterman algorithmLike constructing MSA in PHMMClosest previous work to PHMM

We’ll compare our PHMM results to both n-gram and HMM

PHMM Applications

11Slide12

Support Vector Machines

Several previous studies use SVMSVM has nice geometric interpretationSVMs very

popular in machine learningFor masquerade detection, SVM results are about same as NBClaimed that SVM is more efficient, as compared to naïve Bayes

But, naïve Bayes is very efficient

PHMM Applications

12Slide13

Other

Frequent and/or infrequent commands Neither seems to perform well“Hybrid Bayes one step Markov” and “hybrid multistep Markov”

Nice names, but not so good results“Non-negative matrix factorization”Good results

Ensemble (combination) approaches

Seem to offer slight improvement

PHMM Applications

13Slide14

Experimental Results

Again, we compare HMM and n-grams to several PHMM modelsAll are tested on Schonlau

datasetThen we generate a simulated datasetAll tested again on simulated dataWhy simulated data?

Schonlau

data has limitations

wrt

PHMMThis will be explained later

PHMM Applications

14Slide15

HMM & n-Gram ROC Curves

First, compare HMM and n-grams

PHMM Applications

15Slide16

HMM and n-Gram AUC

For ROC curves on previous slide…

PHMM Applications

16Slide17

Training PHMM

How many sequences to use?More sequences, better for E matrix…

…but worse for gapsLength of each sequence?For Schonlau

dataset, we have 5k training commands per user

Where to begin/end sequences?

No good answers for

Schonlau dataset

PHMM Applications

17Slide18

PHMM Sequences

Note that all 5k commands used in each case

PHMM Applications

18Slide19

PHMM ROC Curves

ROC curves for each PHMM caseAny trend?

PHMM Applications

19Slide20

PHMM AUC

AUC for each PHMM case5, 10, and 20 sequences are best cases

PHMM Applications

20Slide21

HMM, n-Gram, and PHMM

Again, for Schonlau

datasetWhich method is better?

PHMM Applications

21Slide22

HMM vs PHMM

HMM and PHMM give similar results on Schonlau

datasetSurprising that PHMM does so wellWhy? No begin/end sequence info!

What if we had “better” sequences?

PHMM could certainly do better and maybe much, much better

But how to get a better dataset?

PHMM Applications

22Slide23

Simulated Dataset

Generate Markov model for each userBased on monograph & digraph stats

Like matrices π and

A

of an HMM

Now we can generate sequences

Use matrix

π

to select initial element

Then use matrix

A

to generate sequence

HMM must do well on this data

(why?)

PHMM might do well

or not

PHMM Applications

23Slide24

ROC Curves Simulated Data

HMM vs

PHMMBased on 5k training commands

PHMM Applications

24Slide25

AUC for Simulated Data

Again, based on 5k training commands

PHMM Applications

25Slide26

Real World Problem

Masquerade detection in real worldAt first, we have little training dataCan’t protect user until we train a

modelSo, we want to train as soon as possibleMinimum training data needed to obtain a useful model?

We compare HMM and PHMM with

200, 400, and 800 training commands

PHMM Applications

26Slide27

Limited Training Data

Simulated dataHMM

vs PHMMBig difference when very little training data available

PHMM Applications

27Slide28

Limited Training Data

PHMM most impressive with very little data (especially wrt AUC

0.1)

PHMM Applications

28Slide29

Limited Training Data

Same results as previous slide

PHMM Applications

29Slide30

Optimal Masquerade Detection Strategy?

Obtain 200 commands, train PHMMUse this PHMM model until a reliable set of 800+ commands is available

Then train HMM on 800+ commandsUse HMM from then onGives us a reliable model with limited data, and best model with more data

PHMM Applications

30Slide31

Another PHMM Advantage?

PHMM might be better when attacker hijacks ongoing sessionMasquerader mimics average behavior

This is what is modeled by HMMHarder to mimic sequential behaviorAs modeled by PHMM

Depends on position in the sequence

This should be investigated further

PHMM Applications

31Slide32

PHMM for Malware Detection

PHMM Applications

32

Swapna

Vemparala

Mark StampSlide33

Malware Detection

In previous work, PHMM tested for metamorphic detectionBased on extracted

opcodesResults were generally not impressiveMSA has many gaps and PHMM is weakCode transposition causes problems

And code transposition common in malware

Opcode

sequence

not strong

wrt

PHMM

PHMM Applications

33Slide34

Malware Detection 2.0

Here, again apply PHMM to malwareBut what to use as features ???Want feature(s) where

…Sequence/order is criticalAnd, difficult for malware writer to

modify sequential information

What feature(s) to use?

(

Static) opcodes

not good in PHMM

PHMM Applications

34Slide35

Software Birthmarks

Birthmark is inherent feature of codeIn contrast to a watermark

We consider both static and dynamic birthmarksStatic 

collected without executing

Dynamic

execution/emulation Examples of each?

Advantages/disadvantages of each?

PHMM Applications

35Slide36

This Research

Consider opcodesStatic feature, extracted by disassembly

Also consider API callsDynamic, use Buster Sandbox AnalyzerCompare HMM and PHMM for both

Then 3 cases for each malware family

Static and dynamic HMM

Dynamic PHMM

PHMM Applications

36Slide37

Data

Malware data from Malicia

ProjectPHMM Applications

37

Benign set of 20 Windows applications Slide38

HMM & Opcode

Sequences

Scatterplots and ROC curves for Security Shield

PHMM Applications

38Slide39

HMM Results

Results for all families, static and dynamic birthmarks

PHMM Applications

39Slide40

PHMM

Dynamic birthmarks, i.e., API calls

PHMM Applications

40Slide41

Results

Static and dynamic HMMAnd dynamic PHMM

PHMM Applications

41Slide42

Bottom Line

In these cases, dynamic data gives better resultsAPI calls better than (static) opcodes

HMM does very well on API calls……but

PHMM can do even better

Sequential info matters in API calls!

Is PHMM really worth it?

PHMM Applications

42Slide43

References

Masquerade detectionL. Huang and M. Stamp, Masquerade detection using profile hidden Markov models, Computers & Security

, 30(8):732-747, November 2011Malware detectionS. Vemparala

, et al, Malware detection using dynamic birthmarks, 2nd International Workshop on Security & Privacy Analytics (IWSPA 2016), co-located with ACM CODASPY 2016, March 9-11, 2016

PHMM Applications

43