/
Polymorphic Malware Detection Polymorphic Malware Detection

Polymorphic Malware Detection - PowerPoint Presentation

giovanna-bartolotta
giovanna-bartolotta . @giovanna-bartolotta
Follow
417 views
Uploaded On 2016-06-19

Polymorphic Malware Detection - PPT Presentation

Connor Schnaith Taiyo Sogawa 9 April 2012 Motivation 5 000 new malware samples per day David Perry of Trend Micro Large variance between attacks Polymorphic attacks Perform the same function ID: 369185

set behavior main malware behavior set malware main ideas behaviors mining significant malicious discriminative samples specifications system usenix graphs

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Polymorphic Malware Detection" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Polymorphic Malware Detection

Connor Schnaith, Taiyo Sogawa

9 April 2012Slide2

Motivation

5

000

new

malware samples per day”

--David Perry of Trend Micro

Large variance between attacks

Polymorphic attacks

Perform the same function

Altered immediate values or addressing

Added extraneous instructions

Current detection methods insufficient

Signature-based matching not accurate

Behavioral-based detection requires human analysis and engineeringSlide3

Malware Families

Classified into related clusters (families)

Tracking of development

•Correlating information•Identifying new variants•Based on similarity of code•Koobface•Bredolab•PoisonIvy•Conficker (7 mil. Infected)

Source: Carrera, Ero, and Peter Silberman. "State of Malware: Family Ties." Media.blackhat.com. 2010. Web. 7 Apr. 2012. <https://media.blackhat.com/bh-eu-10/presentations/Carrera_Silberman/BlackHat-EU-2010-Carrera-Silberman-State-of-Malware-slides.pdf>.Slide4

~300 samples of malware with 60% similarity thresholdSlide5

Current Research

Techniques for identifying malicious behavior

Mining and clustering

Building behavior trees

Industry

ThreatFire and Sana Security developing behavioral-based malware detectionSlide6

Design challenges

Discerning malicious portions of code

Dynamic program slicing

accounting for control flow dependencies

Reliable automation

Must be able to be reliable w/o human interventionMinimal false positivesSlide7

Holmes: Main Ideas

Two major tasks

Mining significant behaviors from a set of samples

Synthesizing an optimally discriminative specification from multiple sets of samples

Key distinction in approach

"positive" set - malicious"negative" set - benignMalware: fully described in the positive set, while not fully described in the negative setSlide8

Main Ideas: behavior mining

Extracts portions of the dependence graphs of programs from the positive set that correspond to behaviors that are

significant

to the programs’ intent.

The algorithm determines what behaviors are significant (next slide)

Can be thought of as contrasting the graphs of positive programs against the graphs of negative programs, and extracting the subgraphs that provide the best contrast.Slide9

Main ideas: behavior mining

A "behavior" is a data dependence graph

G = (V, E, a, B)

V is the set of vertices that correspond to operations (system calls)

E is the edges of the graph and correspond to dependencies between operations

a is the labeling function that associates nodes with the operations they representB is the labeling function that associates the edges with the logic that represents the dependenciesSlide10

Main ideas: behavior mining

A program P exhibits a behavior G if it can produce an execution trace T with the following properties

Every operation in the behavior corresponds to an operation invocation and its arguments satisfy certain logical constraints

the logic formula on edges connecting behavior operations is satisfied by a corresponding pair of operation invocations in the trace

Must capture information flow in dependence graphs

two key characteristicsthe path taken by the data in the programsecurity labels assigned to the data source and the data sinkSlide11

Security LabelDescription

NameOfSelf

The name of the currently executing program

IsRegistryKeyForBootList

A Windows registry key lsiting software set to start on boot

IsRegistryKeyForWindowsA registry key that contains configuration settings for the operating systemIsSystemDirectoryThe Windows system directoryIsRegistryKeyForBugfixThe Windows registry key containing list of installed bugfixes and patchesIsRegistryKeyForWindowsShellThe Windows registry key controlling the shellIsDevice

A named kernel deviceIsExecutableFileExecutable fileSlide12

Main ideas: behavior mining

Information gain is used to determine if a behavior is

significant

. A behavior that is not significant is ignored when constructing the dependency graph

Information gain is defined in terms of Shannon entropy and it means gaining additional information to increase the accuracy of determining if a G is in G+ or G-

Shannon entropyH(G+ U G-) corresponds to the uncertainty that a graph G belongs to G+ or G-partition G+ and G- into smaller subsets to decrease that uncertaintyprocess called subgraph isomorphism Slide13

Main ideas: behavior mining

A significant behavior g is a subgraph of a dependence graph in in G+ such that:

Gain(G+ U G- , g) is maximized

Information gain is used as the quality measure to guide the behavior mining process

Some non-significant actions can get passed as significant

these actions may or may not throw off the algorithm that determines if the program is maliciousSlide14

Main ideas: behavior mining

Significant behaviors mined from malware Ldpinch

Leaking bugfix information over the network

Adding a new entry to the system autostart list

Bypassing firewall to allow for malicious traffic

Could say any program that exhibits all three of these behaviors should be flagged maliciousThis is too specific of a statementDoesn't account for variations within a familyIt is known that smaller subsets of behaviors that only include one of these actions could still be maliciousNeed discriminative specificationsSlide15

Main ideas: discriminative specifications

Creates clusters of behaviors that can be classified into as characteristic subset

Program matches specification if it matches all of the behaviors in a subset

"Discriminative" in that it matches the malicious but not the benign programsSlide16

Main ideas: discriminative specifications

Each set of subset of behaviors induces a cluster of samples

Malicious and benign samples are mined are organized into these clusters

Goal: find an optimal clustering technique to organize the malicious into the positive subset and the benign into negative subsetSlide17

Main ideas: discriminative specifications

Three part algorithm

Formal concept analysis

Simulated annealing

Constructing optimal specifications

Formal concept analysisO is a cluster of samplesA is the set of mined behaviors in OA concept is the pair (A, O)Set of concepts: {c1, c2, c3 , ... , cN)Behavior specification: S(c1, c2, c3, ... , cN)

Slide18

Main ideas: discriminative specifications

Formal Concept Analysis (continued)

Begins by constructing all concepts and computes pairwise intersection of the intent sets of these concepts

Repeated until a fixpoint is reached and no new concepts can be constructed

When algorithm terminates, left with an explicit listing of all of the sample clusters that can be specified in terms of one or more mined behaviors

Goal is to find {c1, c2, c3, ... , cN} such that S(c1, c2, c3, ... , cN) is optimal (based on threshold)Slide19

Main ideas: discriminative specifications

Simulated annealing

Probabilistic technique for finding approximate solution to global optimization problem

At each step, a candidate solution

i

is examined and one of its neighbors j is selected for comparisonThe algorithm moves to j with some probabilityA cooling parameter T is reduced throughout process and when it gets to a minimum the process stopsSlide20

Main ideas: discriminative specifications

Constructing Optimal Specifications

Threshold

t

, a set containing positive and negative samples, and a set of behaviors mined with the previous process

Called SpecSynthConstructs full set of conceptsRemoves redundant concepts Run simulated annealing until convergence, then return the best solutionSlide21

Holmes: Mining an ClusteringSlide22

Evaluation and Results: Holmes

Used six malware families to develop specifications

Tested final product against 19 malware families

Collected 912 malware samples and 49 benignSlide23

Holmes Continued

Experiments carried over varying threshold values (t)

Demonstrates high sensitivity to system accuracy

Perhaps only efficient for a specific subset of malwareSlide24

Holmes Scalability

Worst-case complexity is exponential

Behaviors of repeated executions (Stration and Delf) took 12-48 hours to analyze

Scalability for Holmes is a nightmare!

“scary and scaled”Slide25

USENIX

The Advanced Computing Systems Association

(Unix Users Group)

2009 article: automatic behavior matching

Behavior graphs (slices)

Tracking data and control dependenciesMatching functionsPerformance evaluationsSource: Kolbitsch, Clemens. "Effective and Efficient Malware Detection at the End Host." Usenix Security Symposium (2009). Web. 8 Apr. 2012. <http://www.iseclab.org/papers/usenix_sec09_slicing.pdf>.Slide26

USENIX: Producing Behavior Graphs

Instruction log

Trace instruction dependencies

Slicing doesn't reflect stack manipulation

Memory log

Access memory locationsPartial behavior graph of Netsky (Kolbitsch et al)Slide27

USENIX: Behavior Slices to Functions

Use instruction and memory log to determine input arguments

Identify repeated instructions as loops

Include memory read functions

We can now compare to known malwareSlide28

Evaluation

Six families used for development (mostly mass-mailing worm)

Expanded test setSlide29

Performance Evaluation

Installed Internet Explorer, Firefox, Thunderbird, Putty, and Notepad on Windows XP test machine

Single-core, 1.8 GHz, 1GB RAM, Pentium 4 processorSlide30

USENIX Limitations

Evading system emulator

USENIX detector uses Qemu emulator

delays

time-triggered behavior

command and control mechanismsModifying algorithms behaviorA more fundamental change, but cannot be detected using same signaturesEnd-host based systemCannot track network activitySlide31

Questions/Discussion