Connor Schnaith Taiyo Sogawa 9 April 2012 Motivation 5 000 new malware samples per day David Perry of Trend Micro Large variance between attacks Polymorphic attacks Perform the same function ID: 369185
Download Presentation The PPT/PDF document "Polymorphic Malware Detection" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Polymorphic Malware Detection
Connor Schnaith, Taiyo Sogawa
9 April 2012Slide2
Motivation
“
5
000
new
malware samples per day”
--David Perry of Trend Micro
Large variance between attacks
Polymorphic attacks
Perform the same function
Altered immediate values or addressing
Added extraneous instructions
Current detection methods insufficient
Signature-based matching not accurate
Behavioral-based detection requires human analysis and engineeringSlide3
Malware Families
•
Classified into related clusters (families)
•
Tracking of development
•Correlating information•Identifying new variants•Based on similarity of code•Koobface•Bredolab•PoisonIvy•Conficker (7 mil. Infected)
Source: Carrera, Ero, and Peter Silberman. "State of Malware: Family Ties." Media.blackhat.com. 2010. Web. 7 Apr. 2012. <https://media.blackhat.com/bh-eu-10/presentations/Carrera_Silberman/BlackHat-EU-2010-Carrera-Silberman-State-of-Malware-slides.pdf>.Slide4
~300 samples of malware with 60% similarity thresholdSlide5
Current Research
Techniques for identifying malicious behavior
Mining and clustering
Building behavior trees
Industry
ThreatFire and Sana Security developing behavioral-based malware detectionSlide6
Design challenges
Discerning malicious portions of code
Dynamic program slicing
accounting for control flow dependencies
Reliable automation
Must be able to be reliable w/o human interventionMinimal false positivesSlide7
Holmes: Main Ideas
Two major tasks
Mining significant behaviors from a set of samples
Synthesizing an optimally discriminative specification from multiple sets of samples
Key distinction in approach
"positive" set - malicious"negative" set - benignMalware: fully described in the positive set, while not fully described in the negative setSlide8
Main Ideas: behavior mining
Extracts portions of the dependence graphs of programs from the positive set that correspond to behaviors that are
significant
to the programs’ intent.
The algorithm determines what behaviors are significant (next slide)
Can be thought of as contrasting the graphs of positive programs against the graphs of negative programs, and extracting the subgraphs that provide the best contrast.Slide9
Main ideas: behavior mining
A "behavior" is a data dependence graph
G = (V, E, a, B)
V is the set of vertices that correspond to operations (system calls)
E is the edges of the graph and correspond to dependencies between operations
a is the labeling function that associates nodes with the operations they representB is the labeling function that associates the edges with the logic that represents the dependenciesSlide10
Main ideas: behavior mining
A program P exhibits a behavior G if it can produce an execution trace T with the following properties
Every operation in the behavior corresponds to an operation invocation and its arguments satisfy certain logical constraints
the logic formula on edges connecting behavior operations is satisfied by a corresponding pair of operation invocations in the trace
Must capture information flow in dependence graphs
two key characteristicsthe path taken by the data in the programsecurity labels assigned to the data source and the data sinkSlide11
Security LabelDescription
NameOfSelf
The name of the currently executing program
IsRegistryKeyForBootList
A Windows registry key lsiting software set to start on boot
IsRegistryKeyForWindowsA registry key that contains configuration settings for the operating systemIsSystemDirectoryThe Windows system directoryIsRegistryKeyForBugfixThe Windows registry key containing list of installed bugfixes and patchesIsRegistryKeyForWindowsShellThe Windows registry key controlling the shellIsDevice
A named kernel deviceIsExecutableFileExecutable fileSlide12
Main ideas: behavior mining
Information gain is used to determine if a behavior is
significant
. A behavior that is not significant is ignored when constructing the dependency graph
Information gain is defined in terms of Shannon entropy and it means gaining additional information to increase the accuracy of determining if a G is in G+ or G-
Shannon entropyH(G+ U G-) corresponds to the uncertainty that a graph G belongs to G+ or G-partition G+ and G- into smaller subsets to decrease that uncertaintyprocess called subgraph isomorphism Slide13
Main ideas: behavior mining
A significant behavior g is a subgraph of a dependence graph in in G+ such that:
Gain(G+ U G- , g) is maximized
Information gain is used as the quality measure to guide the behavior mining process
Some non-significant actions can get passed as significant
these actions may or may not throw off the algorithm that determines if the program is maliciousSlide14
Main ideas: behavior mining
Significant behaviors mined from malware Ldpinch
Leaking bugfix information over the network
Adding a new entry to the system autostart list
Bypassing firewall to allow for malicious traffic
Could say any program that exhibits all three of these behaviors should be flagged maliciousThis is too specific of a statementDoesn't account for variations within a familyIt is known that smaller subsets of behaviors that only include one of these actions could still be maliciousNeed discriminative specificationsSlide15
Main ideas: discriminative specifications
Creates clusters of behaviors that can be classified into as characteristic subset
Program matches specification if it matches all of the behaviors in a subset
"Discriminative" in that it matches the malicious but not the benign programsSlide16
Main ideas: discriminative specifications
Each set of subset of behaviors induces a cluster of samples
Malicious and benign samples are mined are organized into these clusters
Goal: find an optimal clustering technique to organize the malicious into the positive subset and the benign into negative subsetSlide17
Main ideas: discriminative specifications
Three part algorithm
Formal concept analysis
Simulated annealing
Constructing optimal specifications
Formal concept analysisO is a cluster of samplesA is the set of mined behaviors in OA concept is the pair (A, O)Set of concepts: {c1, c2, c3 , ... , cN)Behavior specification: S(c1, c2, c3, ... , cN)
Slide18
Main ideas: discriminative specifications
Formal Concept Analysis (continued)
Begins by constructing all concepts and computes pairwise intersection of the intent sets of these concepts
Repeated until a fixpoint is reached and no new concepts can be constructed
When algorithm terminates, left with an explicit listing of all of the sample clusters that can be specified in terms of one or more mined behaviors
Goal is to find {c1, c2, c3, ... , cN} such that S(c1, c2, c3, ... , cN) is optimal (based on threshold)Slide19
Main ideas: discriminative specifications
Simulated annealing
Probabilistic technique for finding approximate solution to global optimization problem
At each step, a candidate solution
i
is examined and one of its neighbors j is selected for comparisonThe algorithm moves to j with some probabilityA cooling parameter T is reduced throughout process and when it gets to a minimum the process stopsSlide20
Main ideas: discriminative specifications
Constructing Optimal Specifications
Threshold
t
, a set containing positive and negative samples, and a set of behaviors mined with the previous process
Called SpecSynthConstructs full set of conceptsRemoves redundant concepts Run simulated annealing until convergence, then return the best solutionSlide21
Holmes: Mining an ClusteringSlide22
Evaluation and Results: Holmes
Used six malware families to develop specifications
Tested final product against 19 malware families
Collected 912 malware samples and 49 benignSlide23
Holmes Continued
Experiments carried over varying threshold values (t)
Demonstrates high sensitivity to system accuracy
Perhaps only efficient for a specific subset of malwareSlide24
Holmes Scalability
Worst-case complexity is exponential
Behaviors of repeated executions (Stration and Delf) took 12-48 hours to analyze
Scalability for Holmes is a nightmare!
“scary and scaled”Slide25
USENIX
The Advanced Computing Systems Association
(Unix Users Group)
2009 article: automatic behavior matching
Behavior graphs (slices)
Tracking data and control dependenciesMatching functionsPerformance evaluationsSource: Kolbitsch, Clemens. "Effective and Efficient Malware Detection at the End Host." Usenix Security Symposium (2009). Web. 8 Apr. 2012. <http://www.iseclab.org/papers/usenix_sec09_slicing.pdf>.Slide26
USENIX: Producing Behavior Graphs
Instruction log
Trace instruction dependencies
Slicing doesn't reflect stack manipulation
Memory log
Access memory locationsPartial behavior graph of Netsky (Kolbitsch et al)Slide27
USENIX: Behavior Slices to Functions
Use instruction and memory log to determine input arguments
Identify repeated instructions as loops
Include memory read functions
We can now compare to known malwareSlide28
Evaluation
Six families used for development (mostly mass-mailing worm)
Expanded test setSlide29
Performance Evaluation
Installed Internet Explorer, Firefox, Thunderbird, Putty, and Notepad on Windows XP test machine
Single-core, 1.8 GHz, 1GB RAM, Pentium 4 processorSlide30
USENIX Limitations
Evading system emulator
USENIX detector uses Qemu emulator
delays
time-triggered behavior
command and control mechanismsModifying algorithms behaviorA more fundamental change, but cannot be detected using same signaturesEnd-host based systemCannot track network activitySlide31
Questions/Discussion