Meltem Ozsoy Caleb Donovick Iakov Gorelik Nael Abu Ghazaleh and Dmitry Ponomarev Binghamton University University of California Riverside HPCA 2015 San Francisco CA ID: 660257
Download Presentation The PPT/PDF document "Malware-Aware Processors: A Framework fo..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Malware-Aware Processors: A Framework for Efficient Online Malware Detection
Meltem Ozsoy*, Caleb Donovick*, Iakov Gorelik*,Nael Abu-Ghazaleh** and Dmitry Ponomarev** Binghamton University, ** University of California, Riverside
HPCA 2015 - San Francisco, CASlide2
Malware Growth
HPCA 2015 - San Francisco, CAAnti-virussoftwareOS Level DefensesExecution Monitoring
AV Test Malware Statistics,2014 (http://
www.av-test.org
/en/statistics/malware/)Slide3
What This Work is All About
Comprehensive execution monitors are too heavy-weight to be always-onPerformance lossLow-level indicators were shown to be effective to classify malwareDemme et al. (ISCA 2013) proposed offline detection using performance countersOur contribution: online detection in hardwareHardware classifies are not perfect, thus:Two Level Detection Framework: Use hardware-based detector to prioritize the work of heavy-weight software detectorHPCA 2015 - San Francisco, CASlide4
Two Level Detection Framework
HPCA 2015 - San Francisco, CASlide5
Malware Detection
Static Analysis Study program without executionSignature generation with byte/instruction sequencesUsing source code, CFG generationLimitations of Static AnalysisRequires source code, disassemblyMetamorphic malware (Self Modifying Code)Polymorphic (encrypted) malwareNon-deterministic inputs can change program flowHPCA 2015 - San Francisco, CASlide6
Malware Detection
Dynamic AnalysisSystem calls, function parameters, API calls, created processes/threads, etc. monitored Expensive, uses VM or emulatorLimitations of Dynamic AnalysisOnly effective against analyzed malwareAdvanced Persistent Threats (APTs) can bypass with zero-day exploitsHPCA 2015 - San Francisco, CASlide7
VM
VMExecution MonitoringSystemcall ForwardingProxos (OSDI’06)VM Introspection, Isolated MonitoringLivewire(NDSS’03), Virtuoso (IEEE Security & Privacy’11)Reference MonitoringPinOS(ACM VEE’07), Kernel DBT(ASPLOS’12)HPCA 2015 - San Francisco, CAApplication
EM
EM
Application
Modified Application
EM
Kerne
l
Kerne
l
Kerne
lSlide8
Malware Detection at Low-level
Sub-semantic MonitoringLow-level indicators of program such as Performance Counters (Demme et al. ISCA’13) are monitoredLimitationsDetection is after the factNot real-timeFeatures are limited to available performance countersHPCA 2015 - San Francisco, CASlide9
Our Proposal: MAP
Malware Aware Processor (MAP)Use hardware for sub-semantic detectionTrain a simple machine learning algorithmPeriodic checks during executionPerform online detection using time series analysis in hardwareHigh overhead software analysis activated only for suspicious programs (Two Level Detection)HPCA 2015 - San Francisco, CASlide10
MAP Design Overview
HPCA 2015 - San Francisco, CAInstruction FetchInstruction CacheRename/DecodeBranch PredictionPhysical Register File
Issue
Functional Units
ROB & Architectural Register File
MMU
Data Cache
Exception Unit
MAPSlide11
Sub-Semantic Feature Space
ArchitecturalARCH : Frequency of memory read/writes, taken & immediate branches and unaligned memory accesses Memory AddressMEM1 : Frequency of memory address distance histogram MEM2 : Memory address distance histogram mix InstructionINS1 : Frequency of instruction categories INS2 : Difference between two most frequent opcodes INS3 : Existence of categories INS4 : Existence of opcodesHPCA 2015 - San Francisco, CASlide12
Machine Learning Algorithms
Logistic RegressionHypothesis function (ax1+bx2+ ... +c) is trained to figure out weights (a, b, c)Sigmoid function translates the hypothesis function to a value (0 – 1)Neural Network (multi layer perceptron)One hypothesis function trained for each layerTranslation function is tanhHPCA 2015 - San Francisco, CASlide13
Data Set & Data Collection
FamilyTrainTestValExtendedTestTotalVundo1425
21
42
Emerleox
10
5
4
33
52
Virut
8
3
7
46
64
Sality
12
2
4
46
64
Ejik
7
6
4
101
118
Looper
10
3
6
145
164
AdRotator
14
1
2
119
136
PornDialer
11
6
4
196
217
Boaxxe
13
6
0
211
230
Total
99
34
36
918
1087
32
-bit Windows 7
on
VirtualBox
Windows Security
Services disabled
Features collected through PIN during execution of malware
University Of
Mannheim dataset
Offensive Computing
VirusTotal
HPCA 2015 - San Francisco, CASlide14
Selecting Features for Classification
HPCA 2015 - San Francisco, CAOffline detection performanceLow hardware implementation complexityUsed for hardware implementation Slide15
Key Aspects of MAP Operation
Machine Learning model trained at design timeWeights for the model are loaded into MAP hardwareWhile program executes, MAP hardware collects features at instruction commit stageFor each 10K committed instructions, a binary decision (malware/regular) is madeHPCA 2015 - San Francisco, CASlide16
MAP Online Detection
Periodic binary signals created for 10K instructions during executionExponentially Weighted Moving Average (EWMA) is used for filtering out occasional false positives/negatives Additional optimizations for efficient hardware implementationFixed Point representationSliding window of signalsHPCA 2015 - San Francisco, CASlide17
Hardware Implementation
Logistic RegressionNeural NetworkHPCA 2015 - San Francisco, CASlide18
MAP FPGA Implementation
HPCA 2015 - San Francisco, CASlide19
Example of EWMA
HPCA 2015 - San Francisco, CALogistic RegressionNeural NetworkSlide20
Results
HPCA 2015 - San Francisco, CASlide21
Key Results of MAP
Best performing feature is based on instruction opcodesMAP achieves 89% real-time detection with only 6% false positives with a simple LR predictionPhysical design overheadCycle time 1.9%(LR), 5.5%(NN)Area 0.3%(LR), 5.7%(NN) Power 0.1%(LR), 1.7%(NN)HPCA 2015 - San Francisco, CASlide22
Future Directions
MAP can be extended as a configurable malware detection engineUpdating weights for new malwareConfiguring featuresIntegrated FPGAs in new CPU designs (Intel Xeon) can be used for MAPHPCA 2015 - San Francisco, CASlide23
Thank
You!Questions?HPCA 2015 - San Francisco, CA