/
1 March 2019 Michael D. Brown 1 March 2019 Michael D. Brown

1 March 2019 Michael D. Brown - PowerPoint Presentation

tatyana-admore
tatyana-admore . @tatyana-admore
Follow
342 views
Uploaded On 2019-11-09

1 March 2019 Michael D. Brown - PPT Presentation

1 March 2019 Michael D Brown Research Scientist GTRI Improving Security Through Software De bloating The Software Bloat Problem Software Debloating as a Solution Effects of Software Debloating on Security ID: 765097

gadgets debloating software gadget debloating gadgets gadget software code security expressivity package bloat reduction count set introduction program reuse

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "1 March 2019 Michael D. Brown" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

1 March 2019 Michael D. BrownResearch Scientist, GTRI Improving Security Through Software De bloating

The Software Bloat Problem Software Debloating as a SolutionEffects of Software Debloating on SecurityCode Reuse AttacksSecurity Improvement Measures How are these effects measured? Are these the right measures? Agenda

Introduction Software security is a complex arms race: New attack techniques spur research into new defense mechanisms New defenses require attackers to construct more complex methods to evade them This isn’t going anywhere anytime soon; Software security is an undecidable problem. In this talk, we will explore a proactive method for improving security.

What is Software Bloat? Modern software engineering practices favor software and systems that are: Modular Reusable Feature Rich This helps engineers rapidly develop complex and widely deployable software. However, it comes at a cost – when software is deployed and executed it contains large portions of code that will never be used. This is software bloat.

Sources of Software Bloat - Vertical Software bloat occurs vertically throughout the software stack, primarily at layers of abstraction. Example: glibc modularizes a large number of common C functions via an API. It is very unlikely that a program linking this library uses all of its functions, but the entire library is loaded at execution time.

Sources of Software Bloat - Lateral Software bloat occurs laterally within a software package due to feature creep. Example: cURL supports data transfer via 23 different protocols: DICT, FILE, FTP, FTPS, Gopher, HTTP, HTTPS, IMAP, IMAPS, LDAP, LDAPS, POP3, POP3S, RTMP, RTSP, SCP, SFTP, SMB, SMBS, SMTP, SMTPS, Telnet and TFTP Common users of cURL or libcurl are unlikely to full utilize this level of feature variety.

Prevalence of Software Bloat A recent study of vertical software bloat by Quach et al [1] found: Across 12 commonly used programs such as Firefox, Chrome, VLC, and Sublime, only 65% of instructions and 32% of functions found in dependent libraries are actually required. Commonly used features in these programs require a relatively small portion of the total instructions present in the programs and libraries. For example: Playing an audio file in VLC requires only 12% of the overall instructions. Creating, composing, and saving a file in Sublime requires only 27% of the overall instructions. Fetching and displaying 10 popular websites in Firefox requires only 29% of the overall instructions.

Security: Harder to analyze with bug finding tools Bloat code may contain reachable attack vectors / vulnerabilities Bloat code may increase the overhead of security defenses Bloat code can potentially be used in a code reuse attack Performance: Increases code size / complexityIncreases memory footprintIncreases execution time (not always significant)Increases power consumptionNegative Effects of Software Bloat

Code Reuse attacks are a class of exploits that circumvent Write XOR Execute defenses by reusing existing executable code in the target program. Early attacks exploited memory corruption bugs (such as buffer overflow) to redirect the execution of the program to useful functions in libc . (return-into- libc ) These attacks were limited in expressivity – the attacker could not inject arbitrary code. Background: Code Re-use Attacks

In 2007, Shacham [7] proposed the first gadget-based code reuse attack method, called Return-Oriented Programming (ROP). ROP overcomes the expressivity limits of return-into- libc by chaining multiple snippets of code called “gadgets” together to construct a malicious payload. Gadgets are used like complex instructions, and the the total set of gadgets available to the attacker is their ISA. Gadgets end in a return instruction, which the attacker can exploit to maintain control flow. Gadget Based Code Reuse: Return Oriented Programming (ROP)

In 2007, Shacham [7] proposed the first gadget-based code reuse attack method, called Return-Oriented Programming (ROP). ROP overcomes the expressivity limits of return-into- libc by chaining multiple snippets of code called “gadgets” together to construct a malicious payload. Gadgets are used like complex instructions, and the the total set of gadgets available to the attacker is their ISA. Gadgets end in a return instruction, which the attacker can exploit to maintain control flow. Gadget Based Code Reuse: Return Oriented Programming (ROP)

Gadget Based Code Reuse: Jump Oriented Programming (JOP) In response, researchers developed a large number of defenses against ROP attacks, aimed primarily at defending the stack. In response, JOP [8] was introduced. JOP uses gadgets ending in indirect jump and call instructions to maintain control flow instead of relying on return instructions and the stack. JOP requires the use of a special purpose gadget, called a dispatcher that handles control flow. COP – call only derivative of JOP.

Software Bloat and Gadgets Bloat code, whether it comes from unnecessary library functions or extraneous program features, is loaded into program memory at run time. This bloat code is useless to the user executing the program, but contributes to the total set of gadgets available to the attacker. The attacker can use these gadgets in the construction of their exploits.

Software Debloating In response to this problem, researchers are developing ways to debloat software. Software debloating is a type of software transformation, and can be generalized as follows: P’ = Debloat(P, context) Stated plainly, debloating takes as input a package P and the context in which P is intended to be deployed, and produces a variant P’ that contains just enough functionality to satisfy the specified context.

Software Debloating Software debloating can be performed at many different points in the software lifecycle: Original Source Code Preprocessor Compiler Front End Middle End Compiler Back End Linker Package Binary Loader Debloat Source Code Debloat Intermediate Representation (IR) Debloat Code in Memory Debloat Binary via Rewriting

Overview of Software Debloating Techniques - CHISEL CHISEL [2] is an automated debloating tool that targets unnecessary feature code. It takes as input a test script written by the user that expresses the desired functionality of a debloated variant. This test is used as part of a feedback directed program reduction method based on delta debugging principles.

Overview of Software Debloating Techniques - TRIMMER TRIMMER [3] is an automated debloating tool that targets unnecessary feature code. It takes as input a user defined static package configuration that expresses the compile time configuration of the package. Configuration data is treated as a compile time constant, and is propagated throughout the program. This is followed by custom, aggressive compiler optimizations to prune functionality from the program.

Overview of Software Debloating Techniques - PCL Piece-wise Compilation and Loading (PCL) [5] is an automated debloating technique that targets software bloat originating from shared libraries (described as load time dead code elimination). During compilation, the piecewise compiler generates an external dependency graph which is used at run time to eliminate code bloat in memory. Memory locations that contain bloat are marked non-executable, preventing them from being used in a code re-use attack.

Reduces the number of code reuse gadgets Less code means fewer pieces of code the attacker can stitch together in gadget based code reuse attacks Moving target defense / software diversity Debloating creates new variants of the original package May render information about the original package useless Vulnerability Elimination If code is removed that contains a reachable vulnerability, the attacker cannot exploit itMay potentially remove zero day threats Claims of Positive Effects of Software Debloating on Security

Moving target defense / software diversity Debloating intentionally leaves desired functionality untouched – so information attacker has about the original program can still be used unless it pertains to a debloated segment Doesn’t stop the attacker from reversing the variant as well. Difficult to articulate how much this would inhibit an attacker in a practical scenario. Vulnerability Elimination Who cares if debloating removes known vulnerabilities? You can’t always debloat vulnerable pieces of a package, but you can always patch them.How do we know if we actually removed a zero day or not? Difficulties in Measuring Claims of Improved Security

Measuring the impact of debloating on gadget reduction is simple. Static Analysis tools like ROPgadget [6] will scan a binary and count gadgets by type. We can subtract the number of gadgets found in the debloated package from the number of gadgets found in the original and calculate the rate of gadget count reduction. On average: CHISEL reduced total gadgets by 54.1%TRIMMER reduced total gadgets by 20%PCL reduced total gadgets by 71%Measuring the Impact of Debloating on Gadgets

So, reducing gadget counts in programs is good and improves security, right? Well…. It’s complicated.

Shortcomings of Gadget Count Reduction In our research, we discovered that gadget count reduction is too superficial to be an accurate metric for measuring security improvement. We explored the effect debloating has on security using measures from the attacker’s perspective by asking the question: How does debloating make constructing a code reuse attack more challenging or difficult? In our experiments, it was fairly common for software debloating to achieve good gadget count reduction, yet fail to improve security. Worse yet, we observed instances where security was made worse through software debloating despite achieving positive gadget reduction.

We debloated three software packages that varied in size, structure and operational complexity: libmodbus v3.1.4 [27], a software library implementing a common industrial network protocol. Bftpd v4.9 [26], an FTP server utility program. libcurl v7.61.0 [9], a data transfer utility library. We debloated each package at three different intensity levels: Conservative: Some peripheral features in the package are targeted for debloating. Moderate: Some peripheral features and some core features are targeted for debloating.Aggressive: All debloatable features except for a small set of core features are targeted for debloating.We used our own debloater that operates in manner comparable to CHISEL and TRIMMER (as these debloaters were not available) and achieves gadget count reduction on par with them. Our debloater reduced the count of gadgets on average by 30% for aggressive scenarios, 15% for moderate scenarios, and 8% for conservative scenarios.Analysis Methodology

We focused our analysis on three areas: Overall composition of the gadgets in the original and variant package Which gadgets actually were removed? Are there side effects of removing gadgets? Population of special purpose gadgets Does debloating remove ” dealbreaker” gadgets necessary to construct exploits?Expressivity of the gadgets in the packageHow computationally powerful is the ”Instruction Set” in the original and variant package?Analysis Methodology

Analyzing the composition of the gadgets within a debloated variant led to an interesting discovery: Debloating techniques that remove code from a program representation introduce new gadgets into the variant. In some cases, debloating actually caused the total gadget count to increase for some types due to introduction! Results – Gadget Composition

Gadget Introduction Mechanisms – Compiler Optimization Removing code from a software package via debloating can have unpredictable effects on optimization and code generation choices made by the compiler. Some optimizations suppressed and/or triggered by debloating: Loop Unrolling Function Inlining Dead Code Elimination

x86 and x86-64 have variable length instructions, so it is possible to decode instructions from an offset other than the original instruction boundary. Gadgets found using this method are called unintended gadgets. Since debloating causes significant changes to the package’s final representation, the unintended gadgets in a package can vary greatly in a debloated variant. Gadget Introduction Mechanisms – Unintended Gadgets

We compared the sets of unique gadgets in our original package to each of its variants. Interestingly – gadget introduction in our scenarios is not a rare event, and it happens at a rather high rate. The prevalence of gadget introduction is not apparent using gadget count reduction. Gadget introduction can occur at high rates as a result of debloating, but be hidden under a reduction in total size. It is impossible to know the attacker’s concrete aims in advance, which makes it difficult to know if the gadgets removed by debloating are more or less useful than those introduced. Results – Gadget Composition

Prevalence of Gadget Introduction Introduction Rate: Introduced Gadgets / Total Gadgets in Debloated Variant * 100

Like gadget count reduction, the introduction rate of gadgets via debloating doesn’t give us a complete picture. However, it does open up that possibility that debloating can actually make security worse. We need a qualitative analysis the gadget sets to determine if debloating was successful or not. This is where things get tricky – like software diversity and vulnerability elimination, this is difficult to measure. From an attacker’s perspective, the gadgets available must allow them to maintain exploit control flow (using special purpose gadgets) and provide sufficient computational power to express their desired exploit (using functional gadgets). We explored the effects of debloating on these two aspects. Gadget Introduction and Gadget Quality

We used a combination of automated static analysis and manual inspection to identify special purpose gadgets in the original package and their debloated variants. (NOTE: not an exhaustive list of all special purpose gadgets) Individual Gadget Quality – Special Purpose Gadgets

Key Observations: Debloating generally reduced the number of special purpose gadgets available. However, we observed no cases in which it eliminated all of the gadgets of a particular type. In three cases, the gadget introduction increased the availability of gadgets pf a particular type. (It should be noted that only one gadget was introduced in each instance) In one case, a special purpose gadget type that was not available was introduced. Individual Gadget Quality – Special Purpose Gadgets

Gadget Set Quality – Expressivity of Set Gadget Expressivity is a measure of the computational power of a set of gadgets. Recall that gadgets are used as complex instructions to ”write” a malicious payload using only snippets of the compromised software package. These gadgets can be used to accomplish typical computational tasks such as arithmetic operations, memory load / store, conditional branching, etc. The high bar for expressivity is Turing-completeness which means a set of gadgets is computationally universal. Stated plainly, it means that the gadgets can be used to construct any program. However, practical ROP exploits have been demonstrated that do not require a Turing-Complete level of expressivity.

Gadget Set Quality – Expressivity of Set Gadget based code reuse techniques such as ROP and JOP have been shown to be capable of achieving Turing-Completeness. This is usually shown academically using handpicked gadgets from a common system library such as libc , however Homescu et al [9] proposed a method for classifying short gadgets into computational classes to determine if a given set of gadgets achieves an expressiveness level.They show that if at least one gadget exists per class, it can be used for all computational needs relative to that class. If all classes are satisfied relative to a level of expressivity, then the set of gadgets has achieved that level of expressivity.We used this gadget scanner to determine the expressivity of the gadgets in the original packages and their variants, which allows us to determine how debloating has affected expressivity.

Results We measured expressivity across three levels – the minimum expressivity for a practical ROP exploit, the expressivity required for the same exploit in an ASLR environment, and Turing Completeness.

Results Key Observations: In three of nine scenarios, debloating reduced the expressivity of the set of gadgets. In three of nine scenarios, debloating resulted in some increases, and some decreases. In two scenarios, debloating had no effect on expressivity. In one scenario, debloating increased the expressivity.

High Level Summary Of the nine debloating scenarios we explored, only one scenario resulted in a clear improvement to security across all our analysis methods. One scenario resulted in a clear negative impact on security. In the other seven scenarios, the results were mixed or neutral.

In all of our scenarios, debloating resulted in a positive gadget count reduction. However, our deeper analysis showed a much grayer picture. To fully understand how debloating a package has affected security, a deep and multi-faceted analysis is required. We propose the following measures be considered: Gadget Count Reduction by Gadget Type Gadget Introduction Rate by Gadget Type Partial Expressivity of Functional Gadgets by Type Contributions by External Dependencies We do not claim this is an exhaustive list – future work may identify other measures that are useful for this purpose.Alternative to Gadget Count Reduction

As a consequence of the use of gadget reduction as a metric, recently proposed debloating methods treat debloating for security the same way they treat debloating for performance. More is better, so debloat as much as possible! The effects of debloating on security are difficult to predict, and our research shows they are not monotonic. More isn’t necessarily better. We propose an iterative, human-in-the-loop process for debloating that incorporates our proposed measures. This model allows for corrections to the debloating specification based on the previous iteration’s results. How Can We Best Incorporate These Measures

How Can We Best Incorporate These Measures

We took our scenario which resulted in negative results and attempted a second iteration. In our second iteration, we chose to alter to the specification to debloat three fewer features that were highly correlated with other retained features. We repeated our analysis on the new variant, and the end result was the elimination of the negative impacts, as well achieving a marginal security improvement with respect to expressivity. A case study – Max debloating does not mean max security improvement

The relationship between software debloating and software security is complicated. Debloating can fail to improve security, or even make it worse. Debloating for security is not like debloating for performance – debloating more does not necessarily produce better results. Measuring the reduction in gadget count is insufficient to make claims of improved security. It can hide negative effects of debloating such as gadget introduction. It is not directly related to more important measures such as availability of special purpose gadgets and gadget set expressivity. Debloating to improve security is possible, but not nearly as easy as suggested by recent work.It requires deep and multi-faceted analysis to determine the effect debloating had on security.It may require multiple iterations to get it right.Summary of Key Takeaways

Questions?

References Quach, A., Erinfolami , R., Demicco , D., and Prakash, A. A multi-OS cross-layer study of bloating in user programs, kernel, and managed execution environments. In The 2017 Workshop on Forming an Ecosystem Around Software Transformation (FEAST) (2017). Lee, W., Heo , K., Pashakhanloo , P., and Naik, M. Effective Program Debloating via Reinforcement Learning. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security (CCS) (2018). Sharif, H., Abubakar, M., Gehani , A., and Zaffar , F. TRIMMER: Application specialization for code debloating. In Proceedings of the 2018 33rd ACM/IEEE International Conference on Automated Software Engineering (ASE) (2018). Chen, Y., Sun, S., Lan, T., and Venkataramani , G. TOSS: Tailoring online server systems through binary feature customization. In The 2018 Workshop on Forming an Ecosystem Around Software Transformation (FEAST) (2018). Quach, A., Prakash, A., and Yan, L. Debloating software through piece-wise compilation and loading. In Proceedings of the 27th USENIX Security Symposium (2018). Salwan , J. ROPgadget : Gadgets finder and auto-roper, 2011. http://shell-storm.org/project /ROPgadget/ Shacham , H. The geometry of innocent flesh on the bone: return-into- libc without function calls (on the x86). In Proceedings of 14th ACM conference on Computer and Communications Security (CCS) (2007). Bletsch , T., Jiang, X., Freeh, V.W., and Liang, Z. Jump-oriented programming: a new class of code-reuse attack. In Proceedings of the 6 th ACM Symposium on Information, Computer and Communications Security (ASIACCS) (2011) Homescu , A., Stewart, M., Larsen, P., Brunthaler , S., and Franz, M. Microgadgets : size does matter in turing -complete return-oriented programming. In Proceedings of the 6th USENIX conference on offensive technologies (WOOT) (2012).