/
REVERSE ENGINEERING TECHNIQUES REVERSE ENGINEERING TECHNIQUES

REVERSE ENGINEERING TECHNIQUES - PowerPoint Presentation

stefany-barnette
stefany-barnette . @stefany-barnette
Follow
361 views
Uploaded On 2018-09-20

REVERSE ENGINEERING TECHNIQUES - PPT Presentation

Presented By Sachin Saboji 2sd06cs079 CONTENTS Introduction Front End Data Analysis Control Flow Analysis Back End Challenges Conclusion Bibliography Introduction The purpose of reverse engineering is to understand a software system in order to facilitate enhancement correction ID: 672259

flow code graph control code flow control graph reverse program language decompiler analysis engineering machine level data front phases instructions representation basic

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "REVERSE ENGINEERING TECHNIQUES" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

REVERSE ENGINEERING TECHNIQUES

Presented By

Sachin Saboji

2sd06cs079Slide2

CONTENTS

Introduction

Front End

Data Analysis

Control Flow Analysis

Back End

Challenges

Conclusion

BibliographySlide3

Introduction

The purpose of reverse engineering is to understand a software system in order to facilitate enhancement, correction, documentation, redesign, or reprogramming in a different programming

language.Slide4

What is Reverse Engineering

Reverse engineering is the process of analyzing a subject system to identify the system’s components and their interrelationships and create representations of the system in another form or at a higher level of

abstraction.Slide5

Difference between Reverse Engineering and Forward Engineering.

REVERSE ENGINEERING

FORWARD ENGINEERING

Given an application, deduce tentative requirements.

Given requirements, develop an application.

Less certain. An implementation can yield different requirements, depending on the reverse engineer’s interpretation.

More certain. The developer has requirements and must deliver an application that implements them.

Adaptive. The reverse engineer must find out what the developer actually did.

Prescriptive. Developers are told how to work.

Less mature. Skilled staff sparse.

More mature. Skilled staff readily available.

Can be performed 10 to 100 times faster than forward engineering.(days to weeks of work).

Time consuming (months to years of work).

The model can be imperfect. Salvaging partial information is still useful.

The model must be correct and complete or the application will fail.Slide6

How Reverse Engineering can be achieved.

Decompilers

The Phases of a Decompilers.

The Grouping of Phases.Slide7

Decompilers

A decompiler is a program that reads a program written in a machine language and translates it into an equivalent program in a high level language.Slide8

The Phases of a Decompiler

A decompiler is structured in a similar way to a compiler, by a series of phases that transform the source machine program from one representation to another.

No lexical analysis or scanning phase in the decompiler.Slide9

The Grouping of Phases

The decompiler phases are normally grouped into three different modules

Front-end

UDM

Back-endSlide10

Front End

The front-end is a machine dependent module, which takes as input the binary source program, parses the program, and produces as output the control flow graph and intermediate language representation of the program.Slide11

Syntax Analysis

The syntax analyzer is the first phase of the decompiler.

This sequence of bytes is checked for syntactic structure.

Valid strings are represented by a parse tree, which is input into the next phase, the semantic analyzer. Slide12

Semantic Analysis

The semantic analysis phase determines the meaning of a group of machine instructions.

It collects information on the individual instructions of a subroutine.Slide13

Intermediate Code Generation

In a decompiler, the front-end translates the machine language source code into an intermediate representation, which is suitable for analysis by the Universal Decompiling Machine.

The low-level intermediate representation is implemented in quadruples

opcode dest src1 src2 Slide14

Control Flow Graph Generation

The control flow graph generation phase constructs a call graph of the source program, and a control flow graph of basic blocks for each subroutine of the program.

Basic Blocks

A basic block is a sequence of instructions that has one entry point and one exit point.Slide15

Control Flow Graphs

A control flow graph is a directed graph that represents the flow of control of a program.

The nodes of this graph represent basic blocks of the program, and the edges represent the flow of control between nodes.

Graph Optimizations

Flow-of-control optimization is the method by which redundant jumps are eliminated. Slide16

Universal Decompiling Machine

The UDM is an intermediate module that is completely machine and language independent

It performs the core of the Decompiling analysis.

Two phases included in this module are the data flow and the control flow analyzers. Slide17

 

Data Flow Analysis

The low-level intermediate code generated by the front-end is an assembler type representation that makes use of registers and condition codes.

Data flow analysis is the process of collecting information about the way variables are used in a program, and summarizing it in the form of sets

.Slide18

Code Improving Optimizations

This section presents the code-improving transformations used by a decompiler.

Dead-Register Elimination

Dead-Condition Code Elimination

Condition Code PropagationSlide19

Control Flow Analysis

The control flow graph constructed by the front-end has no information on high-level language control structures.

Such a graph can be converted into a structured high-level language graph by means of a structuring algorithm. Slide20

Structuring Algorithms

In Decompilation, the aim of a structuring algorithm is to determine the underlying control structures of an arbitrary graph, thus converting it into a functional and semantic equivalent graph.

Structuring Loops

In order to structure loops, a loop in terms of a graph representation needs to be defined. This representation must be able to not only determine the extent of a loop, but also provide a nesting order for the loops. Slide21

Finding the type of Loop

Finding the Loop Follow Node

Application Order

The structuring algorithms presented in the previous sections determine the entry and exit (i.e. header and follow) nodes of sub graph that represent high-level loops.Slide22

The Back-end

This module is composed

by the code generator.

The code generator generates code for a predefined target high-level language.

Generating Code for a Basic Block

For each basic block, the instructions in the basic block are mapped to an equivalent instruction of the target language.Slide23

Generating Code for asgn Instructions

Generating Code for call Instructions

Generating Code for ret Instructions

Generating Code from Control Flow GraphsSlide24

Challenges

The main difficulty of a decompiler parser is the separation of code from data, that is, the determination of which bytes in memory represent code and which ones represent data.

Less mature. Skilled staff sparse.

The model can be imperfect. Salvaging partial information is still useful.

Less certain. An implementation can yield different requirements, depending on the reverse engineer’s interpretation.

The determination of data types such as arrays, records, and pointers.Slide25

Conclusion

This seminar report has presented techniques for the reverse compilation or Decompilation of binary programs.

A decompiler for a different machine can be built by writing a new front-end for that machine, and a decompiler for a different target high-level language can be built by writing a new back-end for the target language. Slide26

Further work on Decompilation can be done in two areas: the separation of code and data, and the determination of data types such as arrays, records, and pointers. Slide27

Bibliography

A. V. Aho, R. Sethi, and J. D. Ullman. Principles of Compiler Design. Narosa Publication House, 1985.

Michael Blaha and James Rumbaugh. Object-Oriented Modeling and Design with UML. Second Edition, Pearson Education 2008.

Cristina Cifuentes. Reverse Compilation Techniques, PhD Thesis. Queensland University of Technology, 1994.

http://www.cc.gatech.edu/reverse.Slide28

THANK YOU