/
Grade Projections Grade Projections

Grade Projections - PowerPoint Presentation

stefany-barnette
stefany-barnette . @stefany-barnette
Follow
402 views
Uploaded On 2016-06-25

Grade Projections - PPT Presentation

Ive calculated two grades for everyone Realistic assumes your performance in the course continues the same Optimistic assumes you get maximum scores for the rest of the course Some statistics ID: 377728

gmu information borrowed traffic information gmu traffic borrowed kang brent byunghoon worm nodes suspicious detection content botnet network server privacy data false

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Grade Projections" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Grade Projections

I’ve calculated two grades for everyone:

Realistic: assumes your performance in the course continues the same

Optimistic: assumes you get maximum scores for the rest of the course

Some statistics:

Realistic: 1 A, 3 A-, 6

Bs

, 4 Cs, 5 Ds, 1 F

Optimistic: 12 A, 7 A-, 1 B+, 1 B

There

is still a

lot of room to improveSlide2

Dynamic Quarantine

Worms spread very fast (minutes, seconds)

Need automatic mitigation

If this is a new worm, no signature existsMust apply behaviour-based anomaly detectionBut this has a false-positive problem! We don’t want to drop legitimate connections!Dynamic quarantine “Assume guilty until proven innocent” Forbid access to suspicious hosts for a short timeThis significantly slows down the worm spread

C. C.

Zou

, W. Gong, and D.

Towsley

. "Worm Propagation Modeling and Analysis

under

Dynamic Quarantine Defense,"

ACM CCS Workshop on Rapid

Malcode

(WORM'03)

, Slide3

Dynamic Quarantine

Behavior

-based anomaly detection can point out suspicious hosts

Need a technique that slows down worm spread but doesn’t hurt legitimate traffic much“Assume guilty until proven innocent” technique will briefly drop all outgoing connection attempts (for a specific service) from a suspicious hostAfter a while just assume that host is healthy, even if not proven soThis should slow down worms but cause only transient interruption of legitimate trafficSlide4

Dynamic Quarantine

Assume we have some anomaly detection program that flags a host as suspicious

Quarantine this host

Release it after time TThe host may be quarantined multiple times if the anomaly detection raises an alarmSince this doesn’t affect healthy hosts’ operation a lot we can have more sensitive anomaly detection techniqueSlide5

Dynamic Quarantine

An infectious host is quarantined after time units

A

susceptible host is falsely quarantined after time units Quarantine time is T, after that we release the hostA few new categories:Quarantined infectious R(t)Quarantined susceptible Q(t)Slide6

Slammer With DQSlide7

DQ With Large T?

T=10sec

T=30secSlide8

DQ And Patching?Slide9

Patch Only Quarantined Hosts

Cleaning I(t)

Cleaning R(t)Slide10

DOMINO

The goal is to build an overlay network so that nodes cooperatively detect intrusion activity

Cooperation reduces the number of false positives

Overlay can be used for worm detectionMain feature are active-sink nodes that detect traffic to unused IP addressesThe reaction is to build blacklists of infected nodes V. Yegneswaran, P. Barford, S. Jha, “Global Intrusion Detection in the DOMINOOverlay System,” NDSS 2004Slide11

DOMINO ArchitectureSlide12

DOMINO Architecture

Axis

nodes collect, aggregate and share data

Nodes in large, trustworthy ISPsEach node maintains a NIDS and an active sink over large portion of unused IP spaceAccess points grant access to axis nodes after thorough administrative checksSatellite nodes form trees below an axis node, collect information, deliver it to axis nodes and pull relevant informationTerrestrial nodes supply daily summaries of port scan dataSlide13

Information Sharing

Every axis node maintains a global and local view of intrusion activity

Periodically a node receives summaries from peers which are used to update global view

List of worst offenders grouped per portLists of top scanned portsRSA is used to authenticate nodes and signed SHA digests are used to ensure message integrity and authenticitySlide14

How Many Nodes We Need?

40 for port summaries

20 for worst offender listSlide15

How Frequent Info Exchange?

Staleness doesn’t matter much but more frequent lists are better to catch worst offendersSlide16

How Long Blacklists?

About 1000 IPs are enoughSlide17

How Close Monitoring Nodes?

Blacklists in same /16 space are similar

 satellites in /16 space should be grouped

under the same axis node and sets of /16 spaces should be randomly distributed among different axis nodesSlide18

Automatic Worm Signatures

Focus on TCP worms that propagate via scanning

Idea: vulnerability exploit is not easily mutable so worm packets should have some common signature

Step 1: Select suspicious TCP flows using heuristicsStep 2: Generate signatures using content prevalence analysisKim, H.-A. and Karp, B., Autograph: Toward Automated, Distributed Worm Signature Detection, in the Proceedings of the 13th Usenix Security Symposium (Security 2004), San Diego, CA, August, 2004.Slide19

Suspicious Flows

Detect scanners as hosts that make many unsuccessful connection attempts (>2)

Select their

successful flows as suspiciousBuild suspicious flow poolWhen there’s enough flows inside trigger signature generation stepSlide20

Signature Generation

Use most frequent byte sequences across flows as the signature

Naïve techniques fail at byte insertion, deletion, reordering

Content-based payload partitioning (COPP)Partition if Rabin fingerprint of a sliding window matches breakmark = content blocksConfigurable parameters: window size, breakmarkAnalyze which content blocks appear most frequently and what is the smallest set of those that covers most/all samples in suspicious flow poolSlide21

How Well Does it Work?

Tested on traces of HTTP traffic interlaced with known worms

For large block sizes and large coverage of suspicious flow pool (90-95%) Autograph performs very well

Small false positives and false negativesSlide22

Distributed Autograph

Would detect more scanners

Would produce more data for suspicious flow pool

Reduce false positives and false negativesSlide23

Automatic Signatures (approach 2)

Detect content prevalence

Some content may vary but some portion of worm remains invariant

Detect address dispersionSame content will be sent from many hosts to many destinationsChallenge: how to detect these efficiently (low cost = fast operation)S.Singh, C. Estan, G. Varghese and S. Savage “Automated Worm Fingerprinting,” OSDI 2004Slide24

Content Prevalence Detection

Hash content + port + proto and use this as key to a table where counters are kept

Content hash is calculated over overlapping blocks of fixed size

Use Rabin fingerprint as hash functionAutograph calculates Rabin fingerprint over variable-length blocks that are non-overlappingRabin fingerprint is a hash function that is efficient to recalculate if a portion of the input changesSlide25

Address Dispersion Detection

Remembering sources and destinations for each content would require too much memory

Scaled bitmap:

Sample down input space, e.g., hash into values 0-63 but only remember those values that hash into 0-31Set the bit for the output value (out of 32 bits)Increase sampling-down factor each time bitmap is full = constant space, flexible countingSlide26

How Well Does This Work?

Implemented and deployed at UCSD networkSlide27

How Well Does This Work?

Some false positives

Spam, common

HTTP protocol headers .. (easily whitelisted)Popular BitTorrent files (not easily whitelisted)No false negativesDetected each worm outbreak reported in newsCross-checked with Snort’s signature detectionSlide28

Polymorphic Worm Signatures

Insight: multiple invariant substrings must be present in all variants of the worm for the exploit to work

Protocol framing (force the vulnerable code down the path where the vulnerability exists)

Return addressSubstrings not enough = too shortSignature: multiple disjoint byte stringsConjunction of byte stringsToken subsequences (must appear in order)

Bayes

-scored substrings (score + threshold)

J. Newsome, B. Karp and D. Song, “Polygraph:

Automatically Generating Signatures for Polymorphic Worms,”

IEEE Security and Privacy Symposium, 2005Slide29

Worm Code Structure

Invariant bytes: any change makes the worm fail

Wildcard bytes: any change has no effect

Code bytes: Can be changed using some polymorphic technique and worm will still workE.g., encryptionSlide30

Polygraph Architecture

All traffic is seen, some is identified as part of suspicious flows and sent to suspicious traffic pool

May contain some good traffic

May contain multiple wormsRest of traffic is sent to good traffic poolAlgorithm makes a single pass over pools and generates signaturesSlide31

Signature Detection

Extract tokens (variable length) that occur in at least K samples

Conjuction

signature is this set of tokensTo find token-subsequence signatures samples in the pool are aligned in different ways (shifted left or right) so that the maximum-length subsequences are identifiedContiguous tokens are preferredFor Bayes signatures for each token a probability is computed that it is contained by a good or a suspicious flow – use this as a scoreSet high value of threshold to avoid false positivesSlide32

How Well Does This Work?

Legitimate traffic traces: HTTP and DNS

Good traffic pool

Some of this traffic mixed with worm traffic to model imperfect separationWorm traffic: Ideally-polymorphic worms generated from 3 known exploitsVarious tests conductedSlide33

How Well Does This Work?

When compared with single signature (longest substring) detection, all proposed signatures result in lower false positive rates

False negative rate is always zero if the suspicious pool has at least three samples

If some good traffic ends up in suspicious poolFalse negative rate is still lowFalse positive rate is low until noise gets too bigIf there are multiple worms in suspicious pool and noiseFalse positives and false negatives are still low Slide34

Borrowed from Brent

ByungHoon

Kang, GMU

BotnetsSlide35

A Network of Compromised Computers on the Internet

IP

locations

of the Waledac botnet. Borrowed from Brent ByungHoon Kang, GMU Slide36

Botnets

Networks of compromised machines under the control of

hacker, “

bot-master” Used for a variety of malicious purposes:Sending Spam/Phishing EmailsLaunching Denial of Service attacks Hosting Servers (e.g., Malware download site)Proxying Services (e.g., FastFlux network)Information Harvesting (credit card, bank credentials, passwords, sensitive data.) Borrowed from Brent

ByungHoon

Kang, GMU

Slide37

Botnet

with Central Control Server

After resolving the IP address for the IRC server,

bot-infected machines CONNECT to the server, JOIN a channel, then wait for commands.Borrowed from Brent ByungHoon Kang, GMU Slide38

Botnet with Central Control Server

The

botmaster

sends a command to the channel. This will tell the bots to perform an action. Borrowed from Brent ByungHoon Kang, GMU Slide39

Botnet with Central Control Server

The IRC server sends (broadcasts) the message to bots listening on the channel.

Borrowed from Brent ByungHoon Kang, GMU

Slide40

Botnet with Central Control Server

The bots perform the command.

In this example: attacking / scanning CNN.COM.

Borrowed from Brent ByungHoon Kang, GMU Slide41

Botnet

Sophistication Fueled by Underground Economy

Unfortunately, the detection, analysis and mitigation of

botnets has proven to be quite challenging Supported by a thriving underground economy Professional quality sophistication in creating malware codes Highly adaptive to existing mitigation efforts such as taking down of central control server. 41Borrowed from Brent ByungHoon Kang, GMU Slide42

Emerging Decentralized

Peer to Peer Multi-layered

Botnets

Traditional botnet communicationCentral IRC server for Command & Control (C&C)Single point of mitigation:C&C Server can be taken down or blacklistedBotnets with peer to peer C&CNo single point of failure.E.g., Waldedac, Storm, and NugacheMulti-layered architecture to obfuscate and hide control servers in upper tiers.Borrowed from Brent ByungHoon

Kang, GMU

Slide43

Expected Use of DHT P2P Network

Publish and Search

Botmaster

publishes commands under the key.Bots are searching for this key periodicallyBots download the commands=>Asynchronous C&CBorrowed from Brent

ByungHoon

Kang, GMU

Slide44

Multi-Layered Command and Control Architecture Through P2P

Each

Supernode

(server) publishes its location (IP address) under the key 1 and key 2Subcontrollers search for key 1 Subnodes (workers) search for key 2 to open connection to the Supernodes=> Synchronous C&C

Borrowed from Brent

ByungHoon

Kang, GMU

Slide45

Current Approaches to Botnet

Virus Scanner at Local Host

Polymorphic binaries against signature scanning

Not installed even though it is almost freeRootkitNetwork Intrusion Detection SystemsKeeping states for network flowsDeep packet inspection is expensiveDeployed at LAN, and not scalable to ISP-level Requires Well-Trained Net-Security SysAdminBorrowed from Brent ByungHoon Kang, GMU Slide46

46

Conficker infections are still increasing after one year!!!

There are millions of computers on the Internet

that do not have virus scanner nor IDSBorrowed from Brent ByungHoon Kang, GMU

Slide47

Botnet Enumeration Approach

Used for spam blocking, firewall configuration, DNS rewriting, and alerting sys-

admins

regarding local infections.Fundamentally differs from existing Intrusion Detection System (IDS) approaches IDS protects local hosts within its perimeter (LAN) An enumerator would identify both local as well as remote infections Identifying remote infections is crucial There are numerous computers on the Internet that are not under the protection of IDS-based systems. 47Borrowed from Brent ByungHoon Kang, GMU

Slide48

How to Enumerate Botnet

Need to know the method and protocols for how a

bot

communicates with its peersUsing sand-box techniqueRun bot binary in a controlled environmentNetwork behaviors are captured/analyzedInvestigating the binary code itselfReversing the binary into high level codesC&C Protocol knowledge and operation details can be accurately obtainedBorrowed from Brent ByungHoon Kang, GMU

Slide49

Simple Crawler Approach

Given network protocol knowledge, crawlers:

collect list of initial bootstrap peers into queue

choose a peer node from the queuesend to the node look-up or get-peer requestsadd newly discovered peers to the queuerepeat 2-5 until no more peer to be contactedCan’t enumerate a node behind NAT/FirewallWould miss bot-infected hosts at home/office!Borrowed from Brent ByungHoon

Kang, GMU

Slide50

Passive P2P Monitor (PPM)

Given P2P protocol knowledge that

bot

usesA collection of “routing-only” nodes that Act as peer in the P2P network, butControlled by us, the defenderPPM nodes can observe the traffic from the peer infected hostsPPM node can be contacted by the infected hosts behind NAT/FirewallBorrowed from Brent ByungHoon Kang, GMU Slide51

Crawler and Passive P2P Monitor (PPM)

Crawler

PPM

PPMPPM

Borrowed from Brent

ByungHoon

Kang, GMU

Slide52

Crawler vs. PPM: # of

IPs

found

Borrowed from Brent ByungHoon Kang, GMU Slide53

Botnet

Enumeration Challenges

DHCP

NATsNon-uniform bot distributionChurnMost estimates put size of largest botnets at tens of millions of botsActual size may be much smaller if we account for all of the aboveSlide54

Fast Flux

Botnets

use a lot of newly-created domains for phishing and malware delivery

Fast flux: changing name-to-IP mapping very quickly, using various IPs to thwart defense attempts to bring down botnetSingle-flux: changing name-to-IP mapping for individual machines, e.g., a Web serverDouble-flux: changing name-to-IP mapping for DNS nameserver tooProxies on compromised nodes fetch content from backend serversSlide55

Fast Flux

Advantages for the attacker:

Simplicity: only one back end server is needed to deliver content

Layers of protection through disposable proxy nodesVery resilient to attempts for takedownSlide56

Fast Flux Detection

Look for domain names where mapping to IP changes often

May be due to load balancing

May have other (non-botnet) cause, e.g., adult content deliveryEasy to fabricate domain namesLook for DNS records with short-lived domain names, with lots of A records, lots of NS records and diverse IP addresses (wrt AS and network access type)Look for proxy nodes by poking themSlide57

Poking

Botnets

is Dangerous

They have been known to fight backDDoS IPs that poke them (even if low workers are scanned)They have been known to fabricate data for honeynetsHoneynet is a network of computers that sits in otherwise unused (dark) address space and is meant to be compromised by attackersSlide58

PrivacySlide59

What is Privacy?

Privacy is about PII

It is primarily a policy issue

Privacy is an issue of user educationMake sure users are aware of the potential use of the information they provideGive the user controlPrivacy is a security issueSecurity is needed to implement the policySlide60

Security v. Privacy

Sometimes conflicting

Many security technologies depend on identification

Many approaches to privacy depend on hiding one’s identitySometimes supportivePrivacy depends on protecting PII (personally identifiable information)Poor security makes it more difficult to protect such informationSlide61

Debate on Attribution

How much low level information should be kept to help track down cyber attacks

Such information can be used to breach privacy assurances

How long can such data be keptSlide62

Privacy is Not the Only Concern

Business Concerns

Disclosing Information we think of as privacy-related can divulge business plans

MergersProduct plansInvestigationsSome “private” information is used for authenticationSSNCredit card numbersSlide63

Aggregation of Data

Consider whether it is safe to release information in aggregate

Such information is presumably no longer personally identifiable

But given partial information, it is sometimes possible to derive other information by combining it with the aggregated data.Slide64

Anonymization of Data

Consider whether it is safe to release information that has been stripped of so called personal identifiers

Such information is presumably no longer personally identifiable

What is important is not just anonymity, but linkabilityIf I can link multiple queries, I might be able to infer the identity of the person issuing the query through one query, at which point, all anonymity is lostSlide65

Traffic Analysis

Even when specifics of communication are hidden, the mere knowledge of communication between parties provides useful information to an adversary

E.g. pending mergers or acquisitions

Relationships between entitiesCreated visibility of the structure of an organizationsAllows some inference about interestsSlide66

Information for Traffic Analysis

Lists of the web sites you visit

Email logs

Phone recordsPerhaps you expose the linkages through web sites like linked inConsider what information remains in the clear when you design security protocolsSlide67

Network Trace Sharing

Researchers need network data

To validate their solutions

To mine and understand trendsSharing network data creates necessary diversityEnables generalization of resultsCreates a lot of privacy concernsVery few public traffic trace archives(CAIDA, WIDE, LBNL, ITA, PREDICT, CRAWDAD, MIT DARPA)