Ive calculated two grades for everyone Realistic assumes your performance in the course continues the same Optimistic assumes you get maximum scores for the rest of the course Some statistics ID: 377728
Download Presentation The PPT/PDF document "Grade Projections" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Grade Projections
I’ve calculated two grades for everyone:
Realistic: assumes your performance in the course continues the same
Optimistic: assumes you get maximum scores for the rest of the course
Some statistics:
Realistic: 1 A, 3 A-, 6
Bs
, 4 Cs, 5 Ds, 1 F
Optimistic: 12 A, 7 A-, 1 B+, 1 B
There
is still a
lot of room to improveSlide2
Dynamic Quarantine
Worms spread very fast (minutes, seconds)
Need automatic mitigation
If this is a new worm, no signature existsMust apply behaviour-based anomaly detectionBut this has a false-positive problem! We don’t want to drop legitimate connections!Dynamic quarantine “Assume guilty until proven innocent” Forbid access to suspicious hosts for a short timeThis significantly slows down the worm spread
C. C.
Zou
, W. Gong, and D.
Towsley
. "Worm Propagation Modeling and Analysis
under
Dynamic Quarantine Defense,"
ACM CCS Workshop on Rapid
Malcode
(WORM'03)
, Slide3
Dynamic Quarantine
Behavior
-based anomaly detection can point out suspicious hosts
Need a technique that slows down worm spread but doesn’t hurt legitimate traffic much“Assume guilty until proven innocent” technique will briefly drop all outgoing connection attempts (for a specific service) from a suspicious hostAfter a while just assume that host is healthy, even if not proven soThis should slow down worms but cause only transient interruption of legitimate trafficSlide4
Dynamic Quarantine
Assume we have some anomaly detection program that flags a host as suspicious
Quarantine this host
Release it after time TThe host may be quarantined multiple times if the anomaly detection raises an alarmSince this doesn’t affect healthy hosts’ operation a lot we can have more sensitive anomaly detection techniqueSlide5
Dynamic Quarantine
An infectious host is quarantined after time units
A
susceptible host is falsely quarantined after time units Quarantine time is T, after that we release the hostA few new categories:Quarantined infectious R(t)Quarantined susceptible Q(t)Slide6
Slammer With DQSlide7
DQ With Large T?
T=10sec
T=30secSlide8
DQ And Patching?Slide9
Patch Only Quarantined Hosts
Cleaning I(t)
Cleaning R(t)Slide10
DOMINO
The goal is to build an overlay network so that nodes cooperatively detect intrusion activity
Cooperation reduces the number of false positives
Overlay can be used for worm detectionMain feature are active-sink nodes that detect traffic to unused IP addressesThe reaction is to build blacklists of infected nodes V. Yegneswaran, P. Barford, S. Jha, “Global Intrusion Detection in the DOMINOOverlay System,” NDSS 2004Slide11
DOMINO ArchitectureSlide12
DOMINO Architecture
Axis
nodes collect, aggregate and share data
Nodes in large, trustworthy ISPsEach node maintains a NIDS and an active sink over large portion of unused IP spaceAccess points grant access to axis nodes after thorough administrative checksSatellite nodes form trees below an axis node, collect information, deliver it to axis nodes and pull relevant informationTerrestrial nodes supply daily summaries of port scan dataSlide13
Information Sharing
Every axis node maintains a global and local view of intrusion activity
Periodically a node receives summaries from peers which are used to update global view
List of worst offenders grouped per portLists of top scanned portsRSA is used to authenticate nodes and signed SHA digests are used to ensure message integrity and authenticitySlide14
How Many Nodes We Need?
40 for port summaries
20 for worst offender listSlide15
How Frequent Info Exchange?
Staleness doesn’t matter much but more frequent lists are better to catch worst offendersSlide16
How Long Blacklists?
About 1000 IPs are enoughSlide17
How Close Monitoring Nodes?
Blacklists in same /16 space are similar
satellites in /16 space should be grouped
under the same axis node and sets of /16 spaces should be randomly distributed among different axis nodesSlide18
Automatic Worm Signatures
Focus on TCP worms that propagate via scanning
Idea: vulnerability exploit is not easily mutable so worm packets should have some common signature
Step 1: Select suspicious TCP flows using heuristicsStep 2: Generate signatures using content prevalence analysisKim, H.-A. and Karp, B., Autograph: Toward Automated, Distributed Worm Signature Detection, in the Proceedings of the 13th Usenix Security Symposium (Security 2004), San Diego, CA, August, 2004.Slide19
Suspicious Flows
Detect scanners as hosts that make many unsuccessful connection attempts (>2)
Select their
successful flows as suspiciousBuild suspicious flow poolWhen there’s enough flows inside trigger signature generation stepSlide20
Signature Generation
Use most frequent byte sequences across flows as the signature
Naïve techniques fail at byte insertion, deletion, reordering
Content-based payload partitioning (COPP)Partition if Rabin fingerprint of a sliding window matches breakmark = content blocksConfigurable parameters: window size, breakmarkAnalyze which content blocks appear most frequently and what is the smallest set of those that covers most/all samples in suspicious flow poolSlide21
How Well Does it Work?
Tested on traces of HTTP traffic interlaced with known worms
For large block sizes and large coverage of suspicious flow pool (90-95%) Autograph performs very well
Small false positives and false negativesSlide22
Distributed Autograph
Would detect more scanners
Would produce more data for suspicious flow pool
Reduce false positives and false negativesSlide23
Automatic Signatures (approach 2)
Detect content prevalence
Some content may vary but some portion of worm remains invariant
Detect address dispersionSame content will be sent from many hosts to many destinationsChallenge: how to detect these efficiently (low cost = fast operation)S.Singh, C. Estan, G. Varghese and S. Savage “Automated Worm Fingerprinting,” OSDI 2004Slide24
Content Prevalence Detection
Hash content + port + proto and use this as key to a table where counters are kept
Content hash is calculated over overlapping blocks of fixed size
Use Rabin fingerprint as hash functionAutograph calculates Rabin fingerprint over variable-length blocks that are non-overlappingRabin fingerprint is a hash function that is efficient to recalculate if a portion of the input changesSlide25
Address Dispersion Detection
Remembering sources and destinations for each content would require too much memory
Scaled bitmap:
Sample down input space, e.g., hash into values 0-63 but only remember those values that hash into 0-31Set the bit for the output value (out of 32 bits)Increase sampling-down factor each time bitmap is full = constant space, flexible countingSlide26
How Well Does This Work?
Implemented and deployed at UCSD networkSlide27
How Well Does This Work?
Some false positives
Spam, common
HTTP protocol headers .. (easily whitelisted)Popular BitTorrent files (not easily whitelisted)No false negativesDetected each worm outbreak reported in newsCross-checked with Snort’s signature detectionSlide28
Polymorphic Worm Signatures
Insight: multiple invariant substrings must be present in all variants of the worm for the exploit to work
Protocol framing (force the vulnerable code down the path where the vulnerability exists)
Return addressSubstrings not enough = too shortSignature: multiple disjoint byte stringsConjunction of byte stringsToken subsequences (must appear in order)
Bayes
-scored substrings (score + threshold)
J. Newsome, B. Karp and D. Song, “Polygraph:
Automatically Generating Signatures for Polymorphic Worms,”
IEEE Security and Privacy Symposium, 2005Slide29
Worm Code Structure
Invariant bytes: any change makes the worm fail
Wildcard bytes: any change has no effect
Code bytes: Can be changed using some polymorphic technique and worm will still workE.g., encryptionSlide30
Polygraph Architecture
All traffic is seen, some is identified as part of suspicious flows and sent to suspicious traffic pool
May contain some good traffic
May contain multiple wormsRest of traffic is sent to good traffic poolAlgorithm makes a single pass over pools and generates signaturesSlide31
Signature Detection
Extract tokens (variable length) that occur in at least K samples
Conjuction
signature is this set of tokensTo find token-subsequence signatures samples in the pool are aligned in different ways (shifted left or right) so that the maximum-length subsequences are identifiedContiguous tokens are preferredFor Bayes signatures for each token a probability is computed that it is contained by a good or a suspicious flow – use this as a scoreSet high value of threshold to avoid false positivesSlide32
How Well Does This Work?
Legitimate traffic traces: HTTP and DNS
Good traffic pool
Some of this traffic mixed with worm traffic to model imperfect separationWorm traffic: Ideally-polymorphic worms generated from 3 known exploitsVarious tests conductedSlide33
How Well Does This Work?
When compared with single signature (longest substring) detection, all proposed signatures result in lower false positive rates
False negative rate is always zero if the suspicious pool has at least three samples
If some good traffic ends up in suspicious poolFalse negative rate is still lowFalse positive rate is low until noise gets too bigIf there are multiple worms in suspicious pool and noiseFalse positives and false negatives are still low Slide34
Borrowed from Brent
ByungHoon
Kang, GMU
BotnetsSlide35
A Network of Compromised Computers on the Internet
IP
locations
of the Waledac botnet. Borrowed from Brent ByungHoon Kang, GMU Slide36
Botnets
Networks of compromised machines under the control of
hacker, “
bot-master” Used for a variety of malicious purposes:Sending Spam/Phishing EmailsLaunching Denial of Service attacks Hosting Servers (e.g., Malware download site)Proxying Services (e.g., FastFlux network)Information Harvesting (credit card, bank credentials, passwords, sensitive data.) Borrowed from Brent
ByungHoon
Kang, GMU
Slide37
Botnet
with Central Control Server
After resolving the IP address for the IRC server,
bot-infected machines CONNECT to the server, JOIN a channel, then wait for commands.Borrowed from Brent ByungHoon Kang, GMU Slide38
Botnet with Central Control Server
The
botmaster
sends a command to the channel. This will tell the bots to perform an action. Borrowed from Brent ByungHoon Kang, GMU Slide39
Botnet with Central Control Server
The IRC server sends (broadcasts) the message to bots listening on the channel.
Borrowed from Brent ByungHoon Kang, GMU
Slide40
Botnet with Central Control Server
The bots perform the command.
In this example: attacking / scanning CNN.COM.
Borrowed from Brent ByungHoon Kang, GMU Slide41
Botnet
Sophistication Fueled by Underground Economy
Unfortunately, the detection, analysis and mitigation of
botnets has proven to be quite challenging Supported by a thriving underground economy Professional quality sophistication in creating malware codes Highly adaptive to existing mitigation efforts such as taking down of central control server. 41Borrowed from Brent ByungHoon Kang, GMU Slide42
Emerging Decentralized
Peer to Peer Multi-layered
Botnets
Traditional botnet communicationCentral IRC server for Command & Control (C&C)Single point of mitigation:C&C Server can be taken down or blacklistedBotnets with peer to peer C&CNo single point of failure.E.g., Waldedac, Storm, and NugacheMulti-layered architecture to obfuscate and hide control servers in upper tiers.Borrowed from Brent ByungHoon
Kang, GMU
Slide43
Expected Use of DHT P2P Network
Publish and Search
Botmaster
publishes commands under the key.Bots are searching for this key periodicallyBots download the commands=>Asynchronous C&CBorrowed from Brent
ByungHoon
Kang, GMU
Slide44
Multi-Layered Command and Control Architecture Through P2P
Each
Supernode
(server) publishes its location (IP address) under the key 1 and key 2Subcontrollers search for key 1 Subnodes (workers) search for key 2 to open connection to the Supernodes=> Synchronous C&C
Borrowed from Brent
ByungHoon
Kang, GMU
Slide45
Current Approaches to Botnet
Virus Scanner at Local Host
Polymorphic binaries against signature scanning
Not installed even though it is almost freeRootkitNetwork Intrusion Detection SystemsKeeping states for network flowsDeep packet inspection is expensiveDeployed at LAN, and not scalable to ISP-level Requires Well-Trained Net-Security SysAdminBorrowed from Brent ByungHoon Kang, GMU Slide46
46
Conficker infections are still increasing after one year!!!
There are millions of computers on the Internet
that do not have virus scanner nor IDSBorrowed from Brent ByungHoon Kang, GMU
Slide47
Botnet Enumeration Approach
Used for spam blocking, firewall configuration, DNS rewriting, and alerting sys-
admins
regarding local infections.Fundamentally differs from existing Intrusion Detection System (IDS) approaches IDS protects local hosts within its perimeter (LAN) An enumerator would identify both local as well as remote infections Identifying remote infections is crucial There are numerous computers on the Internet that are not under the protection of IDS-based systems. 47Borrowed from Brent ByungHoon Kang, GMU
Slide48
How to Enumerate Botnet
Need to know the method and protocols for how a
bot
communicates with its peersUsing sand-box techniqueRun bot binary in a controlled environmentNetwork behaviors are captured/analyzedInvestigating the binary code itselfReversing the binary into high level codesC&C Protocol knowledge and operation details can be accurately obtainedBorrowed from Brent ByungHoon Kang, GMU
Slide49
Simple Crawler Approach
Given network protocol knowledge, crawlers:
collect list of initial bootstrap peers into queue
choose a peer node from the queuesend to the node look-up or get-peer requestsadd newly discovered peers to the queuerepeat 2-5 until no more peer to be contactedCan’t enumerate a node behind NAT/FirewallWould miss bot-infected hosts at home/office!Borrowed from Brent ByungHoon
Kang, GMU
Slide50
Passive P2P Monitor (PPM)
Given P2P protocol knowledge that
bot
usesA collection of “routing-only” nodes that Act as peer in the P2P network, butControlled by us, the defenderPPM nodes can observe the traffic from the peer infected hostsPPM node can be contacted by the infected hosts behind NAT/FirewallBorrowed from Brent ByungHoon Kang, GMU Slide51
Crawler and Passive P2P Monitor (PPM)
Crawler
PPM
PPMPPM
Borrowed from Brent
ByungHoon
Kang, GMU
Slide52
Crawler vs. PPM: # of
IPs
found
Borrowed from Brent ByungHoon Kang, GMU Slide53
Botnet
Enumeration Challenges
DHCP
NATsNon-uniform bot distributionChurnMost estimates put size of largest botnets at tens of millions of botsActual size may be much smaller if we account for all of the aboveSlide54
Fast Flux
Botnets
use a lot of newly-created domains for phishing and malware delivery
Fast flux: changing name-to-IP mapping very quickly, using various IPs to thwart defense attempts to bring down botnetSingle-flux: changing name-to-IP mapping for individual machines, e.g., a Web serverDouble-flux: changing name-to-IP mapping for DNS nameserver tooProxies on compromised nodes fetch content from backend serversSlide55
Fast Flux
Advantages for the attacker:
Simplicity: only one back end server is needed to deliver content
Layers of protection through disposable proxy nodesVery resilient to attempts for takedownSlide56
Fast Flux Detection
Look for domain names where mapping to IP changes often
May be due to load balancing
May have other (non-botnet) cause, e.g., adult content deliveryEasy to fabricate domain namesLook for DNS records with short-lived domain names, with lots of A records, lots of NS records and diverse IP addresses (wrt AS and network access type)Look for proxy nodes by poking themSlide57
Poking
Botnets
is Dangerous
They have been known to fight backDDoS IPs that poke them (even if low workers are scanned)They have been known to fabricate data for honeynetsHoneynet is a network of computers that sits in otherwise unused (dark) address space and is meant to be compromised by attackersSlide58
PrivacySlide59
What is Privacy?
Privacy is about PII
It is primarily a policy issue
Privacy is an issue of user educationMake sure users are aware of the potential use of the information they provideGive the user controlPrivacy is a security issueSecurity is needed to implement the policySlide60
Security v. Privacy
Sometimes conflicting
Many security technologies depend on identification
Many approaches to privacy depend on hiding one’s identitySometimes supportivePrivacy depends on protecting PII (personally identifiable information)Poor security makes it more difficult to protect such informationSlide61
Debate on Attribution
How much low level information should be kept to help track down cyber attacks
Such information can be used to breach privacy assurances
How long can such data be keptSlide62
Privacy is Not the Only Concern
Business Concerns
Disclosing Information we think of as privacy-related can divulge business plans
MergersProduct plansInvestigationsSome “private” information is used for authenticationSSNCredit card numbersSlide63
Aggregation of Data
Consider whether it is safe to release information in aggregate
Such information is presumably no longer personally identifiable
But given partial information, it is sometimes possible to derive other information by combining it with the aggregated data.Slide64
Anonymization of Data
Consider whether it is safe to release information that has been stripped of so called personal identifiers
Such information is presumably no longer personally identifiable
What is important is not just anonymity, but linkabilityIf I can link multiple queries, I might be able to infer the identity of the person issuing the query through one query, at which point, all anonymity is lostSlide65
Traffic Analysis
Even when specifics of communication are hidden, the mere knowledge of communication between parties provides useful information to an adversary
E.g. pending mergers or acquisitions
Relationships between entitiesCreated visibility of the structure of an organizationsAllows some inference about interestsSlide66
Information for Traffic Analysis
Lists of the web sites you visit
Email logs
Phone recordsPerhaps you expose the linkages through web sites like linked inConsider what information remains in the clear when you design security protocolsSlide67
Network Trace Sharing
Researchers need network data
To validate their solutions
To mine and understand trendsSharing network data creates necessary diversityEnables generalization of resultsCreates a lot of privacy concernsVery few public traffic trace archives(CAIDA, WIDE, LBNL, ITA, PREDICT, CRAWDAD, MIT DARPA)