/
Firewalls and Intrusion Detection Systems David Brumley dbrumley@cmu.edu Firewalls and Intrusion Detection Systems David Brumley dbrumley@cmu.edu

Firewalls and Intrusion Detection Systems David Brumley dbrumley@cmu.edu - PowerPoint Presentation

trish-goza
trish-goza . @trish-goza
Follow
348 views
Uploaded On 2019-11-02

Firewalls and Intrusion Detection Systems David Brumley dbrumley@cmu.edu - PPT Presentation

Firewalls and Intrusion Detection Systems David Brumley dbrumleycmuedu Carnegie Mellon University IDS and Firewall Goals Expressiveness What kinds of policies can we write Effectiveness How well does it detect attacks while avoiding false positives ID: 762098

acc ips good tier ips acc tier good firewall rate frag detection tcp change bad granularity network 100 syn

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Firewalls and Intrusion Detection System..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Firewalls and Intrusion Detection Systems David Brumley dbrumley@cmu.edu Carnegie Mellon University

IDS and Firewall Goals Expressiveness: What kinds of policies can we write? Effectiveness: How well does it detect attacks while avoiding false positives?Efficiency: How many resources does it take, and how quickly does it decide?Ease of use: How much training is necessary? Can a non-security expert use it?Security: Can the system itself be attacked?Transparency: How intrusive is it to use? 2

FirewallsDimensions: Host vs. Network Stateless vs. StatefulNetwork Layer3

Firewall GoalsProvide defense in depth by: Blocking attacks against hosts and services Control traffic between zones of trust 4

Logical Viewpoint5 Inside Outside Firewall For each message m, either: Allow with or without modification Block by dropping or sending rejection notice Queue m ?

PlacementHost-based Firewall Network-Based Firewall 6 Host Firewall Outside Firewall Outside Host B Host C Host A Features: Faithful to local configuration Travels with you Features: Protect whole network Can make decisions on all of traffic (traffic-based anomaly)

ParametersTypes of Firewalls Packet Filtering Stateful InspectionApplication proxyPoliciesDefault allowDefault deny 7

Recall: Protocol Stack8 Application (e.g., SSL) Transport ( e.g., TCP , UDP) Network (e.g., IP ) Link Layer (e.g., ethernet ) Physical Application message - data TCP data TCP data TCP data TCP Header data TCP IP data TCP IP ETH ETH Link (Ethernet) Header Link (Ethernet) Trailer IP Header

Stateless Firewall Filter by packet header fields IP Field (e.g., src, dst)Protocol (e.g., TCP, UDP, ...)Flags(e.g., SYN, ACK) Application Transport Network Link Layer Firewall Outside Inside Example : only allow incoming DNS packets to nameserver A.A.A.A. Allow UDP port 53 to A.A.A.A Deny UDP port 53 all Fail-safe good practice e.g., ipchains in Linux 2.2 9

Need to keep state 10 Inside Outside Listening Store SN c , SN s Wait SN C  rand C AN C  0 Syn SYN/ACK : SN S  rand S AN S  SN C Established ACK : SN  SN C +1 AN  SN S Example: TCP Handshake Firewall Desired Policy: Every SYN/ACK must have been preceded by a SYN

Stateful Inspection Firewall Added state (plus obligation to manage)TimeoutsSize of tableState Application Transport Network Link Layer Outside Inside e.g., iptables in Linux 2.4 11

Stateful More Expressive 12 Inside Outside Listening Store SN c , SN s Wait SN C  rand C AN C  0 Syn SYN/ACK : SN S  rand S AN S  SN C Established ACK : SN  SN C +1 AN  SN S Example: TCP Handshake Firewall Record SN c in table Verify AN s in table

State Holding Attack13 Firewall Attacker Inside Syn Syn Syn ... 1. Syn Flood 2. Exhaust Resources 3. Sneak Packet Assume stateful TCP policy

Fragmentation14 Octet 1 Octet 2 Octet 3 Octet 4 Ver IHL TOS Total Length ID 0 DF MF Frag ID ... Data Frag 1 Frag 2 Frag 3 IP Hdr DF=0 MF=1 ID=0 Frag 1 IP Hdr DF=0 MF=1 ID=n Frag 2 IP Hdr DF=1 MF=0 ID=2n Frag 3 say n bytes DF : Don’t fragment (0 = May, 1 = Don’t) MF: More fragments (0 = Last, 1 = More) Frag ID = Octet number

Reassembly15 Data Frag 1 Frag 2 Frag 3 IP Hdr DF=0 MF=1 ID=0 Frag 1 IP Hdr DF=0 MF=1 ID=n Frag 2 IP Hdr DF=1 MF=0 ID=2n Frag 3 Frag 1 Frag 2 Frag 3 0 Byte n Byte 2n

Example2,366 byte packet enters a Ethernet network with a default MTU size of 1500 Packet 1: 1500 bytes 20 bytes for IP header 24 Bytes for TCP header1456 bytes will be dataDF = 0 (May fragment), and MF=1 (More fragments)Fragment offset = 0Packet 2: 910 bytes20 bytes for IP header24 bytes for the TCP header866 bytes will be dataDF = 0 (may fragment), MF = 0 (Last fragment)Fragment offset = 182 (1456 bytes/8) 16

17 Octet 1 Octet 2 Octet 3 Octet 4 Source Port Destination Port Sequence Number .... ... DF=1 MF=1 ID=0 ... 1234 ( src port) 80 ( dst port) ... Packet 1 Overlapping Fragment Attack ... DF=1 MF=1 ID=2 ... 22 ... Packet 2 1234 80 22 Assume Firewall Policy:  Incoming Port 80 (HTTP)  Incoming Port 22 (SSH) Bypass policy TCP Hdr (Data!)

Stateful FirewallsPros More expressive Cons State-holding attackMismatch between firewalls understanding of protocol and protected hosts18

Application Firewall Check protocol messages directly Examples: SMTP virus scannerProxiesApplication-level callbacks19State Application Transport Network Link Layer Outside Inside

Firewall Placement 20

Demilitarized Zone (DMZ)21 Inside Outside Firewall DMZ WWW NNTP DNS SMTP

Dual Firewall22 Inside Outside Hub DMZ Interior Firewall Exterior Firewall

Design Utilities Solsoft Securify 23

References Elizabeth D. Zwicky Simon CooperD. Brent Chapman William R Cheswick Steven M Bellovin Aviel D Rubin 24

Intrusion Detection and Prevetion Systems 25

Logical Viewpoint26 Inside Outside IDS/IPS For each message m, either: Report m (IPS: drop or log) Allow m Queue m ?

Overview Approach: Policy vs AnomalyLocation: Network vs. HostAction: Detect vs. Prevent27

Policy-Based IDSUse pre-determined rules to detect attacks Examples: Regular expressions (snort), Cryptographic hash (tripwire, snort) 28Detect any fragments less than 256 bytesalert tcp any any -> any any (minfrag: 256; msg : " Tiny fragments detected, possible hostile activity"; ) Detect IMAP buffer overflow alert tcp any any -> 192.168.1.0/24 143 ( content : "|90C8 C0FF FFFF|/bin/ sh "; msg: "IMAP buffer overflow!”;) Example Snort rules

Modeling System Calls [wagner&dean 2001] 29 Entry(f) Entry(g) Exit(f) Exit(g) open() close() exit() getuid() geteuid() f(int x ) { if(x ){ getuid ( ); } else{ geteuid () ;} x+ +; } g () { fd = open("foo ", O_RDONLY); f(0); close(fd ); f(1); exit(0); } Execution inconsistent with automata indicates attack

Anomaly Detection30 Distribution of “normal” events IDS New Event Attack Safe

Example: Working Sets31 Alice Days 1 to 300 reddit xkcd slashdot fark working set of hosts Alice Day 300 outside working set reddit xkcd slashdot fark 18487

Anomaly DetectionPros Does not require pre-determining policy (an “unknown” threat) ConsRequires attacks are not strongly related to known trafficLearning distributions is hard32

Automatically Inferring the Evolution of Malicious Activity on the Internet David Brumley Carnegie Mellon University Shobha VenkataramanAT&T ResearchOliver SpatscheckAT&T Research Subhabrata Sen A T&T Research

<ip1,+> <ip 2 ,+> <ip 3,+> <ip4,->34 Tier 1 E K A ... Spam Haven Labeled IP’s from spam assassin, IDS logs, etc. Evil is constantly on the move Goal: Characterize regions changing from bad to good ( Δ -good) or good to bad ( Δ -bad)

Research Questions Given a sequence of labeled IP’s Can we identify the specific regions on the Internet that have changed in malice?Are there regions on the Internet that change their malicious activity more frequently than others?35

36 Spam Haven Tier 1 Tier 1 Tier 2 D X Tier 2 B C K Per-IP often not interesting A ... DSL CORP A X Challenges Infer the right granularity E Previous work: Fixed granularity Per-IP Granularity (e.g., Spamcop )

37 Spam Haven Tier 1 Tier 1 Tier 2 D E Tier 2 B C W A ... DSL CORP A BGP granularity (e.g., Network-Aware clusters [ KW’00]) Challenges Infer the right granularity X X Previous work: Fixed granularity

B C 38 Spam Haven Tier 1 Tier 1 Tier 2 D X Tier 2 B C K Coarse granularity A ... DSL CORP A K Challenges Infer the right granularity E E CORP Idea: Infer granularity Medium granularity Well-managed network: fine granularity

39 Spam Haven Tier 1 Tier 1 Tier 2 D E Tier 2 B C W A ... DSL SMTP Challenges Infer the right granularity We need online algorithms A fixed-memory device high-speed link X

Research Questions Given a sequence of labeled IP’s Can we identify the specific regions on the Internet that have changed in malice?Are there regions on the Internet that change their malicious activity more frequently than others?40 Δ -Change Δ -Motion We Present

BackgroundIP Prefix treesTrackIPTree Algorithm 41

42 Spam Haven Tier 1 Tier 1 Tier 2 D E Tier 2 B C W A ... DSL CORP A X 1.2.3.4/32 8.1.0.0/16 IP Prefixes: i /d denotes all IP addresses i covered by first d bits Ex: 8. 1 .0.0-8. 1 .255.255 Ex: 1 host (all bits)

43 One Host WholeNet 0.0.0.0/0 0.0.0.0/1 128.0.0.0/1 128.0.0.0/2 192.0.0.0/2 0.0.0.0/2 64.0.0.0/2 An IP prefix tree is formed by masking each bit of an IP address. 0.0.0.0/32 0.0.0.1/32 0.0.0.0/31 128.0.0.0/3 160.0.0.0/3 128.0.0.0/4 144.0.0.0/4

44 0.0.0.0/0 0.0.0.0/1 128.0.0.0/1 0.0.0.0/2 64.0.0.0/2 128.0.0.0/2 192.0.0.0/2 A k- IPTree Classifier [VBSSS’09] is an IP tree with at most k-leaves, each leaf labeled with good (“+”) or bad (“-”). 128.0.0.0/3 160.0.0.0/3 128.0.0.0/4 152.0.0.0/4 0.0.0.0/32 0.0.0.1/32 0.0.0.0/31 6-IPTree + - + - + + Ex: 64.1.1.1 is bad Ex: 1.1.1.1 is good

45 /0 /1 /16 /17 /18 + - - In : stream of labeled IPs ... <ip 4 ,+> <ip 3 ,+> <ip 2 ,+> <ip 1 ,-> TrackIPTree Algorithm [VBSSS’09] Out : k- IPTree TrackIPTree

Δ-Change AlgorithmApproach What doesn’t work Intuition Our algorithm46

47 Goal: identify online the specific regions on the Internet that have changed in malice. /0 /1 /16 /17 /18 T 1 for epoch 1 + - + /0 /1 /16 /17 /18 T 2 for epoch 2 + + - Δ -Bad: A change from good to bad Δ -Good: A change from bad to good Δ -Good: A change from bad to good Epoch 1 IP stream s 1 Epoch 2 IP stream s 2 ....

48 Goal: identify online the specific regions on the Internet that have changed in malice. /0 /1 /16 /17 /18 T 1 for epoch 1 + - + /0 /1 /16 /17 /18 T 2 for epoch 2 + + - False positive: Misreporting that a change occurred False Negative: Missing a real change

49 Goal: identify online the specific regions on the Internet that have changed in malice. Idea: divide time into epochs and diff Use TrackIPTree on labeled IP stream s 1 to learn T 1 Use TrackIPTree on labeled IP stream s 2 to learn T 2 Diff T 1 and T 2 to find Δ-Good and Δ-Bad/0 /1 /16 /17 /18 T 1 for epoch 1 + - - /0 /1 /16 T 2 for epoch 2 - Different Granularities! ✗

50 Goal: identify online the specific regions on the Internet that have changed in malice. Δ -Change Algorithm Main Idea: Use classification errors between T i -1 and T i to infer Δ -Good and Δ -Bad

51 T i-2 Ti-1Ti TrackIPTree TrackIPTree S i-1 S i Fixed Ann. with class. error S i-1 T old,i-1 Ann. with class. error S i T old,i compare (weighted) classification error (note both based on same tree) Δ -Good and Δ -Bad Δ -Change Algorithm

Comparing (Weighted) Classification Error 52 /16 IPs: 200 Acc : 40% IPs: 150 Acc : 90% IPs: 110 Acc : 95% T old,i-1 IPs: 40 Acc : 80% IPs: 50 Acc : 30% /16 IPs: 170 Acc : 13% IPs: 100 Acc : 10% IPs: 80 Acc : 5% T old,i IPs: 20 Acc : 20% IPs: 70 Acc : 20% Δ -Change Somewhere

Comparing (Weighted) Classification Error 53 /16 IPs: 200 Acc : 40% IPs: 150 Acc : 90% IPs: 110 Acc : 95% T old,i-1 IPs: 40 Acc : 80% IPs: 50 Acc : 30% /16 IPs: 170 Acc : 13% IPs: 100 Acc : 10% IPs: 80 Acc : 5% T old,i IPs: 20 Acc : 20% IPs: 70 Acc : 20% Insufficient Change

Comparing (Weighted) Classification Error 54 /16 IPs: 200 Acc : 40% IPs: 150 Acc : 90% IPs: 110 Acc : 95% T old,i-1 IPs: 40 Acc : 80% IPs: 50 Acc : 30% /16 IPs: 170 Acc : 13% IPs: 100 Acc : 10% IPs: 80 Acc : 5% T old,i IPs: 20 Acc : 20% IPs: 70 Acc : 20% Insufficient Traffic

Comparing (Weighted) Classification Error 55 /16 IPs: 200 Acc : 40% IPs: 150 Acc : 90% IPs: 110 Acc : 95% T old,i-1 IPs: 40 Acc : 80% IPs: 50 Acc : 30% /16 IPs: 170 Acc : 13% IPs: 100 Acc : 10% IPs: 80 Acc : 5% T old,i IPs: 20 Acc : 20% IPs: 70 Acc : 20% Δ -Change Localized

EvaluationWhat are the performance characteristics?Are we better than previous work? Do we find cool things? 56

57Performance In our experiments, we : let k=100,000 (k- IPTree size)processed 30-35 million IPs (one day’s traffic)using a 2.4 Ghz ProcessorIdentified Δ -Good and Δ -Bad in <22 min using <3MB memory

58 2.5x as many changes on average! How do we compare to network-aware clusters? (By Prefix)

Spam59 Grum botnet takedown

60 Botnets 22.1 and 28.6 thousand new DNSChanger bots appeared 38.6 thousand new Conficker and Sality bots

Caveats and Future Work “For any distribution on which an ML algorithm works well, there is another on which is works poorly.” – The “No Free Lunch” Theorem 61Our algorithm is efficient and works well in practice. ....but a very powerful adversary could fool it into having many false negatives. A formal characterization is future work. !

Detection TheoryBase Rate, fallacies, and detection systems 62

63 Let Ω be the set of all possible events. For example:Audit records produced on a hostNetwork packets seenΩ

64 Ω I Set of intrusion events I Intrusion Rate: Example : IDS Received 1,000,000 packets. 20 of them corresponded to an intrusion. The intrusion rate Pr [I] is: Pr [I] = 20/1,000,000 = .00002

65 Ω I A Set of alerts A Alert Rate: Defn : Sound

66 Ω I A Defn : Complete

67 Ω I A Defn : False Positive Defn : False Negative Defn : True Positive Defn : True Negative

68 Ω I A Defn : Detection rate Think of the detection rate as the set of intrusions raising an alert normalized by the set of all intrusions .

69 Ω I A 18 4 2

70 Ω I A Think of the Bayesian detection rate as the set of intrusions raising an alert normalized by the set of all alerts . (vs. detection rate which normalizes on intrusions.) Defn : Bayesian Detection rate Crux of IDS usefulness !

71 Ω I A 2 4 18 About 18% of all alerts are false positives!

ChallengeWe’re often given the detection rate and know the intrusion rate, and want to calculate the Bayesian detection rate 99% accurate medical test 99% accurate IDS 99% accurate test for deception...72

Fact: 73 Proof:

Calculating Bayesian Detection RateFact: So to calculate the Bayesian detection rate: One way is to compute: 74

Example 1,000 people in the city 1 is a terrorists, and we have their pictures. Thus the base rate of terrorists is 1/1000Suppose we have a new terrorist facial recognition system that is 99% accurate.99/100 times when someone is a terrorist there is an alarmFor every 100 good guys, the alarm only goes off once.An alarm went off. Is the suspect really a terrorist?75 City (this times 10)

Example Answer: The facial recognition system is 99% accurate. That means there is only a 1% chance the guy is not the terrorist. 76(this times 10) City Wrong!

Formalization 1 is terrorists, and we have their pictures. Thus the base rate of terrorists is 1/1000. P[T] = 0.00199/100 times when someone is a terrorist there is an alarm.P[A|T] = .99For every 100 good guys, the alarm only goes off once.P[A | not T] = .01Want to know P[T|A] 77 City (this times 10)

1 is terrorists, and we have their pictures. Thus the base rate of terrorists is 1/1000. P[T] = 0.00199/100 times when someone is a terrorist there is an alarm.P[A|T] = .99For every 100 good guys, the alarm only goes off once.P[A | not T] = .01Want to know P[T|A]78 City (this times 10) Intuition : Given 999 good guys, we have 999*.01 ≈ 9-10 false alarms False alarms

79 Unknown Unknown

Recall to get Pr[A] Fact: 80 Proof:

..and to get Pr[A∩ I] Fact: 81 Proof:

82 ✓ ✓

83

Visualization: ROC(Receiver Operating Characteristics Curve) Plot true positive vs. false positive for a binary classifier at various threshold settings 84

For IDS Let I be an intrusion, A an alert from the IDS1,000,000 msgs per day processed2 attacks per day10 attacks per message85False positives False positives True positives 70% detection requires FP < 1/100,000 80% detection generates 40% FP From Axelsson , RAID 99

Why is anomaly detection hardThink in terms of ROC curves and the Base Rate fallacy. Are real things rare? If so, hard to learn Are real things common? If so, probably ok. 86

ConclusionFirewalls 3 types: Packet filtering, Stateful , and ApplicationPlacement and DMZIDSAnomaly vs. policy-based detectionDetection theoryBase rate fallacy87