Jim Hogg UW CSE P501 W 1 CSE P501 Compiler Construction Conventional Heap Storage Garbage Collection Spring 2014 Jim Hogg UW CSE P501 W 2 char s char malloc 50 ID: 307481
Download Presentation The PPT/PDF document "Spring 2014" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Spring 2014
Jim Hogg - UW - CSE P501
W-1
CSE P501 – Compiler Construction
Conventional Heap Storage
Garbage CollectionSlide2
Spring 2014Jim Hogg - UW - CSE P501W-2...char* s = (char*) malloc(50
);...free(s);C
C Runtime Heap Memory
Developer must remember to free memory when no longer required
Eventual fragmentation => slow to
malloc
, slow to
free
In Use
Conventional Heap StorageSlide3
Spring 2014Jim Hogg - UW - CSE P501W-3Heap Storage Fragmentation
C Runtime Heap Memory
In Use
malloc
: walk the
freelist
to find a slot big enough for current request
free
: adjust
freelist
; collapse contiguous
freespace
fragmentation: plenty free chunks but none big enough for request
cannot compact the used space - may contain pointers; may be pointed-atSlide4
Spring 2014Jim Hogg - UW - CSE P501W-4BugsForget to free => eventually run out of memorycalled a "memory leak"Call
free, but continue to use!called "use-after-free", or "dangling pointer"memory corruption - wrong answers; crash if lucky!major source of security issuesdetect via "pool poisoning"
2 pointers
free via 1
free
malloc
; corruption!Slide5
Spring 2014Jim Hogg - UW - CSE P501W-5Solving Memory Leaksint f() {
... Car c = Car("Chevy"); ...}C++object c is destructed [calls ~Car() ]automatically when c it falls out of scope
called RAII ("Resource Acquisition Is Initialization")more general case still requires Developer to delete c when no longer needed"smart pointers" can helpSlide6
W-6Solving Leaks and Use-After-Freepublic static int Main(String[] args) { ...
if (...) { ArrayList cars = new ArrayList(); for (int
i = 0; i < 1000; i++) { cars.Add(new Car()); }
... } ...
Boat b = new Boat();
...
}
A
At point
A
I have 1000
Car
objects allocated on the heap
At some later time I ask for more heap memory. May trigger a "garbage collection" - automatic, and silent
Compiler realizes that
cars
is not referenced later by the program; so it's "garbage"; GC recycles it for use by other objectsMagically solves leaks
and use-after-free!
BSlide7
Spring 2014Jim Hogg - UW - CSE P501W-7
nextnext
Allocate an object; fast!
next
Allocate more
objects; and one more, please?
Garbage CollectionSlide8
Spring 2014Jim Hogg - UW - CSE P501W-8Garbage Collection, 1Allocate another object
next
next
"roots"
Trace reachable objects
next
"roots"
Compact
unreachables
;
update
all
pointers
GC does not find garbage: it finds live objects and ignores all other memorySlide9
Spring 2014Jim Hogg - UW - CSE P501W-9Roots?"Roots" are locations that hold a pointer to any object in the heap:Method-local variablesMethod Arguments
Global variablesClass-static fieldsRegistersSlide10
Spring 2014Jim Hogg - UW - CSE P501W-10GC Start
root
rootSlide11
Spring 2014Jim Hogg - UW - CSE P501W-11GC Mark Phase
root
root
Unreachable
ReachableSlide12
Spring 2014Jim Hogg - UW - CSE P501W-12GC Sweep Phase
root
root
Reachable
With memory free, now allocate space for object that provoked the GCSlide13
Spring 2014Jim Hogg - UW - CSE P501W-13No BugsForget to free => eventually run out of memorycalled a "memory leak"
Call free, but continue to use!called "use-after-free"GC removes control over free'ing memory from the hands of the DeveloperGC will find all live objects and reclaim all other memory each time it is invoked. So no "memory leak"sAn object is garbage-collected only if it was unreachable by the program; being unreachable, GC removes "use-after-free" bugsSo what's a "memory leak" in C# or Java? - holding on to objects that really could be freed -
eg, by setting the object-ref to null. Particularly troublesome in a Generational Garbage CollectorSlide14
Spring 2014Jim Hogg - UW - CSE P501W-14C# FinalizersClass C above defines a "Finalize" method. Syntax is ~C()Syntactic sugar for: protected override void Finalize()
As each object of class C is created, the CLR links it onto a Finalization QueueWhen the object is about to be reclaimed, it is moved onto the Freachable Queue; a Finalizer background thread will run that Finalize method laterclass C {
private StreamWriter sw; ... ~C() { sw.Close(); } // flush output buffer ...
}Slide15
Spring 2014Jim Hogg - UW - CSE P501W-15Finalization
root
Finalization Queue
root
Finalization Queue
Freachable
Queue
Not live; not dead; zombiesSlide16
Spring 2014Jim Hogg - UW - CSE P501W-16ThreadsCompiler needs to create GC info - at each point in program, where are current roots? - variables, registersGC info is bulky - cannot store for every instruction!
So store GC info for selected points in the program - "GC-safepoints"Typical GC-safepoint is a function returnNeed to stop all threads to carry out a GCFor each thread stack-frame, stomp its return address; when it reaches there, the thread will return, not to its caller, but into GC codeSlide17
Spring 2014Jim Hogg - UW - CSE P501W-17Hijacking a Threadretaddr
gc
caller
GC
gc
GC
gc
GC
Top of Stack
Save & Stomp
retaddr
Thread continues to run
Eventually returns and hits the hijackSlide18
Spring 2014Jim Hogg - UW - CSE P501W-18Threads Run to Their Safepoints
Start GC
After each thread has hit its safepoint, GC can start its sweep phaseSlide19
Spring 2014Jim Hogg - UW - CSE P501W-19Object Age Distribution
Performing GC over large heap consumes lots of cpu time (and trashes the caches!)Experiment reveals:Many objects live a very short time (high "infant mortality")Some object live a very long timeBi-modal
number
age
Object AgesSlide20
Spring 2014Jim Hogg - UW - CSE P501W-20Generational Garbage CollectionGen0
Gen2
Gen1'Nursery'Divide heap into 3 areasAllocate new objects from Gen0At next GCmove all Gen0 survivors into Gen1move all Gen1 survivors into Gen2Slide21
Spring 2014Jim Hogg - UW - CSE P501W-21Migration thru GenerationsGen0
Gen2
Gen1Gen0
Gen2
Gen1
Gen0
0
Gen2
Gen1
Survivors
Gen0
Gen2
Gen1Slide22
Spring 2014Jim Hogg - UW - CSE P501W-22Generational GCGarbage-collecting entire heap is expensiveA process of diminishing returns
Instead, Garbage-Collect only the nursery (Gen0)Smaller, so fasterYields lots of free space (objects die young)Maximum 'bang for the buck'Sometime collect Gen0+Gen1When the going gets tough, collect Gen0+Gen1+Gen2 ("full" collection)Some old objects may become unreachable - just ignore them in a Gen0 collectAbove picture assumes
no writes to old objects kept Gen0 objects reachableSlide23
Spring 2014Jim Hogg - UW - CSE P501W-23Card Table
In practice, older objects are written-to between GCsMight update a pointer that results in keeping young object alive; missing this is a disasterGen0
Gen2Gen1
Card Tables: bitmap, with 1 bit per 128 Bytes of Gen2 heap
Create a "write barrier" - any write into Gen1 or Gen2 heap sets corresponding bit in Card Table
During GC sweep, check Card Tables for any "keep alive" pointers
In practice, Card Tables contain few 1 bits
Gen2 Card Table
Gen1 Card TableSlide24
Spring 2014Jim Hogg - UW - CSE P501W-24Debugging the GCWhat if a GC goes wrong?Fails to free some memory that is unreachable => memory leakFrees some memory that is still reachable - impending disaster
Frees some non-objects - disasterColloquially called a GC 'hole'CausesBug in GC - mark, sweep, compactionBug in GC infoSymptomsWrong answersProcess CrashIf we're 'lucky' failure comes soon after errorOtherwise, failure may occur seconds, minutes or hours later
TestingGC-Stresseg: maximally randomize the heap; force frequent GCs; check resultsSlide25
Spring 2014Jim Hogg - UW - CSE P501W-25Further Design ComplexitiesObjects with Finalizers:take 2 GCs to dieextend the lifetime of other connected objects
run at some later time => "non-deterministic" finalizationno guarantee on order that finalizers runsuncaught exceptions escape and disappear!Strong refsWeak refs - "short" and "long""Dispose" patternResurrectionLarge Object Heap (objects > 85 kB in size)Concurrent Garbage Collection
Per-processor heaps; processor affinityFully-interruptible GCSafe-point on back-edge of loopSlide26
Spring 2014Jim Hogg - UW - CSE P501W-26References for CLR GCGarbage Collection: Automatic Memory Management in the Microsoft .NET Framework – by Jeffrey RichterGarbage Collection—Part 2: Automatic Memory Management in the Microsoft .NET Framework - by Jeffrey RichterGarbage Collector Basics and Performance Hints – by Rico Mariani
Using GC Efficiently – Part 1 by MaoniUsing GC Efficiently – Part 2 by MaoniUsing GC Efficiently – Part 3 by MaoniUsing GC Efficiently – Part 4 by MaoniGC Performance Counters - by Maoni Tools that help diagnose managed memory related issues – by MaoniClearing up some confusion over finalization and other areas in GC – by MaoniSlide27
And a bit of perspective…Automatic GC has been around since LISP I in 1958Ubiquitous in functional and object-oriented programming communities for decadesMainstream since Java(?) (mid-90s)Now conventional? - nope!Specialized patterns of allocate/free are still better coded by-hand - eg "Arena" storage inside compiler optimization phasesSpring 2014
Jim Hogg - UW - CSE P501W-27