Landon Cox April 12 2017 How do we virtualize Key technique trap and emulate Untrusted user code tries to do something it cant Transfer control to something that can do it ID: 575206
Download Presentation The PPT/PDF document "A Survey of Virtual Machine Research" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
A Survey of Virtual Machine Research
Landon Cox
April 12, 2017Slide2
How do we virtualize
?
Key technique: “
trap and emulate”Untrusted/user code tries to do something it can’tTransfer control to something that can do itEvaluate whether thing is allowedIf so, do it and return control. Else, kill process or throw exception.Where have we seen trap and emulate?Virtual memoryProcess tries to access non-resident memoryTrap to OSOS makes virtual page residentRetry instruction that caused faultGenerally useful technique, crucial for virtual machines
Ways to do this?
Rely on HW
Rewrite codeSlide3
Coarser abstraction: virtual machine
We’ve already seen a kind of virtual machine
OS gives processes virtual memory
Each process runs on a virtualized CPUVirtual machineAn execution environmentMay or may not correspond to physical realityEmulate the parts that don’t correspond to realitySlide4
Virtual machine options
Approaches to implementing VMs
Interpreted virtual machines
Translate every VM instructionKind of like on-the-fly compilationVM instruction HW instruction(s)
Direct executionExecute instructions directlyEmulate the hard onesSlide5
Interpreted virtual machines
Implement the machine in software
Must
translate emulated to physicalJava: byte codes x86, PPC, ARM, etcSoftware fetches/executes instructions
Program
(
foo.class
)
Byte code
Interpreter
(java)
x86
Looks like a dynamic
virtual memory translatorSlide6
Java virtual machine
What is the
low-level interface to programs?
Java byte-code (or Dalvik) instructionsWhat abstraction does this interface provide?JVM: Stack-machine architectureDalvik (Android): Register-based architecture
The Java programming languageHigh-level language compiled into byte codeLibrary of services (kind of like a kernel)Like C++/STL, C#Slide7
Direct execution
What is the interface?
Hardware ISA (e.g
., x86 instructions)What is the abstraction?Physical machine (e.g., x86 processor)
Program
(XP kernel)
x86
Monitor
(VMware)
x86
x86Slide8
Different techniques
Full emulation
Bochs
, QEMUPartial emulationVMwarePara-virtualizationXenDynamic recompilationVirtual PCSlide9
Views of the CPU
How is a process’s view of the CPU different than the
OS’s
?Kernel modeAccess to physical memoryManipulation of page tablesOther “privileged instructions”Turn off interruptsTrapsKeep these in mind when thinking about virtual machinesSlide10Slide11
Traditional OS structure
Host Machine
Operating System
App
App
App
App
Ring 3
Ring 0Slide12
Virtual machine structure
Host Machine
Virtual Machine Monitor (Hypervisor)
Guest OS
Guest
App
Guest
App
Guest OS
Guest OS
Guest
App
Ring 3
Ring 0
Ring 3?Slide13
Virtual machine structure
Host Machine
Virtual Machine Monitor (Hypervisor)
Guest OS
Guest
App
Guest
App
Guest OS
Guest OS
Guest
App
Ring 3
Ring 0
Ring 1Slide14
Why are hypervisors useful?
Code reuse
Can run old operating systems + apps on new hardware
Original purpose of VMs by IBM in the 60sEncapsulationCan put entire state of an “application” in one thingMove it, restore it, copy it, etcIsolation, securityAll interactions with hardware are mediatedHypervisor can keep one VM from affecting anotherHypervisor cannot be corrupted by guest operating systemsSlide15
Encapsulation
Say I want to suspend/restore an application
W
rite the process mem + regs to diskCan I restart the process later?Yes, this is just like switching threads or processesJust restore address space and registersJump to saved PCSlide16
Encapsulation
Say I want to suspend/restore an application
Write the
process mem + regs to diskWhat if I reboot my kernel and restart the process?No, application state is spread out in many placesApplication might depend on other processesApplications have state in the kernel (lost on reboot)(e.g., open files, locks, process ids, driver states, etc)Slide17
Encapsulation
Virtual machines capture all of this state
Can
suspend/restore an applicationOn same machine between bootsOn different machinesVery useful in server farmsWe’ll talk more about this with XenSlide18
Security
Can user processes
corrupt the kernel
? Which ones?Privileged user processes can (running as super user)Can overwrite logsOverwrite kernel fileCan boot a new kernelExploit a bug in the system call interfaceOk, so I’ll use a hypervisor. Is my data any less vulnerable?All the state in the guest is still vulnerable (file systems, etc)So what’s the point?Hypervisors can observe the guest OS
Security services in hypervisor are safe, makes detection easierSlide19
Security
Hypervisors buggy too, why trust them more than kernels?
Narrower interface to malicious code (no system calls)
No way for kernel to call into hypervisorSmaller, (hopefully) less complex codebaseShould be fewer bugsAnything wrong with this argument?Hypervisors are still complexMay be able to take over hypervisor via non-syscall interfacesE.g
., what if hypervisor is running IP-accessible services?Para-virtualization (in Xen
) may compromise thisSlide20
VMware architecture
Host Machine
Host OS
VM App
Target OS
Target
App
Target
App
Virtual Machine
Monitor
VM Driver
Host
App
VMM World
Host WorldSlide21
SimOS (proto-VMware) arch.
Host Machine
Host OS
SimOS
Target OS
Target
App
Target
App
Host
App
Host
AppSlide22
SimOS memory
SimOS
Target OS
Host Machine
Host OS
Target App
Mem File
SimDisk
File
SimDisk
SimOS
VMemory
Target App
SimOS
code, data
TargOS
code, data
TargApp
code, data
Virtual MMUSlide23
SimOS page fault
SimOS
Target OS
Host Machine
Host OS
Target App
Mem File
SimDisk
File
SimDisk
SimOS
VMemory
Target App
Unmapped
addr
SimOS
Fault handler
TargOS
Fault handler
Virtual MMU
What if I want to suspend and migrate the target OS?Slide24
Full vs interpreted
Why would I use VMware instead of Java?
Support for legacy applications
Do not force users to use a particular languageDo not force users to use a particular OSWhy would I use Java instead of VMware?Lighter weightNice properties of type-safe languageCan prove safety at compile timeSlide25
Full vs interpreted
What about protection?
What
does Java use for protection? VMware?Java relies on language features (cannot express unsafe data access)VMware: hardware (like an OS) and bin. rewriting (like link-loader)What are the trade-offs? Which protection model is better?Java gives you stronger (i.e., provable) safety guaranteesHardware protection doesn’t constrain programming expressivenessWhat about sharing (the opposite of protection)?Sharing among components in Java is easy
(call a function, compiler makes sure it is safe)Sharing between address spaces is more work, has higher overhead(use sockets, have to context switch, flush TLB, etc)Slide26
Singularity (could try both)Slide27
In 1974 …
Virtual machines have finally arrived!
(except not really)
Why did it take until the 2000s for VMs to actually arrive?Data centers are the main reason for widespread adoption of VMsNice to run multiple OSes on your desktopVMs allow infrastructure owners to safely rent their resourcesCould just hand out accounts. Why are VMs
easier?VMs encapsulate all of an app’s dependenciesIncludes kernal
and libraries w/ correct versionsCan move
VMs
around
Can consolidate
VMs
on one server, and shut others downSlide28
Sharing machines among users
When?
Scientific computing (
testbeds, “the grid”)Data centers (three-tier web applications)“The Cloud”Why would you want to do this?Slide29
Sharing machines among users
X
Consolidate under-utilized servers
to reduce
CapEx
and
OpEx
Avoid downtime with relocation
Dynamically re-balance workload
to guarantee application SLAs
X
Enforce security policy
XSlide30
What should the interface be?Slide31
Amazon EC2
Does anyone know what EC2 uses?
Xen
hypervisor (para-virtualized Linux)Slide32
Xen architecture
Host Machine
Guest OS
Guest
App
Host
App
Xen
Domain 0
Guest OS
Guest
AppSlide33
X86_32 address space
Kernel
User
4GB
3GB
0GB
Xen
S
S
U
All address spaces
All of a VM’s address spaces
Each guest
app
When does the hypervisor need to flush the TLB?
When a new guest VM or guest app needs to be run.
When are each set of virtual addresses are valid?Slide34
Xen physical memory
Allocated by hypervisor when VM is created
Why
can’t we allow guests to update PTBR?Might map virtual addrs to physical addrs they don’t ownVMware and Xen used to handle this differently
VMware maintained “shadow page tables”Xen used “hypercalls”
(Xen and VMware support both mechanisms now)Slide35
VMware guest page tables
Guest OS
VMM
Hardware
Virtual → Machine
Shadow page table
Update PTE
How does VMM grab control when PTE is updated?
Marks PTE pages read-only, generates page fault.
MMUSlide36
Xen physical memory
Guest
OSes
allocate and manage own PTs“Hypercall” to change PT baseLike a system call between guest OS and XenXen must validate PT updates before use
What are the validation rules?1. Guest may only map phys. pages it owns
2. PT pages may only be mapped ROSlide37
Xen guest page tables
Guest OS
VMM
Hardware
Virtual → Machine
Update PTE
hypercall
(like a
syscall
)
1) Validation check
2) Perform update
MMUSlide38
Para-virtualized CPU
Hypervisor runs at higher privilege than guest OS
Why
is having only two levels a problem?Guest OSes must be protected from guest applicationsHypervisor must be protected from guest OSWhat do we do if we only have two privilege levels?
OS shares lower privilege level with guest applicationsRun guest apps and guest OS in different address spaces
Why would this be slow?
VMM must flush the TLB on system calls, page faultsSlide39
Google App Engine
What is the interface?
Python and
Java API and runtime