/
CS33: Introduction to Computer Organization CS33: Introduction to Computer Organization

CS33: Introduction to Computer Organization - PowerPoint Presentation

khadtale
khadtale . @khadtale
Follow
342 views
Uploaded On 2020-06-23

CS33: Introduction to Computer Organization - PPT Presentation

Week 8 Discussion Section    Atefeh Sohrabizadeh atefehszcsuclaedu 112219 Agenda Virtual Memory Threading and Basic Synchronization Virtual Memory As demand on the CPU increases processes slow down ID: 784275

memory address virtual thread address memory thread virtual page process threads space physical shared dram processes write vpn private

Share:

Link:

Embed:

Download Presentation from below link

Download The PPT/PDF document "CS33: Introduction to Computer Organizat..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

CS33: Introduction to Computer OrganizationWeek 8 – Discussion Section  

Atefeh Sohrabizadeh

atefehsz@cs.ucla.edu

11/22/19

Slide2

AgendaVirtual Memory

Threading and Basic Synchronization

Slide3

Virtual MemoryAs demand on the CPU increases, processes slow down

But, if processes need too much memory, some of them may not be able to run

When a program is out of space, it can’t run

Solution: Virtual Memory (VM)

There is only one DRAM unit in machine

Provide each process with a large, uniform, and private address space so that they think they have access to entire DRAM

Main memory (DRAM) will be treated as a cache for an address space on disk

Memory management is simpler

Since each process has a uniform address space

Isolates address spaces

The address space of each process is protected from corruption by other processes

Slide4

Virtual Memory (cont’)

Partition memory into fixed-size chunks: pages

Both VM and DRAM

Physical address space

Set of M =

physical, linear addresses (DRAM)

Virtual address space

Set of N = virtual, linear addresses (disk)VM is an array of N contiguous bytes stored on diskThe contents of disk are cached in physical/main memory (DRAM)Each page is either:UnallocatedOr cached Or uncached

 

Slide5

DRAM Cache OrganizationDRAM is about 10x slower than SRAM

Disk is 10,000x slower than DRAM

So, DRAM has huge miss penalty

This results in:

Large page size

Fully associative

Any VP can be placed in any PP

But, with highly sophisticated replacement algorithmsWrite back (rather than write through)

Slide6

Page TableAn array of

page table entries (PTE)

that maps virtual pages to physical pages

Resides in DRAM

Page hit:

Reference to VM word is in

DRAM

Page fault:Reference to VM word is not in DRAMCauses an exceptionPage fault handler selects a victim to be evictedRe-executes the current instruction

Slide7

Using Page Table

Split virtual address into VPN and VPO

VPN is used as an index to page table

If not valid: page fault

Split physical address into PPN and PPO

Slide8

Memory Management Unit (MMU)A chip in CPU that performs address translation

Virtual address to physical address

Translation Lookaside Buffer (TLB)

Cache for MMU

Caches VPN translation to PPN

Typically has high associativity (e.g. 4-way set associative)

Slide9

TLB Hit and Miss

Slide10

Example

TLBI

TLBT

Slide11

Address Translation Example #1

Virtual Address:

0x03D4

VPN ___ TLBI ___ TLBT ____ TLB Hit? __ Page Fault? __ PPN: ____

Physical Address

CO ___ CI___ CT ____ Hit? __ Byte: ____

13

12

11

10

9

8

7

6

5

4

3

2

1

0

VPO

VPN

TLBI

TLBT

11

10

9

8

7

6

5

4

3

2

1

0

PPO

PPN

CO

CI

CT

0

0

1

0

1

0

1

1

1

1

0

0

0

0

0x0F

0x3

0x03

Y

N

0x0D

0

0

0

1

0

1

0

1

1

0

1

0

0

0x5

0x0D

Y

0x36

Slide12

Address Translation Example #2

Virtual Address:

0x0020

VPN ___ TLBI ___ TLBT ____ TLB Hit? __ Page Fault? __ PPN: ____

Physical Address

CO___ CI___ CT ____ Hit? __ Byte: ____

13

12

11

10

9

8

7

6

5

4

3

2

1

0

VPO

VPN

TLBI

TLBT

11

10

9

8

7

6

5

4

3

2

1

0

PPO

PPN

CO

CI

CT

0

0

0

0

0

1

0

0

0

0

0

0

0

0

0x00

0

0x00

N

N

0x28

0

0

0

0

0

0

0

0

0

1

1

1

0

0x8

0x28

N

Mem

Slide13

Multi-Level Page Table

As the size of memory increases, the page table size increases

For a 48-bit address space, 4KB page size, need

PTE

But, not program use the whole address space

The address space that a program access is sparse

So, use multi-level page tables

Only need to store level 1 page table in DRAMThe rest can be paged in/out when needed 

Slide14

Sharing Revisited: Shared Objects

Physical

memory

Process 1

virtual memory

Process 2

virtual memory

Process 1 and 2 maps the shared object.

Notice how the virtual addresses can be different.

Slide15

Sharing Revisited: Private Copy-on-write (COW) ObjectsImagine these are forked-processesTwo processes mapping a

private copy-on-write (COW)

object.

Area flagged as private copy-on-write

PTEs in COW areas are flagged as read-only

Physical

memory

Process 1

virtual memory

Process 2

virtual memory

Private

copy-on-write

area

Slide16

Sharing Revisited: Private Copy-on-write (COW) ObjectsInstruction writing to COW page triggers protection fault. Handler creates new R/W page.

Instruction restarts upon handler return.

Copying deferred as long as possible!

Physical

memory

Process 1

virtual memory

Process 2

virtual memory

Private

copy-on-write

area

Slide17

ConcurrencyMotivation

Increase the performance by running multiple processes concurrently

Approach:

Using processes

Pro

Address spaces are not shared

processes can’t overwrite VM of one another accidentally

ConAddress spaces are not sharedsharing information is hardTend to be slow because of large overheads for process controlUsing threads

Slide18

Thread-Based ConcurrencyThread

“A logical flow that runs in the context of a process”

Like processes, they are scheduled automatically by the kernel

A single process can have multiple threads running concurrently

All threads running in a process share the entire virtual address space

Easier communication

Each thread has its own thread context

TID, stack, stack pointer, program counter, general-purpose registers, and condition codesA thread context is much smaller than a process contextFaster context switch : less overheadMust access the shared variables carefully

Slide19

Thread CreationCreate a thread using:

Use NULL for argument

attr

When it returns,

tid

has the ID of the newly created thread

A thread can determine its own ID using:

A thread routine takes one argument as void* and returns one argument as void* as well

Slide20

Thread CleanupThreads can be terminated by

Calling either of the “

pthread_exit

”, “exit”, and “

pthread_cancel

” functions

Or when their top-level thread routine returns

Check book for more detailsThreads wait for other threads to terminate by calling pthread_joinThen they are reapedJoinable threads can be reaped and killed by other threadsTo avoid memory leaks, each joinable thread should be either explicitly reaped by another thread or detachedIf you don’t plan on joining (reaping) make your threads detachedBy calling pthread_detachDetached threads can not be reaped and killed by other threads

Slide21

Progress GraphModels the execution of n concurrent threads

As a trajectory through n-dimensional Cartesian space

Critical section

Instructions that manipulate a shared variable

Unsafe region

Intersection of the two critical sections

Safe trajectory

A trajectory that skirts the unsafe regionEnsures the mutually exclusive access to the shared variablesMust synchronize the threads so that they always have a safe trajectory

Slide22

SemaphoreAll the process context excluding the thread context is shared

E.g. global and static local variables

Shared variables may cause synchronization errors

(One) Solution:

Use semaphores

A global variable with a nonnegative integer value

A counter that can get N values

A mutex is a binary semaphore (its value is always 0 or 1)Associate a semaphore with each shared variable

Slide23

Semaphore Functionsint

sem_init

(

sem_t

*

sem

, 0, unsigned int value);

Initializes semaphore “sem” to valuevoid P(sem_t *s);“I want access”Waits until s becomes nonzeroIf s is nonzero, decrements s by one and returns immediatelyGrabs the lock (locking the mutex)void V(sem_t *s);“I’m done”Increments s by oneUnlocks the mutexIf there are any threads waiting at a P operation, wakes up one of them

Slide24

Progress Graph with SemaphoresEach state is labeled with the value of semaphore s

Forbidden region

A collection of states made by combination of P and V operations

s < 0

Encloses the unsafe region

No feasible trajectory can include any of the states in the forbidden region

Since semaphores should be nonnegative

So every feasible trajectory is safe

Slide25

ExampleWhat’s the problem with this code?

Deadlock

foo(n-1) can’t acquire the lock (mutex)

Since foo

(n) is not done

gets stuck

Slide26

Concurrency IssuesThread safety

A function is thread-safe

iff

always functions correctly when called repeatedly from multiple concurrent threads

Carefully handle shared variables and use proper synchronization to achieve thread-safe function

Reentrant functions

Special form of thread-safe functions

Does not use any shared dataDeadlockWhere a collection of threads are blocked waiting for a condition that will never be true

Slide27

Deadlock with SemaphoreWhen forbidden regions of two semaphores overlap there is a deadlock region

Trajectories can enter deadlock region, but can never leave

Here, each thread is waiting for the other to do a V operation that will never occur

Avoid deadlock when mutexes are used with:

Given a total ordering of all mutexes, the thread acquire the mutexes in order

Slide28

Class AttendanceUse this link:

https://onlinepoll.ucla.edu/polls

/3734

Open till 3:50 PM

Slide29

AcknowledgmentThe contents and figures are taken from

Computer Systems: A Programmer's Perspective”, Ed. 3, Bryant and

O’Hallaron

lecture slides