Steve Furber The University of Manchester stevefurbermanchesteracuk Turing Centenary Turing in Manchester Outline 63 years of progress Building brains The SpiNNaker project The networking challenge ID: 602154
Download Presentation The PPT/PDF document "Biologically-Inspired Massively-Parallel..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Biologically-Inspired Massively-Parallel Computation
Steve FurberThe University of Manchestersteve.furber@manchester.ac.ukSlide2
Turing CentenarySlide3
Turing in ManchesterSlide4
Outline
63 years of progressBuilding brainsThe
SpiNNaker projectThe networking challenge
A generic neural modelling platform
Plans & conclusionsSlide5
Manchester Baby (1948)Slide6
SpiNNaker CPU (2011)Slide7
63 years of progress
Baby:
filled a medium-sized room
used 3.5 kW of electrical power
executed 700 instructions per second
SpiNNaker
ARM968 CPU node:
fills ~3.5mm
2 of silicon (130nm)
uses 40 mW of electrical power
executes 200,000,000 instructionsper secondSlide8
Energy efficiency
Baby:5 Joules per instructionSpiNNaker ARM968:0.000 000 000 2 Joules per instruction
25,000,000,000 times better than Baby!
(James Prescott Joule born Salford, 1818)Slide9
Outline
63 years of progressBuilding brainsThe
SpiNNaker projectThe networking challenge
A generic neural modelling platform
Plans & conclusionsSlide10
Bio-inspiration
How can massively parallel computing resources accelerate our understanding of brain function?How can our growing understanding of brain function point the way to more efficient parallel, fault-tolerant computation?Slide11
Building brains
Brains demonstratemassive parallelism (1011 neurons)
massive connectivity (1015
synapses)
excellent power-efficiencymuch better than today’s microchips
low-performance components (~ 100 Hz)low-speed communication (~
metres/sec)adaptivity – tolerant of component failure
autonomous learningSlide12
Neurons
multiple inputs, single output (c.f. logic gate)useful across multiple scales (102 to 1011)Brain structure
regularity
e.g. 6-layer cortical ‘
microarchitecture’
Building brainsSlide13
Neural Computation
To compute we need:ProcessingCommunicationStorageProcessing:abstract modellinear sum of
weighted inputsignores non-linear
processes in dendritesnon-linear output function
learn by adjusting synaptic weights
w
1
x
1
w
2
x
2
w
3
x
3
w
4
x
4
y
fSlide14
Leaky integrate-and-fire model
inputs are a series of spikes
total input is a weighted sum of the spikes
neuron activation is the input with a “leaky” decay
when activation exceeds threshold, output fires
habituation, refractory period, …?
ProcessingSlide15
Izhikevich modeltwo variables, one fast, one slow:
neuron fires whenv > 30; then:a, b, c & d select behaviour
(
www.izhikevich.com
)
Processing
v
uSlide16
Communication
Spikesbiological neurons communicate principally via ‘spike’ eventsasynchronousinformation is only:which neuron fires, andwhen it fires‘Address Event’Representation (AER)Slide17
Storage
Synaptic weightsstable over long periods of timewith diverse decay properties?adaptive, with diverse rulesHebbian, anti-Hebbian, LTP, LTD, ...Axon ‘delay lines’Neuron dynamics
multiple time constantsDynamic network statesSlide18
Outline
63 years of progressBuilding brainsThe
SpiNNaker projectThe networking challenge
A generic neural modelling platform
Plans & conclusionsSlide19
SpiNNaker project
A million mobile phone processors in one computerAble to model about 1% of the human brain……or 10 mice!Slide20
Design principles
Virtualised topologyphysical and logical connectivity are decoupledBounded asynchronytime models itselfEnergy frugalityprocessors are freethe real cost of computation is energySlide21
SpiNNaker
systemSlide22
SpiNNaker
node Slide23
SpiNNaker chip
MobileDDRSDRAMinterface
Multi-chip packaging by UNISEM EuropeSlide24
Outline
63 years of progressBuilding brainsThe
SpiNNaker projectThe networking challenge
A generic neural modelling platform
Plans & conclusionsSlide25
Network – packets
Four packet typesMC (multicast): source routed; carry events (spikes)P2P (point-to-point): used for bootstrap, debug, monitoring, etc
NN (nearest neighbour): build address map, flood-fill code
FR (fixed route): carry 64-bit debug data to host
Timestamp mechanism removes errant packetswhich could otherwise circulate forever
Header (8 bits)
Event ID (32 bits)
P
ER
TS
T
0
-
Payload (32 bits)
Header (8 bits)
Address (16+16 bits)
P
SQ
TS
T
1
-
Srce
DestSlide26
Network – MC Router
All MC spike event packets are sent to a routerTernary CAM keeps router size manageable at 1024 entries (but careful network mapping also essential)CAM ‘hit’ yields a set of destinations for this spike eventautomatic
multicastingCAM ‘miss’
routes event to a ‘default’ output link
Inter-chip
0
0
1
0
0
X
1
1
X
000000010000010000
001001
On-chip
Event IDSlide27
Outline
63 years of progressMany cores make light workBuilding brains
The
SpiNNaker project
The networking challengeA generic neural modelling platformPlans & conclusionsSlide28
Problem mapping
SpiNNaker:
Problem: represented as a network of
nodes
with a certain
behaviour
...
...behaviour of
each
node embodied as an interrupt handler in code...
...compile, link...
...binary files loaded into core instruction memory...
Our job is to make the model
behaviour
reflect reality
...problem is split into two parts...
...problem topology loaded into firmware routing tables...
...abstract problem
topology
...
The code says "send message" but has
no control
where the output message
goesSlide29
Bisection performance
1,024 links
in each direction
~10
billion packets/
s10Hz mean firing rate250 Gbps
bisection bandwidthSlide30
Event-driven software modelSlide31
SpiNNaker robot controlSlide32
48-node PCBSlide33
Outline
63 years of progressBuilding brainsThe
SpiNNaker projectThe networking challenge
A generic neural modelling platform
Plans & conclusionsSlide34
SpiNNaker machines
103 machine: 864 cores, 1 PCB, 75W 104 machine:10,368 cores, 1 rack, 900W (NB 12 PCBs for operation without aircon)
105 machine: 103,680 cores, 1 cabinet, 9kW
106 machine: 1M cores, 10 cabinets, 90kW Slide35
Conclusions
Brains represent a significant computational challengenow coming within range?
SpiNNaker is driven by the brain modelling objective
virtualised topology, bounded asynchrony, energy frugality
The major architectural innovation is the multicast communications infrastructure
We have working hardware48-node 864-ARM PCBs now
first multi-PCB systems now working