/
Neural Network Theory Neural Network Theory

Neural Network Theory - PowerPoint Presentation

lindy-dunigan
lindy-dunigan . @lindy-dunigan
Follow
485 views
Uploaded On 2017-05-22

Neural Network Theory - PPT Presentation

Table of Contents Part 1 The Motivation and History of Neural Networks Part 2 Components of Artificial Neural Networks Part 3 Particular Types of Neural Network Architectures Part 4 Fundamentals on Learning and Training Samples ID: 550895

network neural function networks neural network networks function neuron neurons input part activation time learning output connections problem problems

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Neural Network Theory" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Neural Network TheorySlide2

Table of Contents

Part 1: The Motivation and History of Neural Networks

Part 2: Components of Artificial Neural Networks

Part 3: Particular Types of Neural Network Architectures

Part 4: Fundamentals on Learning and Training Samples

Part 5: Applications of Neural Network Theory and Open Problems

Part 6: Homework

Part 7: BibliographySlide3

Part 1: The motivation and History of Neural NetworksSlide4

Motivation

Biologically inspired

The organization of the brain is considered when constructing network configurations and algorithmsSlide5

The brain

A human Neuron has four elements:

Dendrites – receive signals from other cells

Synapses – where information is stored at the contact points between neurons

Axons – output signals transmitted

Cell body – produces all necessary chemicals for the neuron to function properlySlide6

Association to Neural Networks

Artificial neurons have

Input channels

Cell body

Output channel

And synapses are simulated with a weightSlide7

Main Characteristics adapted from Biology

Self-organization and learning capability

Generalization capability

Fault toleranceSlide8

The 100-step rule

Experiments showed that a human can recognize the picture of a familiar object or person in 0.1 seconds. Which corresponds to a neuron switching time of

seconds in 100 discrete time steps of parallel processing

A computer following the von Neumann architecture can do practically nothing in 100 time steps of assembler steps.

 Slide9

Word to the wise

We must be careful comparing the nervous system with a complicated contemporary device. In ancient times , the brain was compared to a pneumatic machine, in renaissance to a clockwork, and in the 1900's to a telephone networkSlide10

History of Neural Network Theory

1943 - Warren McCulloch and Walter Pitts introduced models of neurological networks

1947 - Pitts and McCulloch indicated a practical field of application for neural networks

1949 - Karl

Lashley

defended his thesis that brain information storage is realized as a distributed system.Slide11

History Continued

1960 - Bernard

Widrow

and

Marcian

Hoff introduced the first fast and precise adaptive learning system. The first widely commercially used neural network. Hoff later became the co-founder of Intel Corporation.

1961 - Karl

Steinbuch

introduced technical realizations of associative memory which can be seen as predecessors of today's neural associative memories

1969 - Marvin Minsky and Seymour

Papert

published a precise analysis of the perceptron to show the perceptron model was not capable of representing many important problems and so, deduced that the field would be a research "dead end".Slide12

History Part 3

1973 - Christoph von der

Malsburg

used a neuron model that was non-linear and biologically more motivated

1974 - Harvard

Werbos

developed a learning procedure called

backpropagation of error

1982 -

Teuvo

Kohonen

described the

self-organizing feature maps

also known as Kohonen maps

1985 - John Hopfield published an article describing a way of finding acceptable solutions for the Travelling Salesman problem using Hopfield netsSlide13

Simple Example of a neural network

Assume we have a small robot. This robot has n number of distance sensors from which it extracts input data. Each sensor provides a real numeric value at any time. In this example, the robot can "sense" when it is about to crash. So, it drives until one of its sensors denotes it is going to collide with an object.

Neural networks allow the robot to "learn when to stop" by treating the neural network as a "black box", then we do not know its structure but just regard its behavior in practice. So, we show the robot when to drive on or when to stop. i.e. called training samples, and are taught to the neural network by learning procedures. Either an algorithm or a mathematical formula. From this, the neural network in the robot will generalize from these samples, and learn when to stop.Slide14

Part 2: Components of Artificial Neural NetworksSlide15

Flynn’s Taxonomy of Computer design

 

Single instruction stream

Multi instruction stream

Single program

Multiple program

Single data stream

SISD

MISD

 

 

Multiple data stream

SIMD

MIMD

SPMD

MPMD

Neural Computers are a particular case of MSMID architecture

Simplest case: an algorithm represents an operation of multiplying a large dimensionality vector or matrix by a vector

The number of operation cycles in the problem solving process is determined by the physical entity and complexity of the problemSlide16

Neural “Clustering”

A “cluster” is a synchronously functioning group of single-bit processors that has a special organization that is close to the implementation of the main part of the algorithm

This provides solutions to two additional problems

1) to minimize or eliminate the information interchange between nodes of the neural computer in the process of problem solving

2) to solve weakly formalized problems (e.g. learning for optimal pattern recognition, self-learning

clusterization

,

etc

)Slide17

DEFINTIONSSlide18

Neurons

Neuron

– nonlinear parameterized bounded function y

y=f(

,

,…,

;

,

,…,

) where {x

i

} are the variables and {

} are the parameters (or weights) of the function. {

} exists in {0,1}

The variables of the neuron are often called

input

and its value is the

output

The function f can be parameterized in any appropriate fashion

The most frequently used potential v is a weighted sum of inputs with an additional constant term called

"bias"

such that v=

+

 Slide19

Neural Networks

Neural Network –

sorted triple (N, V, w)

N is the set of neurons

V is the set

whose elements are called connections between neuron

i

and neuron j

The function

defines the weights of the connection between neuron

i

and neuron j

 Slide20

The Propagation function

Looking at neuron j, we will usually find a lot of neurons with connection to j. for a neuron j the

propagation function

receives outputs

of other neurons

which are connected to j and transforms them in consideration of the connecting weights

into the network input

that can be further processed by the activation function

Network input is the result of the propagation function

 Slide21

Threshold function

Neurons get activated if the network input exceeds their threshold value:

Definition: Let j be a neuron. The

threshold value

is uniquely assigned to j and marks the position of the maximum gradient value of the activation function (basically a switching value)

 Slide22

Activation Function

Definition: let j be a neuron. The

activation function

is defined as

This transforms the network input and the previous activation function into a new activation function.

 Slide23

Further properties of the activation function

It is advisable that f, the

activation function

, be a

sigmoid function

The parameters are assigned to the neuron nonlinearity. i.e. they belong to the very definition of the activation function such is the case when function f is a

radial basis function

(RBF) or wavelet. For instance, the output of a

gaussian

RBF is given by y=

Where

is the position of the center of the

gaussian

and

is the standard deviation

The main difference between the two above categories of neurons is that RBFs and wavelets are local nonlinearities which vanish asymptotically in all directions of input space, whereas neurons that have a potential and sigmoid nonlinearity have an infinite-range of influence along the direction defined by v=0

 Slide24

Optimal Control Theory

Zermelo’s

problem and the handout

Example problemSlide25

Part 3: Particular Types of Neural Network ArchitecturesSlide26

Transfer from logical basis to threshold basis

In the case of neural computers, the logical basis of the computer system in the simplest case is the basis

. This basis maximally corresponds to the logical basis of the major solved problems. The neural computer is a maximally parallelized system for a given algorithmic kernel implementation.

The number of operation cycles in the problem solving process (the number of adjustment cycles for optimization) of the secondary functional in the neural computer is determined by the physical entity and the complexity of the problem

 Slide27

Fermi or Logistic Equation and

tanh

(x)

Fermi or logistic function.

Which maps the range of values (0,1)

Hyperbolic tangent

which maps from (-1,1)

 Slide28

Neural Network with direct connectionsSlide29

Neural Networks with cross connectionsSlide30

Neural networks with ordered backward connectionsSlide31

Neural networks with amorphous backward connectionsSlide32

Multilayer Neural networks with sequential connectionsSlide33

Multilayer neural networkSlide34

FeedForward networks

Feed forward neural networks-

nonlinear function of its inputs which is the composition of the functions of its neurons

A feedforward network with n inputs,

hidden neurons and

output neurons computes

nonlinear functions of its n input variables as compositions of the

functions computed by the hidden neurons

Feedforward networks are static: e.g. if input is constant, so is output

Feedforward multilayer networks with sigmoid nonlinearities are often termed multilayer

perceptrons

or MLPs.

 Slide35

FeedForward Network diagramSlide36

Completely linked networks (Clique)

Completely linked networks permit connections between all neurons except for direct recurrences. Furthermore, the connections must be symmetric. So, every neuron can become an input neuron. (Clique)Slide37

Directed Terms

If the function to be computed by the feedforward neural network is thought to have a significant linear component, it may be useful to add linear terms (called directed terms) to the above structureSlide38

Recurrent Networks

General Form:

Neural networks that include cycles. Since the output of a neuron cannot be a function of itself, then we must explicitly take time into account. The output of a neuron cannot be a function of itself at the same instant of time, but can be a function of its past values. These are considered discrete-time systems

Each connection of a recurrent neural network is assigned a delay value (possibly equal to zero) in addition to being assigned a weight as in feedforward networks.Slide39

Canonical Form of recurrent networks

Governed by recurrent discrete-time equations, the general mathematical description of a linear system is the state equations,

Where

is the state vector at time

,

is the input vector at time

,

is the output vector at time

, and A,B,C,D are matrices.

Property: Any recurrent neural network, however complex can be cast into a canonical form, made of a feedforward neural network, some outputs of which (state outputs) are fed back to the inputs through unit delays.

 Slide40

Canonical form of recurrent network diagramSlide41

The order of neural networks

Synchronous activation - all neurons change their values synchronously. i.e. they simultaneously calculate network inputs, activation and output, and pass them on. Closest to biology, most generic and can be used with networks of arbitrary topology

Random order - a neuron

i

is randomly chosen and its

,

and

are updated. For n neurons, a cycle is the n-fold execution of this step. Not always useful

Random permutation - each neuron is chosen exactly once, but in random order, during one cycle. This way is used rarely because it is generally useless, and time-consuming

Topological order of activation - the neurons are updated during one cycle and according to a fixed order determined by the network topology.

 Slide42

When to use Neural Networks

The fundamental property of neural networks with supervised training is the parsimonious approximation property. i.e. their ability of approximating any sufficiently regular function with arbitrary accuracy. Therefore, neural networks may be advantageous in any application that requires finding, in a machine learning framework, a nonlinear relation between numerical data

To do so, make sure that

1) a nonlinear model is necessary

2) determine if a neural network is necessary instead of, for instance a polynomial approximation. i.e. when the number of variables is large (larger than or equal to 3)Slide43

Part 4: Fundamentals on Learning and Training SamplesSlide44

Theoretically, a neural network could learn by

Developing new connections

Deleting existing connections

Changing connecting weights

Changing the threshold values of neurons

Varying one or more of the three neuron functions (activation, propagation, output)

Developing new neurons

Deleting neurons

The change of connecting weight is the most common procedure.

 Slide45

Different types of training

Unsupervised learning - the training set only consists of input patterns, the network tries, by itself, to detect similarities and to generate pattern classes

Reinforcement learning - the training set consists of input patterns, after completion of a sequence a value is returned to the network indicating whether the result was right or wrong, and possibly, how it was right or wrong.

Supervised learning - the training set consists of input patterns with correct results so that the network can receive a precise error vectorSlide46

Supervised learning steps

Enter input pattern

Forward propagation of the input by the network, generation of the output

Comparing the output with the desired output and provide the error vector

Corrections of the network are calculated based on the error vector

Corrections are appliedSlide47

Error vector

determined usually by the root mean square function (RMSE)

Does not always guarantee global minimum, but may only find local minimum

To calculate RMSE

1) take each error of each data point, square the value.

2) Sum the error squared terms

3) divide by the number of data values

4) take the square root of that valueSlide48

Part 5: Applications of Neural Network Theory and Open ProblemsSlide49

Open problems

Identifying if the neural network will converge in finite time

Training the neural network to identify local versus global minimums

Neural modularitySlide50

Applications of Neural Network

THeory

Traveling Salesman problem

Image Compression

Character Recognition

Optimal Control ProblemsSlide51

Part 6: HomeworkSlide52

1) Show that for the following, the given equations can be expressed by the respective functions themselves

Fermi function:

Hyperbolic tangent function:

 Slide53

Optimal control problem

2)

 Slide54

Find the RMSE of the below data set

Sample

Data

Estimation

1

1

-1

2

4

7

3

3

1

4

0

-2

5

9

7

6

11

11

7

12

13

8

6

7

9

8

8

10

20

17

11

11

9

12

11

13

13

2

2

14

0

0

15

4

5

16

9

9

17

5

5

18

13

14

19

15

17

20

1

0Slide55

Part 7: BibliographySlide56

Works Cited

Dreyfus, G. 

Neural Networks: Methodology and Applications

. Berlin: Springer, 2005. Print.

Galushkin

, A. I. 

Neural Networks Theory

. Berlin: Springer, 2007. Print.

Kriesel

, David. "D.

Kriesel

." 

A Brief Introduction to Neural Networks []

. Manuscript,

n.d. Web. 28 Mar. 2016.

Lenhart

, Suzanne, and John T. Workman. 

Optimal Control Applied to Biological Models

. Boca Raton: Chapman & Hall/CRC, 2007. Print.

Ripley, Brian D. 

Pattern Recognition and Neural Networks

. Cambridge: Cambridge UP, 1996. Print.

Rojas,

Raúl

Neural Networks: A Systematic Introduction

. Berlin: Springer-

Verlag

, 1996. Print.Wasserman, Philip D. Neural Computing: Theory and Practice. New York: Van Nostrand Reinhold, 1989. Print.https://www.researchgate.net/post/What_are_the_most_important_open_problems_in_the_field_of_artificial_neural_networks_for_the_next_ten_years_and_why