CS 501CS Seminar Min Xian Assistant Professor Department of Computer Science University of Idaho Image from NVIDIA Researchers Geoff Hinton Yann LeCun Andrew Ng Yoshua Bengio ID: 657920
Download Presentation The PPT/PDF document "Deep Learning Insights and Open-ended Qu..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Deep Learning Insights and Open-ended Questions
CS 501:CS Seminar
Min Xian
Assistant Professor
Department of Computer Science
University of IdahoSlide2
Image from NVIDIASlide3
Researchers:Geoff HintonYann LeCun
Andrew Ng
Yoshua
Bengio …
Google Trends
Deep Learning
Google TrendsSlide4
Deep Learning in Industry
Companies
Projects
Investment
Description
Google
DeepMind, Since 2010
$500 million
Intel
autonomous driving system
$15.3 billion
Facebook
DeepFace
,
2014
-
Nine-layer, trained on 4 millions faces, 97% vs. 85%(FBI)
Nvidia
GPU, CUDA, since 2009
-
Increase the
speed of deep learning system by more than 100 times
Apple, Tesla, Baidu,
…Slide5
Math model of NN
1943
Turing Test
1947
First functional NN with many layers
1965
Deng’s improvement
2009
Nvidia’s
NN-GPU
Feifei Li’s ImageNet
2009
2011,
AlexNet
(CNN)
2014,
DeepFace
2014, Ian’s GAN
2015,
AlphaGo
2010-2017
2017
1940
1950
1960
1970
1980
1990
2000
2010
1982
SVM
1995
Apply the backpropagation
algo
. to NN
LeCun
, Handwritten digits recognition
1989
1969
Minsky’s two problems:
XOR and c
omputing power
Geoff Hinton’s
Deep belief nets
2006Slide6
Deep Learning is about Neural Networks (NNs)
What is Deep Learning?
The Mostly Complete Chart of Neural Networks by the team at the
Asimov Institute
.
An example of a feedforward NNSlide7
Deep Learning is about neural nets
Multiple layers of nonlinear processing units (node)
Learn
data representation
by supervised or unsupervised learning
Forming a hierarchical data representation from low-level to high-level
What is Deep Learning?
The Mostly Complete Chart of Neural Networks by the team at the
Asimov Institute
.
An example of a shallow neural netSlide8
Feedforward Neural Nets
Hidden
Output
Input
class 1
class 2
Highly structured and comes in layers
Group of classifiers
Feedforward propagationSlide9
Feedforward Neural Nets: An Example
Hidden
Output
Input
s
ick
healthy
Height
Weight
Temperature Slide10
Biological foundation
Dealing
with
complex patterns
with high representation capacity
From Shallow Nets To Deep Nets
Biological Neural Nets
(
100
billion
neurons)Slide11
Break down complex patterns to simpler patterns
Using simple patterns of building blocks to detect complex patterns
Ability to Recognize Complex Patterns
An example of CNN for Face recognitionSlide12
The
Vanishing Gradient Problem
makes
it very hard to train a deep
net
Backpropagation
(0.9)
100
Slow training process
No high quality big data set
No powerful computing
devices
NVIDIA GPU and deep learning
machine,
2009
Our machine: 8×GTX 1080, 8×8GB memory, 8×2560 CUDA cores.
Why did it take 50 years?
ImageNet: Feifei Li, 2009
Total number of images: 14,197,122
Number of images with bounding box annotations: 1,034,908, 3000 classesSlide13
Deep learning models
Convolutional
Neural Net (CNN)
: Machine vision problems, object detection, Yann
LeCun
Recursive Neural Tensor Net (RNTN):
discover the hierarchical structure of
data
Recurrent Neural
Net (RNN): do forecasting based on sequence input
Deep Belief Net (DBN): small labelled dataset,
pretraining, fine-tuning; Restricted Boltzmann machine (RBM): no vanishing gradient problem, automatically find patterns in data reconstructing the input (Geoff Hinton)
Autoencoder
Choice of Deep Learning ModelsSlide14
General Guideline:Classification
: DBN, CNN
Time
series analysis
and forecasting
: RNN
Choice of Deep Learning Models
Applications:
Text/Document analysis: RNN, RNTN
Image analysis: CNN, DBN
Image captioning: RNN, CNN
Video recognition: CNN
Self-driving: RNN, CNN
Statistic planning: RNN
Speech recognition: RNN
Slide15
ToRight amount of dataComplex patterns
Computing infrastructure
Not to
Not enough data
Has inside knowledge of data, can design
good
featuresNot have the computing resources
When to Use Deep Learning?Slide16
Courses at the University of Idaho CS 404/504: Machine Learning
CS 404/504: Deep Learning
CS 470/570: Artificial Intelligence
Other resources:
Andrew Ng’s Machine Learning course (
Coursera
)
Yoshua
Bengio’ s book Deep Learning
How to get started with deep learningSlide17
Deep Learning Platforms: a set of tools and interface for building Deep nets
S
election
of deep nets, CNN, DBN, MLP, RNN, RNTN
D
ata
preprocessing
UI
I
nfrastructure, GPU
How to get started with deep learningSlide18
Software Platforms: install on your personal hardware H2O.ai: MLP,
Dato
GraphLab
: CNN and MLP
Full Platforms: handle all technic issues
ersatz lab
How to get started with deep learningSlide19
Deep learning libraries: software libraries
highly-qualified
software team
regularly
maintained
open
sourced
surrounded
by a large
community
How to get started with deep learning
Commercial-Grade libraries:
Deep learning4j, Torch,
Caffe
and
TensorFlow
Educational or scientific research libraries:
theano
,
DeepMat
and
TensorFlow
Slide20
Deep Learning Trends and Discussion
Scales
of data and computation drive the progress of deep learning
Amount of data
Performance
Traditional approaches, SVM, Random forest, logistic regression, etc.
Medium Neural Nets
Deep Neural Nets
Q2
: is big data necessary for learning ?
Q1
: big data and Large models
Good or Bad ?Slide21
Deep Learning Trends and Discussion
Overfitting
and underfitting or
variance and bias
t
raining time
error
gap
Test error
training error
Training set
Test set
Generalization ability
Question 3: how to judge
if a model is overfitting or
underfittingSlide22
Deep Learning Trends and Discussion
Human-level error
?
Underfitting
: compare human-level error and training error
Solution: Bigger model, training longer
Overfitting
: compare test error and training error
Solution: early stopping, dropout, regularization
, get more data
time
error
Test error
Human-level error
training errorSlide23
Deep Learning Trends and Discussion
Overfitting
and underfitting: a practical strategy
Training set
Validation
Test
Training
error is high
Bigger model, training longer, new architecture
yes
Validation error is high
More data, regularization, new architecture
yes
Done
From Andrew NgSlide24
Deep Learning Trends and Discussion
End-to-End Learning: output much more complex results not just numbers
Object recognition: image
Numbers: 1, 2, …,1000
Product review
sentiment: positive (1) or negative (-1)
Image captioning
sentence
audio
transcript
Medical image
cancer
Tumor detection, feature extraction and selectionSlide25
Deep Learning Trends and Discussion
End-to-End
Learning
Medical image
cancer
Tumor
detection, segmentation, feature
extraction and selection
Deep nets
Q4
:
is end-to-end learning good for all problems
?Slide26
Deep Learning Trends and Discussion
Q5: Is unsupervised learning the
future of
AI/deep learning ?
Deep
learning started with unsupervised
learning
E
xciting and difficult learning simple and complex conceptsExpensive to collect labeled
dataWeakly supervised: large amount of unlabeled data + small set of unlabeled dataSlide27
Questions?
Min Xian, Assistant Professor
Department of Computer Science | UI-IF
TAB 309 | 208-757-5425
mxian@uidaho.edu