/
Machine Learning Overview Machine Learning Overview

Machine Learning Overview - PowerPoint Presentation

volatilenestle
volatilenestle . @volatilenestle
Follow
342 views
Uploaded On 2020-07-02

Machine Learning Overview - PPT Presentation

Tamara Berg CS 590133 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik Dan Klein Stuart Russell Andrew Moore Percy Liang Luke Zettlemoyer Rob ID: 792753

learning data unlabeled clustering data learning clustering unlabeled labeled examples image classification visual cluster features blum credit operate slide

Share:

Link:

Embed:

Download Presentation from below link

Download The PPT/PDF document "Machine Learning Overview" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Machine Learning Overview

Tamara BergCS 590-133 Artificial Intelligence

Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore, Percy Liang, Luke Zettlemoyer, Rob Pless, Killian Weinberger, Deva Ramanan

1

Slide2

Announcements

HW4 is due April 3Reminder: Midterm2 next Thursday

Next Tuesday’s lecture topics will not be included (but material will be on the final so attend!)Midterm review Monday, 5pm in FB009

Slide3

Midterm Topic List

Be able to define the following terms and answer basic questions

about them: Reinforcement learningPassive vs Active RLModel-based vs model-free approachesDirect utility estimationTD Learning and TD Q-learning

Exploration

vs

exploitation

Policy Search

Application to Backgammon/

Aibos

/helicopters (at

a high

level)

Probability

Random

variables

Axioms

of

probability

Joint

, marginal, conditional probability distributions

Independence

and conditional

independence

Product rule, chain rule, Bayes

rule

Slide4

Midterm Topic List

Bayesian Networks General

Structure and parameters Calculating joint and conditional probabilitiesIndependence in Bayes Nets (Bayes Ball)Bayesian InferenceExact Inference (Inference by Enumeration, Variable Elimination)Approximate Inference (Forward Sampling, Rejection Sampling, Likelihood Weighting)

Networks for which efficient inference is

possible

Naïve Bayes

Parameter learning including Laplace smoothing

Likelihood, prior, posterior

Maximum likelihood (ML), maximum a posteriori (MAP) inference

Application to spam/ham classification

Application to image

classification (at a high level)

Slide5

Midterm Topic List

HMMsMarkov

PropertyMarkov ChainsHidden Markov Model (initial distribution, transitions, emissions)Filtering (forward algorithm)Machine LearningUnsupervised/supervised/semi-supervised learning

K Means clustering

Training, tuning, testing, generalization

Slide6

Machine learning

Image source:

https://www.coursera.org/course/ml

Slide7

Machine learning

DefinitionGetting a computer to do well on a task without explicitly programming itImproving performance on a task based on experience

Slide8

Big Data!

Slide9

What is machine learning?

Computer programs that can learn from dataTwo key components

Representation: how should we represent the data?Generalization: the system should generalize from its past experience (observed data items) to perform well on unseen data items.

Slide10

Types of ML algorithms

UnsupervisedAlgorithms operate on unlabeled examples

SupervisedAlgorithms operate on labeled examples Semi/Partially-supervisedAlgorithms combine both labeled and unlabeled examples

Slide11

Slide12

Clustering

The assignment of objects into groups (aka clusters) so that objects

in the same cluster are more similar to each other than objects in different clusters. Clustering is a common technique for statistical data analysis, used in many fields, including machine learning, data mining, pattern recognition, image analysis and bioinformatics.

Slide13

Euclidean distance, angle between data vectors,

etc

Slide14

Slide15

K-means clustering

Want to minimize sum of squared Euclidean distances between points

xi and their nearest cluster centers mk

Slide16

Slide17

Slide18

Slide19

Slide20

Slide21

Slide22

Slide23

Slide24

Slide25

Slide26

Slide27

Slide28

Slide29

Source: Hinrich Schutze

Slide30

Hierarchical clustering strategies

Agglomerative clustering

Start with each data point in a separate cluster

At each iteration, merge two of the “closest”

clusters

Divisive clustering

Start with all

data points

grouped into a single cluster

At each iteration, split the “largest” cluster

Slide31

P

Produces a hierarchy of

clusterings

P

P

P

Slide32

P

Slide33

Divisive Clustering

Top-down (instead of bottom-up as in Agglomerative Clustering

)Start with all data points

in one big cluster

Then

recursively split clusters

Eventually

each

data point

forms a cluster on its own.

Slide34

Flat or hierarchical clustering?

For high efficiency, use flat clustering (e.g. k means

)For deterministic results: hierarchical clustering

When

a hierarchical structure is desired: hierarchical algorithm

Hierarchical

clustering can also be applied if K cannot be predetermined (can start without knowing K)

Source: Hinrich Schutze

Slide35

Clustering in Action – example from c

omputer vision

Slide36

Recall: Bag of Words Representation

Represent

document as a “bag of words”

Slide37

Bag-of-features models

Slides

adapted from

Fei-Fei

Li, Rob Fergus, and Antonio

Torralba

Slide38

Bags of features for image classification

Extract features

Slide39

Extract features

Learn “visual vocabulary”

Bags of features for image classification

Slide40

Extract features

Learn “visual vocabulary”Represent images by frequencies of

“visual words”

Bags of features for image classification

Slide41

1. Feature extraction

Slide42

2. Learning the visual vocabulary

Slide43

2. Learning the visual vocabulary

Clustering

Slide44

2. Learning the visual vocabulary

Clustering

Visual vocabulary

Slide45

Example visual vocabulary

Fei-Fei et al. 2005

Slide46

3. Image representation

…..

frequency

Visual

words

Slide47

Types of ML algorithms

UnsupervisedAlgorithms operate on unlabeled examples

SupervisedAlgorithms operate on labeled examples Semi/Partially-supervisedAlgorithms combine both labeled and unlabeled examples

Slide48

Slide49

Slide50

Slide51

Slide52

Example: Sentiment analysis

http://gigaom.com/2013/10/03/stanford-researchers-to-open-source-model-they-say-has-nailed-sentiment-analysis/

http://nlp.stanford.edu:8080/sentiment/rntnDemo.html

Slide53

Example: Image classification

apple

pear

tomato

cow

dog

horse

input

desired output

Slide54

http://

yann.lecun.com

/exdb/mnist/index.html

Slide55

Example: Seismic data

Body wave magnitude

Surface wave magnitude

Nuclear explosions

Earthquakes

Slide56

Slide57

The basic classification framework

y = f(

x)Learning: given a training set of labeled examples {(

x

1

,y

1

), …, (

x

N

,y

N

)}

, estimate the parameters of the prediction function

f

Inference:

apply

f

to a never before seen

test example

x

and output the predicted value

y = f(

x

)

output

c

lassification function

input

Slide58

Naïve Bayes classifier

A single dimension or attribute of

x

Slide59

Example: Image classification

Car

Input: Image Representation

Classifier (e.g. Naïve Bayes, Neural Net,

etc

Output: Predicted label

Slide60

Example: Training and testing

Key challenge:

generalization to unseen examples

Training set (labels known)

Test set (labels unknown)

Slide61

Slide62

Some classification

methods

10

6

examples

Nearest neighbor

Shakhnarovich, Viola, Darrell 2003

Berg, Berg, Malik 2005

Neural networks

LeCun, Bottou, Bengio, Haffner 1998

Rowley, Baluja, Kanade 1998

Support Vector Machines and Kernels

Conditional Random Fields

McCallum, Freitag, Pereira 2000

Kumar, Hebert 2003

Guyon, Vapnik

Heisele, Serre, Poggio, 2001

Slide63

Classification … more soon

Slide64

Types of ML algorithms

UnsupervisedAlgorithms operate on unlabeled examples

SupervisedAlgorithms operate on labeled examples Semi/Partially-supervisedAlgorithms combine both labeled and unlabeled examples

Slide65

Supervised learning

has many successes

recognize speech,

steer a car,

classify documents

classify proteins

recognizing faces, objects in images

...

Slide Credit:

Avrim

Blum

Slide66

However, for many problems, labeled data can be rare or expensive.

Unlabeled data is much cheaper.

Need to pay someone to do it, requires special testing,…

Slide Credit:

Avrim

Blum

Slide67

However, for many problems, labeled data can be rare or expensive.

Unlabeled

data is much cheaper.

Speech

Images

Medical outcomes

Customer modeling

Protein sequences

Web pages

Need to pay someone to do it, requires special testing,…

Slide Credit:

Avrim

Blum

Slide68

However, for many problems, labeled data can be rare or expensive.

Unlabeled

data is much cheaper.

[From Jerry Zhu]

Need to pay someone to do it, requires special testing,…

Slide Credit:

Avrim

Blum

Slide69

Need to pay someone to do it, requires special testing,…

However, for many problems, labeled data can be rare or expensive.

Unlabeled

data is much cheaper.

Can we make use of cheap unlabeled data?

Slide Credit:

Avrim

Blum

Slide70

Semi-Supervised Learning

Can

we use unlabeled data to

augment a small

labeled

sample

to

improve

learning?

But unlabeled data is missing the most important info!!

But maybe still has useful regularities that we can use.

But…

But…

But…

Slide Credit:

Avrim

Blum