/
1 Machine Learning 1 Machine Learning

1 Machine Learning - PowerPoint Presentation

danika-pritchard
danika-pritchard . @danika-pritchard
Follow
405 views
Uploaded On 2015-09-25

1 Machine Learning - PPT Presentation

Spring 2013 Rong Jin 2 CSE847 Machine Learning Instructor Rong Jin Office Hour Tuesday 400pm500pm TA Qiaozi Gao Thursday 400pm500pm Textbook Machine Learning The Elements of Statistical Learning ID: 140255

data learning board machine learning data machine board play theory experience function boards number issues backgammon final task hand

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "1 Machine Learning" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

1

Machine Learning

Spring

2013

Rong

JinSlide2

2

CSE847 Machine Learning

Instructor:

Rong

Jin

Office Hour:

Tuesday 4:00pm-5:00pm

TA,

Qiaozi

Gao

,

Thursday 4:00pm-5:00pm

Textbook

Machine Learning

The Elements of Statistical Learning

Pattern Recognition and Machine Learning

Many subjects are from papers

Web site: http://www.cse.msu.edu/~cse847Slide3

3

Requirements

~

10

homework assignments

Course project

Topic: visual object recognition

Data: over one million images with extracted visual features

Objective: build a classifier that automatically

identifies

the class of objects

in

images

Midterm exam & final examSlide4

4

Goal

Familiarize you with the state-of-art in Machine Learning

Breadth: many different techniques

Depth: Project

Hands-on experience

Develop the way of machine learning thinking

Learn how to model

real-world

problems

by machine

learning techniques

Learn how to deal with

practical issuesSlide5

5

Course Outline

Theoretical Aspects

Information

Theory

Optimization Theory

Probability Theory

Learning Theory

Practical Aspects

Supervised Learning Algorithms

Unsupervised Learning Algorithms

Important Practical Issues

ApplicationsSlide6

6

Today’s Topics

Why is machine learning?

Example: learning to play backgammon

General issues in machine learningSlide7

7

Why Machine Learning?

Past: most computer programs are mainly made by hand

Future: Computers should be able to program themselves by the interaction with their environmentSlide8

8

Recent Trends

Recent progress in algorithm and theory

Growing flood of online data

Computational power is available

Growing industrySlide9

Big Data Challenge

2.7 Zetabytes (10

21

) of data exists in the digital universe today.

Huge amount of data generated on the Internet every minute

YouTube users upload 48 hours of video,

Facebook

users share 684,478 pieces of content,

Instagram

users share 3,600 new photos,

http://www.visualnews.com/2012/06/19/how-much-data-created-every-minute/Slide10

Big Data Challenge

High dimensional data appears in many applications of machine learning

Fine grained visual classification [1]

250,000 featuresSlide11

Why Data Size Matters ?

Matrix completionClassification, clustering, recommender systemsSlide12

Why Data Size Matters ?

Matrix can be perfectly recovered provided

the number of observed entries

 O(

rn

log

2

(

n

))Slide13

Why Data Size Matters ?

The recovery error can be arbitrarily large if the number of observed entries <

O(

rn

log

(

n

))Slide14

Why Data Size Matters ?

error

# observed entries

O(

rn

log

(

n

))

O(

rn

log

2

(

n

))

UnknownSlide15

Difficult to access finance for small & medium business

Minimum loan

Tedious loan approval procedure

Low approval rate

Long cycle

Completely big data driven

Leverage e-commerce data to financial services

Alibaba Small and Micro Financial ServicesSlide16

Insurance contracts has year-on-year growth rate of 100%.

Over 1 billion contracts in 2013Over 100 million contracts one day on November 11, 2013

Shipping Insurance for Returned ProductsSlide17

Uniform 5% fixed rate

Fixed rate

Solely based on historical data and demographics

Actuarial approach

Simple

Easy to explain

Pricing model based on a few couple parameters

Data based pricing

Relatively accurate

Millions of features, real time pricing

Machine learned model

Dynamic pricing

Highly accurate

Shipping Insurance for Returned ProductsSlide18

18

Three Niches for Machine Learning

Data mining: using historical data to improve decisions

Medical records

 medical knowledge

Software applications that are difficult to program by hand

Autonomous driving

Image Classification

User modeling

Automatic recommender systemsSlide19

19

Typical Data Mining Task

Given:

9147 patient records, each describing pregnancy and birth

Each patient contains 215 features

Task:

Classes of future patients at high risk for Emergency Cesarean SectionSlide20

20

Data Mining Results

One of 18 learned rules

:

If

no previous vaginal delivery

abnormal 2

nd

Trimester Ultrasound

Malpresentation at admission

Then

probability of Emergency C-Section is 0.6Slide21

21

Credit Risk Analysis

Learned Rules

:

If

Other-Delinquent-Account > 2

Number-Delinquent-Billing-Cycles > 1

Then

Profitable-Costumer ? = no

If

Other-Delinquent-Account = 0

(Income > $30K or Years-of-Credit > 3)

Then Profitable-Costumer ? = yesSlide22

22

Programs too Difficult to Program By Hand

ALVINN drives 70mph on highwaysSlide23

23

Programs too Difficult to Program By Hand

ALVINN drives 70mph on highwaysSlide24

24

Programs too Difficult to Program By Hand

Visual object recognitionSlide25

25

Image Retrieval using TextsSlide26

26

Software that Models Users

Description:

A homicide detective and a fire marshall must stop a pair of murderers who commit videotaped crimes to become media darlings

Rating

:

Description:

Benjamin Martin is drawn into the American revolutionary war against his will when a brutal British commander kills his son.

Rating

:

Description:

A biography of sports legend, Muhammad Ali, from his early days to his days in the ring

Rating

:

History

What to Recommend?

Description:

A high-school boy is given the chance to write a story about an up-and-coming rock band as he accompanies it on their concert tour

.

Recommend

: ?

Description

:

A young adventurer named Milo Thatch joins an intrepid group of explorers to find the mysterious lost continent of Atlantis.

Recommend

: ?

No

YesSlide27

27

Netflix ContestSlide28

28

Relevant Disciplines

Artificial Intelligence

Statistics (particularly Bayesian Stat.)

Computational complexity theory

Information theory

Optimization theory

Philosophy

Psychology

…Slide29

29

Today’s Topics

Why is machine learning?

Example: learning to play backgammon

General issues in machine learningSlide30

30

What is the Learning Problem

Learning = Improving with experience at some task

Improve over task T

With respect to performance measure P

Based on experience E

Example: Learning to Play Backgammon

T: Play backgammon

P: % of games won in world tournament

E: opportunity to play against itselfSlide31

31

Backgammon

More than 10

20

states (boards)

Best human players see only small fraction of all board during lifetime

Searching is hard because of dice (branching factor > 100)

 

                                            Slide32

32

TD-Gammon by Tesauro (1995)

Trained by playing with itself

Now approximately equal to the best human playerSlide33

33

Learn to Play Chess

Task T: Play chess

Performance P: Percent of games won in the world tournament

Experience E:

What experience?

How shall it be represented?

What exactly should be learned?

What specific algorithm to learn it?Slide34

34

Choose a Target Function

Goal:

Policy:

: b

 m

Choice of value function

V: b, m

B = board

 = real valuesSlide35

35

Choose a Target Function

Goal:

Policy:

: b

 m

Choice of value function

V: b, m

V: b

B = board

 = real valuesSlide36

36

Value Function V(b): Example Definition

If b final board that is won: V(b) = 1

If b final board that is lost: V(b) = -1

If b not final board V(b) = E[V(b*)] where b* is final board after playing optimallySlide37

37

Representation of Target Function V(b)

Same value

for each board

Lookup table

(one entry for each board)

No Learning

No Generalization

Summarize experience into

Polynomials

Neural NetworksSlide38

38

Example: Linear Feature Representation

Features:

p

b

(b), p

w

(b) = number of black (white) pieces on board b

u

b

(b), u

b

(b) = number of unprotected pieces

t

b

(b), t

b(b) = number of pieces threatened by opponentLinear function:V(b) = w0pb(b)+ w1pw(b)+ w2

ub(b)+ w3

u

w

(b)+ w

4

t

b

(b)+ w

5

t

w

(b)

Learning:

Estimation of parameters w

0

, …, w

5Slide39

39

Given:board b

Predicted value V(b)

Desired value V*(b)

Calculate

error(b) = (V*(b) – V(b))

2

For each board feature f

i

w

i

w

i

+ c

error(b)fiStochastically minimizesb (V*(b)-V(b))2

Tuning Weights

Gradient Descent OptimizationSlide40

40

Obtain Boards

Random boards

Beginner plays

Professionals playsSlide41

41

Obtain Target Values

Person provides value V(b)

Play until termination. If outcome is

Win: V(b)

 1 for all boards

Loss: V(b)  -1 for all boards

Draw: V(b)  0 for all boards

Play one move: b

 b’

V(b)

 V(b’)

Play n moves: b

 b’… b

(n)

V(b)

 V(b(n))Slide42

42

A General Framework

MathematicalModeling

Finding Optimal Parameters

Statistics

Optimization

+

Machine LearningSlide43

43

Today’s Topics

Why is machine learning?

Example: learning to play backgammon

General issues in machine learningSlide44

44

Importants Issues in Machine Learning

Obtaining experience

How to obtain experience?

Supervised learning vs. Unsupervised learning

How many examples are enough?

PAC learning theory

Learning algorithms

What algorithm can approximate function well, when?

How does the complexity of learning algorithms impact the learning accuracy?

Whether the target function is learnable?

Representing inputs

How to represent the inputs?

How to remove the irrelevant information from the input representation?

How to reduce the redundancy of the input representation?