/
Learning Drivers through Imitation using Supervised Methods Learning Drivers through Imitation using Supervised Methods

Learning Drivers through Imitation using Supervised Methods - PowerPoint Presentation

marina-yarberry
marina-yarberry . @marina-yarberry
Follow
414 views
Uploaded On 2016-05-09

Learning Drivers through Imitation using Supervised Methods - PPT Presentation

By Luigi Cardamone Daniele Loiacono and Pier Luca Lanzi The outline Introduction Related work Torcs Imitation learning What sensors What actions What learning method What data ID: 312472

results learning experimental work learning results work experimental data methods track torcs imitation introduction sensors actions lookahead related discussion

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Learning Drivers through Imitation using..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Learning Drivers through Imitation using Supervised Methods

By Luigi

Cardamone

, Daniele

Loiacono

and Pier Luca

LanziSlide2

The outline

Introduction

Related work

Torcs

Imitation learning

What sensors?

What actions?

What learning method?

What data?

Experimental results

Discussion, conclusions and future workSlide3

Introduction

What is imitation learning?

Supervised learning

Neuroevolution

Two main methods

Direct methods

Indirect methodsSlide4

Introduction

Direct methods are well-known to

be very

ineffective.

Our methods develop drivers with only 15% lower performance than best bot in TORCS.

The trick is in “human-like” high-level action predictionSlide5

The outline

Introduction

Related work

Torcs

Imitation learning

What sensors?

What actions?

What learning method?

What data?

Experimental results

Discussion, conclusions and future workSlide6

Related work

Imitation learning in computer games

Rule-based NPC for Quake III via two-step process

Quake II NPC via reinforcement learning, fuzzy clustering and a Bayesian motion-modeling.

Neural networks with

backpropagation

for Legion II and Motocross The Force.

Drivatar

training for

Forza

MotosportSlide7

The outline

Introduction

Related work

Torcs

Imitation learning

What sensors?

What actions?

What learning method?

What data?

Experimental results

Discussion, conclusions and future workSlide8

What input sensors?

The rangefinder sensor

The

lookahead

sensorSlide9

What actions?

4 low-level effectors in TORCS

Wheel

Gas pedal

Brake pedal

Gear change

2 high-level actions in this work

Speed

TrajectorySlide10

What learning methods?

K-nearest neighbor

Training :

Doesn’t need any training

How it was applied?

Directly during the TORCS race

At each tic, the logged data is searched to find the k most similar instances.

The k similar instances are selected and averagedSlide11

What learning methods?

Neural networks

Training

Neuroevolution

with Augmenting Topology (NEAT) to evolve both the weights and the topology of a neural network

How it was applied?

2 networks, for speed and target position prediction

Rangefinder networks with 19 angle inputs + 1 bias input

Lookahead

networks with 8 segments inputs + bias input

The fitness was defined as the prediction errorSlide12

What data?

Inferno bot on 3 tracks for 3 laps each

Simple fast track

Difficult track with many fast turns

A difficult track with many slow sharp turns

Only the data of second lap was recorded

3 data sets with 1982, 3899 and 3619 examples

Additional all-in set with 9500 examplesSlide13

The outline

Introduction

Related work

Torcs

Imitation learning

What sensors?

What actions?

What learning method?

What data?

Experimental results

Discussion, conclusions and future workSlide14

Experimental results

Overall, we obtained 16 models

2 learning algorithms

3 + 1 datasets

2 types of sensors

K-nearest algorithm was applied with k = 20

NEAT was applied with 100 individuals for 100 generations

All the experiments were conducted using TORCS 1.3.1Slide15

Experimental Results - Evaluation

Each model was evaluated by using it to drive a car on each track for 10.000 game ticks.

The tracks

3 tracks used for training

2 unseen tracks

A simple fast track

A track with many fast and difficult turns

The driver was also equipped with standard recovery policy.Slide16

Experimental results - Results

Inferno was better than its imitations

Lookeaheads

are better than rangefinders

K-nearest neighbor is better than NEAT

One of the models had only 15% lower performance than Inferno bot.Slide17

The summary of the resultsSlide18

Experimental results - Execution time

Direct methods result in low computational cost

Our approach needs 30 times less CPU time to obtain reasonable resultsSlide19

Increasing the lookahead

How much

lookahead

is useful?

Second series of tests with 8 and 16

lookeahead

values showed

overfittingSlide20

The outline

Introduction

Related work

Torcs

Imitation learning

What sensors?

What actions?

What learning method?

What data?

Experimental results

Discussion, conclusions and future workSlide21

Discussion

Good drivers

Close to the target bot

Run out of the track in difficult turns as a result of prediction error or a low reactivity in steering

Bad drivers

Many discontinues in the prediction of trajectories

Causes car to move quickly from one

side of the track to the other oneSlide22

Perceptual aliasing

Two

different places can be perceived the

same

Usually happens on long straight parts of the road

Can be solved via special treatment of straight

parts, full throttle or bigger

lookaheadSlide23

Summary

Supervised learning to imitate a driver

High-level aspect of driving, speed and trajectory rather than low-level effectors

Novel

lookahead

sensor

Good results with k-nearest neighbor

Inferno bot is still better due to perceptual aliasing and slow steering during abrupt turnsSlide24

Future work

Exploit structural symmetry on the track

Increase the robustness to noise

Reduce computational cost

Improve steering reaction to abrupt turns