/
Feature Engineering Studio Feature Engineering Studio

Feature Engineering Studio - PowerPoint Presentation

phoebe-click
phoebe-click . @phoebe-click
Follow
444 views
Uploaded On 2016-06-03

Feature Engineering Studio - PPT Presentation

February 2 2015 Welcome to Problem Proposal Day Rules for Presenters Rules for the Rest of the Class Rules for Presenters Talk for 3 minutes on Data set What variable will you predict What kind of variables will you use to predict it ID: 347499

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Feature Engineering Studio" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Feature Engineering Studio

February 2, 2015Slide2

Welcome to

Problem Proposal Day

Rules for Presenters

Rules for the Rest of the ClassSlide3

Rules for Presenters

Talk for 3 minutes on:

Data set

What variable will you predict?

What kind of variables will you use to predict it?

Why is this worth doing?

And please email

me your slides (if any)Slide4

Rules for Audience

After the presentation

Ask quick questions

Give quick suggestionsSlide5

Criteria

Everyone

Is the problem genuinely important? (usable or publishable)

Is there a good measure of ground truth?

Only if you know what you’re talking about

Is there rich enough data to distill meaningful features?

Is there enough data to be able to take advantage of data mining?Slide6

Rules for Audience

Be polite!

No interrupting

No rambling

No being meanSlide7

Presentations

Alphabetical Order Based on Last Name

Tie-Breaker: First NameSlide8

For next week

Think about how to improve your problem proposal

Rewrite your problem proposal based on the feedback you got today

Then email it to me for further feedback and a “thumbs-up” before the next classSlide9

Assignment 2

Data Familiarization

“Mucking Around”

Get your data set

Open it in

Excel (or another tool you prefer)

Look at your ground truth label (if you have one)

Look at other key variables

What does each variable mean semantically?

If numerical, what are its max, min, average,

stdev

? Create histograms of key variables.

If categorical, what is the distribution of each value?Slide10

Assignment 2

Data Familiarization

“Mucking Around”

Write a brief report for me

You don’t need to prepare a presentation

But be ready to discuss what you learn about your

data, in classSlide11

What if you don’t have data yet?

Get your

dataSlide12

What if you don’t have data yet?

Get your data

If you

don’t have your data yet, email

me

at least 48 hours before the assignment is due and

I’ll send you a practice data setSlide13

How to compute in Excel

If numerical, what are its max, min, average,

stdev

?

If

categorical, what is the distribution of each value?

Using Class2DataSlide14

How to do a histogram in Excel

Using

Class2DataSlide15

Next Session

2/4 Lab Session: Using

RapidMiner

If you don’t know how to build a prediction model in

RapidMiner

, you should attend this session

If you

do

know how to build a prediction model in

RapidMiner

, you don’t have to attendSlide16

Next Class After That

2/16 Data Cleaning (Asgn.2

due)

Do the assignment

Read the readingsSlide17

Note

2/9 No class

2/11 No classSlide18

Questions? Comments? Concerns?