/
Challenges facing data-enabled interdisciplinary training Challenges facing data-enabled interdisciplinary training

Challenges facing data-enabled interdisciplinary training - PowerPoint Presentation

alexa-scheidler
alexa-scheidler . @alexa-scheidler
Follow
367 views
Uploaded On 2018-03-09

Challenges facing data-enabled interdisciplinary training - PPT Presentation

What is DESE If your science and engineering is not data enabled youre not doing it right http drewconwaycom zia 2013326thedatascience venn diagram Big Data in Agriculture Today ID: 644830

science data promote challenges data science challenges promote discipline skills infrastructure big students sharing tools standards min challenge solutions

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Challenges facing data-enabled interdisc..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Challenges facing data-enabled interdisciplinary trainingSlide2

What is DESE?

If your science and engineering is not data enabled…

…you’re not doing it right.Slide3

http://

drewconway.com

/

zia

/2013/3/26/the-data-science-

venn

-diagramSlide4
Slide5

Big Data in Agriculture (Today)

Syngenta Challenge: What seed varieties to plant?

Consider

expected weather conditions, knowledge about the soil at their farms, and performance studies of candidate soybean varieties from numerous sources. Slide6

Tomorrow

Problem becomes Gene (60K) X Environment(?) X Phenotype (thousands)

G X E = P

Visualize results so a farmer can understand, actionable intelligenceSlide7

VELOCITY

Try it!

http://nextml.org/chemistry

Up to thousands of respondents at one time

With each choice, back end must update embedding and deliver new query without noticeable delay for user

Serious data-handling and infrastructure design challengeSlide8

VarietySlide9

Leveling the playing field

Everyone comes in with different skills and tool sets.

How do we get each discipline “up to

speed”on

critical data-science skills…

…without requiring extensive additional coursework / time to degree?Slide10

One-way street problem

Students in computer science and engineering have data-science skills that apply broadly…

…but “apply skillset A to dataset B” != cross-disciplinary science.

What will engage interest from both computational and applied sides to promote true interactions?Slide11

Tower of Babel 1

Data science tools and standards vary considerably across disciplines and even across labs…

…yet for students to interact, a common set of tools is required.

How do such standards get set, and what should they be?Slide12

Tower of Babel 2

Each disciplines has its own jargon, which can be efficient within discipline but a barrier across disciplines.

Talks are hard to follow when (a) you can’t understand the terms

and

when (b) you have to stop to explain every third word.

How do we promote shared language for data science?Slide13

Data science infrastructure

Means of collecting, sharing, documenting data are proliferating.

Esp

with big data, issues arise:

Privacy, data sharing, large data sets, documentation of data, etc.

What are the right tools and infrastructure for managing data storage, documentation, access,

etc

?

Open science? Amazon? Wiki?

Github

? Slack? WordPress? Slide14

Plan

Small group breakout 1 (15min): Elaborate and rank order list of challenges

What are we missing? Add any additional challenges to Google Doc

What is most pressing? Rank order listed challenges (last 2 min)

Report back (15 min): Which challenge was your table’s top ranked and why?

Small group breakout 2 (15 min): Top n challenges assigned to tables—regroup at a table that interests you and discuss solutions

Note solutions on Google doc

Report back: What are your

table’s solutions?Slide15

These are questions for you!

Teaching basic data science to students who are not in quantitative areas.

What basic skills should scientists have to at least get started?

How should these skills be taught?

How do we promote true interdisciplinary collaboration, rather than partitioning tasks by discipline?

How do we balance the utility of jargon versus its alienating effects? How do we best promote good communication from data-science to discipline?

How do we manage

variety and promoting standards in software use and development.

How do we build

infrastructure for big data sharing, security, and documentation