/
Scientists See Promise in Deep-Learning Programs Scientists See Promise in Deep-Learning Programs

Scientists See Promise in Deep-Learning Programs - PowerPoint Presentation

lois-ondreau
lois-ondreau . @lois-ondreau
Follow
409 views
Uploaded On 2016-12-10

Scientists See Promise in Deep-Learning Programs - PPT Presentation

Microsoft Seeks an Edge in Analyzing Big Data Jeff Hawkins Develops a Brainy Big Data Company Google Offers BigData Analytics The Age of Big Data How Big Data Became So Big Why Hire a Lawyer Computers Are Cheaper ID: 499955

wolf data predictive big data wolf big predictive search wild wolves coding dog domesticated review pet 011001000110111101100111 attorney 2012

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Scientists See Promise in Deep-Learning ..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Scientists See Promise in Deep-Learning Programs

Microsoft Seeks an Edge in Analyzing Big Data

Jeff Hawkins Develops a Brainy Big Data Company

Google Offers Big-Data Analytics

The Age of Big Data

How Big Data Became So Big

Why Hire a Lawyer? Computers Are Cheaper

Armies of Expensive Lawyers, Replaced by Cheaper SoftwareSlide2

The total amount of digital data in the world is estimated to

exceed 1.8

Zettabytes

(1.8 TRILLION Gigabytes)

)

The digital universe is doubling every 2 years

85%

of that data

is owned or controlled by corporations

at some point in its lifecycle

Source: International Data Corporation (IDC) Study, 2012Slide3

Big Data is Here

And it’s coming soon to a litigation near you…

What’s changed?Slide4

The

Great

Co

mming

lin

gSlide5

Redefining

scalability in eDiscovery.

1

1000

1 X 10

12Slide6

Predictive Coding is a Form of

Machine Learning

What is Machine Learning

?Slide7

voice recognition

software, e.g., calling your bank or credit card company handwriting, facial or fingerprint recognition

analyzing market trends and guiding investment decisions making decisions on applications for credit or loans

modeling and predicting severe weather patterns

filtering spam in your email inboxt

argeted marketing on the internet

robotics

It’s already a part of our lives. . . Slide8

KEY POINT: Predictive

coding is just a part of a continuum of technology assisted review (TAR) methods that we are already very familiar with in searching and analyzing data.

Key Words

Concept

Clustering

Concept

Search

Predictive

Coding

Three supporting propositions:

Each successive approach incorporates the preceding approaches.

Each successive approach contains more supporting criteria.

All are ultimately based on the concept of pattern matching.Slide9

Key Words

= Simple pattern matching

External input:“wild,” “wolf,” “pet”

dog

cat

rhino

ferret

goldfish

cow

wolf

domestic

wild

petSlide10

Concept Clustering

= Organization based on internal relationships

dog

cat

domesticated

wild

pet

rhinoferret

goldfish

cow

wolf

tigerdog

catdomesticatedwildpetrhinoferretgoldfish

cow

wolf

tiger

01110111011010010110110001100100 (

wild

)

011001000110111101100111 (

dog

)

011100000110010101110100 (

pet)Slide11

Concept Searching

dog

cat

rhino

ferret

goldfish

cow

wolf

domestic

wild

pet

dogcat

rhinoferretgoldfishcowwolfdomesticatedwildpet

tiger

= Key words + Concept organization

External input:

“zoo,” wild,”

“domesticated”

farm

zoo

01111010011011110110111

(zoo)

01110111011010010110110001100100

(wild)

011001000110111101101101011001010111001101110100011010010110001101100001011101000110010101100100 (domesticated)Slide12

Predictive Coding

dog

cat

rhino

ferret

goldfish

cow

wolf

domestic

wild

pet

dogcatrhino

ferretgoldfishcowwolfdomesticatedwildpet

tiger

= document-level input + probabilistic modeling

farm

zoo

e

xternal input:

h

uman-coded documents

output: doc-level

probability rankings

01111010011011110110111

(zoo)

01110111011010010110110001100100

(wild)

011001000110111101101101011001010111001101110100011010010110001101100001011101000110010101100100 (domesticated)Slide13

Infer

Step 1. sample documents from entire set.Slide14

Step 2: attorney review of sample documents to create training and control set.

In the European mind,

wolves

long stood as

a symbol of baneful, uncontrollable nature

. As far back as the time of

Aesop in 500 BCE (Before the

Common

Era

),

wolves

in

literature are portrayed as wicked villains and long-fanged, terrible beasts. Before the Middle Ages, wolves were nearly always the greedy thief, criminal trickster, or cruel remorseless murderer. The wolf does not fare well in the European imagination. Can the wolf be domesticated?The domesticated dog

is

descended from the

wolf

found in the

wild

.

While

some people have

occasionally attempted

to raise wolves as pets

, their

2 ½ inch fangs and tendency

to eat nearby small animals such as cats

can create

socially

awkward situations with

neighbors

.

Responsive

Not ResponsiveSlide15

Step 3: create model from human coded training set (responsive and not responsive).

In the European mind, wolves long stood as a symbol of baneful, uncontrollable nature. As far back as the time of Aesop in 500 BCE (Before the

Common

Era), wolves in literature are portrayed as wicked villains and long-fanged, terrible beasts. Before the Middle Ages, wolves were nearly always the greedy thief, criminal trickster, or cruel remorseless murderer. The wolf does not fare well in the European imagination.

Can the wolf be domesticated?

The domesticated dog is

descended from the wolf found in the wild.

While some people have

occasionally attempted

to raise wolves as pets, their

2 ½ inch fangs and tendency

to eat nearby small animals such as cats can create socially awkward situations withneighbors.

Can

the

wolf

be

domesticated?

The

domesticated dog

is

descended from

the

wolf found in

the

wild.

While some people

have

occasionally attempted

to

raise wolves

as

pets, their

2 ½ inch fangs

and

tendency

to

eat nearby small animals such

as

cats

can

create socially

awkward situations with

neighbors.

wolves

wolf

pet

Word

Pos.

Neg.

wolf

.98

.08

dog

.56

.43

pet

.42

.28

raise

.61

.09

costner

dances

Word

Assoc

%

wolf

pet

.73

dog

wolf

.43

pet

raise

..88

raise

wolf

.61

raise

werewolf

011001000110111101100111

011001000110111101100111

011001000110111101100111

011001000110111101100111

011001000110111101100111

011001000110111101100111

011001000110111101100111Slide16

Step 4: test model against sample (human coded) set.

"

Dances With Wolves" has the makings of a great work, one that recalls a variety of literary antecedents, everything from "Robinson Crusoe" and "Walden" to "Tarzan of the Apes." Michael Blake's screenplay touches both on man alone in nature and on the 19th-century white man's assuming his burden among the less privileged.

Wolves

are sometimes kept as exotic pets, and in some rarer occasions, as

working animals

. Although closely related to

dogs (which

are

believed to

have split from wolves between 10,000 and 100,000 years ago), wolves do not show the same tractability as dogs in living alongside humans. Wolves also need much more space than dogs, about 10- 15 sq. miles.Slide17

Yes

No

Apply model to remainder of documents that have not been reviewed

R

esponsive

Non-responsiveSlide18

Step 5: Apply model to entire set and rank documents.

100 %

90%

80%

70%

60%

50%

40%

30%

20%

10%

0%Slide19

PREDICTIVE CODING AND BIG DATANYLJ/Pangea3 WebinarApril 15, 2013Slide20

OUTLINEMitigating Big Data in E-DiscoveryStakeholder AnalysisThe New Reality of Predictive Coding

Long-Term TrendsSlide21

Mitigating Big Data in e-discoveryPredictive Coding and Big DataSlide22

BIG DATA IN E-DISCOVERYBigger haystack—more documents in

generalCorporate data culture—more relevant documents

More sources—poses collection/preservation challengesSlide23

MITIGATING BIG DATA IN E-DISCOVERYSome mitigating factors:

Principles of proportionality and cooperationInformation governance tools and document managementTechnology-assisted r

eview and predictive codingSlide24

Stakeholder analysisPredictive Coding and Big DataSlide25

PREDICTIVE CODING STAKEHOLDER ANALYSIS Judges: generally receptive

Clients: cost efficiencies vs. risk managementLawyers: new model, building expertiseSlide26

The new reality of predictive codingPredictive Coding and Big DataSlide27

NEW REALITY OF PREDICTIVE CODINGSlide28

Long-term trendsPredictive Coding and Big DataSlide29

LONG-TERM TRENDSOver time, Big Data growth > predictive coding benefits

Some document-by-document human review necessary

Strategic nuances in a new discovery battlegroundSlide30

CONTACT PANGEA3Slide31

SEARCH (1)

How do we search for discoverable ESI?Manually?With automated assistance?Which is“better” and why?

M.R. Grossman & G.V. Cormack, “The Grossman-Cormack Glossary of Technology-Assisted Review,” 7 Fed.

Cts. Law R. 1 (2013)

Maura R. Grossman & Gordon V. Cormack, “Technologically-Assisted Review in E-Discovery Can Be More Effective and More Efficient than Exhaustive Manual Review,” XVII Rich. J.L. & Tech

. 11 (2011) (available at

http://jolt.richmond.edu/v17i3/article11.pdf)

For a “shorter” discussion, see Efficient E-Discovery, ABA Journal

31 (Apr. 2012)

31Slide32

SEARCH (2)

Using search terms? How accurate are these? See In re National Ass’n of Music Merchants, Musical Instruments and Equipment Antitrust Litig

., 2011 WL 6372826 (S.D. Ca. Dec. 19, 2011)

32Slide33

SEARCH (3)

Automated review or “predictive coding” as an alternative to the use of search terms. For decisions which address automated review, see:EORHB, Inc. v. HOA Holdings LLC, C.A. No. 7409 (Del. Ct. Ch. Oct. 15, 2012)

In re Actos (

Pioglitazone) Prod. Liability Litig

., MDL No. 6:11-md-2299 (W.D. La. July 27, 2012)

Da Silva Moore v.

Publicis

Groupe SA, 2012 U.S. Dist. LEXIS 23350 (S.D.N.Y. Feb. 24),

aff’d

, 11 Civ. 1279 (ALC (AJP) (S.D.N.Y. Apr. 26, 2012)

Global Aerospace Inc. v.

Landow Aviation, L.P., Consol. Case No. CL 61040 (VA Cir. Ct. Apr. 23, 2012)33Slide34

SEARCH (4)

WHAT LESSONS CAN BE DRAWN FROM THE DECISIONS?Judge approved automated search at a “threshold” level. “Results” may be subject to challenge and later rulings.Threshold superiority of automated vs. manual review recognized given volume of ESI and attorney review costs.Large volumes of ESI in issue.

Party seeking to do automated review must offer “transparency of process” or something close to it.“Reasonableness” of methodology is key.

Speculation by the opposing party is insufficient to defeat threshold approval.

34Slide35

SEARCH (5)

LET’S TAKE A DEEP BREATH AND RECAP WHERE WE ARE TODAY, VENDOR HYPE NOTWITHSTANDING:We have yet to see a judicial analysis of process

and results in a

contested matter.

Safe to assume that the proponent of a process

will bear the burden of proof (whatever that burden might be).Safe to assume

at least some transparency of process

may/will be expected.

If “reasonableness” is standard, how reasonable must the

results

be? Is “precision” of 80% enough? 90%? Remember, there are no agreed-on standards.

35Slide36

INTERLUDE

Assume a party makes production of ESI based on search terms proposed by an adversary. Assume further that the adversary suspects “something” is missing.Is suspicion enough to warrant direct access to the party’s databases by a consultant retained by the adversary?If not, what proofs should be required?Will an attorney’s certification or affidavit suffice?

Will/should the attorney become a witness?Will experts be needed?

Note, with regard to proofs, S2 Automation LLC v. Micron Technology, Inc., No. 11-0884 (D.N.M. Aug. 9, 2012), where the court, relying on Rule 26(g)(1), required a party to disclose its search methodology.

36Slide37

INTERLUDE

A collision between search and ethics?Assume a party’s attorney knows that search terms proposed by adversary counsel, if applied to the party’s ESI, will not lead to the production of relevant (perhaps highly relevant) ESI.Absent a lack of candor to adversary counsel or the court under RPC 3.4 (which implies if not

require,s some affirmative statement), does not RPC 1.6 require the party’s attorney to remain silent?

What if the “nonproduction” becomes learned later? If nothing else, will the party’s attorney suffer bad “PR” if nothing else?If the party’s attorney wants to advise the adversary, should the attorney secure her client’s informed consent? What if the client says, “no?”

(with thanks to the Hon. John M.

Facciola)

37Slide38

INTERLUDE

AS WE THINK ABOUT SEARCH, THINK ABOUT THE ETHICS ISSUES THAT USE OF A NONPARTY VENDOR MAY LEAD TO! 38