/
Microsoft Unlocks Business Value with Machine Learning Microsoft Unlocks Business Value with Machine Learning

Microsoft Unlocks Business Value with Machine Learning - PowerPoint Presentation

test
test . @test
Follow
396 views
Uploaded On 2016-08-04

Microsoft Unlocks Business Value with Machine Learning - PPT Presentation

Val Fontama PhD Principal Data Scientist Derek Bevan Principal Software Architect Data amp Decision Sciences Group Data Science at Microsoft Summary Building Data Science team Agenda Covering the Analytics Spectrum ID: 433133

microsoft data windows amp data microsoft amp windows team analysis analytics science targeting big predictive business piracy models churn problem window advanced

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Microsoft Unlocks Business Value with Ma..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Microsoft Unlocks Business Value with Machine Learning

Val Fontama, PhD, Principal Data Scientist

Derek Bevan, Principal Software ArchitectSlide2

Data & Decision Sciences Group

Data Science at Microsoft

Summary

Building Data Science team

AgendaSlide3

Covering the Analytics Spectrum

Descriptive

Diagnostic

Prescriptive

Predictive

IT Professionals

Data Modeling, ETL, Data

Warehousing

, Data Marts and Cubes

Information Worker

Self-Service & Exploration with Power BI

Data Scientists

Advanced Analytics from Microsoft and 3

rd

parties

BI Enablement

Advanced Analytics

Enterprise Data Management

What happened?

Why did it happen?

What will happen?

What should I do?Slide4

Microsoft

DDSG -

Vision, Mission and Services Offerings

Strategic Analytics Consulting

Data Science Community

Big Data Analytics

Big Data Innovation

Predictive and Prescriptive

Causality Studies

Fraud Detection

System Dynamics

Forecasting

Optimization

Big Data Insights & Visualization

Social & Sentiment Analysis

Web Analytics

POC & Pilot Enablement

Solution Design

Architectural Design Consulting

Community Development

Data Science Training

Data Driven Org Strategy

MCS & EPG Partnership

Industry Showcase

Global Field

External Client

Consulting

Simulation Modeling Services

Mission

|

Provide advanced analytic expertise to influence strategy and help drive efficiency,

grow revenue and improve customer satisfaction

Vision

|

Build a Culture of Data Driven Decision MakingSlide5

Industries: DDSG Data Scientist Experience

Telecommunications

Financial Services

Health Care

Fixed Line & Mobile

Banking, Insurance, Real EstatePharmaceuticals, Biotechnology

Industry/UtilityAerospace, Utility, ManufacturingSlide6

Advanced Analytics

at

MicrosoftSlide7

How to build

a predictive model

?

Define Business Problem

Prepare DataDevelop Model Through Iterations

Deploy Model

Monitor Model’s Performance

Business

Insights

1

2

3

4

5Slide8

DDSG

Solving Real Problems: Sample Client

Engagements

Industry Stats

Windows Telemetry

SEGMENTATIONCYCLE TIME REDUCTION

Build a utilization based customer segmentation by analyzing the Click stream from Windows Telemetry panel

MS.COM - Targeting

TARGETING

SURFACE TABLET, WINDOWS PHONE 8

Target visitors that showed an in interest in Surface, Windows Phone, Xbox on the basis of their MS.com/MS Store behavior

CRM Online

CHURN PREDICTION

PROACTIVE SUBSCRIBER RETENTION

Building a predictive churn model – for the CRM online customers to help with retention

ISRM - Security

Enhance ISRM security monitoring and incident response capabilities. Detect potential threats on the Microsoft corporate network.

SECURITY

INTRUSION DETECTION

OEM – Unlicensed Devices

ROI, INSIGHT

WINDOWS

8 DEVICES

Analysis of ROI and development of actionable insight for marketing spend in OEM channels, including manufacturers retailers and distributers

PIRACY DETECTION

REVENUE GROWTH OPPORTUNITY

Analyzing current trends in piracy of MS products and building models to identify instances of pirated software

LCA –

Cybercrime Unit Slide9

VIDEO

Cybercrime:

Piracy Detection

There’s no one country, business or organization that can tackle cybercrime threats alone. That’s why we invest in bringing partners into our center – law enforcement agencies, partners and customers – to work alongside us

.”Brad Smith, Microsoft’s general counsel and executive vice president of Legal and Corporate Affairs.Slide10

Cybercrime: Piracy Prevention

Problem:

Cybercrime cost

governments, corporations and the public billions in recent years, but the techniques and level of proof required to solve enterprise cybercrime problems has been extremely challenging in the past. In particular, lost revenue from software piracy impacts an enterprise’s bottom line

Findings: Microsoft’s teams combined cyber forensics, big data analysis and machine learning techniques to enable the ability to identify diverse piracy

mechanics to stop 3 massive operations in different geographies and recouped over $5M in revenueApplied Analytics led to

stopping

piracy at the source by ceasing a daily leak of license keys from a

factory

As a result, several

legal cases were brought to the court of law

recently

Methodology:

Technological

advances and Data

Science enabled

Microsoft Cybercrime Center,

Legal Corporate Affairs

and

Microsoft IT’s Data & Decision Sciences Teams’ to effectively stop

unlicensed activity and piracy, backed by the US Computer Fraud and Abuse Act

Microsoft IT DDSG mined large volumes

of license related data; predictive models built by the Data Scientists were implemented to score millions of product keys

that LCA

used successfully to identify fraudulent behaviorSlide11

Preventing Network Intrusion with Machine Learning

Problem:

Early detection of suspicious activity on the network servers & eliminate the threat.

Methodology:

File system to store massive security data.Fully automated workflow to drive end-to-end data receiving and transformation process.Analysis and visualizations of Windows Events to identify pre-defined threat scenarios.

Move from descriptive analytics to a mature predictive archetype. Slide12

Churn Analysis

Problem:

A business line

is experiencing 36% Churn annually

Findings:

Under-utilization is a key leading indicator (Low usage)

Each 1% reduction of churn results in ~$342K

impact

Methodology:

4

0% of data is missing or incomplete

Enumerated key leading indicators drivers of churn and scored every subscription with probability of churn

Developed

Random Forest model with ~65% accuracySlide13

Customer

Targeting With Machine Learning

Problem:

To leverage the history of a person’s behavior on Microsoft.com to identify their interests and predict future actions

Predict which customers are likely to buy Surface or Windows Phone

Methodology:Big Data Platform – HDP for Windows/Azure HDInsight and Advanced Analytics support

Develop statistical models to determine the probability of users buying a Surface DeviceSlide14

Targeting Models Delivered

Windows Phone

Provided list of cookies that are more likely to land on a Windows Phone pageMonthly scoring during 3 monthsSurfaceProvided list of cookies that are more likely to buy a surfaceMonthly scoring since April 20, 2013

New Targeting Models Developed for Surface and Windows PhoneSlide15

Path analysis

Geography analysis

By

Microsoft’s

PowerMapBig Data Analysis5 months of logs from Microsoft.comAnalysis conducted using Power BI, SQL Server, & Hadoop

Understand the Big Picture of your website’s

logs

Text Mining on external and internal

queries

Recognize your users quickly before their behavior

changes

Big Data Clustering models for user

segmentation

Big Data Predictive models for user behavior /

targeting

Do this for any sub-site, campaign, user segment,

etc.

Leverage big data platform for ongoing model

refinementSlide16

Behavior AnalysisSlide17

Queries in Microsoft.com were logged during a specific time range. The engineering team was interested to know the popular “topics” from this collection of queries (documents

)

A text miner tool pre-processed 3 million queries, and constructed 25 thematic topics formed by “key words”. The 5 most popular “topics” are listed below

Text AnalysisCategoryTopic IdDoc cutoffTerms

cutoffTopicNum of termsNum of queriesMultiple

5.05.032

0.397+window

, +

live

,

windowsmedia

,

xp

,

aspx

26.0

21633.0

Multiple

15.0

3.074

0.304

xp

, +

window

, sp3,

xp

service pack, +

download44.0

18299.0

Multiple

13.0

3.353

0.316

+window, +vista

, +

installer

, +mobile

, +phone

77.0

17771.0

Multiple

2.05.804

0.432

+medium

, +

player

, +

window

, +

download

, +

window

19.0

16713.0

Multiple

4.0

4.999

0.402

+office

, +

microsoft

office,

microsoft

, +

mac

, +

download

24.0

13088.0

Internal

(i.e. on direct Microsoft pages

)

Category

Topic

Id

Doc cutoff

Terms

cutoff

Topic

Num

of terms

Num

of queries

Multiple

5.0

8.793

0.367

+window

, +

phone

, +

bit

, +

theme

, +

install

177.0

213487.0

Multiple

9.0

8.133

0.343

microsoft

, +

microsoft

office

, +

microsoft

word

, +

microsoft

essential

, +

microsoft

outlook

140.0

144995.0

Multiple

10.0

7.305

0.337

+window

, +

phone

, +

installer

, +

vista

, +

server

174.0

132050.0

Multiple

25.0

3.152

0.228

+error

, +

server

, +

file

, +

code

,

sharepoint

545.0

104760.0

Multiple

8.0

7.818

0.343

+download

, +

free

, +

window

, +

explorer

,

microsoft

128.0

85837.0

External

(i.e. referrals from Google, Yahoo, etc.)Slide18

Windows OS users

Internal queries

This chart shows groups of similar queries. There are total 15 end nodes in this chart showing 15 groups. Almost all of these groups are product related.

Text AnalysisSlide19

Results – Better Targeting increased Revenue

Better customer targeting

Targeting coverage improved by 5% due to predictive models and other measures!

Increased revenue from display Ads

Targeted Ads generated up to 19% of revenue

Revenue per 1000 impressions grew by over 8X

Revenue per click grew by 6X!Slide20

Building a

Data

Science TeamSlide21

Data Science Team Composition

Team Experience:

Our Academic Backgrounds

Applied Mathematics

Computer ScienceEconometricsStatistics

Engineering Our Professional Expertise Financial Services

TelecommunicationsInformation TechnologyIndustrials/ManufacturingUtilities

Healthcare

Marketing

Domain Experience:

Forecasting/Modeling

Demand Forecasting

Predictive

Modeling

Demand-Driven Planning

Credit

Modeling

Fraud

Detection

Consumer Relations

Sentiment Analysis/Social Media

Inventory Optimization

Customer

Acquisition/Segmentation

Membership Portfolio Optimization

Click

stream Data Analysis

Data ScienceDesign of experimentsPredictive MaintenanceMachine Learning

Big Data Analytics/Innovation

…a key resource for delivering value to the enterprise and your business Slide22

The Roles: Data Scientist & Customer

…key resources, engaged collaboration essential for delivering value to the enterprise

Data Scientist

Scientific

Method

Domain

Knowledge

Intellectual Curiosity &

Critical Thinking

Visualization & Communication

Math &

Statistics

Advanced Computing &

Data Management

Business Problem

Insights for Decision Making

Ethical

Considerations

Objectivity

Hypotheses

Validation

Transparency

Dialog With Business

Problem

Description

Options

Considered

Receptive

to

Conclusions

Customer,

Partner,

StakeholderSlide23

Best Practices

Data Science is a team sport

Hire complementary skills to build a rounded team!

We need a hybrid Data Science team structure for best resultsNeed a centralized team of Data Scientists to share and promote best practicesAnd Data Scientists in Line of Business groups for domain knowledgeData Science team needs to be peers, but not inside a BI team

Analytics team should span descriptive, diagnostic, predictive and prescriptive analyticsBI only covers descriptive and diagnosticData Scientist in a BI team may be under-utilizedSlide24

Introduced Data & Decision Sciences Group

Data Science at Microsoft

Cybercrime and antipiracyNetwork intrusionCustomer churn predictionCustomer targeting modelsBuilding a Data Science teamSummarySlide25

Resources

Learning

Microsoft Certification & Training Resources

www.microsoft.com/learning

msdn

Resources for Developers

http://microsoft.com/msdn

TechNet

Resources for IT Professionals

http://microsoft.com/technet

Sessions on Demand

http://channel9.msdn.com/Events/TechEdSlide26

Complete an evaluation

and

enter to win!Slide27

Evaluate this session

Scan this

QR

code

to evaluate this session.Slide28

©

2014

Microsoft Corporation. All rights reserved. Microsoft, Windows,

and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.Slide29
Slide30

Advanced

Telemetry Analytics for Windows

Problem:

We needed a behavior customer segmentation for Windows and Office

Very large volumes of telemetry data are collected – over 1.7 Billion mouse clicks and 2.4 Billion keystrokes

Findings: Successfully developed 7 user behavioral segments

Prioritize investments around activities people do mostMethodology:

How

can we effectively mine and extract meaning from the data

?

Used clustering techniques to segment data that included hardware, app usage, user data, URLs visited