/
Skills and Training for Big Data Projects Skills and Training for Big Data Projects

Skills and Training for Big Data Projects - PowerPoint Presentation

acenum
acenum . @acenum
Follow
343 views
Uploaded On 2020-07-02

Skills and Training for Big Data Projects - PPT Presentation

ESS Big Data Workshop 2016  13 14 October 2016 Ljubljana Philippe NIEUWBOURG p hilippenieuwbourgdecideocom nieuwbourg What I will cover New opportunities coming from new data sources ID: 792471

big data skills people data big people skills sources understand tools coming nieuwbourg examples source social 2016 money analytics

Share:

Link:

Embed:

Download Presentation from below link

Download The PPT/PDF document "Skills and Training for Big Data Project..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Skills and Training for Big Data Projects

ESS Big Data Workshop 2016, 13 -14 October 2016, Ljubljana

Philippe NIEUWBOURG

p

hilippe.nieuwbourg@decideo.com@nieuwbourg

Slide2

What I will cover

New opportunities coming from new data sourcesSkills framework for Big DataWhy it’s urgent to act and launch proofs of concept

Slide3

Wake-up call!

Wayne Smith, Chief Statistician of Canada resigned on Sept. 16“Statistics Canada needed to be more agile because it was facing huge challenges in a world of big data including: demands for up-to-the-minute information that businesses and planners rely on, declining response rates on traditional surveys, and meeting the government’s need for statistics in new policy fields”

Slide4

A Big Data Revolution?

Slide5

A Big

NEW Data Revolution?

Slide6

Volume – Velocity – Variety

Volume is the less importantVelocity is a challenge for people working on long term trendsAbility to connect and capture feeds of data coming from social networks, web applications, sensors…Variety is a challenge for everybodyWe need to invent news ways to find insights into unstructured dataPeople gives today much more valuable information on social media and cell phones trough status, videos, photos, than trough questionnaires

Slide7

New Data Sources

From Data Sources YOU create and controlTo Data Sources collected by othersData Quality issuesData feedsData to buy or tradeData you don’t keep control on :FormatSustainabilityNew data sources, and data sources that disappears

Slide8

Examples

Data coming from mobile phones: geolocation, content served, usage…Data coming from smart cities: sensors, security cameras…Data coming from social networks: status, images, videos, links, graphsAll data coming from private companies which could be use to understand behaviors:Uber data could be used to understand people mobilityAirbnb data could be used to understand how and where people travelPinterest is used to understand how people cook, what kind of fashion and decoration they like…

Slide9

Skills framework

PillarsMain skills

Data ManagementManipulate high volume datasetsMoving from surveys to raw dataDealing with dirty dataGenerate structured data from unstructured dataDeal with uncertainty (new data sources, dead data sources, format evolution…)Mathematics & StatisticsEvolution of statistical modelsNew analytic technics based on large data setsImpact of big data on existing statistical and analytics technicsFind value in new type of datasets (i.e. social media) trough network analysis

Impact of « unlimited » resources available on analytics processes

Predictive analytics and machine learningCommunicationGrammar of graphicsData Storytelling

Legal, Ethics, and Privacy

Refer to the deliverable D.1.1.

Slide10

Business Skills

Legal & EthicsIdentify data sourcesNegotiate data sourcesCreate value from data and analysisSell data and analysis

Slide11

Data Management skills (collect & clean)

Collect or access existing dataCategory of tools

Examples of requested toolsCollect data from social medias and other sourcesAPI programmingSocial media data resellersRestlet, CNIP, Datasift, Topsy, FlumeTransform dataETLTalend, Informatica, IBM Datastage, Microsoft Integration Services, AB Initio, SqoopClean dataData Quality ManagementTrillium, DataMentors, Pixata, Ataccama, Zookeeper

Slide12

Data management skills (Store)

Store dataCategory of tools

Examples of requested toolsStore structured dataRDBMSMicrosoft SQL Server, MySQL, PostgreSQL, Hbase, SybaseIQ, Google BigQuery, Amazon Web ServicesStore XML dataXML databasesVelocityDB, TaffyDB, XML:DB, eXistdbStore unstructured dataNoSQLHDFSCassandra, MongoDBCloudera, Hortonworks…Store relations between objects

Graph Databases

Neo4j, Tibco Graph Database

Slide13

Analyze data

Analyze dataCategory of tools

Examples of requested toolsQuery data, manipulate data setsNew tools used for NoSQL and Hadoop environmentMapReduce, Spark, Hive, PigStatisticsStatistical softwareSAS, SPSS, Statistica, Excel, RDevelop your own analytical toolsProgramming languagesJava, Python, Scala, Ruby, JuliaMachine learningSoftware

Programming languages

Dataiku, TIMi, MahoutPython, R, Scala, Google TensorFlowData Mining, Text analytics, Sentiment analysisData Mining toolsRapidMiner, Weka,

Angoss, Kxen (SAP), Tanagra

Slide14

Communication skills

Visualize and share dataCategory of tools

Examples of requested toolsCreate charts and dashboardsSelf-service Business Intelligence solutionsQlik, Tableau, Tibco Spotfire, Yellowfin, Microsoft PowerBI, Domo, ZoomdataProgram your own chartsProgramming languagesGraphics librariesR, PythonPlot.lyCreate mapsGeospatial softwareGaligeo, Esri, CartoDBTell a story about your data

Data Storytelling tools

Tableau, Qlik, Yellowfin, Microsoft PowerPoint, Prezi

Slide15

NSO’s in a competing world

Slide16

NSO’s are not the only source of data

What is the best source to understand people mobility in a country? Mobile telecommunication providers?What is the best source to understand what people think of politics? Twitter, Facebook?What is the best source to measure the happiness of a people? Instagram, Facebook?What is the best source to measure what people spend on vacation? Anonymous data from banks?

Slide17

GAFA’s want to make money with data

Google, Apple, Facebook, Amazon…Telecommunications companies, banks, web services…They spend money to collect users dataThey know, better than you, the value of data they collectThey want to make money from that data

Slide18

Do you know Watson?

Slide19

Will you sustain or fight Algorithmic Regulation?

Algorithmic regulation is a system of governance where more exact data collected from citizens via their smart devices and computers are used for more efficiency in organizing human life as a collective.Big Data is the key of algorithmic regulationWho want to replace you? All Big Data platform if they can make money from their data!

Slide20

Skills and Training for Big Data Projects

ESS Big Data Workshop 2016, 13 -14 October 2016, Ljubljana

Philippe NIEUWBOURG

p

hilippe.nieuwbourg@decideo.com@nieuwbourg