Apache NiFi Presented by Joe Witt Apache NiFi PPMC Member Apache NiFis job Enterprise Dataflow Management 1 Automate the flow of data from any source to systems which extract meaning and ID: 827481
Download Pdf The PPT/PDF document "Better Analytics Demand Better Dataflow" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Apache NiFiBetter Analytics Demand Bet
Apache NiFiBetter Analytics Demand Better DataflowPresented by: Joe WittApache NiFiPPMC MemberApache NiFi’sjob: Enterprise Dataflow Management1Automate the flow of data from any sour
ce…to systems which extract meaning
ce…to systems which extract meaning and insight…and to those that store and make it available for usersAnalytics need data with the following characteristics:2QualityCorrect,comple
te, reliableRelevanceRight size,rate,
te, reliableRelevanceRight size,rate, format, schema, content, lightweight analysisTimelinessAll data has a half-life. Not all data is created equal.SecureConfidential,unalteredComplia
ntAuthorized,traceableRecoverableErr
ntAuthorized,traceableRecoverableErrors happen. Iterateuntil it’s right.Enterprise Dataflow: “What could possibly go wrong?”3Dataflow –Route, Transform, MediateAcquireAnalyze
StoreDataflow across the enterprise4E
StoreDataflow across the enterprise4Edge SitesRegional SitesCorporate DatacentersPartnersChallenges at the edge5Edge Sites•Devices may•Have low power•Use legacy protocols and
formats•Use emerging protocols and
formats•Use emerging protocols and formats•Communications may be•Unstable•High latency / Low Throughput•Expensive•Data acquired may be•Erroneous•Devoid of value or
‘noisy’•Time sensitive or toler
‘noisy’•Time sensitive or tolerant•Of differing priority•SensitiveChallenges at the core6Corporate DatacentersData may need transformation•Enrichment•Format/schema co
nversion•Splitting or AggregationSy
nversion•Splitting or AggregationSystems may be•Down, degraded, returning to service•Rate or throughput sensitive•Authorized for a subset of dataScaling and reliability•Contr
olled data loss only•Up (node effici
olled data loss only•Up (node efficient) & Out (global volume)Governance•Keeping track of all the information flows•Ability to understand and manage the flows•Ability to detect an
d recover from mistakesThe basic buildi
d recover from mistakesThe basic building blocksReal-time Command and ControlThe Power of Provenance7Apache NiFiFoundational Concepts231HEADER-UUID-Name-Size-Entry TimeAttr
ibutes Map[[Key | Value]]CONTENTFlow
ibutes Map[[Key | Value]]CONTENTFlow File8•Types•Events•Objects•Files•Messages•Media•Formats•JSON•Avro•Text•Mp4•Proprietary•Sizes•Bytes
to GBsFlow File Processor9Connections
to GBsFlow File Processor9Connections10Flow Controller11NiFiArchitecture12NiFiClustering Model13Tighten the feedback loop•Changes have consequences (good or bad)•And you see
them as they occurContinuous Improveme
them as they occurContinuous Improvement•Compare real-time vs. historical statistics•View data provenance•View Content at any stageIntuitive user experience•Visual programming
•Logical flow graph14Real-time c
•Logical flow graph14Real-time command and control2Latency Optimization•Intra process•Inter process•End-to-endCompliance•Prove handling•Assess impactUnderstandi
ng•Step through time•View conten
ng•Step through time•View content•View Context15The Power of Provenanceaka “Dude, where’s my data?”3Status and direction for NiFi16Efficient use of each node-100s of MB
/s per node-100Ks transactions/s per n
/s per node-100Ks transactions/s per nodeSimple / Effective scaling modelRuntime Command and ControlData ProvenanceDistributed durability of data-Maybe Kafka backed queuesHigh Availabilit
y Cluster ManagerLive / Rolling Upgrade
y Cluster ManagerLive / Rolling UpgradesProvenance Query Language / ReportingA complete user experience enabled by provenanceExisting StrengthsRoadmap HighlightsApache NiFi(incubating) s
itehttp://nifi.incubator.apache.orgSub
itehttp://nifi.incubator.apache.orgSubscribe to and collaborate atdev@nifi.incubator.apache.orgSubmit Ideas or Issueshttps://issues.apache.org/jira/browse/NIFI@ApacheNifi17Learn more abou