/
1 Bigdata   and the Internet of Things( 1 Bigdata   and the Internet of Things(

1 Bigdata and the Internet of Things( - PowerPoint Presentation

obrien
obrien . @obrien
Follow
65 views
Uploaded On 2023-09-25

1 Bigdata and the Internet of Things( - PPT Presentation

IoT Opportunities and Challenges   Thiab Taha Computer Science Department University of Georgia Athens GA USA thiabcsugaedu The 8 th  International Conference on Information Technology ID: 1021009

iot data research billion data iot billion research devices big 2017 applications management computing security connected internet smart large

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "1 Bigdata and the Internet of Things(" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

1. 1Bigdata and the Internet of Things(IoT) Opportunities and Challenges Thiab TahaComputer Science DepartmentUniversity of GeorgiaAthens, GA USA thiab@cs.uga.eduThe 8th International Conference on Information TechnologyICIT 2017Internet of ThingsMay 17-18, 2017Amman, Jordan 

2. 2What is Internet of Things(IOT)?https://www.nsf.gov/pubs/2017/nsf17072/nsf17072.jsp?WT.mc_id=USNSF_25&WT.mc_ev=clickIoT is generally understood to refer to the internetworking of physical devices that contain electronics, sensors, actuators and software, and that are able to collect and exchange data about, and in some cases interact with, the physical environment. In brief: The IOT refers to the devices that collect & transmit data via the internet.

3. 3http://www.dyogram.com/2017/04/internet-of-things-iot-market-potential-trends-in-2017-and-beyond/

4. 4Internet of Things(IOT)The IoT faces several challenges including security, privacy, scalability, design complexity, safety resulting from the lack of human control over systems, software flexibility, etc. Another major challenges of IoT is the processing, storing, and analyzing the large amount of data that comes from so many different resources. On the other hand, the IoT has many applications that are extremely useful in our daily life, such as smart cars, smart cities, home appliances and security, health tracking wearable devices, weather monitors, etc. The IOT has the potential to increase efficiency, accuracy, safety, and convenience through increased interconnection and intelligence of the integrated physical and computing environments. It has the potential to impact every aspect of daily life.

5. 5Internet of Things(IOT)At the same time, the trend toward connecting "everything" is rapidly expanding the perimeter or surface of concern that must be secured, and the potential for exposure of personally-identifiable information.

6. 6Protecting the Internet of ThingsWhile the car hacking work garnered the most public attention, there are other important security weaknesses.Tadayoshi  Kohno was an author on the first publications demonstrating the security risks of wirelessly reprogrammable pacemakers and defibrillators. Doctors disabled wireless mechanism in Dick Cheney’s(Former US Vice President)pacemaker to thwart hacking. Kohno stresses that the benefits of these devices outweigh the security risks and that patients should have no qualms using them. However, he believes that device manufacturers must improve the security of current and future devices.

7. 7Given the projected impact of IoT in nearly every industry sector including healthcare, agriculture/farming, manufacturing, energy, transportation, communication, security, finance, clothing, and sports, foundational precompetitive research is important to enable designs and applications that meet critical performance, security, and privacy guarantees.

8. 8Internet of Things(IOT)According to the ABI Research's latest data on the Internet of Everything (IoE) shows that there are more than 10 billion wirelessly connected devices in the market today; with over 30 billion devices expected by 2020 (https://www.abiresearch.com/press/more-than-30-billion-devices-will-wirelessly-conne/). This year we will have 4.9 billion connected things(forbes.com)"The emergence of standardized ultra-low power wireless technologies is one of the main enablers of the IoE, with semiconductor vendors and standards bodies at the forefront of the market push, helping to bring the IoE into reality," said Peter Cooney, practice director. "The year 2013 is seen by many as the year of the Internet of Everything, but it will still be many years until it reaches its full potential. The next 5 years will be pivotal in its growth and establishment as a tangible concept to the consumer.”

9. 9

10. 10Gartner, Inc. forecasts that 8.4 billion connected things will be in use worldwide in 2017, up 31 percent from 2016, and will reach 20.4 billion by 2020. Total spending on endpoints and services will reach almost $2 trillion in 2017. Regionally, Greater China, North America and Western Europe are driving the use of connected things and the three regions together will represent 67 percent of the overall Internet of Things (IoT) installed base in 2017.Consumer Applications to Represent 63 Percent of Total IoT Applications in 2017The consumer segment is the largest user of connected things with 5.2 billion units in 2017, which represents 63 percent of the overall number of applications in use (see Table 1). Businesses are on pace to employ 3.1 billion connected things in 2017. "Aside from automotive systems, the applications that will be most in use by consumers will be smart TVs and digital set-top boxes, while smart electric meters and commercial security cameras will be most in use by businesses," said Peter Middleton, research director at Gartner.http://www.gartner.com/newsroom/id/3598917Gartner, Inc. (NYSE: IT) is the world's leading information technology research and advisory company. 

11. 11Gartner Says 8.4 Billion Connected "Things" Will Be in Use in 2017, Up 31 Percent From 2016Consumer Applications to Represent 63 Percent of Total IoT Applications in 2017Gartner, Inc. forecasts that 8.4 billion connected things will be in use worldwide in 2017, up 31 percent from 2016, and will reach 20.4 billion by 2020. Total spending on endpoints and services will reach almost $2 trillion in 2017. Regionally, Greater China, North America and Western Europe are driving the use of connected things and the three regions together will represent 67 percent of the overall Internet of Things (IoT) installed base in 2017.Consumer Applications to Represent 63 Percent of Total IoT Applications in 2017The consumer segment is the largest user of connected things with 5.2 billion units in 2017, which represents 63 percent of the overall number of applications in use (see Table 1). Businesses are on pace to employ 3.1 billion connected things in 2017. "Aside from automotive systems, the applications that will be most in use by consumers will be smart TVs and digital set-top boxes, while smart electric meters and commercial security cameras will be most in use by businesses," said Peter Middleton, research director at Gartner.Table 1: IoT Units Installed Base by Category (Millions of Units)Source: Gartner (January 2017)In addition to smart meters, applications tailored to specific industry verticals (including manufacturing field devices, process sensors for electrical generating plants and real-time location devices for healthcare) will drive the use of connected things among businesses through 2017, with 1.6 billion units deployed. However, from 2018 onwards, cross-industry devices, such as those targeted at smart buildings (including LED lighting, HVAC and physical security systems) will take the lead as connectivity is driven into higher-volume, lower cost devices. In 2020, cross-industry devices will reach 4.4 billion units, while vertical-specific devices will amount to 3.2 billion units.Business IoT Spending to Represent 57 Percent of Overall IoT Spending in 2017While consumers purchase more devices, businesses spend more. In 2017, in terms of hardware spending, the use of connected things among businesses will drive $964 billion (see Table 2). Consumer applications will amount to $725 billion in 2017. By 2020, hardware spending from both segments will reach almost $3 trillion.Table 2: IoT Endpoint Spending by Category (Millions of Dollars)Category2016201720182020Consumer3,963.05,244.37,036.312,863.0Business: Cross-Industry1,102.11,501.02,132.64,381.4Business: Vertical-Specific1,316.61,635.42,027.73,171.0Grand Total6,381.88,380.611,196.620,415.4Table 1: IoT Units Installed Base by Category (Millions of Units)

12. 12To give you some perspective on IoT being the “next big thing,” here are what analysts are predicting for the IoT market:Bain Capital(an investment firm) predicts that by 2020 annual revenues could exceed $470B for the IoT vendors selling the hardware, software and comprehensive solutions.McKinsey & Company(consulting firm) estimates the total IoT market size in 2015 was up to $900M, growing to $3.7B in 2020 and has a potential economic impact of $2.7 to $6.2T until 2025.General Electric predicts investment in the Industrial Internet of Things (IIoT) is expected to top $60 trillion during the next 15 years.IHS(Indian Health Services, USA) forecasts that the IoT market will grow from an installed base of 15.4 billion devices in 2015 to 30.7 billion devices in 2020 and 75.4 billion in 2025

13. 13Below is a global market potential (in billions) of IoT devices owned from 2015 to 2025. In those 10 years, that market will increase by 489.55% globally . . . Wowzer!We really should care about IoT because empirical data show a promising future and an exponential market growth. With 75.5 billions IoT devices in the world by 2025, every person and business will own and use many devices. We should be actively engaged and knowledgable about this space. Not only should we be engaged from an investing perspective, but IoT can totally change your life or business for the better.https://www.facebook.com/MaciejKranzInnovation/posts/1819969151611805.Roundup Of Internet Of Things Forecasts And Market Estimates, 2016

14. 14In addition to smart meters, applications tailored to specific industry verticals (including manufacturing field devices, process sensors for electrical generating plants and real-time location devices for healthcare) will drive the use of connected things among businesses through 2017, with 1.6 billion units deployed. From 2018 onwards, cross-industry devices, such as those targeted at smart buildings (including LED lighting, HVAC and physical security systems) will take the lead as connectivity is driven into higher-volume, lower cost devices. In 2020, cross-industry devices will reach 4.4 billion units, while vertical-specific devices will amount to 3.2 billion units.Business IoT Spending to Represent 57 Percent of Overall IoT Spending in 2017While consumers purchase more devices, businesses spend more. In 2017, in terms of hardware spending, the use of connected things among businesses will drive $964 billion (see Table 2). Consumer applications will amount to $725 billion in 2017. By 2020, hardware spending from both segments will reach almost $3 trillion.

15. 15Table 2: IoT Endpoint Spending by Category (Millions of Dollars)Category 2016 2017 2018 2020 Consumer 532,515 725,696 985,348 1,494,466Business: Cross-Industry 212,069 280,059 372,989 567,659Business: Vertical-Specific 634,921 683,817 736,543 863,662Grand Total 1,379,505 1,689,572 2,094,881 2,925,787Source: Gartner (January 2017)http://www.gartner.com/newsroom/id/3598917

16. 16

17. 17IHS Automotive: The number of cars connected to the Internet worldwide will grow more than six fold to 152 million in 2020 from 23 million in 2013.Navigant Research: The worldwide installed base of smart meters will grow from 313 million in 2013 to nearly 1.1 billion in 2022.Morgan Stanley: Driverless cars will generate $1.3 trillion in annual savings in the United States, with over $5.6 trillions of savings worldwide.Machina Research: Consumer Electronics M2M connections will top 7 billion in 2023, generating $700 billion in annual revenue.On World: By 2020, there will be over 100 million Internet connected wireless light bulbs and lamps worldwide up from 2.4 million in 2013.Juniper Research: The wearables market will exceed $1.5 billion in 2014, double its value in 2013--

18. 18

19. 19Farmers are more closely monitoring crops with the help of sensor networks to ensure a better yield, and factory owners are monitoring operations to spot maintenance issues without requiring costly shutdowns.Major contractors have begun to add sensors to buildings and other large infrastructure as they’re being built, hooking them up to simulation engines to spot flaws, inefficiencies, and costly over-engineering before the problems are baked into the design. A few forward-looking cities such as Singapore are using IoT to monitor water networks for leaks, and the shipping industry is beginning to add sensors to crates of perishable food or medicines.

20. 20NSF:IoT is expected to become ubiquitous, with implementations in the smart home - management of energy use, control of appliances, monitoring of food and other consumables; consumer applications - health and fitness monitoring, condition diagnosis; manufacturing and industrial settings - supply chain management, robotic manufacturing, quality control, health and safety compliance; utility grids and other critical infrastructure - grid optimization, automated fault diagnosis, automated cyber security monitoring and response; and automotive/transportation - optimization for driving conditions, assessing driver alertness, collision/accident avoidance, managing vehicle health.Market verticals that are potentially impacted by innovations in this area include Connected Cities and Homes, Smart Transportation, Smart Agriculture, Industrial IoT, and Retail IoT. In the home, Energy savings lead the way here, followed by security and remote monitoring and automation gizmos such as a timed sprinkler system.Proposals are encouraged that address key challenges across the full range of IoT applications

21. 21In March 29, 2017, the National Science Foundation(NSF) calls for what is called:The “Dear Colleague Letter (DCL)” encourages collaborations between industry, academe, and government in research related to IoT specifically and, more broadly, cyber-physical systems. The aim is to establish multi-Industry-University Cooperative Research Centers (IUCRCs) that, in collaboration with their industry partners, are capable of collectively addressing large-scale and cross-disciplinary challenges in the broad context of IoT. NSF therefore welcomes and encourages proposals in response to the IUCRC program solicitation, NSF 17-516, in the areas outlined in this DCL.

22. 22NSF: Potential areas of precompetitive research that are of interest include, but are not limited to:Mobile technologies and applications;Healthcare and biomedical technologies;Smart grids and energy management;IoT Platforms, sensors, controls, and actuators;Agriculture and farming-based applications;Smart City/Community applications;Transportation and traffic management systems;Industrial and Manufacturing applications;Metrics, measurements, and benchmarking;Standards, practices, and policies (e.g., legal, regulatory); andTrust, security, and privacy in IoT.

23. 23NSF: The successful realization of an IoT-enabled world will thus depend not only on solving technical and engineering challenges, but will also require significant collaboration among academe, industry, and government to develop thoughtful and well-crafted standards, practices, and policies (including legal and regulatory) that take into account the complexities and societal implications of the IoT. To this end, any proposed IUCRC in any area related to IoT must include a clear and compelling plan to address relevant trust, security, and privacy issues within the overall mission of the proposed Center.

24. 24Technical challenges also remain before IoT will reach its true potential. Yet all the key technologies have passed the thresholds required for substantial ROI. Sensors, wireless radios, and processors are getting smaller, cheaper, and more power efficient. The hard part is hooking it all up.There are a good number of open source software projects for the Internet of Things:(https://www.linux.com/news/21-open-source-projects-iot)Home Assistant(https://home-assistant.io/ )-- This up and coming grassroots project offers a Python-oriented approach to home automation. See our recent profile on Home Assistant.Mainspring(http://www.m2mlabs.com/framework) -- M2MLabs’ Java-based framework is aimed at M2M communications in applications such as remote monitoring, fleet management, and smart grids. Like many IoT frameworks, Mainspring relies heavily on a REST web-service, and offers device configuration and modeling tools.

25. 25Physical Web/Eddystone(https://google.github.io/physical-web/) -- Google’s Physical Web enables Bluetooth Low Energy (BLE) beacons to transmit URLs to your smartphone. It’s optimized for Google’s Eddystone BLE beacon, which provides an open alternative to Apple’s iBeacon. The idea is that pedestrians can interact with any supporting BLE-enabled device such as parking meters, signage, or retail products.

26. 26The Thing System(http://thethingsystem.com/) -- This Node.js based smart home “steward” software claims to support true automation rather than simple notifications. Its self-learning AI software can handle many collaborative M2M actions without requiring human intervention. The lack of a cloud component provides greater security, privacy, and control.ThingSpeak(https://thingspeak.com/) -- The five-year-old ThingSpeak project focuses on sensor logging, location tracking, triggers and alerts, and analysis. ThingSpeak users can tap a version of MATLAB for IoT analysis and visualizations without buying a license from Mathworks.

27. 27Title: Tech Insider Webinar(05/11/2017): Test & Reliability Challenges in the Internet of Things:The Internet of Things (IoT) is an extremely fragmented market and can be defined to include anything from sensors to small servers — more than 30 billion of them by 2020. It has become crucial for today's IoT chips to use a range of new solutions during the design stage to ensure robustness of manufacturing test, field reliability, and security. Design-for-testing (DFT) engineers need to use new test and reliability solutions to enable power reductions during test, concurrent test, isolated debug and diagnosis, pattern porting, calibration, and uniform access. Moreover, per-unit price of IoT devices remains a key factor in high volume production. Thus, minimizing test cost while accommodating these technical issues is a major challenge for IoT. This webinar, besides discussing the key trends and challenges of IoT, will cover solutions to handle the wide range of potential robustness challenges during all periods of the IoT lifecycle from design to post silicon bring-up, volume production, and in-system operation.

28. 28

29. 29

30. 30https://en.wikipedia.org/wiki/Big_dataBig data is a term for data sets that are so large or complex that traditional data processing application software is inadequate to deal with them. Challenges include capture, storage, analysis, data curation, search, sharing, transfer, visualization, querying, updating and information privacy. The term "big data" often refers simply to the use of predictive analytics, user behavior analytics, or certain other advanced data analytics methods that extract value from data, and seldom to a particular size of data set. "There is little doubt that the quantities of data now available are indeed large, but that’s not the most relevant characteristic of this new data ecosystem."Analysis of data sets can find new correlations to "spot business trends, prevent diseases, combat crime and so on“. Scientists, business executives, practitioners of medicine, advertising and governments alike regularly meet difficulties with large data-sets in areas including Internet search, finance, urban informatics, and business informatics. Scientists encounter limitations in e-Science work, including meteorology, genomics, connectomics, complex physics simulations, biology and environmental research.

31. 31Data sets grow rapidly - in part because they are increasingly gathered by cheap and numerous information-sensing mobile devices, aerial (remote sensing), software logs, cameras, microphones, radio-frequency identification (RFID) readers and wireless sensor networks.The world's technological per-capita capacity to store information has roughly doubled every 40 months since the 1980s;as of 2012, every day 2.5 exabytes(2.5×1018) of data are generated.

32. 32

33. Visualization of daily Wikipedia edits created by IBM. At multiple terabytes in size, the text and images of Wikipedia are an example of big data.33

34. 34

35. Big Data can be described by the following characteristics:Volume – The name ‘Big Data’ itself contains a term which is related to size and hence the Characteristic. (“18.9 Billion Network Connections, 6 Billion People have cell phones, 2.5 Quintillion Bytes of Data are created everyday, 40 Zetabytes of data will be created in 2020”this was taken from http://www.ibmbigdatahub.com/infographic/four-vs-big-data)Variety - Different forms of Data Types(structured, unstructured, text, multimedia)Velocity - The speed of generation of data or how fast the data is generated and processed to meet the demands and the challenges which lie ahead in the path of growth and development. Analysis of Streaming Data, millisecond to seconds to respond.Veracity - The quality of the data being captured can vary greatly. Accuracy of analysis depends on the veracity of the source data. Uncertainty of Data(how much of their data was accurate)35

36. Complexity - Data management can become a very complex process, especially when large volumes of data come from multiple sources. These data need to be linked, connected and correlated in order to be able to grasp the information that is supposed to be conveyed by these data. This situation, is therefore, termed as the ‘complexity’ of Big Data.36

37. 37

38. 38Big data challengesDifficult to scale computing performance and storage capacity with the increased data size80% of data unstructured and growing 15 times fasterData expected to grow 40 zettabytes by 2020Total data in WWW was estimated 4 zettabytesChallenge is much of the insight and actionable pattern lies in unstructured dataData comes from variety of sources like sensors, logs, social media, pictures, videos, transaction records etc.Expose key insights for improving customer experiences, enhancing marketing effectiveness and mitigating financial risks.

39. CISE Distinguished Lecture Series - Kathy Yelick - May 20, 2015Kathy Yelick’s research is in programming languages, compilers, and algorithms for parallel machines. She earned her Ph.D in Electrical Engineering and Computer Science from MIT and has been a professor at UC Berkeley since 1991 with a joint research appointment at LBNL since 1996. She was Director of the National Energy Research Scientific Computing Center (NERSC) from 2008 to 2012 and currently leads the Computing Sciences directorate at LBNL  “In the same way that the Internet has combined with web content and search engines to revolutionize every aspect of our lives, the scientific process is poised to undergo a radical transformation based on the ability to access, analyze, and merge large, complex data sets.  Scientists will be able to combine their own data with that of other scientists to validate models, interpret experiments, re-use and re-analyze data, and make use of sophisticated mathematical analyses and simulations to drive the discovery of relationships across data sets.  This “scientific web” will yield higher quality science, more insights per experiment, an increased democratization of science, and a higher impact from major investments in scientific instruments. 39

40. At the same time, the traditional growth in computing performance is slowing, starting with flattening of processor clock speeds, but eventually also in transistor density.  These trends will limit our ability to field some of the largest systems, e.g., exascale computers, but the cost in hardware, infrastructure and energy will limit the growth in computing capacity per dollar at all scales.  Fundamental research questions exist in computer science to extend the limits of current computing technology through new architectures, programming models and algorithms, but also to explore options for post-Moore computing.  While the largest computing capabilities have traditionally been focused on modeling and simulation, some of the data analysis problems arising from scientific experiments will also require huge computational infrastructure.  Thus, a sophisticated understanding of the workload across analytics and simulations is needed to understand how future computer systems should be designed and how technology and infrastructure from other markets can be leveraged.40

41. In her talk, she gave some examples of how science disciplines such as biology, material science and cosmology are changing in the face of their own data explosion, and how mathematical analyses, programming models, and workflow tools can enable different types of scientific exploration.  This will lead to a set of open questions for computer scientists due to the scale of the data sets, the data rates, inherent noise and complexity, and the need to “fuse” disparate data sets.  Rather than being at odds with scientific simulation, many important scientific questions will only be answered by combining simulation and observational data, sometimes in a real-time setting.Along with scientific simulations, experimental analytics problems will drive the need for increased computing performance, although the types of computing systems and software configurations may be quite different.”41

42. HPCIn nondistributed architecture, data stored in a central server and applications access this central serverMore compute power and storage is added as data growsQuerying against a huge, centrally located data makes system slow, inefficient and performance suffers.42

43. HPC meets big dataAnalytics methods applied to established HPC domains in industry, government, and academiaHigh-end commercial analytics pushing up into HPC e.g PaypalThe journey from science to industry/commerce can be relatively short43

44. HPC and Big data boundary dissolvingIDC( International Data Corporation) study shows two third of HPC sites are performing big data analysisHPC vendors are increasingly targeting commercial markets, whereas commercial vendors are seeing HPC requirements. The goal is to successfully bring the two data-intensive computing paradigms together to “reinvent the wheel”Producing environments have performance of HPC and usability and flexibility of the commodity big data stackIDC termed it as HPDA44

45. HPDA : Data intensive simulation and analyticsHPDA = tasks involving sufficient data volumes and algorithmic complexity to require HPC resources Structured data, unstructured data, or bothRegular (e.g., Hadoop) or irregular (e.g., graph) patternsSmarter mathematical algorithmsHigher security and more realisms45

46. Factors driving HPDAHigh complexityAllows companies to aim more complex, intelligent questions at their data infrastructures advantages in today’s increasingly competitive marketsuseful to discover unknown patterns and relationships in data e.g fraud detection, to reveal hidden commonalities within millions of archived medical recordsTransition from static searches to higher-value, dynamic pattern discovery.High time criticalityInformation not available quickly have little or no valueweather report for tomorrow is useless if it’s unavailable until the day after tomorrowData analysis using HPC technology corrected this problem.46

47. Robust growth in HPC and HPDAindividual computers in a cluster are called nodesA cluster size is between 16 and 64 nodes, or from 64 to 768 cores.clusters connected via a high speed interconnect fabric, typically InfiniBandAccording to IDC, the server market will continue to grow at rate of 7.3% CAGR from 2011 – 2016Generate about $14.6 billion in revenues by 201647

48. HPDAStorage will continue to be the fastest-growing segment within HPCGrowing nearly 9% through 2016, and are projected to become a $5.6 billion marketIDC forecasts market for HPDA servers will grow at 13.3% CAGR from 2012 to 2016, and will approach $1.4 billion in 2017 revenue.HPDA storage revenue will near $800 million by 2017, with growth of 18.1% CAGR by 201648

49. Software for Building HPDA SolutionsThe ability to ingest data at high rates, then use analytics software to create competitive or innovation advantagesUse of Hadoop with HPC infrastructureHigh-performance, scalable to support near real-time analytics and HPDA capabilitiesReduce the processing time for the growing volumes of data in today’s distributed computing environmentsIT organizations are using Hadoop as a cost-effective data factory for collecting, managing and filtering large data49

50. Hadoop overviewOpen-source software framework, processes data-intensive workloads having large volumes of data and distributedHadoop configuration is based on three parts:Hadoop Distributed File System (HDFS)Hadoop MapReduce application modelHadoop CommonThe initial design goal was to use commodity technologies to form large clustersCapable of providing the cost effective high I/O performance50

51. HadoopHDFS have these key characteristics:Use large filesPerform large block sequential reads for analytics processingAccess large files using sequential I/O in append-only mode.HDFS splits large files into blocks and distributes them across serversHDFS replicates each data block on more than one serverData and processing are distributed and is done through MapReduce51

52. Big Data Consulting Services and Training center at UGA:The primary goal of this project is to establish a Big Data Consulting Services and Training center at the University of Georgia(Georgia(http://research.franklin.uga.edu/bigdata/) with the following goals in mind: To learn about the needs of the researchers across disciplines at the University of Georgia who are involved with big data; To gather and organize information concerning the local and national resources available for collecting, processing, storing, accessing, sharing, and curating such data;To educate researchers on the available resources; To provide a web-based, searchable, and sustainable resource that contains the necessary information – in the form of tutorials, applications, human contacts, and videos – on the local and national resources and how to access and use them.  The consulting services will emphasize three broad areas: data management, high performance computing, and cross-disciplinary coordination around big data. 52

53. Data ManagementIn the area of data management, it is becoming increasingly evident that the simple storage provisioning that has worked in the past may no longer be sufficient. Data management does certainly involve storage, but also a host of other issues including those associated with data ownership, curation, history, and organization; dynamic data streams; data access, availability, and security; ontologies, formatting, and standards. The consulting services will focus on identifying and organizing resources to help researchers to account for these issues, thus enabling them to develop strong data management plans; to effectively discover and evaluate discipline-specific repositories; and to structure data for maximum efficiency. A data management plan documents the processes for handling the flow of data from collection through analysis, including software and hardware systems, as well as quality control and validation of these systems. The center will collaborate with the University of Georgia Libraries to address data management issues. The library website will be expanded and integrated with the center website. http://guides2.galib.uga.edu/subject-guide/21-Data-Management-Plans53

54. 2. High Performance ComputingIn the area of high performance computing, our aim will be to assist researchers in identifying the most appropriate resource for addressing their research and educational needs. Further, the center will assist in training researchers on available software for high performance computing environments such as MPI (“message passing interface”), OpenMP, CUDA (“compute unified device architecture”) and Hadoop that supports the processing of large data sets in a distributed computing environment. We will do so through a focus on MapReduce which is a programming model for processing large data sets with a parallel, distributed algorithm on a cluster. In addition, the center will assist researchers in exploring a variety of national resources, including the Extreme Science and Engineering Discovery Environment (XSEDE), the Titan supercomputer at Oak Ridge National Laboratory, and the open source Integrated Rule-Oriented Data System (iRODS) for managing, sharing, publishing, and preserving digital data, as well as the University of Georgia’s own dedicated Hadoop environments in the Georgia Advanced Computing Research Center (GACRC) and the Information Technology Services (ITS) Group at the Terry College of Business. 54

55. 3. Cross-Disciplinary Coordination around Big DataThe center will actively promote its services throughout the University of Georgia. This effort will complement the many efforts throughout the university that address big data, which include a semi-annul interdisciplinary “big data workshop” (organized by GACRC, Office of the VP for Research, Office of the Enterprise Information Technology, and the Computer Science and Management Information Systems Departments). Also, UGA has already won a number of grants on issues associated with big data, including two recent NSF CAREER awards focusing on dealing with big data (Krashen & Perdisci) and an NSF-funded Research Coordination Network (RCN) for understanding the management issues associated with cyberinfrastructure (Berente). Also, UGA is a leader in “energy informatics,” which is a new field involving the use of big data to solve energy problems (and, among other things, involves a public-private partnership in this effort known as the Georgia Energy Informatics Cluster, or GEIC). Finally, researchers at the university have begun looking into ways that big data can be leveraged in the social sciences. In UGA’s Terry College of Business, for example, the Management Information Systems (MIS) department has developed a formal arrangement with Hortonworks (a Hadoop commercialization and service organization)4and is initiating a project with XSEDE leadership to investigate techniques for eliciting multidimensional social network data from unstructured data sources. The Big Data Consulting Services and Training Center can act as a catalyst to bring these multiple initiatives more closely together, and to leverage the variety of disparate efforts to make the entire science enterprise more productive. http://news.uga.edu/releases/article/uga-researchers-receive-nsf-funding-to-conduct-math-and-malware-research/http://www.terry.uga.edu/news/releases/uga-helps-create-public-private-coalition-to-boost-georgias-energy-efficien4 http://hortonworks.com/55

56. Many of the elements are already in place at UGA to support data intensive research. For example, the Georgia Advanced Computing Resource Center (GACRC)(http://gacrc.uga.edu/) and the CUDA Teaching Center(http://teachingcuda.uga.edu) and the CUDA Research center(http://cuda.uga.edu) already offer resources and some training on the use of high performance computing; the UGA Libraries offer services to help researchers write data management plans; and there are a variety of campus-wide and national resources that researchers can use for their data generation, analysis, and management needs. However, currently there is no clearinghouse or coordinating organization to which any researcher on campus can turn into in order to best incorporate data-intensive approaches into their research. To meet these needs, we plan to form a campus-wide committee comprised of faculty, administrators, information technology (IT) staff, librarians, and students; survey the campus in order to get a better understanding of data management needs; visit with individuals who are working with data that are large, long-term, and/or structurally complex to learn about their needs in more detail; identify the campus and national resources that are available for big data users. We will then develop a set of web resources that can serve as both a central resource for the campus and a starting point for further efforts to enhance UGA's cyberinfrastruture capabilities. Finally, we will continue to conduct the popular university-wide “big data” events under the auspices of the center to aid in communicating, coordinating, and disseminating activity around data-intensive research.56

57. Intellectual Merit: Across disciplines, researchers are working to meet the opportunity of “big data” by incorporating data-intensive approaches in their research. These disciplines include a wide range of study areas, including, but not limited to, physical sciences, mathematics and computer science, social sciences, engineering, audiovisual arts, as well as the medical and biological sciences. These disciplines are focused on using data to advance research in widely varying application domains from internet-based research and scientific computing to medical records management and energy utility management. The goal of the center will be to coordinate across these disciplines and to provide an effective enterprise for moving data intensive research forward in the university while avoiding redundant duplication of effort.57

58. Broader impact: This project will involve faculty, students, librarians, technology staff, and students. The goal is to build this consulting center to serve the UGA campus and then extend it to other institutions. As resources become available, this center’s reach can be extended across the Georgia University system, which includes an abundance of minority-serving institutions, and outside the state of Georgia. In addition to the research component necessary to compile and organize information on ”big data” resources, there will be significant effort devoted to education, training, outreach and the curation of up-to-date information that will be relevant to all members of a university community. The center will sponsor ”big data” events and outreach efforts to both promote and enable data-intensive science across disciplines.58

59. 59Data curation is the management of data throughout its lifecycle, from creation and initial storage to the time when it is archived for posterity or becomes obsolete and is deleted. The main purpose of data curation is to ensure that data is reliably retrievable for future research purposes or reuse.

60. 60From Wikipedia, the free encyclopediaAn actuator is a component of a machine that is responsible for moving or controlling a mechanism or system.An actuator requires a control signal and a source of energy. The control signal is relatively low energy and may be electric voltage or current, pneumatic or hydraulic pressure, or even human power. The supplied main energy source may be electric current, hydraulic fluid pressure, or pneumatic pressure. When the control signal is received, the actuator responds by converting the energy into mechanical motion.An actuator is the mechanism by which a control system acts upon an environment. The control system can be simple (a fixed mechanical or electronic system), software-based (e.g. a printer driver, robot control system), a human, or any other input.[1]

61. 61

62. 62Cyber-Physical Systems(http://cyberphysicalsystems.org/)Cyber-Physical Systems (CPS) are integrations of computation, networking, and physical processes. Embedded computers and networks monitor and control the physical processes, with feedback loops where physical processes affect computations and vice versa. The economic and societal potential of such systems is vastly greater than what has been realized, and major investments are being made worldwide to develop the technology. The technology builds on the older (but still very young) discipline of embedded systems, computers and software embedded in devices whose principle mission is not computation, such as cars, toys, medical devices, and scientific instruments. CPS integrates the dynamics of the physical processes with those of the software and networking, providing abstractions and modeling, design, and analysis techniques for the integrated whole.

63. 63

64. 64Beacons are small, often inexpensive devices that enable more accurate location within a narrow range than GPS, cell tower triangulation and Wi-Fi proximity. Beacons transmit small amounts of data via Bluetooth Low Energy (BLE) up to 50 meters, and as a result are often used for indoor location technology, although beacons can be used outside as well. As an example of how beacons can be used, when a customer is in a store a beacon in that location can communicate with the store's app on the customer's phone to display special offers or additional information for specific products or services the company is currently offering.