DCODS DCO Data Science Vision Our vision is to develop facilitate and maintain sustained multiway engagement of carbon scientists in multiscale local to global networks for the transformation of our understanding of carbon in Earth ID: 793880
Download The PPT/PDF document "DCO-DS: Moving Forward DCO Synthesis Mee..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
DCO-DS: Moving Forward
DCO Synthesis Meeting. Oct. 29-30, 2015
DCO-DS = DCO Data Science
Slide2Vision…
“Our vision is to develop, facilitate, and maintain sustained multi-way engagement of carbon scientists in multi-scale local to global networks” [for the transformation of our understanding of carbon in Earth].
Organization is required so
participants
can carry out their mission(s)
Those participants (by
defn
.) may never be in a single organization -> virtual
organization
Slide3Virtual Organizations as Socio-Technical Systems
‘ …a geographically distributed organization whose members are bound by a long-term common interest or goal, and who communicate and coordinate their work through information technology’ (
Ahuja
)
‘These members assume well defined roles and status relationships within the context of the virtual group that may be
independent of their role and status in the organization employing them’
(
Ahuja
et al., 1998)
Technology
Communication Patterns
Organizational
Structure
Slide4Virtual Organization
Feature:
Outcomes/ values
Dynamic versus static
Evolvable/ ecosystem-like
Heterogenetic tolerance
Attributes of the organization
Roles/ responsibilities
Scale or scalability
Slide5Strategy…
Slide6Mapping…
goal -> use case
participation -> team(s), vetting, acceptance
outcomes/ value -> goals, metrics, evaluation, incentives, data/information/ knowledge projects, responses, decisions
dynamic -> agile working format, small iterations
evolution -> rapid development, evaluation and iteration (open)
Slide7Methodology…
Slide8DCO-DS Evaluation
Form as key input to DCO-DS
Focused on the evaluation of Deep Carbon v
irtual
Observatory
Evaluation questions will help determine
DC
vO's role inIncreasing members, activity and awareness of DCO activitiesEnabling search, access, exchange and use of data & information for DCO scientific and educational needsNeeds to further integrate with DCO Members' essential technologiesPhased roll-out to begin early OctWave 1: Executive Committee, Secretariat, Community leads, selected othersWave 2: DCO SSCs, EngagementWaves 3, 4, 5, 6: DCO Communities
Slide9Value Philosophy
Value focuses on organizational outputs (or outcomes) rather than inputs
For example: Deployed knowledge and skills vs research budgetsValue relates to benefit of outcomes, rather than outcomes themselvesProducts and services enabled by knowledge and skills
Value implies relative, useful, and usable outcomes
Beneficiaries have to understand and
appreciate
Credit: B. Rouse (BEVO) 2008
Slide10Leveraging
existing data resources
Interface between DCO Data Portal and other data repositories – key part of post-2019 efforts (e.g. Spring 2015 effort with CoDL/ MBL)
Incorporate specific metadata requirements into the DCO Knowledge Store
Extend DCO Ontology for incorporation of other repository data,
and/or
u
tilize existing schemaProvide data in a variety of formats for use (non-specialists)Populate the metadata and data repository for DCO projects that do not already have their own portalWork on and develop new boundary activities
Slide11DCO-DS Boundary Activities
Slide12Moving Forward
A technology refresh for major platform components for the DCO network, and a “network” succession plan Prioritized efforts based on evaluations (Nov-Dec)
Inputs from DCO synthesis discussions and post-2019 committees/ task groupsSignificant efforts on data registration and data legaciesAnd continue to work on existing and develop new boundary activities
Slide13Questions?Comments?
Patrick West, westp@rpi.edu, Peter Fox,
pfox@cs.rpi.edu The Team: Lead: Peter Fox, Staff: Patrick West, Stephan Zednik and John Erickson, Post Doc: Marshall Ma, Graduate Students: Han Wang, Hao Zhong, Ahmed Eleish
Slide14DCO Knowledge Graph Analytics
Identified key areas of DCO for analysis and
visualization
, initially:
Publications and publication keywords
User registrations
DCO Member areas of expertise
Instance Creation statistics: who is creating what and associated with what communities.What would you like to see?
Slide15DCO Knowledge Graph Analytics
Publication Subject Area Word Cloud
Slide16Current Work:
Thermodynamic
Data Rescue
A large number of geoscience publications contain publication
datasets
that are not expressed external to the publication text
Extracting, organizing, and reusing these datasets is valuable
Data Science Team and Extreme Physics and Chemistry community member Mark Ghiorso identified thermodynamic datasets about the enthalpy and entropy of chemicals
Slide17Current
Work: Geo Sample curation and IGSN
Have
GeoSample
as a class in DCO ontology and collect the core metadata items for sample registration in the DCO data portal;
Interface between the
DCO IGSN Allocation
Agent and the IGSN registry agent, with two potential functionalities:Assign IGSN to a sample record through the DCO data portal in collaboration with UT funded activityUse IGSN to import sample records from existing repositories to the DCO data portal, if there is a mature IGSN metadata API
Slide18Future Work: Instrument Reporting and
Browsing*
Progress to-date:
Reporting on DCO-funded Instrument use by Projects and Field Studies
Referencing DCO Instrument use within Grant Summary Reports
within Instrument grants and related project/field study grants
Future work: The Instrument Browser
Dynamically generated instrument list and instrument summary pageA faceted search interface for instrumentsInstrument discovery based on nature of use, data collected, projects and point of contact* Outcome from the DCO Data Science day at RPI in 2014!!!
Slide19Future Work: Deep Carbon Science Trend Analysis
Natural Language Processing (NLP) based analysis of Deep Carbon publication corpus
Extracts entities and relations from the corpus
Constructs a Deep Carbon Knowledge Base consisting of unified entities and relations
Provides structured knowledge for downstreaming applications and analysis
Includes retrieval of authoritative metadata into DCO Knowledge Graph
Includes Deep Carbon Science Visualization Dashboard