ICSTI March 2012 Data and the Scientific Article Researchers perceive data sets as important but hard to access Publishing Research Consortium 2010 Researchers N 3824 Important but hard to access ID: 400736
Download Presentation The PPT/PDF document "Moving forward our shared data agenda: ..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Moving forward our shared data agenda: a view from the publishing industry
ICSTI, March 2012Slide2
Data and the Scientific Article
Researchers perceive data sets as “important, but hard to access”
Publishing
Research Consortium,
2010
Researchers, N = 3824
Important, but hard to accessSlide3
Overview: Data & the Scientific Article
Current approaches
Thoughts for the futureSlide4
Supplementary Material
Authors can upload Supplementary Material with their paper
Pro’s
Coupling of data and article
Peer review
Citation mechanism
Preservation (byte-wise)
Con’s
Limited data type support
Compatibility (format support)
Limited capacity
Data not centrally storedSlide5
Connecting with Data Repositories, 1
Link to CCDC database
(indicates that information for this
article is available)
Screenshot of journal article on
ScienceDirect
(http://dx.doi.org/10.1016/j.jfluchem.2009.07.015)
Article Linking example: CCDCSlide6
Connecting with Data Repositories, 2
... clicking on the CCDC logo takes the reader to a page at the CCDC repository with data related to the article
Screenshot of information page at CCDC (Cambridge Crystallographic Data Centre)
Article Linking example: CCDCSlide7
Connecting with Data Repositories, 3
Tagged
Genbank
entry
(genetic sequence)
Screenshot of journal article on
ScienceDirect
(http://dx.doi.org/10.1016/j.biortech.2010.03.063 )
Entity Linking example:
Genbank
Accession NumberSlide8
Connecting with Data Repositories, 4
... clicking on the linked
Genbank
accession code takes the reader to an information page on the NCBI data repository about that specific genetic sequence
Screenshot of information page at NCBI (National Center for Biotechnology Information)
Entity Linking example:
Genbank
Accession NumberSlide9
Connecting with Data Repositories, 5
Database
Subject
Type of Linking
CCDCCrystallographyArticle-levelPANGAEAEarth SciencesArticle-level*
EMBL Molecular InteractionsChemistryEntity, tagging
Molecular INTeraction DBChemistryEntity, tagging
GenbankNucleotides
Entity
, tagging
UniProt
Proteins
Entity,
t
agging
Protein
Data Bank
Proteins
Entity
, tagging
ClinicalTrials
MedicineEntity, tagging
TAIR (Arabidopsis
)
Model organism
Entity
, tagging
Mendelian
Inheritance in Men
Genetics, inheritance
Entity,
t
agging
*: with ApplicationSlide10
The Article of the FutureSlide11
Discovery and Use via SciVerse Applications
Use information from
SciVerse
and the web
Support for rich user interfaces
Integrated directly into the online article
Simple to build using Content and Framework APIs
Open standards (Apache Shindig, Open Social)
Features & BenefitsSlide12
Discovery and Use via SciVerse Applications
Libraries
can become focal point for applications
Researchers
can save time and improve their
information discovery process
“Apps interacting with results are very important to help save time…”
Specific information can be
targeted by applications to facilitate content mining
and speed up the search time, utilising more time for analysis.
“what faculty is really after is something that ties this altogether, so its all in one place…”
Applications assist researchers to
extract all information
– content, data, figures etc. to a single analysis source which can be on a local database at the customer’s institute.Slide13
Applications example: NCBI Genome Viewer
Scans the article and builds list of sequences based on NCBI accession numbers tagged in the article
View/analyze sequence data from genes in the article using NCBI Sequence Viewer
See specific information about each strand; zoom in/out; export data
Screenshots of journal article on
ScienceDirect
(http://dx.doi.org/10.1016/j.ygeno.2007.07.010)Slide14
Applications example: PANGAEA
Document identifier sent to PANGAEA data repository for earth sciences
PANGAEA returns map plotted with locations where cited data was collected
Push-pins open with details of dataset and direct link to data on PANGAEA.de
Screenshots of journal article on
ScienceDirect
(http://dx.doi.org/10.1016/S0377-8398(01)00044-5)Slide15
Elsevier Enables Content Mining
CONTENT
Customers may:
Run extensive searches
and use locally loaded content for text mining purposes for their own research.
Perform extensive mining operations
on subscribed content
.
Structuring input text
Deriving patterns within the structured text
Evaluation and interpretation of the output.
Extract semantic entities
from Elsevier content for the purpose of recognition and classification of the relations between them
Integrate results
on a server used for the customer’s own mining system for access and use by its researchers through the customer’s internal secure network.
Enabling developers
who wish to design and implement applications to analyse our content, or test applications as part of their research within Elsevier content Slide16
Our Content Mining Solution Suite
CONTENT
DELIVERY
SEARCH &
WORKFLOW
SOLUTIONS
ANALYSISSlide17
Current initiative overview
Supplementary Material
Linking to Data Repositories
Presentation via
Article of the FutureDiscovery and Use via SciVerse ApplicationsEmpower scientists to mine content and use locally
***************************Data store (600 terrabytes as present)Executable papersWorkflow toolsEtc.Slide18
Conclusions: some thoughts for the future
RESEARCHERS
FUNDERS
PUBLISHERS
INSTITUTIONS
Need for aligned strategies and policies, sustainable business models, and concerted collaboration