Biomedical articles per year

Biomedical articles per year Biomedical articles per year - Start

2016-05-26 32K 32 0 0

Description

2. /43. Questions of biomedical experts. “Are there any DNMT3 proteins present in plants. ?”. “. Yes” . “. Yes. The plant DOMAINS REARRANGED METHYLTRANSFERASE2 (DRM2) is a homolog of the mammalian de novo . ID: 335829 Download Presentation

Embed code:
Download Presentation

Biomedical articles per year




Download Presentation - The PPT/PDF document "Biomedical articles per year" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.



Presentations text content in Biomedical articles per year

Slide1

Slide2

Biomedical articles per year

2

/43

Slide3

Questions of biomedical experts

“Are there any DNMT3 proteins present in plants?”

“Yes”

“Yes. The plant DOMAINS REARRANGED METHYLTRANSFERASE2 (DRM2) is a homolog of the mammalian de novo methyltransferase DNMT3. DRM2 contains a novel arrangement of the motifs required for DNA methyltransferase catalytic activity.”

Yes/No question

Exact Answer

Ideal Answer

3

/43

Slide4

Questions of biomedical experts

“What is the methyl donor of DNA (cytosine-5)-methyltransferases?”

“S-adenosyl-L-methionine”

“S-adenosyl-L-methionine (AdoMet, SAM) is the methyl donor of DNA (cytosine-5)-methyltransferases. DNA (cytosine-5)-methyltransferases catalyze the transfer of a methyl group from S-adenosyl-L-methionine to the C-5 position of cytosine residues in DNA.”

Factoid question

Exact Answer

Ideal Answer

4

/43

Slide5

Questions of biomedical experts (III)

List question

“In 1955, the production of itaconic acid was firstly described for Ustilago maydis. Some Aspergillus species, like A. itaconicus and A. terreus, show the ability to synthesize this organic acid and A. terreus can secrete significant amounts to the media. Itaconic acid is mainly supplied by biotechnological processes with the fungus Aspergillus terreus. Cloning of the cadA gene into the citric acid producing fungus A. niger showed that it is possible to produce itaconic acid also in a different host organism.”

“Aspergillus terreus”, “Aspergillus niger”, “Ustilago maydis”

Exact Answer

Ideal Answer

“Which species may be used for the biotechnological production of itaconic acid?”

5

/43

Slide6

Questions of biomedical experts (III)

Summary question

“Histone methyltransferases (HMTs) are responsible for the site-specific addition of covalent modifications on the histone tails, which serve as markers for the recruitment of chromatin organization complexes. There are two major types of HMTs: histone-lysine N-Methyltransferases and histone-arginine N-methyltransferases. The former methylate specific lysine (K) residues such as 4, 9, 27, 36, and 79 on histone H3 and residue 20 on histone H4. The latter methylate arginine (R) residues such as 2, 8, 17, and 26 on histone H3 and residue 3 on histone H4. Depending on what residue is modified and the degree of methylation (mono-, di- and tri-methylation), lysine methylation of histones is linked to either transcriptionally active or silent chromatin.”

-

Exact Answer

Ideal Answer

“How do histone methyltransferases cause histone modification?”

6

/43

Slide7

7

/43

Slide8

Finding relevant snippets

8

/43

Slide9

Not only texts: ontologies, linked data, …

9

/43

Slide10

10

/43

Slide11

Information from structured data

List question

http://www.disease-ontology.org/api/metadata/DOID:162 (cancer) http://www.uniprot.org/uniprot/M3K8_RAT (TPL2 synonym)

Subject: http://www4.wiwiss.fu-berlin.de/diseasome/resource/diseases/3003 (lung cancer)Predicate: http://www4.wiwiss.fu-berlin.de/diseasome/resource/diseasome/associatedGeneObject: http://www4.wiwiss.fu-berlin.de/diseasome/resource/genes/TPL2"

Related RDF triple

Related concepts

“Which forms of cancer is the Tpl2 gene associated with?”

11

/43

Slide12

12

/43

Slide13

BioASQ Vision

Make sure this knowledge is used to the benefit of patientsNeed to make it accessible to biomedical expertsSearch is not effective enoughPush research in automated answering of questionsA challenge for such systems can achieve a multiplying effect

13

/43

Slide14

What is BioASQ?

A challenge funded by the European Union (FP7).

Task a: Hierarchical text classificationOrganizers distribute new unclassified PubMed articles.Participants assign MeSH terms to the articles.Evaluation based on annotations of PubMed curators.

Task b: IR, QA, summarization, …Organizers distribute English biomedical questions.Participants provide: relevant articles, snippets, concepts, triples, “exact” answers, “ideal” answers. Evaluation: both automatic (GMAP, MRR, ROUGE etc.) and manual (by biomedical experts).

14

/43

Slide15

Task

b

The challenge

15

/43

Task

a

Slide16

16

/43

Slide17

Behind the scenes

17

/43

Slide18

BioASQ Platform

18

/43

Slide19

Datasets

Task b data contain gold articles, snippets, concepts, triples, “exact” and “ideal” answers prepared by biomedical experts from around Europe.

Task a1st challenge2nd challengeTraining10,876,00412,628,968Test8349071950

Task b1st challenge2nd challengeTraining29310Test281500

19

/43

Slide20

Data sources

They

include both text and structured info.PubMed abstracts, PubMed Central articles, MeSH.Gene Ontology, UniProt, Jochem, Disease Ontology.

20

/43

Slide21

Annotation: questions and queries

21

/43

Slide22

Annotation: snippets

22

/43

Slide23

Annotation: answers

23

/43

Slide24

Assessment: relevance of material

24

/43

Slide25

Assessment: information in answers

25

/43

Slide26

BioASQ social network

26

/43

Slide27

Oracle

27

/43

Slide28

Oracle

28

/43

Slide29

Two cycles

March 2013 June 2013 August 2013 September 2013

2013 Schedule

February 2014 March 2014 May 2014 September 2014

2014 Schedule

The official challenge is

over, but…Task a continues to run each week .An oracle for task b will be available soon.Oracles will remain available.Third cycle is being designed …

29/43

Slide30

Challenge participants so far

30

/43

Slide31

Challenge participants in each cycle

31

/43

Slide32

Evaluation measures

Task a: Hierarchical text classificationFlat measures for multi-label classification: Accuracy, MiF, MaF, EBFHierarchical measures: LCA-F (new), HF

Task b: IR, QA, summarization, …Phase A: standard IR measures, mean precision, mean recall, mean F-measure, MAP (used for winners selection), G-MAPPhase B:‘Exact answers’ (based on type): accuracy (yes/no), strict/lenient accuracy, MRR (factoid), mean F-measure (list)‘Ideal answers’: manual scores from the experts {Readability, Repetition, Information Precision and Recall}, plus ROUGE

32

/43

Slide33

First year technology/results overview

Task 1aMainly SVMs and learning-to-rank.Mostly flat classification, ignoring class taxonomy.Mediocre results by hierarchical methods.One of the systems outperformed NLM’s system.Task 1bPhase A (retrieve relevant documents, concepts, snippets, triples): low performance (compared to baselines).Phase B (formulate ‘exact’ and ‘ideal’ answers): poor performance for ‘exact’ answers (except for yes/no questions); high performance for ‘ideal’ answers (paragraph-sized summaries), but starting with gold documents, snippets etc.Large scope for improvements, esp. in Task 1b.

33

/43

Slide34

“Exact” answer results (batch 2/3)

34

/43

Slide35

“Ideal” answer results (batch 2/3)

35

/43

Slide36

Results – task a – flat measures

36

/43

Slide37

Results – task a – hierarchical

37

/43

Slide38

First challenge prizes

38

/43

Slide39

Sustainability

BioASQ OracleSoftware release and installation instructionsBenchmark datasets BioASQ social networkInvolvement of the biomedical community in the processAttracting sponsors for prizes

Making the challenge viable, at very low cost, after the end of the project

39

/43

Slide40

Project Consortium

National

Centre for Scientific Research “Demokritos” -NSCR “D” (EL)Transinsight GmbH – TI (D)Universite Joseph Fourier- UJF (F)University Leipzig - ULEI (D)Universite Pierre et Marie Curie Paris 6 – UPMC (F)Athens University of Economics and Business – Research Centre – AUEB-RC (EL)

40

/43

Slide41

Project Consortium

41

/43

Slide42

Get in touch!

BioASQ workshop @CLEF (Sheffield, Sept 14)Visit www.bioasq.orgFollow @BioASQ

42

/43

Slide43

Useful Links

BioASQ Annotation & assessment tools:http://at.bioasq.org/http://assess.bioasq.org/https://github.com/AKSW/BioASQ-ATBioASQ social network: http://sn.bioasq.org/https://github.com/AKSW/BioASQ-SNBioASQ platform: http://bioasq.lip6.fr/BioASQ Oracles: http://bioasq.lip6.fr/oracle/

43/43

A. Kosmopoulos, I. Partalas, E. Gaussier, G. Paliouras, I. Androutsopoulos

,

Evaluation

Measures for Hierarchical Classification: a unified view and novel

approaches.

Data Mining and Knowledge

Discovery (To appear)

Slide44

Slide45

Slide46

Slide47

Slide48

Slide49

Slide50

Slide51

Slide52

Slide53


About DocSlides
DocSlides allows users to easily upload and share presentations, PDF documents, and images.Share your documents with the world , watch,share and upload any time you want. How can you benefit from using DocSlides? DocSlides consists documents from individuals and organizations on topics ranging from technology and business to travel, health, and education. Find and search for what interests you, and learn from people and more. You can also download DocSlides to read or reference later.