/
Biomedical articles per year Biomedical articles per year

Biomedical articles per year - PowerPoint Presentation

trish-goza
trish-goza . @trish-goza
Follow
375 views
Uploaded On 2016-05-26

Biomedical articles per year - PPT Presentation

2 43 Questions of biomedical experts Are there any DNMT3 proteins present in plants Yes Yes The plant DOMAINS REARRANGED METHYLTRANSFERASE2 DRM2 is a homolog of the mammalian de novo ID: 335829

task bioasq biomedical histone bioasq task histone biomedical answer http answers challenge methyltransferases questions experts results acid measures snippets articles ideal

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Biomedical articles per year" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1
Slide2

Biomedical articles per year

2

/43Slide3

Questions of biomedical experts

“Are there any DNMT3 proteins present in plants

?”

Yes”

Yes. The plant DOMAINS REARRANGED METHYLTRANSFERASE2 (DRM2) is a homolog of the mammalian de novo methyltransferase DNMT3. DRM2 contains a novel arrangement of the motifs required for DNA methyltransferase catalytic activity.”

Yes/No question

Exact Answer

Ideal Answer

3

/43Slide4

Questions of biomedical experts

“What

is the methyl donor of DNA (cytosine-5)-

methyltransferases

?

“S-adenosyl-L-methionine”“S-adenosyl-L-methionine (AdoMet, SAM) is the methyl donor of DNA (cytosine-5)-

methyltransferases. DNA (cytosine-5)-methyltransferases catalyze the transfer of a methyl group from S-adenosyl-L-methionine to the C-5 position of cytosine residues in DNA.”

Factoid question

Exact Answer

Ideal Answer

4

/43Slide5

Questions of biomedical experts (III)

List question

In 1955, the production of

itaconic

acid was firstly described for

Ustilago maydis. Some Aspergillus species, like A. itaconicus and A. terreus, show the ability to synthesize this organic acid and A. terreus can secrete significant amounts to the media. Itaconic acid is mainly supplied by biotechnological processes with the fungus Aspergillus

terreus. Cloning of the cadA gene into the citric acid producing fungus A. niger showed that it is possible to produce itaconic acid also in a different host organism.”“Aspergillus terreus”, “Aspergillus niger”, “Ustilago maydis” Exact AnswerIdeal Answer“Which species may be used for the biotechnological production of itaconic acid?”

5/43Slide6

Questions of biomedical experts (III)

Summary question

Histone

methyltransferases

(HMTs) are responsible for the site-specific addition of covalent modifications on the histone tails, which serve as markers for the recruitment of chromatin organization complexes. There are two major types of HMTs: histone-lysine N-

Methyltransferases and histone-arginine N-methyltransferases. The former methylate specific lysine (K) residues such as 4, 9, 27, 36, and 79 on histone H3 and residue 20 on histone H4. The latter methylate arginine (R) residues such as 2, 8, 17, and 26 on histone H3 and residue 3 on histone H4. Depending on what residue is modified and the degree of methylation (mono-, di- and tri-methylation), lysine methylation of histones is linked to either transcriptionally active or silent chromatin.”

-Exact AnswerIdeal Answer“How do histone methyltransferases cause histone modification?”6/43Slide7

7

/43Slide8

Finding relevant snippets

8

/43Slide9

Not only texts: ontologies, linked data, …

9

/43Slide10

10

/43Slide11

Information from structured data

List question

http://www.disease-ontology.org/api/metadata/DOID:162 (cancer)

http://www.uniprot.org/uniprot/M3K8_RAT (TPL2 synonym)

Subject: http://www4.wiwiss.fu-berlin.de/diseasome/resource/diseases/3003 (lung

cancer)

Predicate: http://www4.wiwiss.fu-berlin.de/diseasome/resource/diseasome/associatedGene

Object: http://www4.wiwiss.fu-berlin.de/diseasome/resource/genes/TPL2"Related RDF tripleRelated concepts“Which forms of cancer is the Tpl2 gene associated with?”11/43Slide12

12

/43Slide13

BioASQ Vision

Make sure this knowledge is used to the benefit of patientsNeed to make it

accessible

to biomedical experts

Search is not effective enoughPush research in automated answering of questions

A challenge for such systems can achieve a multiplying effect13/43Slide14

What

is BioASQ?

A

challenge

funded by the European Union

(FP7). Task a: Hierarchical text classification

Organizers distribute new unclassified PubMed articles.Participants assign MeSH terms to the articles.

Evaluation based on annotations of PubMed curators.Task b: IR, QA, summarization, …Organizers distribute English biomedical questions.Participants provide: relevant articles, snippets, concepts, triples, “exact” answers, “ideal” answers. Evaluation: both automatic (GMAP, MRR, ROUGE etc.) and manual (by biomedical experts).

14

/43Slide15

Task

b

The challenge

15

/43

Task

aSlide16

16

/43Slide17

Behind the scenes

17

/43Slide18

BioASQ Platform

18

/43Slide19

Datasets

Task b

data contain

gold articles, snippets, concepts, triples, “exact” and “ideal” answers

prepared by biomedical experts from around Europe.

Task a1st challenge2nd

challengeTraining10,876,00412,628,968Test8349071950Task b1st challenge2nd challengeTraining29310Test281500

19/43Slide20

Data sources

They

include both

text and structured

info.

PubMed abstracts, PubMed Central articles, MeSH.

Gene Ontology, UniProt, Jochem, Disease Ontology.20/43Slide21

Annotation: questions and queries

21

/43Slide22

Annotation: snippets

22

/43Slide23

Annotation: answers

23

/43Slide24

Assessment: relevance of material

24

/43Slide25

Assessment: information in answers

25

/43Slide26

BioASQ social network

26

/43Slide27

Oracle

27

/43Slide28

Oracle

28

/43Slide29

Two cycles

March 2013 June 2013 August 2013 September 2013

2013 Schedule

February 2014 March 2014 May 2014 September 2014

2014 ScheduleThe official challenge is over, but…Task a continues to run each week .An oracle for task b

will be available soon.

Oracles will remain available.

Third cycle

is being designed …

29

/43Slide30

Challenge participants so far

30

/43Slide31

Challenge participants in each cycle

31

/43Slide32

Evaluation measures

Task a:

Hierarchical text classification

Flat measures for multi-label classification:

Accuracy,

MiF, MaF, EBFHierarchical measures: LCA-F (new), HF

Task b: IR, QA, summarization, …Phase A: standard IR measures, mean precision, mean recall, mean F-measure, MAP (used for winners selection), G-MAPPhase B:‘Exact answers’ (based on type): accuracy (yes/no), strict/lenient accuracy,

MRR (factoid), mean F-measure (list)‘Ideal answers’: manual scores from the experts {Readability, Repetition, Information Precision and Recall}, plus ROUGE32/43Slide33

First year technology/results overview

Task 1a

Mainly

SVMs and

learning-to-rank.Mostly flat classification, ignoring class taxonomy.

Mediocre results by hierarchical methods.One of the systems outperformed NLM’s system.Task 1bPhase A (retrieve relevant documents, concepts, snippets, triples

): low performance (compared to baselines).Phase B (formulate ‘exact’ and ‘ideal’ answers): poor performance for ‘exact’ answers (except for yes/no questions); high performance for ‘ideal’ answers (paragraph-sized summaries), but starting with gold documents, snippets etc.Large scope for improvements, esp. in Task 1b.33/43Slide34

“Exact” answer results (batch 2/3)

34

/43Slide35

“Ideal” answer results (batch 2/3)

35

/43Slide36

Results –

task a – flat measures

36

/43Slide37

Results – task a – hierarchical

37

/43Slide38

First challenge prizes

38

/43Slide39

Sustainability

BioASQ OracleSoftware release and

installation

instructions

Benchmark datasets BioASQ social network

Involvement of the biomedical community in the processAttracting sponsors for prizes

Making the challenge viable, at very low cost, after the end of the project39/43Slide40

Project Consortium

National

Centre for Scientific Research “

Demokritos

” -

NSCR “D” (EL)

Transinsight GmbH – TI (D)Universite

Joseph Fourier- UJF (F)University Leipzig - ULEI (D)Universite Pierre et Marie Curie Paris 6 – UPMC (F)Athens University of Economics and Business – Research Centre – AUEB-RC (EL)40/43Slide41

Project Consortium

41

/43Slide42

Get in touch!

BioASQ workshop @CLEF

(Sheffield, Sept 14)

Visit

www.bioasq.org

Follow

@BioASQ

42/43Slide43

Useful Links

BioASQ Annotation & assessment tools:http

://at.bioasq.org

/

http://assess.bioasq.org/

https://github.com/AKSW/BioASQ-ATBioASQ social network: http://sn.bioasq.org

/https://github.com/AKSW/BioASQ-SNBioASQ platform: http://bioasq.lip6.fr/BioASQ Oracles: http

://bioasq.lip6.fr/oracle/43/43A. Kosmopoulos, I. Partalas, E. Gaussier, G. Paliouras, I. Androutsopoulos, Evaluation Measures for Hierarchical Classification: a unified view and novel approaches. Data Mining and Knowledge Discovery (To appear)