Yiming Zhang 01272012 Biological questions What is the retainedconstitutive introns distribution in gene level What is the number of retainedconstitutive introns per gene How many retained constitutive ID: 247200
Download Presentation The PPT/PDF document "Machine Learning group meeting" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Machine Learning group meeting
Yiming
Zhang
01/27/2012Slide2
Biological questions
What is the retained/constitutive
introns
distribution in gene level?
What is the number of retained/constitutive
introns
per gene?
How many retained /constitutive
introns
are near 5’ end or 3’ end of genes?Slide3
Biological questions
Which genes have all
introns
retained/ constitutively spliced?
What is the function of genes which are always alternatively/constitutively spliced? (GO analysis)Slide4
Dataset and model
Arabidopsis
TAIR 10 genome sequence and TAIR 10 gff3 annotation.
After removing redundancy and the
introns
or their flanking
exons
which are shorter than 20
nt
,
126,064
introns
and 21,091 genes
have been used.
AT specific model (trained by RF with 200 trees) has been used to
make prediction.Slide5
Preliminary results
Criteria
Intron
R
num
Cons.
intron
num
IntronR
per gene
Cons.
Intron
per gene
All
43700
82364
2.072
3.905
P>=0.7
4444
4814
0.211
0.228
P>=0.8
2072
2129
0.098
0.101
P>=0.9
484
1145
0.023
0.054Slide6
Preliminary results
Intron
num in 5’UTR
Intron
num in CDS
Intron
num in 3’UTR
All
IntronR
3463
38901
1336
Cons.
1319
80591
454
P>0.7
IntronR
456
3761
227
Cons.
38
4755
21
P>0.8
IntronR
156
1857
59
Cons.
21
2097
11
P>0.9
IntronR
67
405
12
Cons.
11
1134
0Slide7
Preliminary results
Chr1
Chr2
Chr3
Chr4
Chr5
All
IntronR
11844
6663
8351
6875
9967
Cons.
22148
12394
15920
12547
19355
P>0.7
IntronR
1101
720
914
710
999
Cons.
1252
766
932
725
1139
P>0.8
IntronR
503
362
394
356
457
Cons.
545
332
413
325
514
P>0.9
IntronR
113
91
87
97
96
Cons.
286
182
228
175
274