/
Fig.1.Transmembranehelix. Fig.1.Transmembranehelix.

Fig.1.Transmembrane helix. - PDF document

pasty-toler
pasty-toler . @pasty-toler
Follow
388 views
Uploaded On 2015-08-06

Fig.1.Transmembrane helix. - PPT Presentation

Overtheyearsanumberofdi erentmethodshavebeendevelopedforpredictingthetopologyofTMHproteinsIngeneralthesemethodsneedtopredictthefollowingitemsithetypeofeachresidueeghelixloopetciitheT ID: 101486

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "Fig.1.Transmembrane helix." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Fig.1.Transmembrane helix. Overtheyears,anumberofdi erentmethodshavebeendevelopedforpre-dictingthetopologyofTMHproteins.Ingeneral,thesemethodsneedtopredictthefollowingitems:(i)thetypeofeachresidue(e.g.,helix,loop,etc.),(ii)theTMHsegments,and(iii)theirorientation.Thevariousmethodsdevelopeddi eronthenumberofdistinctstepsthattheyusetopredicttheaboveitems.Somemethodspredicteachitemindividually,othersutilizepredictorsthatcombinesomeofthesesteps,andotherspredictallthreeitemsinasinglestep.Theresiduetypesarepredictedbyeitherrelyingonthefactthatmembranesegmentscon-tainprimarilyhydrophobicresidues(e.g.,TopPred[29])orbyutilizingmachine-learningapproaches(e.g.,neuralnetworks,supportvectormachines)usingasfeaturestheaminoacidsequenceoftheproteinorevolutionaryinformationintheformofsequencepro les(e.g.,PHDhtm[25],MEMSAT3[10],SVMTop[20]).Thesegmentsareidenti edusingsimplehydrophobicityplots[16]toascertainprobablehelicalsegmentsandthenemployvariousrulesbasedontheexpectedlengthsoftheTMHsegmentstoeitheraccept,reject,orbreaklongsegments[29,20,32].ThesegmentorientationisoftendeterminedbyrelyingonthefactthattheregionsbetweenTMHsegmentsthatarepositivelychargedtendtoresideintheintracellularregionsofthemembrane(positive-insiderule[29]).Theapproachesthatcombinesegmentidenti cationwithorientationdetermi-nation(e.g.,MEMSAT3)employdynamicprogrammingmethodstodeterminethedi erentsegmentsofaTMHproteinanditsorientationrelativetothecy-toplasm.Finally,theapproachesthatpredictalloftheaboveitemsinasinglesteputilizehiddenMarkovmodels(HMM)thatcapturethedi erentstructuralcomponentsofaTMHprotein(e.g.,TMHsegment,insideloop,outsideloop,signalpeptide,etc.)asseparatemodules.Thesemodelsaretrainedoneithertheaminoacidsequenceoftheproteins(e.g.,TMHMM[26]andHMMTOP[27])oronsequencepro les(e.g.,Phobius[17])andpredictthetopologyofaTMHproteinbydeterminingitsmostprobablepaththroughthatmodelusingViterbidecoding[22].ThispaperfocusesonimprovingtheaccuracyofHMM-basedapproachesbycombiningthemwithanSVM-basedapproachthatpredictsthetypesofeach residue.Speci cally,wedevelopedaTMHtopologypredictionalgorithm,calledTOPTMH,thatsolvestheresidue-typeprediction,segmentidenti cation,andorientationdeterminationinthreedistinctsteps.ThetypeofeachresidueisannotatedviaanSVM-basedapproachutilizingawindow-basedencodingoftheresidues'pro leinformationandasecondorderexponentialkernelfunction[24,23,12].Thesegmentsareidenti edbyusingapairofHMMsthatmodelthedi erentstructuralcomponentsofTMHproteins.The rstHMMusesasinputtheSVMpredictionsforeachresidue,whereasthesecondHMMusesasinputhydropathyinformationasmeasuredbyarecentlyintroducedhydrophobicityscale[8].Finally,theorientationofthepredictedsegmentsisdeterminedbyapplyingthepositive-insiderule.Theadvantagesofthisapproacharethree-fold.First,byusingadiscrimi-nativeapproachtolearnaresidue-typepredictionmodel,theaccuracyofthesepredictionsarehigherthanthoseobtained(indirectly)bytheHMMmodel.Sec-ond,byencodingtheproteinsequencesviatheSVMpredictions,whosesignalissigni cantlyhigherthanthatoftherawsequencepro le,thedemandsimposedduringHMMparameterestimationaresubstantiallyreducedallowingittobet-terfocusonlearninghowtocorrectlyidentifythedi erentsegments.Third,bycombiningtheoutputsoftheHMMmodelstrainedontheSVMpredictionsandonthehydrophobicityscores,itallowsTOPTMHtocorrectlyidentifytheTMHsegmentsthathaveanaminoacidcompositionthatissimilartothatofsignalpeptides.WeexperimentallyevaluatedtheperformanceofTOPTMHonthreewidelyuseddatasets.Ourevaluationwasperformedintwophases.First,weevaluatedthegainsobtainedbyTOPTMHbycomparingitagainstanapproachthatusesarule-basedschemetoidentifytheTMHsegmentsfromtheSVMpre-dictionsandanotherthatusesjustasingleHMMmodeltrainedontheSVMpredictions.OurevaluationshowedthattheHMM-basedsegmentidenti cationoutperformstherule-basedapproachbyatleast50%intermsoftheQokscore,whichmeasuresper-segmentaccuracy,andthatbycombiningboththeSVM-andthehydrophobicity-basedHMMmodels,afurther3%{19%improvementscanbeobtained.Second,weevaluateditsperformancebycomparingitagainstPhobius[17]andMEMSAT3[10].OurevaluationshowedthatTOPTMHout-performsbothofthemacrossthedi erentdatasets.Wealsoevaluatedtheper-formanceofTOPTMHonanindependentstaticbenchmark[14].TheresultsonthisblindevaluationshowedthatTOPTMHachievesthehighestscoresonhigh-resolutionsequences(Q2scoreof84%andQokscoreof86%)againstexistingstate-of-the-artsystemswhileachievinglowsignalpeptideerror.2BackgroundandDe nitions2.1TransmembraneHelicalProteinsThestructureofatypicalTMHproteinisshowninFigure1.Itconsistsofaseriesofhelicalsegmentspassingthroughthecell'smembrane(bilipidlayer)separatedbyloopsegmentsthatareeitherontheinsideortheoutsidesideof 3.1ResidueAnnotationStepWedevelopedanSVM-basedTMHresidueannotationapproachthatusesfea-turesobtainedfromtheprotein'sPSSM.ItsoverallstructureissimilartothatusedbyexistingmethodsforSVM-basedstructuralandfunctionalannotationofproteinresiduesusingpositionspeci cscoringmatrices(e.g.,secondarystruc-tureforglobularproteins[12],solventaccessiblesurfacearea[24],disorderpre-diction[24],andDNA-binding[24]).TOPTMHformulatestheresidueannotationproblemasabinaryclassi -cationproblemwhosegoalistopredictifaresiduebelongstoahelixstateornot.ForeachresidueiofaproteinsequenceX,theinputtotheSVMisa(2w+1)-lengthsubsequence(wmer)ofXcenteredatpositioni.Eachwmerisrepresentedbyavectorxioflength(2w+1)20thatisobtainedbyconcatenat-ingtherowsofthePSSMforeachpositionofthewmer.Thiswmer-basedinputisusedforbothtrainingandprediction.Theparameterwdeterminesthelengthofthelocalenvironmentaroundtheithsequencepositionusedwhilebuildingandapplyingthemodelanditsoptimalvalueisdeterminedexperimentally.TOPTMHusesSVMlight[9]tolearntheactualSVMmodelandutilizesthesecondorderexponentialfunction(soe)[12]asitskernelfunction.Thesoekernelhasbeenshowntoproducebetterresultsthanthetraditionalradialbasisfunction(rbf)kernelforvarioussequenceannotationpredictionproblems[12,24,23].Forasequence,thesepredictionsareavailableasawebservicecalledMONSTER1.InthecontextofTOPTMH,thesoekernelfunctionisgivenbyKsoe(xi;yj)=exp 1+K2(xi;yj) p K2(xi;yj)K2(xi;yj)!;(1)wherexiandyjarethevectorrepresentationsoftwowmers,K2isgivenbyK2(xi;yj)=hxi;yji+hxi;yji2;(2)andhxi;yjidenotesthedot-productofthexiandyjvectors.3.2SegmentIdenti cationStepInordertodeterminethebestapproachforidentifyingtheTMHsegmentswedevelopedandstudiedthreedi erentapproaches.The rstapproachutilizesasimpleschemebasedonempiricalrulesandtheothertwopredictthetopologybyemployinghiddenMarkovmodels(HMM)[22].The rstHMM-basedapproachusesasingleHMMbasedsolelyontheSVMscores,whereasthesecondusestwoHMMs|onebasedonSVMscoresandonebasedonhydrophobicityscales.Rule-BasedTherule-basedsegmentidenti cationapproachpost-processestheSVM-basedresidueannotationsandidenti esthesegmentsbyapplyingsome 1http://bio.dtc.umn.edu/monster Fig.2.ThelayoutoftheHMMmodelusedinTOPTMH. TheHMMmodelswerebuiltusingtheUMDHMM[11]package(version1.02),whichwasmodi edtotakeasinputannotatedproteinsequences.ThethreadingofasequencethroughtheHMMmodelwasdoneusingtheViterbi[22]algorithm.HMMBasedonSVMScores(HMM-SVM).ThisapproachbuildsanHMMmodelthatonlytakesintoaccounttheper-residueSVMscoresproducedbytheannotationstep.Toconstructthetrainingset,theSVMscoreforeachresidueiscomputed.Since,HMMsareprimarilydesignedtooperateon nitesizeal-phabets,therawSVMscoresarediscretizedintoa nitenumberofbinswitheachbincorrespondingtoadistinctsymbol.The naltrainingsetfortheHMMcorrespondstoasetofproteinswithknownTMHtopologyrepresentedasse-quencesofSVM-scorebasedbins.AsimilarSVM-basedpredictionfollowedbydiscretizationisperformedwhenthismodelisusedtopredictthetopologyofatestprotein.WediscretizedtheSVMscoresintoequal-sizeintervals,andassignedallresidueswithscores�3and3intothe rstandlastbin,re-spectively.HMMBasedonSVMScoresandHydrophobicityScores(HMM-SVM+HP).ThismodelbuildsapairofHMMmodels|onebasedonSVMscores(HMM-SVM)andonebasedonthehydrophobicityvalues(HMM-HP)ofknownTMHse-quencesandcombinesthetopologypredictionsfrombothHMMmodels.Thisapproachwasmotivatedbythefactthatincertaincases,theSVM-basedresidueannotationmayfailtoidentifycertainhydrophobicTMHsegments.Thisisfur-therdiscussedinSection5. Table1.DiscretizationofHydrophobicityvalues. LabelsAminoAcidsHPValues 1R,E,K,D2:5h2N,H,P,Q1:0h2:53T,Y,G,S�0:1h0:94F,V,C,A,M,W�0:4h�0:15I,Lh�0:5 HPValuesdenotesarangeofhydrophobicityvaluesdecidedbasedon[8] TheHMM-SVMmodelisidenticaltothatdescribedintheprevioussection.TheHMM-HPmodelisbuiltby rstencodingtheaminoacidsofeachTMHproteinasasequenceofdiscretizedhydrophobicityvalues.Table1showstheschemeusedtodiscretizethehydrophobicityvaluesforeachaminoacid.BoththeHMM-SVMandHMM-HPmodelsareusedindependentlytopredicttheTMHsegments.The nalsetofpredictionsconsistsofthesegmentspredictedbyHMM-SVMandthosesegmentspredictedbyHMM-HPthatdonotoverlapwithanyofthesegmentsofHMM-SVM.Twosegmentsareconsideredtooverlapiftheyhavemorethan veresiduesincommon.SincethisapproachcombinesboththeSVM-andHP-basedHMMmodels,wewillrefertoitasHMM-SVM+HP.3.3OrientationDeterminationStepOncetheTMHsegmentshavebeenidenti ed,theirorientationrelativetotheN-terminusisdeterminedbyapplyingthepositive-insiderule[29]usingthetechniqueintroducedinTHUMBUP[32].Inthisapproach,eachproteinis rstcodedintoabinarysequencebyassigningaonetothe rstproteinresidueandallthearganineandlysineresiduesandazerototheremainingresidues.Then,ascoreiscomputedforeachloopbyaddingthevaluesofits15neighboringresiduesoneachside.Ifthetotalscoreforodd-numberedloopsisgreaterthanorequaltothatofevenloops,theN-terminusisinsidethemembrane,otherwiseitisoutside.4ExperimentalDesign4.1DatasetsWeevaluatedthepredictionperformanceoftheTOPTMHmethodondatasetsusedbythePhobiusandMEMSAT3methodsandbyparticipatingonthestaticbenchmark[13].ThedatasetsobtainedfromthePhobiusstudyincludedasetof247transmembraneproteinsandasetof45transmembraneproteinsthatcontainedsignalpeptideresidueswithtransmembranehelixsegments.Wewilldenotethe rstdatasetasTM-OnlyandthesecondasTM-SP.Thedataset Q%obs2T,Q%prd2T,andQ2.Q%obs2TisthepercentageofobservedTMHresiduesthatarepredictedcorrectly(helixrecall),Q%prd2TisthepercentageofpredictedTMHresiduesthatarepredictedcorrectly(helixprecision),andQ2isthepercentageofcorrectlypredictedresidues(bothhelixandnon-helix).Theper-segmentevaluationmeasurestheabilityofamethodtocorrectlyidentifytheactualTMHsegments.Weusedthreeper-segmentmetricsdenotedbyQ%obshtm,Q%prdhtm,andQok.Q%obshtmisthepercentageofobservedTMHsegmentsthatarepredictedcorrectly(TMHsegmentrecall),Q%prdhtmisthepercentageofpredictedTMHsegmentsthatarepredictedcorrectly(TMHsegmentpreci-sion),andQokisthepercentageofproteinsforwhichalltheTMHsegmentsarepredictedcorrectly.NotethatQokisaverystrictmetricaseachproteincon-tributeseitherazerooranone.Intheabovemetrics,apredictedTMHsegmentisconsideredtobecorrectlyidenti edifthereisanoverlapoftenresiduesbe-tweenthepredictedandobservedhelixsegments2Inaddition,apredictedhelixsegmentiscountedonlyonce.Thisisillustratedbyconsideringthefollowingexamples: Obs1:TTTTTTTTTTTTTTTT------TTTTTTTTTTTTT Pred1:-----TTTTTTTTTTTTTTTTTTTTTTTTTTT--- Obs2:---TTTTTTTTTTTTTTTTTTTTTTTTTTTTTT-- Pred2:TTTTTTTTTTTTTT------TTTTTTTTTTTTTTTInthisexample,Obs1andPred1aretheobservedandpredictedTMHsegmentsforaparticularproteinsequence.Duringevaluation,thesecondsegmentoftheObs1sequencewillnotbeconsideredascorrectlypredicted,sincetheonlyseg-mentpredictedinPred1isalreadyaccountedforinthe rstsegmentoftheObs1sequence.Ontheotherhand,thesecondsegmentofthePred2sequencewillbeconsideredasincorrectlypredictedasthe rstsegmentwillbeconsideredfortheonlysegmentinObs2sequence.Although,theper-residuemeasurescapturetheaccuracyofamethodtopredicttheannotationlabelforaresidue,itisnotabletoassesstheabilityofthemethodtoidentifytheTMHsegmentsseparatedbyloopregionsofdi erentlengths.Hence,TMHpredictionalgorithmsaremostlyevaluatedusingper-segmentmetrics.5Results5.1ResidueAnnotationPerformanceTheperformanceachievedbytheSVM-basedresidueannotationfordi erentvaluesofwisshowninTable2.Thistableshowstheper-residueperformancemetrics(Q2,Q%obs2TandQ%prd2T)forasubsetoftheTM-Onlydataset.Weob-servethatintermsofthevariousmetrics,theperformanceachievedfordi erent 2Earliertechniquesusedanoverlapofonlythree[3]or ve[17]residues,whichistooshortandcanarti ciallyin atetheperformanceofascheme. Table3.TMHSegmentIdenti cationPerformance. TM-SPTM-Only Per-ResiduesScores MethodsQ2Q%obs2TQ%prd2TQ2Q%obs2TQ%prd2T Raw-SVM96.7371.1086.6090.6484.3083.10Rule95.1659.5695.8989.1979.6587.36HMM-SVM-D596.2876.3984.8789.4085.5482.25HMM-SVM-D796.4576.8587.7289.3485.6182.23HMM-SVM-D1296.2477.5684.4589.3186.1381.35HMM-SVM-D7+HP97.0884.8088.5089.4686.2182.04 Per-SegmentScores MethodsQokQ%obshtmQ%prdhtmQokQ%obshtmQ%prdhtm RawSVM35.5585.2370.0938.8694.3474.33Rule64.4475.00100.0070.8592.8894.96HMM-SVM-D564.4484.0987.0571.6695.3993.73HMM-SVM-D771.1185.2392.5972.0695.6393.52HMM-SVM-D1260.0085.2285.2270.0495.8092.87HMM-SVM-D7+HP84.4493.1893.1873.6896.1293.33 topredictcorrectlylargecontiguousportionsofeachhelicalsegment.Ontheotherhand,theper-segmentperformanceachievedbytheothersegmentiden-ti cationapproachesareconsiderablyhigher.Boththerule-andHMM-basedapproachesareabletosigni cantlyimproveoverRaw-SVMforboththeTM-SPandTM-Onlydatasets.Amongthem,theapproachesbasedonHMM-SVMout-performtherule-basedapproachby2%{12%,eventhoughthelatterachievedthehighestQ%prdhtmscores(100%and96.44%forTM-SPandTM-Only,respec-tively).TheoverallbestQokresultswereobtainedbytheHMM-SVM-D7+HPap-proach.Inparticular,theQokvaluesachievedbyHMM-SVM-D7+HPare19%and3%betterthanthenextbestperformingscheme(HMM-SVM-D7)ontheTM-SPandTM-Onlydatasets,respectively.ThelargeperformanceadvantageofHMM-SVM-D7+HPoverHMM-SVM-D7ontheTM-SPdatasetareprimar-ilyduetoincreasesinrecall(Q%obshtm).HMM-SVM-D7+HPachievesaQ%obshtmof93.18%comparedtothe85.23%achievedbyHMM-SVM-D7.Apossibleex-planationfortherelativelypoorperformanceofHMM-SVM-D7isthatduetothesignalpeptidesegmentspresentinsomeofthesequencesintheTM-SPdataset,theSVMmodelfailstoidentifysomeoftheTMHresidues.However,theseresiduescanbecorrectlyidenti edwhenhydrophobicityscoresareconsid-ered,andassuchthecombinedHMM-SVM-D7+HPapproachleadstobetteroverallresults. withbothcorrecttopologyandlocationthanTOPTMH(147vs134).Webe-lievethatthisisprimarilyduetothefactthatduetothebinaryclassi cationoftheproteinsequencesinhelixandnon-helixresidues,TOPTMHwasnotabletoe ectivelydi erentiatebetweeninsideandoutsideloopsandthuscouldnotperformsimilartoMEMSAT3.TOPTMHPerformanceontheStaticBenchmark.TheperformanceofTOPTMHonthestaticbenchmarkisshownonTable6.TheTOPTMHresultsshowninthesetablescorrespondtotheresultsobtainedusingtheHMM-SVM-D7+HPtopologypredictionapproach.FromtheseresultsweseethatTOPTMHachievedthehighestQokscoreof86%forthehigh-resolutionsequencesandthehighestQ2scoresof84%and90%forthehigh-andlow-resolutionsequences,respectively.Moreover,TOPTMHhasperformedabout7%betterinTMHpredictionthanbothMEMSAT3andPhobius.NotethateventhoughHMM-TOP2achievedQ%obshtmandQ%prdhtmscoresthatwerehigherthanthecorrespondingscoresachievedbyTOPTMH,itsQokscoreofislowerthanthatachievedbyTOPTMH.ThisisduetothefactthateventhoughHMMTOP2identi edmoreTMHsegmentsintotalthanTOPTMH,itwasnotassuccessfulinpredictingproteinsforwhichalloftheTMHsegmentswereidenti edcorrectly. Table6.TMHBenchmarkResults. HighResolutionAccuracyLowResolutionAccuracy Per-segmentPer-residuePer-segmentPer-residue MethodQokQ%obshtmQ%prdhtmQ2Q%obs2TQ%prd2TQokQ%obshtmQ%prdhtmQ2Q%obs2TQ%prd2T TOPTMH869596847590669288908480PHDpsihtm08849998807683679594898777HMMTOP2839999806989669493908583MEMSAT3809897837888639287888676Phobius809293806984659088908179DAS799996724894399381866585TopPred2759090776483488479887471TMHMM1719090806881729192908380SOSUI718886756674498886887972PHDhtm07698381787682568586878375 ResultsforTOPTMHandMEMSAT3wereobtainedbycollectingpredictionsfortestsetoftheTMHstaticbenchmark[13]andsubmittingtheresultstothebenchmarkserver.Phobius[17]pre-dictionwerecollectedloadingthebenchmarktestsequencestothePhobiuswebserver[13]andsubmittingtheoutputtothebenchmarkserver.AlltheotherresultswereprovidedbytheTMHstaticbenchmarkevaluationweb-site. 15. T.KlabundeandG.Hessler.Drugdesignstrategiesfortargetingg-protein-coupledreceptors.ChemBioChem,3:928{944,2002. 16. J.KyteandR.F.Doolittle.Asimplemethodfordisplayingthehydropathiccharacterofaprotein.JournalofMolecularBiology,157(1):105{132,1982. 17. L.Kll,A.Krogh,andE.L.L.Sonnhammer.Acombinedtransmembranetopologyandsignalpeptidepredictionmethod.JournalofMolecularBiology,338:1027{1036,2004. 18. L.KllandE.L.L.Sonnhammer.Reliabilityoftransmembranepredictionsinwhole-genomedata.FEBSLett.,532(3):415{418,2002. 19. J.LiuandB.Rost.Comparingfunctionandstructurebetweenentireproteomes.ProteinSci.,10:1970{1979,2001. 20. AllanLo,Hua-ShengChiu,Ting-YiSung,Ping-ChiangLyu,andWen-LianHsu.Enhancedmembraneproteintopologypredictionusingahierarchicalclassi cationmethodandanewscoringfunction.JProteomeRes,7(2):487{496,Feb2008. 21. AmitOberai,YungokIhm,SangukKim,andJamesUBowie.Alimiteduniverseofmembraneproteinfamiliesandfolds.ProteinSci,15(7):1723{1734,Jul2006. 22. L.R.Rabiner.Atutorialonhiddenmarkovmodelsandselectedapplicationsinspeechrecognition.InProceedingsoftheIEEE,volume77,pages257{286,1989. 23. HuzefaRangwalaandGeorgeKarypis.frmsdpred:Predictinglocalrmsdbetweenstructuralfragmentsusingsequenceinformation.Proteins,Feb2008. 24. HuzefaRangwala,ChristopherKau man,andGeorgeKarypis.Ageneralizedframeworkforproteinsequenceannotation.InProceedingsoftheNIPSWorkshoponMachineLearninginComputationalBiology,2007. 25. B.Rost,P.Fariselli,andR.Casadio.Topologypredictionforhelicaltransmem-braneproteinsat86accuracy.ProteinSci,5(8):1704{1718,Aug1996. 26. E.L.L.Sonnhammer,G.vonHeijne,andA.Krogh.Ahiddenmarkovmodelforpredictingtransmembranehelicesinproteinsequences.InProceedingsoftheSixthInternationalConferenceonIntelligentSystemsforMolecularBiology,pages175{82,1998. 27. G.E.TusndyaandI.Simon.Principlesgoverningaminoacidcompositionofinte-gralmembraneproteins:applicationtotopologyprediction.JournalofMolecularBiology,283(2):489{506,1998. 28. G.E.TusndyaandI.Simon.Thehmmtoptransmembranetopologypredictionserver.Bioinformatics,17(9):849{850,2001. 29. G.vonHeijne.Membraneproteinstructureprediction.hydrophobicityanalysisandthepositive-insiderule.JournalofMolecularBiology,225(2):487{494,1992. 30. GunnarvonHeijne.Formationoftransmembranehelicesinvivo{ishydrophobicityallthatmatters?TheJournalofgeneralphysiology,129(5):353{356,2007. 31. E.WallinandG.vonHeijne.Genome-wideanalysisofintegralmembraneproteinsfromeubacterial,archaean,andeukaryoticorganisms.ProteinSci,7(4):1029{38,1998. 32. H.ZhouandY.Zhou.Predictingthetopologyoftransmembranehelicalproteinsusingmeanburialpropensityandahidden-markov-model-basedmethod.ProteinSci,12:1547{1555,2003.