2whicharemuchclosertothequenchedresultsthantheannealedonesusingthismethodThispaperisorganizedasfollowsInSecIIwebrieryreviewtheRNAsecondarystructureandintroducethegeneralRNAfoldingproblemwithseque ID: 607242
Download Pdf The PPT/PDF document "Quanticationofthedierencesbetweenquenc..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
QuanticationofthedierencesbetweenquenchedandannealedaveragingforRNAsecondarystructuresTsunglinLiuandRalfBundschuhDepartmentofPhysics,OhioStateUniversity,191WWoodruAv.,ColumbusOH43210-1117Theanalyticalstudyofdisorderedsystemisusuallydicultduetothenecessitytoperformaquenchedaverageoverthedisorder.Thus,onemayresorttotheeasierannealedensembleasanapproximationtothequenchedsystem.InthestudyofRNAsecondarystructures,weexplicitlyquantifythedeviationofthisapproximationfromthequenchedensemblebylookingatthecorrela-tionsbetweenneighboringbases.Thisquantieddeviationthenallowsustoproposeaconstrainedannealedensemblewhichpredictsphysicalquantitiesmuchclosertotheresultsofthequenchedensemblewithoutbecomingtechnicallyintractable.PACSnumbers:87.14.Gg,87.15.v,05.70.FhI.INTRODUCTIONHeteropolymerfoldingisofcrucialsignicanceinmolecularbiology.Itisthebasisforthemechanismwithwhichcellscanproducethreedimensionalbuild-ingblocksoutoftheone-dimensionalinformationstoredintheirgenome.Cellsachievethisbyforming(stillone-dimensional)polymers(proteinsandRNA)bystringingtogetherdierentmonomerswithcovalentbonds.Allmonomersshareacompatiblebackbonebuttheyhavedierentsidechainsandoccurinapredenedorderalongthesequence.Physicalinteractionsbetweenthesemonomersforcethepolymertostablyfoldintoathreedimensionalstructure.Thisstructureiscrucialforthefunctionofthemolecule;itisdeterminedbythespecicsequenceofthepolymer[1{4].Inadditiontoitsbiologicalrelevance,heteropolymerfoldingisalsoaveryinterestingproblemofstatisticalmechanics[5{17].Thecompetitionbetweenthecongu-rationalentropyofthepolymer,theoveralltendencyofthemonomerstosticktoeachother,thesequencedisor-der,andthepreferenceoffoldingtowardabiologicallyactivenativestate,leadstoaveryrichthermodynamicphasediagram.WhilethesamequalitativebehaviorisexpectedforproteinsandRNA,wewillhereconcentrateonRNAsinceRNAfoldingismoreamenabletoanalyt-icalandnumericalapproachesthanproteinfolding.TherelativesimplicityoftheRNAfoldingproblemcomparedtotheproteinfoldingproblemdoesnotstemfromthefactthatRNAconsistsofonlyfourdierentbasesversusthetwentyaminoacidstheproteinsarecomposedof,butitcomesfromthesimplerinteractionrules:Thedomi-nantinteractionbetweenthefourbasesA,U,G,andCofanRNAmoleculeisWatson-Crick(G{CandA{U)pairformation,i.e.,iftwobaseshaveformedapairtheytorstorderdonottakepartinanyfurtherinteractions.Everyaminoacidofaproteinonthecontraryinteractswithallitsspatialneighbors,i.e.,withontheorderoftenotheraminoacidsatatime.Fromastatisticalphysicspointofview,thepossibilityofaglassphaseatlowtemperaturesdrivenbysequencedisorder,isofspecialinterestintheheteropolymerfold-ingproblem[6{12].Unfortunately,evenforthecaseofRNAfoldingananalyticquantitativedescriptionoftheglassphaseisstilloutstanding.Thus,quantitativestud-ieshavetoeitherrelyonnumericsortheyhavetousewhatisknownastheannealedaverage.Intheannealedaverage,thefreeenergyofthesystemisapproximatedasthelogarithmoftheensembleaveragedpartitionfunc-tion(insteadoftakingtheensembleaverageoverthelogarithmofthepartitionfunctioncalledthequenchedaverage).Physically,thisapproximationcorrespondstotreatingthesequencedegreesoffreedomasdynamicalinsteadoffrozenvariables.Thus,theannealedsystemrepresentsasequenceensemblethatiscoupledtothestructuralensemblebywayoftheinteractionenergies.Thissequenceensemblemaybedierentfromtheorigi-nalsequenceensembleofuncorrelatedrandomsequencesoverwhichthefreeenergyissupposedtobeaveraged.Duetothesedierencesbetweentheannealedandthequenchedsequenceensembletheannealedfreeenergyisonlyanapproximationtothetrue(quenched)freeenergyofadisorderedsystem.Thepurposeofthismanuscriptistorstquantifythedierencesbetweentheannealedandthequenchedse-quenceensembles.Specically,wewilllookatcorrelationbetweenneighboringbases.Weshowthatwhilethiscor-relationisstrictlyzerointhecorrect(quenched)sequenceensemble,theyarenon-zerointheannealedsequenceen-sembleandincreasewithdecreasingtemperature-uptocompletecorrelationincertainmodelsofRNAfolding.ThisclearlyunderlinesandquantiesthefundamentalshortcomingsoftheannealedaverageintheRNAfold-ingproblematlowtemperatures.Basedonthequantiednon-zeronearestneighborcor-relations,wethentrytodiminishthedierencesbe-tweentheannealedandquenchedensemblesbyforcingtheannealedensembletopresentzeroneighboringcor-relation.Thisconstrainedannealedensemblebehavesmuchmoresimilartothequenchedensemblethantheannealedensemble.Althoughtheglassphaseitselfcannotbeidentiedusingtheconstrainedannealedensem-blewhichonlypartiallycorrectstheoverallnon-randomcorrelations,onecanobtainthermodynamicquantities 2whicharemuchclosertothequenchedresultsthantheannealedonesusingthismethod.Thispaperisorganizedasfollows:InSec.II,webrie\ryreviewtheRNAsecondarystructureandintroducethegeneralRNAfoldingproblemwithsequencedisorder.InSec.III,wequantifythedeviationofnearestneighborcorrelationsoftheannealedensemble.Finally,weim-provethepureannealedensemblebyapplyingacon-straintofrandomcorrelationsinSec.IV.II.RNAFOLDINGPROBLEMWITHSEQUENCEDISORDERA.RNAsecondarystructuresRNAisasingle-strandedbiopolymeroffourdierentbasesA,U,C,andG.ThestrandcanfoldbackontoitselfandformhelicesconsistingofstacksofstableWatson-Crickpairs(AwithUorGwithC).ThiscomparativelysimpleinteractionschememakestheRNAfoldingprob-lemveryamenabletotheoreticalapproacheswithoutlos-ingtheoverall\ravorofthegeneralbiopolymerfoldingproblem[5].AnRNAsecondarystructureSischaracterizedbyitssetofWatson-Crickbasepairs(i;j)whereiandjdenotetheithandjthbaseoftheRNApolymerrespectively(conventionallyij).Here,wefollowmanypreviousstudies[5{17]andapplythereasonableapproximationtoexcludeso-calledpseudoknots[18],i.e.,fortwoWatson-Crickpairs(i;j)2Sand(k;l)2S,congurationswithikjlarenotallowed.Thisapproximationisjus-tied,becauseshortpseudoknotsdonotcontributemuchtotheoverallenergyandlongpseudoknotsarekineticallydiculttoform.B.QuenchedaveragingThepropertiesofRNAfolding,especiallythepossibil-ityofaglassphasedrivenbythesequencedisorder,havebeenachallengingproblemfromthestatisticalphysicspointsofview.Tounderstandthestatisticsofthisdis-orderedsystem,onersthastoassignanenergyE(;S)toeverysecondarystructureSforagivensequence.Thiscould,e.g.,simplybethenegativeofthetotalnum-berofWatson-Crickbasepairs.ThisthenallowsustocalculatethepartitionfunctionZ()=XS\(;S)e E(;S)=T(1)foragivensequencewhere\(;S)isonewhenthesecondarystructureSiscompatiblewiththesequenceandzerootherwise.Finally,onehastocalculatethequenchedaverageFq= kBThlnZ()i(2)overallsequences.C.AnnealedaveragingUnfortunately,thequenchedfreeenergyFqisverydif-culttocalculate.Thus,onecantrytoapproximatethequenchedfreeenergybythemucheasiercomputedan-nealedfreeenergy,whichtreatsthedisorderedsequencesasdynamicvariables.Thisannealedfreeenergyisonlyalowerboundofthequenchedfreeenergy,Fa= kBTlnhZ()iFq:(3)Itcanbequitedierentfromthequenchedfreeen-ergysincetheannealedensemblefavorsthosesequenceswheremorebindingpairsareallowed.Moreimportantly,physicalquantitiesderivedfromthisannealedfreeenergycanbeverydierentfromtheirquenchedcounterpartsaswewillshowexplicitlyinthefollowingsections.Tobespecic,wewillmeasurethecorrelationbetweenneigh-boringbaseswhichareknowntovanishinthequenchedcase.D.EnergymodelsInthispaper,westudythesimplestmodelofdisor-deredRNAsequenceswhichcontainonlythetwobasesAandU.Inassigningfreeenergiestosecondarystructures,weneglectanyloopentropiesandfocusonthebasepairsalone.Besides,formostpartsofthismanuscript,wedonotconsidertheminimalhairpinlengthconstraintwhichrequiresthetwobasesofabindingpairtobeseparatedbyatleastthreebasesinarealRNAmolecule.Withintheseapproximationswedoconsidertwodierentenergymodels.Inthebindingenergymodel,wesimplyassignanen-ergy= 1toeachAU(orUA)bindingpair.WedenotethecorrespondingBoltzmannfactorbyq=e1=T.Thismodelcapturesthemainfeaturesoftheenergeticsandissimpleenoughforanalyticalandnumericalstudies.Wealsostudyasomewhatmorerealisticenergymodel,namelythestackingenergymodel.Inthismodel,weas-signenergiestothestackingoftwobasepairsratherthantoindividualbasepairs.Thisstackingenergydependsinrealityontheidentitiesofallfourbasesinvolved.WeimplementthiseectbyassociatingaBoltzmannfactors1withstackingsoftypesAAUUandUUAAwhileassociatingadierentBoltzmannfactors2withstackingsoftypesAUAandUAAU.Tobespecic,wewillchoosetheseBoltzmannfactorsass1=e2=Tands2=e1=Tfortheremainderofthiscommunication.Themainreasontostudythestackingenergymodelisthatthesimplebindingenergymodelisknowntobepathologicalwithoutaglassphaseatlowtemperatureinthedisorderedsequenceensemble[7{9].Asimplereasonisthatwhateverthesequence,eachbaseAcanalwaysndanotherbaseUtopairwithprovidedwehavethesameamountofbasesAandU.Thus,sequencesdisorder 3doesnotcausefrustration.Incontrast,theenergydis-tributionofthestackingenergymodelisgreatlyaectedbysequences,andastructureinwhichallbasepairsarestackedcaningeneralnotbefoundforeverysequence.Thus,sequencedisorderisexpectedtocausefrustration,andaglassphaseisexpectedinthisenergymodelforlowenoughtemperature.III.NEARESTNEIGHBORCORRELATIONSOFTHEANNEALEDENSEMBLEInthissection,wecalculatequantitativelyhowthenearestneighborcorrelationsintheannealedensembledeviatefromtheirtruevaluesintherandomsequenceensemble.Tothisend,wehavetocalculatetheannealedpartitionfunctionforsequenceswithlengthN-1,whichisdenedasZa(N)=12N 1XS X\(;S)e E(;S)=T!:(4)1NNN-11Nk1Sk=1N-1AAUFIG.1:RecursiverelationexploringallpossiblesecondarystructuresforahomogeneoussequenceoflengthN.Thewavylinesstandsforcontributionfromallpossiblestructuresandsequences.Thestraightlinestandsfornon-pairedbases.Forthebindingenergymodel,thisannealedpartitionfunctioncanbeeasilyobtainedviatherecursiverelationshowninFig.1alongthelinesofpreviousstudies[10,19{22]buttakingthesequencesintoaccountexplicitly.Theideaistoseparatethetwocasesforthelastbase,whichiseitherunboundorboundtoacertainbasek,andthenrelatethepartitionfunctiontotheshorterlengthoneasZa(N+1;q)=Za(N;q)(5)+q2N 1X=1Za(k;q)Za(N k;q):Withthisrelation,onecanobtainananalyticalformulafortheannealedpartitionfunctioninthelargeNlimitbyperformingthez-transform,whichisdenedascZa(z;q)=1XN=1Za(N;q)z N;(6)ontherecursiverelation.AftersolvingtheresultingquadraticequationforcZa(z;q),wecanobtaintheparti-tionfunctionbydoingtheinversez-transform,Za(N;q)=12iIcZa(z;q)zN 1dz:(7)Thisapproachcanbeeasilygeneralizedtothestackingenergymodel.Inordertokeeptrackofthecorrelationsbythean-nealedensemble,weassignanadditionalBoltzmannfac-torLtoallAAandUUneighborswithinthesequence.ThemodiedannealedpartitionfunctionisthenZa(N;q;L)=12N 1XS X\(;S)qnq(S)LnL()!;(8)wherenq(S)isthenumberofbindingpairsinasecondarystructureS,andnL()isthenumberofconjugateneigh-bors,i.e.,AAandUUneighborsinthesequence.TheadditionalBoltzmannfactorcomplicatesthecal-culationofthepartitionfunctionsincedierentbasesAandUcontributedierently.However,wecanstillformu-laterecursiverelationsbynoticingthatthetwoendbasesofasequencepiecedeterminethecorrelationswithotherpieces.Thus,wecanseparateasequenceintotwocaseswheretheendbasesareeitherofthesametypeornot,andformulatetherecursiverelationforeachcaseinde-pendently.TheannealedpartitionfunctionZa(N;q;L)isthenobtainedviaz-transformasbefore.Sincethefor-mationoftherecursiverelationsisquitetechnical,weonlyaddresstheresulthere,anddeferthedetailstoAp-pendixA.Fromthepartitionfunctionwecanobtainthenear-estneighborcorrelationsbylookingatthedeviationoftheaveragedfractionofAU(orUA)neighborsfromtheexpectedvalue1/2inthedisorderedsequenceensemble.Thisdeviationisobtainedbytakingthederivativeas=12 1NL@Lln(Za(N;q;L))jL=1:(9)A.BindingenergymodelFig.2showstheneighborcorrelationsforthebindingenergymodel.Wendthatthedeviationmovesfurtherawayfromzeroastemperaturedecreases.Thisisadi-rectresultfromthefactthatatlowtemperature,themaincontributionstotheannealedpartitionfunctioncomefromthosesequenceswhichallowalotofbind-ingpairs,unlikethequenchedcasewheresequencesareequallyweighted.Theexactwaythattheneighborcorrelationsarebi-asedcanbeunderstoodasfollows.Inthisbindingenergymodel,theonlythingthatbiasesthenearestneighborcorrelationsistheformationofminimalhairpinssincetheyenforcetheneighboringbasestobedierent,whichareeitherAUorUA.Thus,thedegreeofbiasisdirectlycoupledtothefractionofsmallesthairpinsinasequence.Thisassertioncanbeveriedbystudyingthefractionofminimalhairpins.Asanexample,westudythezerotemperaturecasewhereallthebasesareexpectedtobepaired.Amongallpossiblepairingstructures,weexplic-itlycalculatethefractionofsmallesthairpins(withthe 400.511.522.53T00.0250.050.0750.10.125 dFIG.2:DeviationofthefractionofAU(orUA)nearestneigh-bors.Thedeviationisplottedasafunctionoftemperatureinunitsofthebindingenergyforthebindingenergymodel.Noticethatthedeviationmovesfurtherawayfromzeroandstopsataxedconstantastemperaturedecreases.Italsoapproachesalimitlargerthanzeroathightemperatureindi-catedbythedashedline.detailsshowninAppendixB).Asaresult,everyfourthbaseispartofaminimalsizehairpin.Thus,wehave1=4AU(orUA)nearestneighborsfromthesehairpinsandanother1=23=4=3=8fromtherestofthebasessincetheydonotshownearestneighborcorrelationbias.ThedeviationofthefractionofAU(orUA)neighborsisthenexpectedtobe5=8 1=2=1=8,whichmatchesexactlythezerotemperaturelimitinFig.2.Inthiscase,thesequence,asadynamicvariable,adjustsitselftoallthebindingpairs.Eveninthehightemperaturelimit,althoughallal-lowedsequencesareequallyweighted,therestillexistsanitefractionofminimalsizehairpinsonaverage.Asaresult,thedeviationofneighborcorrelationsapproachesaconstantlargerthanzero.Theassertionthatthedeviationiscoupledtotheformationofminimalsizehairpinsisagainveriedasweadditionallyrequireallthehairpinsbeingoflengthlargerthanone.Inthiscase,thecorrelationbetweennearestneighborsbecomesrandomatalltemperatures.How-ever,thesecondnearestneighborcorrelationsbecomenon-trivial.Thissimplebindingenergymodelgivesusatastehowthenearestneighborcorrelationsarecoupledwiththeen-ergythroughthestructure,i.e.,theformationofminimalhairpins.Thiscorrelationisbiasedsincetheannealeden-sembleputsmoreweightonlowerenergysequences.B.StackingenergymodelFollowingthesameapproach,wecheckthesamedevi-ationasafunctionoftemperatureinthemorerealisticstackingenergymodel.Again,onlytheresultisquotedhereinFig.3(interestedreaderscancheckthedetailedcalculationsinAppendixC).0.511.52T-0.5-0.4-0.3-0.2-0.100.1 dFIG.3:DeviationofthefractionofAU(orUA)nearestneigh-borsfortheenergymodelinvolvingstackingenergies.Unlikeinthecaseofthebindingenergymodel,theAU(orUA)neighborcorrelationsarecompletelybiasedatzerotempera-tureinthestackingenergymodel.Athightemperature,thisdeviationapproachesthesamelimitasthebindingenergymodel.Unlikethebindingenergymodel,atzerotemperature,thenearestneighborcorrelationsofthestackingenergymodelarecompletelybiased.AlmostnoAU(orUA)neighborscanbefoundinthisannealedsystem.Thiscanbeunderstoodsinceatzerotemperature,theonlydominatingstructureisalongsteminwhichallstack-ingloopsareoftypes1.Thus,theonlytwoimportantsequencesaretheonesmadeofhalfconsecutiveAbasesandtheotherhalfofUbases.Toverifythisstructure,weadditionallyintroducean-otherBoltzmannfactorhforeachhairpinloopformation.WiththisBoltzmannfactorwecankeeptrackofthefrac-tionofhairpinsfhintheannealedsystembycalculatingfh=1Nh@hln(Za(N;s1;s2;h;L=1))jh=1:(10)FromFig.4,wedoseethatthefractionofhairpinsofthisannealedsystemindeedgoestozeroastemperaturegoestozero,whichisafeatureofthelongstemstructure.Athightemperature,however,theenergymodeldoesnotmattersinceentropydominates.Thus,theAU(orUA)fractionapproachesthesamelimitasinthebindingenergymodel.Fromthisstackingenergymodel,welearnthatthestrongertheenergyiscoupledtothenearestneighborcorrelations,thelargerdeviationinnearestneighborcor-relationsoftheannealedsystemwillbepresentatlowtemperature.IV.CONSTRAINEDANNEALINGSofarwehaveonlyobservedthesequencecorrela-tionsarticiallyintroducedthroughtheannealedensem- 50.10.20.30.40.50.60.70.80.91T00.050.10.150.2fhannealedconstrained annealedquenchedFIG.4:Fractionofhairpinsinthestackingenergymodelforthreedierentensembles.ble.However,ourapproachcaninfactbeusedtogener-atemorerealisticensembleswithintheannealedframe-work.Theideaistoforcethenearestneighborcorrela-tionstoberandomwhenperformingtheannealedaver-age[23,24].Wesimplyenforcethisrandomdisorderconstraint,i.e.,thefractionofAU(orUA)neighborsbeingonehalfbysettingtheBoltzmannfactorL,whichcontrolsthenear-estneighborcorrelations,towhatevervalueitneedstohaveforthecorrelationsoftheannealedensembletovan-ish.Thisconstrainedannealingturnsouttopredictther-modynamicquantitiesmuchclosertothequenchedre-sults.Anditcanbedoneimmediatelyfollowingourquantieddeviationsindisorder.A.BindingenergymodelTheconstraintforthebindingenergymodelisreadas1NL@Lln(Za(N;q;L))jL=Lc=12:(11)Inthisenergymodel,weexpectthesequenceswithmoreAU(orUA)nearestneighborstobesuppressedsincetheannealedsystemfavorsthoseneighbors.Asaresult,Lc,whichfavorsAA(orUU)neighbors,isexpectedtobelargerthanoneinordertomeettheconstraint.Further-more,Lcshouldbelargeratlowertemperaturessincetheneighborcorrelationismorebiasedatlowertemper-atures.Oneimportantnoteisthattheresultingfreeenergyisonlydeneduptoanadditiveconstant,i.e.,addingaconstantbackgroundpotentialdoesnotchangethere-sultatall.Thus,theabsolutevalueofthisconstrainedannealedfreeenergyaswellastheBoltzmannfactorLchasnorealmeaning.Forexample,onecouldassigntheBoltzmannfactorLtoAU(orUA)neighborsinsteadofAA(orUU)neighbors.Theresultingchemicalpotentialwouldthenchangeasignandthefreeenergywoulddif-ferbyaconstantamount.However,thethermodynamicquantities,whicharecalculatedbytakingderivativesoftheconstrainedfreeenergy,willnotseethisconstantandareexpectedtobeclosertothequenchedresult.Toverifythisassertion,wearegoingtocomputetheaveragefractionofbindingpairsforthebindingenergymodelviaq=N@qln(Za(N;q;L))asafunctionoftem-perature.Then,wecomparethecasesoftheannealed(L=1),theconstrainedannealed(L=Lc)andthequenchedensembles.Astothequenchedresult,wenumericallycalculatethepartitionfunctiongivenrandomsequencesoflength1280andcollectthedatafrom1000randomsequences.Inor-dertoavoidthetrivialnitesizeeectsdueto\ructuationofthefractionofAbasesawayfromitsexpectedvalue1/2,weonlychoosesequencesthatcontainexactly640A'sand640U's.TheresultisshowninFig.5.Thesta-tisticalerrorsofthequenchedresultsarealwayssmallerthanthesizeofthecorrespondingsymbol,suchthatwithintheerrorbarsthequenchedresultsneveroverlapothercurves.Thisconditionholdsforallotherquenchedresultsinthismanuscript.0.20.30.40.50.60.70.80.91T0.340.360.380.40.420.440.460.48fqannealedconstrained annealedquenchedFIG.5:Fractionofbindingpairsinthebindingenergymodel.Theconstraintofrandomnearestneighborsbringstheannealedquantityclosertothequenchednumericalesti-mate.Thestatisticalerrorsofthequenchedresultsarealwayssmallerthanthesizeofthesymbol.Weseethattheconstrainedannealedresultisindeedveryclosetothequenchednumericalestimate.However,allthreeresultsareratherclosetoeachotheranyway.Thereasonforthesethreecasesbeingsoclosetoeachotherissimplythatunderthisenergymodelthesystemisnotglassy,andeverybaseisabletondanotherbaseforpairinginthisbindingenergymodel.Thus,atzerotemperature,allthebasesarepairedinallthreesystems.Thefactthatthenearestneighborcorrelationsarenotbi-asedalotcanalsobeveriedaswendthatatT=0.1,Lctobejust1.59.Thus,thechemicalpotentialintroducedfromtheconstraintiscomparativelysmallanddoesnotaecttheresulttoomuch. 6B.stackingenergymodelThesituationforthestackingenergymodelisverydierentfromthatofthebindingenergymodel.Here,wefollowthesameapproachandcomputetheaveragedfractionofstackingloopsoftypeAAUU(orUUAA)andAU(orUAUA)asafunctionoftemperatureundertheconstraint,1NL@Lln(Za(N;s1;s2;h=1;L))jL=Lc=12:(12)Similarly,inordertoavoidthetrivialnitesizeeectsforthequenchednumericalestimate,wexthenumberofAA,AU,UA,UUneighborsintherandomlychosensequencestobe320each[25].0.10.20.30.40.50.60.70.80.91T00.10.20.30.40.50.6fs1annealedconstrained annealed2 constrained annealedquenchedFIG.6:FractionofstackingloopsAAUU(orUUAA)inthestack-ingenergymodel.Theconstraintofrandomnearestneigh-borsxesthisquantitymuchbetterthanaveragednumberofpairsinthebindingenergymodel.Thephenomenolog-icalconstraint,i.e.,axedfractionofhairpins,bringsthisquantityonlyabitclosertothequenchedresult.FromFigs.6and7,weseethattheconstrainedan-nealedresultsaregreatlyimprovedovertheplainan-nealedresults.Thisveriestheideathatlargerdevia-tionsfromtherandomdisorderresultinabettercor-rectionviatheconstraintoftherandomdisorder.Forthisstackingenergymodel,atT=0.1,Lc=0.0067ismuchmoredierentfrom1thaninthebindingenergymodel.Fromtheseresults,wecanseethattheconstrainedannealedensembleofthestackingenergymodelbehavesinthefollowingway.SincetheensembleisforcedtohavethesamenumberofAA(orUU)andAU(orUA)neighbors,atzerotemperature,thedominatingstructureisstillalongstemstructure,butwithhalfthestackingloopsbeingoftypes1andtheotherhalfbeings2.ThisisconsistentwiththefactthatfractionofhairpinsgoingtozeroastemperaturegoestozerofortheconstrainedannealedsystemasshowninFig.4.Onedierencebetweenthequenchedensembleandtheconstrainedannealedensembleisthatnotallthebasesofarandomsequencecanformstackingloops.Thus,wehaveanitefractionofhairpinsinthequenchedensem-ble(Fig.4).Thisdierencecanusedasanadditional0.10.20.30.40.50.60.70.80.91T00.050.10.150.20.25fs2annealedconstrained annealed2 constrained annealedquenchedFIG.7:FractionofstackingloopsAUA(orUAAU)inthestack-ingenergymodel.Again,theconstraintofrandomnearestneighborsgreatlyimprovestheresult.However,unlikethecaseinFig.6,theconstraintofaxedfractionofhairpinsalsocontributesinbringingtheannealedquantityclosertothequenchedresult.phenomenologicalconstrainttoimprovetheconstrainedannealedsystemevenfurther.Weapplythisadditionalphenomenologicalconstraintbyrequiringthefractionofhairpinsfhtotthequenchednumericalestimatesandneighboringbasestobeuncor-relatedatthesametime,i.e.,toenforceLN@Lln(Za(N;s1;s2;h;L))jh=hc;L=Lc=12(13)hN@hln(Za(N;s1;s2;h;L))jh=hc;L=Lc=fh(T);(14)wherefh(T)isthequenchednumericalestimateinthisequation.FromFigs.6and7,weseethatthisadditionalcon-straintslightlyimprovethefractionofstackingloopss1,butsignicantlyimprovesthefractionofstackingloopss2.Thiscanbeunderstoodsincetheexistenceofhair-pinsintroducesAU(orUA)neighbors,ifthefractionofAU(orUA)neighborsisalsorequiredtobeonehalf,itwilldecreasethefractionofstackingloopss2amongthestemstructures.V.CONCLUSIONWeconcludethatthedeviationoftheannealeden-semblefromthequenchedensembleisstronglyrelatedtotheenergymodelandcanbecompletelybiasedwhenthecorrelationisstronglycoupledtotheenergyofthesystem.Quantifyingthisdeviationallowsustodocon-strainedannealingwhichbringsthepredictionsofther-modynamicquantitiesmuchclosertotherealvaluesinthequenchedensemble.Asthedeviationislarger,theconstraintisstrongerandthusbringstheannealeden-sembleevenclosertothequenchedresults.Unfortu-nately,thebiasingtowardthequenchedensembleisnot 7strongenoughtoactuallydrivethesystemintotheglasstransition.Besidesthenearestneighborcorrelations,onecouldalsoconsiderthecorrelationsfornextnearestneighborsoreventwobasesseparatedbyarbitrarydistances.Inprinciple,allthesecorrelationstogetherwouldbringustotheexactquenchedresultsandthustotheglasstransi-tion.However,thecalculationsbecomemuchmorecum-bersomeasoneincreasesthedistancebetweenthetwobases,andareleftforfuturework.VI.APPENDIXA.AnnealedpartitionfunctionforthebindingenergymodelTheannealedpartitionfunctionisobtainedbyrstsummingoverallcompatiblesequencesgivenasecondarystructureSandthensummingoverallpossiblestructuresS,whichcanbedoneviatherecursiverelationinFig.1.WedenetheannealedpartitionfunctionforasequenceoflengthNasZa(N+1).Inaddition,theannealedpartitionfunctionforasequenceoflengthNwithitstwoendbasespairedisdenedasAe(N 1).TherecursiverelationinFig.1isthenreadasZa(N+1)=1+L2Za(N)+N 1Xk=11+L4Za(k 1)Ae(N k):(15)Thefactor(1+L)=2forthersttermontherighthandsidecomesfromthecontributioninnearestneighborcor-relationsbetweenthefreebaseNandbaseN-1,andthe2takescareofaveragingoverthenumberofsequences.Wehaveasimilarfactorinthesecondtermcomingfromthecorrelationbetweenbasekofthearchandbasek-1.InthelaterpartwewillshowthatthebehavioroftheannealedpartitionfunctionismainlydeterminedbythearchtermAe,sowewillonlylookatthisquantityhere.TherstbaseofAeisalsospeciedtobeAandthelastbasetobetheconjugatebaseU.1NAUEE1NN 1AUEE1kN 1NAUSk=2N 2FIG.8:Recursiverelationfortheannealedpartitionfunc-tionoverheterogeneoussequenceswheretherstandthelastbasesformaconjugatepair.Aletter'E'isusedtodenotethatthetwobasesattheendsofthearchareconjugatebases.Again,theannealedpartitionfunctionforthearchcanbeobtainedthroughasimilarrecursiverelation(Fig.8).Thetwotermsontherighthandsidearefurtherdecom-posedinFigs.9and10.Intheserelations,weneedtokeeptrackoftwofactors:theenergycontributionsandthenearestneighborcorre-E1NN 1AU1EUN 1N1N 1NAOFIG.9:Decompositionofanarchwithitslastbaseinsidebeingafreebase,whichcanbeeitherAorU,intotwocases.Theletter'O'isusedtodenotethatthetwobasesattheendsofthearcharenon-conjugate.1NN 1kEEAUk+11kOAAkN 1EAUFIG.10:Separationofarches.lations.Fromtheenergeticpointofview,anarchcanbethoughtofsimplycontributingaBoltzmannfactorqandneednotstandforarealbindingpair,eventhoughinitiallyitisusedtorepresentarealbindingpair.Thus,inFig.9,aswetrytorelatetheannealedpartitionfunc-tiontoitsshorterlengthones,weassumeaneectivebindingpairbetweenbases1andN-1simplytoconservetheenergycontribution.Inthiscase,thetwobasesarenotreallypaired.Inordertokeeptrackofthecorrectnearestneigh-borcorrelations,weusealetterEonanarchtodenotethatthetwobasesattheendsofthearchareconju-gatebases.Similarly,aletterOisusedtorepresenttwonon-conjugatebasesattheendsofthearch.Thus,inFig.9,thetwocaseswherebaseN-1iseitherAorUareseparatedandaredenotedbyletterEandO,whichisdeterminedbywhetherthebases1andN-1areconjugateornot.Thesenotationsenableustoconnectthedecom-posedtermsrecursivelybacktotherelationinFig.8.InFig.10,aninnerarchcanbetreatedasafreebaseinconsideringtheenergyandcorrelationsfortherestofthebasesoutsidetheinnerarch.However,thereisadier-enceincountingneighborcorrelationsforthistreatmentbecausethefreebaselooksasabaseAfromtheright,butasabaseUfromtheleft.Thecorrectcorrelationscanbeobtainedifweshiftthisdiscrepancytothelastbaseand\ripitfromUtoA.Thus,thelasttermcarriesaletterOonthearchinsteadofE.TheserecursiverelationsarethenreadasAe(N 1)=L2Ae(N 2)+12Ao(N 2)+(16)14N 2X=2Ae(N k 1)[LAo(k 1)+Ae(k 1)];Ao(N 1)=L2Ao(N 2)+12Ae(N 2)+(17)14N 2X=2Ae(N k 1)[LAe(k 1)+Ao(k 1)]: 8Togetherwiththeinitialconditions,Ae(1)=q,Ae(2)=qL,Ao(1)=qL,Ao(2)=q(1+L)=2,onecansolveforAe(N)byperformingthez-transformcAe(z)=1XN=1Ae(N)z N;(18)cAo(z)=1XN=1Ao(N)z N;(19)ontherecursiverelations.AftersolvingforcAe(z),Ae(N)canbeobtainedthroughtheinversetransformAe(N)=12iIcAe(z)zN 1dz:(20)Frompreviousstudies[10],weknowthatinthethermo-dynamiclimit,thepartitionfunctionhasananalyticalformasAe(N)/N 3=2zc(q;L)N,wherezc(q;L)isthegreatestrealpartamongthebranchpointsobtainedfromthesolutionofcAe(z).Similarly,ifweperformz-transformonequation15,wecanrelatethez-transformoftheannealedpartitionfunctioncZa(z)tothatofthearchcAe(z).Sincethesetwosharethesamebranchpoints,theasymptoticbehavioroftheannealedpartitionfunctionisdierentfromtheaboveformulaforthearchbyjustadierentprefactor,whichdoesnotplayaroleinthethermodynamiclimit.ThefractionofAA(orUU)neighboringbasesperbaseoftheannealedsystemistheneasilycalculatedasL@Lln(zc(q;L))jL=1.Unfortunately,theanalyticalsolu-tionofthissetofpolynomialequationsistoocumber-sometoconveyanyusefulinformation.Thus,weresorttonumericalevaluationofthisanalyticalsolutioninthispaper.B.FractionofminimalhairpinsatzerotemperatureAsdiscussedinthemaintext,thefractionofminimalsizehairpinscanbeeasilyobtainedoncewegureoutthepartitionfunction.Atzerotemperature,thepartitionfunctionissimplerthanthenitetemperatureonesinceweonlyneedtoconsiderthegroundstateswhereallbasesarepaired.ThispartitionfunctionisobtainedthroughtherecursiverelationinasimilarwayasshowninFig.11.12NSN 12k 12N112N 12NFIG.11:Recursiverelationforthepartitionfunctionwhereallthebasesarepaired.Wedenethepartitionfunctionforasequenceoflength2(N-1)asZm(N;h),wherehistheBoltzmannfactorforaminimalsizehairpin.TherecursiverelationisthenreadasZm(N+1)=N 1Xk=1Zm(k)Zm(N k+1)+hZm(N):(21)Togetherwiththeinitialconditions,Zm(1)=1andZm(2)=h,onecanobtaintheasymptoticbehaviorthroughz-transform.Aftersimplealgebra,wehavethelargestpolezc(h)=h+2ph+1forthez-transformofpartitionfunctioncZm(z;h).ThepartitionfunctionZm(N)isthenproportionaltozc(h)N.Thefractionofminimalsizehairpinspertwobasesistheneasilycalculatedas@hlnzc(h)jh=1=1+1=phh+2ph+1jh=1=1=2:(22)Thus,thefractionofminimalsizehairpinsperbaseis1/4.C.AnnealedpartitionfunctionforthestackingenergymodelThecalculationforthestackingenergymodelfollowsthesameapproache.However,itisabitmorecompli-catedsinceweneedtokeeptrackofstackingloopsin-volvingfourbaseswhichleadsustotherecursiverelationdepictedinFig.12.E1NN 1AU1NAUES1AU2N 1NAUUASSEE1kN 1NAUSESk=2N 2FIG.12:Recursiverelationforthestackingenergymodel.Intheserecursiverelations,weuseanadditionalletterSonthearchtodenotethefactthatweconsiderthestackingenergyofthestackingloopformedpartlybythatbindingpair.Independentofthetypeofthearch,allthestackingenergiesinsidethearchesarestillconsideredinallcases.Thus,thersttermontherighthandsideinFig.12doesnotcontainanSbecauseitsbaseN-1isunbound,andnostackingloopcanbeformedwiththebindingpairofthearch.E1NN 1AU1EUN 1N1N 1NAO1N 1AUES12N 2N 1AAUUAUES EESL1N 1A12N 2N 1AAUUAESOSOS OAAFIG.13:Decompositionoftheannealedpartitionfunctionwhichlastbaseinsidethearchisafreebase.Similartotherecursiverelationinprevioussection,wethendiscardthelastbaseasafreebaseasshown 9inFig.13.Again,thearchesontherighthandsidearemeanttopreservetheenergycontributionsonly.Inthesecondlineoftherelation,wefurtherdecomposethetermsinordertorelatethesetermswiththerstrecur-siverelationinFig.12.ESAkN 1E1kUESAN 1N(1/h 1)k+11kO1k+1OAAFIG.14:Separationofarchesconsideringthehairpincontri-bution.InFig.14,wealsoseparatethecontributionsoftheinnerarchfromtherestpartasinFig.10.Onedier-enceisthatweconsiderthecontributionfromthehairpinloopsinthiscase.Thus,thehairpinloopcontainedintheouterarchtermisnotarealhairpinloopbecauseoftheexistenceoftheinnerarch.Thecorrectresultisobtainedbyaddingthelasttermintherelation.Inthisstackingenergymodel,wedenotetheannealedpartitionfunctionforanarchoflengthN-1asAes(N).TherecursiverelationsarethenreadasfollowsAes(N 1)=L2Aes(N 2) (s1 1)(1+L2)4Aes(N 4)+12Aos(N 2) (s2 1)2L4Aes(N 4)+s1L2+s24Aes(N 3)(23)+14N 2Xk=3Aes(N k 1)LAos(k 1) (s1 1)2L4Aes(k 3)+Aes(k 1) (s2 1)1+L24Aes(k 3)+2(1h 1)Ho(k);Aos(N 1)=L2Aos(N 2) (s1 1)2L4Aes(N 4)+12Aes(N 2) (s2 1)1+L24Aes(N 4)+s1L+s2L4Aes(N 3)(24)+14N 2Xk=3Aes(N k 1)LAes(k 1) (s1 1)1+L24Aes(k 3)+Aos(k 1) (s2 1)2L4Aes(k 3)+2(1h 1)He(k);wherethetermsHeandHostandforthecontributionfromahairpin.TheyareobtainedseparatelyfromarecursiverelationsimilartotheoneinFig.9,byjustreplacingthewavylinebyastraightline,whichmeansthatbasesarenotbound.OnecantheneasilyformulatetherecursiverelationsforHeandHo.Togetherwiththeinitialconditions:Aes(1)=h,Aes(2)=hL,Aes(3)=h(1+3L2+s2+s1L2)=4,Aos(1)=hL,Aos(2)=h(1+L2)=2,Aos(3)=h(3L+L3+s1L+s2L)=4,wecanperformz-transformtoob-taintheasymptoticbehavioroftheannealedpartitionfunctionforthestackingenergymodel.[1]K.A.Dilletal.,ProteinSci.4,561(1995).[2]J.N.Onuchic,Z.Luthey-Schulten,andP.G.Wolynes,Annu.Rev.Phys.Chem.48,545(1997).[3]T.Garel,H.OrlandandE.Pitard,J.Phys.I(France)7,1201(1997).[4]E.I.Shakhnovich,Curr.Opin.Struct.Biol.7,29(1997).[5]P.G.Higgs,Q.Rev.BioPhys.33,199(2000).[6]P.G.Higgs,Phys.Rev.Lett.76,704(1996).[7]A.Pagnani,G.ParisiandF.Ricci-Tersenghi,Phys.Rev.Lett.84,2026(2000).[8]A.K.Hartmann,Phys.Rev.Lett.86,1382(2001).[9]A.Pagnani,G.ParisiandF.Ricci-Tersenghi,Phys.Rev.Lett.86,1383(2001).[10]R.BundschuhandT.Hwa,Phys.Rev.E65,031903(2002).[11]F.Krzakala,M.MezardandM.Muller,EuroPhys.Lett.57,752(2002)[12]E.Marinari,A.PagnaniandF.Ricci-Tersenghi,Phys.Rev.E.65,041919(2002)[13]M.Muller,F.KrzakalaandM.MezardEuro.Phys.J.E.9,67(2002).[14]H.Orland,andA.Zee,Nucl.Phys.B.620,456(2002)[15]R.Mukhopadhyay,E.Emberly,C.TangandN.S.Wingreen,Phys.Rev.E.68,041904(2003)[16]M.Baiesi,E.OrlandiniandA.L.Stella,Phys.Rev.Lett.91,198102(2003)[17]P.LeoniandC.Vanderzande,Phys.Rev.E.68,051904(2003)[18]I.TinocoJr.andC.Bustamante,J.Mol.Biol.293,271(1999),andreferencestherein.[19]P.G.deGennes,Biopolymers6,175(1968).[20]M.S.Waterman,AdvancesinMathematics,Supplemen-tarystudies,editedbyG.-C.Rota(Academic,NewYork,1978),pp.167-212.[21]J.S.McCaskill,Biopolymers29,1105(1990)[22]M.ZukerandD.Sanko,Bull.Math.Biol.46,591(1984)[23]T.Morita,J.Math.Phys.5,1401(1964)[24]E.Orlandini,A.RechnitzerandS.G.Whittington,J.Phys.A.35,7729(2002)[25]S.F.AltschulandB.W.Erickson,Mol.Biol.Evol.2,526(1985)