/
Protein crystallography for non crystallographers Protein crystallography for non crystallographers

Protein crystallography for non crystallographers - PDF document

min-jolicoeur
min-jolicoeur . @min-jolicoeur
Follow
391 views
Uploaded On 2017-04-01

Protein crystallography for non crystallographers - PPT Presentation

thesameuniquere ID: 333640

thesameuniquere

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "Protein crystallography for non crystall..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

“lledquiterapidly.Itisnowpossibletodownload,withafewclicksofamouse,thestructureofaproteinofinterestanddisplayitusingavarietyofgraphicsprograms,freelyavailabletoanyonewitheventhesimplestmoderncomputer.Oncepresentedasanele-gantpicture,thestructureseemsbeyondsuspicionastoitsvalidity,orperhapsthevalidityofitsinterpreta-tionbyitsauthors.Butisthatalwaysthecase?Anassessmentofthequalityofmacromolecularstructures,correctedfortechnicaldif“culty,novelty,size,resolution,etc.,hasrecentlybeenpublished[5].Theauthorsofthatstudyconcludedthat,onaverage,thequalityofproteinstructureshasbeenquitecon-stantoverthelast35years,andthereislittlediffer-enceinqualitybetweenstructuressolvedintraditionallaboratoriesandbySGefforts(ifanything,thelatterareslightlybetter,atleastfromsomecenters).How-ever,averyclearcorrelationemergedbetweenthequalityofthestructureandtheprestigeofthejournalinwhichitwaspublished,withstructuresinthemostexclusivejournalsbeing,ingeneral,ofstatisticallylowerquality(interestingly,structurespublishedinthisjournalwerefoundtobe,onaverage,ofthehighestquality).Ofcourse,thehigh-impactjournalsputaproperspinontheseresults,relatingthemtothehighercomplexityofthestructuresthattheyacceptforpublication[6].However,asinterpretationofthesestructuresisattheforefrontofstructuralbiology,itisimportantthatreadersshouldbeabletoassesstheirqualityindependently.Thestructureoftheenzymefrankensteinase(appro-priatelynamedafterthebirthplaceofoneoftheauthorsofthisreview,andforsomeotherratherobvi-ousreasons)ispresentedinFig.1A.Itcertainlylooksquitenice,especiallytoanon-crystallographer,butitdoeshaveafewproblems,themainonebeingthatnosuchenzymeexists.However,howcouldabiochemistorbiologistwhoisnottrainedinproteincrystallo-graphy(and,thesedays,practicallynobodyisfullytrainedinthis“eld)recognizethis?Thepurposeofthisreviewistoprovidereaderswithhintsthatmayhelptheminassessingthelevelofvalidityanddetailpro-videdbycrystalstructures(and,toalesserextent,structuresdeterminedbyothertechniques),de“nesev-eralrelevanttermsusedincrystallographicpapers,andgiveadviceonwhereto“ndred”agsthatcouldaffectinterpretationofsuchdata.Thisisnotaprimerofproteincrystallographyfornon-crystallographers,butratherthemusingsoffourstructuralbiologists,activeinvariousaspectsofcrystallography,bothtechnicalandbiological,withacombinedtotalofover125yearsofexperience,writtenforthebene“tofthosethatdonotwantorneedtolearnaboutallthedetailsthatgointothesolutionandre“nementofmacromolecularstructures,butwouldliketogaincon“denceintheirHowisacrystalstructuredetermined?StructuralcrystallographyreliesalmostexclusivelyonthescatteringofX-raysbytheelectronsinthemole-culesconstitutingtheinvestigatedsample.(Someotherscatteringmethods,forexample,ofneutronsorelec-trons,althoughveryimportant,areresponsibleforonlyatinyfractionofthepublishedmacromolecularstructures.)Becausethehighlysimilarstructuralmotifsformingtheindividualunitcellsarerepeatedthrough-outtheentirevolumeofacrystalinaperiodicfashion,itcanbetreatedasa3Ddiffractiongrating.Asaresult,thescatteringofX-radiationisenhancedenor-mouslyinselecteddirectionsandextinguishedcom-pletelyinothers.Thisisgovernedonlybythegeometry(sizeandshape)ofthecrystalunitcellandthewavelengthoftheX-rays,whichshouldbeinthesamerangeastheinteratomicdistances(chemicalbonds)inmolecules.However,theeffectivenessofinterferenceofthediffractedraysineachdirection,andthereforetheintensityofeachdiffractedray,dependsontheconstellationofallatomswithintheunitcell.Inotherwords,thecrystalstructureisencodedinthediffractedX-rays…theshapeandsym-metryofthecellde“nethedirectionsofthediffractedbeams,andthelocationsofallatomsinthecellde“netheirintensities.Thelargertheunitcell,themoredif-fractedbeams(calledre”ections)canbeobserved.Moreover,thepositionofeachatominthecrystalstructurein”uencestheintensitiesofallthere”ectionsand,conversely,theintensityofeachindividualre”ec-tiondependsonthepositionsofallatomsintheunitcell.Itis,therefore,notpossibletosolveonlyaselected,smallpartofthecrystalstructurewithoutmodelingtherestofit,incontrasttootherstructuraltechniquessuchasNMRorextendedX-rayabsorp-tion“nestructurewhichcandescribeonlypartoftheAdiffractionexperimentinvolvesmeasuringalargenumberofre”ectionintensities.Becausecrystalshavecertainsymmetry,somere”ectionsareexpectedtobeequivalentandthushaveidenticalintensity.Theaver-agenumberofmeasurementsperindividual,symmetri-callyuniquere”ectioniscalledredundancyormultiplicity.Becauseeveryre”ectionismeasuredwithacertaindegreeoferror,thehighertheredundancy,themoreaccuratethe“nalestimationoftheaveragedre”ectionintensity.Thespreadofindividualintensitiesofallsymmetry-equivalentre”ections,contributingtoProteincrystallographyfornon-crystallographersA.Wlodaweretal.FEBSJournal(2007)Journalcompilation2007FEBS.NoclaimtooriginalUSgovernmentworks thesameuniquere”ection,isusuallyjudgedbythe(sometimescalled),de“nedEachre”ectionischaracterizedbyitsamplitudeandphase.However,onlyre”ectionamplitudescanbeobtainedfromthemeasuredintensitiesandnodirectinformationaboutre”ectionphasesisprovidedbythediffractionexperiment.Accordingtothewell-estab-lisheddiffractiontheory,toobtainthestructureoftheindividualdiffractingmotif(inourcasethedistribu-tionofelectronsintheasymmetricpartofthecrystalunitcell),itisnecessarytocalculatetheFouriertrans-formationoftheso-calledstructurefactors,orues,whichrepresentthere”ectionamplitudesandphases.Severalmethodsareusedinproteincrystallo-graphytodeterminethephases.Typically,theyleadtoaninitialapproximateelectron-densitydistributioninthecrystal,whichcanbeimprovedinaniterativefash-ion,eventuallyconvergingatafaithfulstructuralmodeloftheprotein.TheprimaryresultofanX-raydiffractionexperi-mentisamapofelectrondensitywithinthecrystal. B Fig.1.Crystalstructureoftheenzymefrankensteinase.(A)Astereoviewshowingatracingoftheproteinchaininthecommonrainbowcol-ors(slowlychangingfromblueN-terminustoredC-terminus).Activesiteresiduesareinball-and-stickrendering,theMgionisshownasagrayball,andwatermoleculesasredspheres.FrankensteinasewasconceivedandreÞnedwith[22]anddrawnwith[21].(B)DetailoftheMgbindingsite.Atomsareshowninball-and-stickrendering,withcarbonatomscoloredgreen,oxygenred,andnitrogenblue.Afewproblemswiththisstructureneedtobeemphasized.(a)Nosuchproteinhaseverexistedorislikelytoexistinthefuture.(b)Thecoordinateswerefreelytakenfromseveralrealproteins,butwereassembledinawaythatwouldsatisfyonlyM.C.Escher.(c)AnÔactivesiteÕconsistingofthesidechainsofphenylalanine,leucine,andvalineisratherunlikelytohavecatalyticproperties.(d)IdentiÞcatioofametalionthatisnotproperlycoordinatedbyanypartoftheproteinisratherdoubtful.(e)Thedistancesbetweentheionandthecoordi-natingatomsareshownwithfourdecimaldigitprecision,vastlyexceedingtheiraccuracy.Besides,theÔbondÕdistancesareentirelyunac-ceptableformagnesium.PDBaccessioncode:ForobviousreasonsthemodeloffrankensteinasewasnotdepositedinthePDB.Itcanbeobtaineduponrequestfromthecorrespondingauthor.A.Wlodaweretal.Proteincrystallographyfornon-crystallographersFEBSJournal(2007)Journalcompilation2007FEBS.NoclaimtooriginalUSgovernmentworks Thiselectrondistributionisusuallyinterpretedin(chemical)termsofindividualatomsandmolecules,butitisimportanttorealizethatthemolecularmodelconsistingofindividualatomsisalreadyaninterpreta-tionoftheprimaryresultofthediffractionexperi-ment.Finally,theatomicmodelisre“nedbyvaryingallmodelparameterstoachievethebestagreementbetweentheobservedre”ectionamplitudes()andthosecalculatedfromthemodel().Thisagreementisjudgedbytheresidualorcrystallographicde“nedlater.Itshouldbestressedthatbothandthe-factorareglobalindicators,showingtheoverallagreement,respectively,betweenequivalentintensitiesorobservedandcalculatedamplitudes,andcannotbeusedtopinpointindividualpoorlymeasuredre”ectionsorlocalincorrectlymodeledstructuralfea-There“nementprocessusuallyinvolvesalternatingroundsofautomatedoptimization(e.g.accordingtoleast-squaresormaximum-likelihoodalgorithms)andmanualcorrectionsthatimproveagreementwiththeelectron-densitymaps.Thesecorrectionsarenecessarybecausetheautomaticallyre“nedparametersmaygetstuckina(mathematical)localminimum,insteadofleadingtotheglobal,optimumsolution.Themodelparametersthatareoptimizedbyare“nementpro-graminclude,foreachatom,itsnates,andaparameterre”ectingitsmobilityorsmearinginspace,knownasthe-factor(ordisplace-mentparameter,sometimesreferredtoastemperature-factorsareusuallyexpressedinArangefrom2to100.[IftheirvaluesinthePDB“lesaresystematicallylowerthan1.0,theyshouldbemultipliedby80(8)tobebroughttothe-factormodelusedisusuallyisotropic,i.e.describesonlytheamplitudeofdisplacement,butmoreelaboratemodelsdescribetheindividualaniso-tropicdisplacementofeachatom.Evenintheiso-tropicapproximation,crystallographicmodelsofmacromoleculesaretremendouslycomplex.Forexam-ple,aproteinmoleculeof20kDawouldtakeabout6000parameterstore“ne!Frequently,thenumberofobservations(especiallyatlowresolution,videinfra)isnotquitesuf“cient.Forthisreason,re“nementiscar-riedoutunderthecontrolofstereochemicalrestraintswhichguideitsprogressbyincorporatingpriorknowl-edgeorchemicalcommonsense[7,8].Themostpopu-larlibrariesofstereochemicalrestraints(theirstandardortargetvalues)havebeencompiledbasedonsmall-moleculestructures[9…11]butthereisgrow-ingevidencefromhigh-qualityproteinmodelsthatthenuancesofmacromolecularstructuresshouldalsobetakenintoaccount[12].Anotherwayofmodelre“nement,introducedmorerecentlyintomacromolecularcrystallography,involvesdividingthewholestructureintorigidfragmentsandexpressingtheirvibrationsintermsoftheso-calledTLSparameterswhichdescribethetranslational,libra-tionalandscrewmovementsofeachfragment[13].Selectionofrigidgroupsshouldbereasonable,corre-spondingtoindividual(sub)domains,forexample.Anexceedinglylargenumberofverysmallfragmentsunreasonablyincreasesthenumberofre“nedparame-tersandleadstomodelsnotfullyjusti“edbytheexperimentaldata.Althoughmanyofthestepsincrystalstructureanal-ysishavebeenautomatedinrecentyears,theinterpre-tationofsome“nefeaturesinelectron-densitymapsstillrequiresasigni“cantdegreeofhumanskillandexperience[14].Adegreeofsubjectivityisthusinevita-bleinthisprocessanddifferentpeopleworkingwiththesamedatamayoccasionallyproduceslightlydiffer-entresults.Thisreviewisprimarilyintendedtoadvisethosewhodonothaveadeepknowledgeofcrystallo-graphy,butneedtoknowhowtheobjectivityandsub-jectivityembeddedintheavailablecrystalstructuresshouldbebalanced.Detailedproceduresusedinmac-romolecularcrystallographyareexplainedinanumberofbooks,somedescribingtheminmoreadvancedterms[15,16],otherinsimplerways[17,18].Electron-densitymapsandhowtointerpretthemAsmentionedearlier,electron-densitymapsaretheprimaryresultofcrystallographicexperiments,whereastheatomiccoordinatesre”ectonlyaninterpretationoftheelectrondensity.Althoughmapsbasedontheinitialexperimentallyderivedphasesaresometimesanalyzedonlybysoftwareratherthanhumaneye(apracticethattheauthorsofthisreviewverystronglyoppose),westillneedtounderstandwhattoexpectfromthem.Thebasicelectron-densitymapcanbycalculatednumericallybyFouriertransformationofthesetofobserved(experimental)re”ectionamplitudestheirphases.However,becausethephases,,arenotavailableexperimentally,theyarecalculatedfromthecurrentmodel.Sucha()maprepresentsanapproximationofthetruestructure,dependingontheaccuracyofthecalculatedphases,thatis,onhowgoodthemodelisfromwhichthephaseswerecom-puted.Anothertypeofelectron-densitymap,theso-calleddifferencemap,calculatedusingdifferencesbetweentheobservedandcalculatedamplitudesandcalculatedphases,(),showstheProteincrystallographyfornon-crystallographersA.Wlodaweretal.FEBSJournal(2007)Journalcompilation2007FEBS.NoclaimtooriginalUSgovernmentworks differencebetweenthetrueandthecurrentlymodeledstructures.Insuchamap,thepartsexistinginthestructure,butnotincludedinthemodel,shouldshowupinthepositivemapcontours,whereasthepartswronglyintroducedintothemodelandabsentinthetruestructurewillbevisibleinnegativecontours.Inpractice,itiscustomarytouse(2maps,correspondingtoasuperpositionofbothprevi-ousmaps,toshowthemodelelectrondensityaswellasthefeaturesrequiringcorrections.Also,theampli-tudesusedinmapcalculationareoftenweightedbystatisticalfactors,re”ectingtheestimatedaccuracyofindividualamplitudesandphases.Becausealldatausedtocomputemaps(bothampli-tudesandphases)containadegreeoferror,themapsalsocontainsomelevelofnoise.Usuallyagooddis-playcontourforthe(2)mapandforthe()mapaboutis±3isthermsdofallmappointsfromtheaver-agevalue.Highercontourlevelsmaysometimesbeusedtoaccentuatecertainfeatures,buttheuseoflowercontourlevelsmaybemisleadingbecausethismayemphasizenoiseratherthanrealfeatures.ItiswellestablishedthattheappearanceofFouriermapsdependsmoreonthephasesthanonamplitudes.Therefore,evenifthecorrectamplitudesareknownfromawell-conducteddiffractionexperiment,inaccu-ratephasesmayintroducemapbias,whichmaybedif-“culttoeliminateintheiterativere“nementandmodelingprocess.Thishappensbecausethewrongphaseswillalwaysreproducethesameerroneousmodelfeatures,whichinturnwillproducethesamesetoferroneousphases.Amapusedtoovercomesuchabiasistheso-calledomitmap,avariationofthedifferencemap,inwhichthevaluesarecomputedfromamodelwiththesuspiciousfragmentsdeleted.Re“nementofsuchatruncatedmodelissupposedtoremoveanymemoryofthosefragmentsinthesetofcalculatedamplitudesandphases.Theomitmapshouldthenshowanunbiasedrepresentationoftheomittedfragment.Thedifferencebetweentheinitial,experimentaland“nal,optimalelectron-densitymapsisillustratedinFig.2.Thefragmentoftheinitialmapagreeswiththe“nalmodel,butitwouldnotbeeasytoconvincinglybuildthispartofthemodelintosuchamap.Themapqualityispoorbecausethephasesusedtoconstructitwereratherinaccurate,anddoesnotresultfromlackoforder,astheproteinchainofthisfragmentiswellde“nedinthecrystal,asevidencedbythemapcalcu-latedwiththe“nalphases.Ingeneral,theclarityandinterpretabilityofelec-tron-densitymaps,eventhosebasedonaccuratephases,dependontheresolutionofthediffractiondata(relatedtothenumberofre”ectionsusedinthecalculations).Figure3illustratestheappearanceof B Fig.2.Stereoviewsofelectron-densitymaps.TheÞnalatomicmodelofafragmentoftheDraDinvasin(PDBcode2axw)[79]issuperimposedonthemaps.(A)The1.75Aresolutionmapcalcu-latedwithamplitudesandinitiallyestimatedphases,contouredatthe1.5level.ThismapwasusedtoconstructtheÞrstmodeloftheproteinmolecule.(B)The1.0AresolutionmapcalculatedamplitudesandthephasesobtaineduponcompletionofthereÞnement,contouredat1.7.TheÞnalmapshowsthecom-pletefragmentofthechainwithconsiderablybetterdetail,sinceitwascalculatedatmuchhigherresolution(usingoverÞvetimesmorereßections)andwithveryaccuratephases.A.Wlodaweretal.Proteincrystallographyfornon-crystallographersFEBSJournal(2007)Journalcompilation2007FEBS.NoclaimtooriginalUSgovernmentworks typicalelectron-densitymapscalculatedwithdatatruncatedatvariousresolutionlimits.Whereasatlowresolutionitisnotpossibletoaccuratelylocateindi-vidualatoms,aprioriknowledgeofthestereochemis-tryofindividualaminoacidsandpeptidegroupsallowsthecrystallographertolocatetheseproteinbuildingblocksquitewell.Withincreasingresolution,themapsbecomeclearer,showingseparatedpeakscor-respondingtothepositionsofindividualatoms.Atatomicresolution,individualpeaksarewellresolvedandtheirheightpermitsdifferentiationbetweenatomtypes.Atomic-resolutionmapsmayshowcertainnon-standardstructuralfeatures,suchasunusualconfor-mationsorveryshorthydrogenbonds.Itwouldnotbepossibletoconvincinglymodelsuchfeaturesintolow-ormedium-resolutionmaps.Inpractice,mapsobtainedwithlow-resolutiondataareevenworsethanthosepresentedintheFig.3,becausetherelativeerrorofdiffractionintensitiesintheresolutionshellof3.5…3.0Aforcrystalsdiffractingto3Aismuchlargerthanforcrystalsdiffractingto1.5AMostproteinscontainregionscharacterizedbyele-vateddegreeof”exibility.Incrystals,such”exibilitymayresulteitherfromstaticordynamicdisorder.Staticdisorderresultsfromdifferentconformationsadoptedbyagivenstructuralfragmentsindifferentunitcells.Dynamicdisorderistheconsequenceofincreasedmobilityorvibrationsofatomsorwholemolecularfragmentswithineachindividualunitcell.Thetimescaleforsuchvibrationsismuchshorterthanthedurationofthediffractionexperimentand,asaresult,theelectrondensitycorrespondstotheaverageddistributionofelectronsinallunitcellsofthecrystal.Inthecaseofstaticdisorder,mapsareaveragedspatiallyoverallunitcellsirradiatedbytheX-rays.Inthecaseofdynamicdisorder,theelectrondensityisaveragedtemporallyoverthetimeofdatacollection.Inbothcases,theelectrondensityissmearedovermultipleconformationalstatesofthedisorderedfrag-mentsofthestructure.Atlowresolution,thesmearedelectrondensitymaybehiddeninthenoiseandsuchfragmentswillnotbeinterpretable,butathigherreso-lutiontheymayappearasdistinct,alternativeposi-tionsifstaticdisorderispresent.Figure4illustratesatypicalcaseofafragmentexistinginmultipleAspecialcaseofdisorderisalwayspresentinthesolventregionofallmacromolecularcrystals.Thedominatingcomponentofthesolventregionarewatermolecules,althoughobviouslyanycompound Fig.3.Theappearanceofelectrondensityasafunctionoftheresolutionoftheexperi-mentaldata.TheN-terminalfragment(Lys1ÐVal2ÐPhe3)oftricliniclysozyme(PDBcode2vb1)[80]withthe()mapscalculatedwithdifferentresolutioncut-off.Whereasatthehighestresolutionof0.65Atherewere184676reßectionsusedformapcalculation,at5Aresolutiononly415reßectionswereincluded. Fig.4.Electrondensityforaregionwithstaticdisorder.Themodelandthecorresponding()mapforArgA63inthestructureofDraDinvasin(PDBcode2axw)[79],withitssidechainintwoconformations.Themapwascalculatedat1.0Aresolutionanddis-playedatthe1.7contourlevel.Proteincrystallographyfornon-crystallographersA.Wlodaweretal.FEBSJournal(2007)Journalcompilation2007FEBS.NoclaimtooriginalUSgovernmentworks fromthecrystallizationmediummayalsobepresentintheintersticesbetweenproteinmolecules.Somewatermolecules,hydrogen-bondedtoatomsattheproteinsurfaceinthe“rsthydrationshell,arelocatedatwell-ordered,fullyoccupiedsitesandcanbemod-eledwithcon“dence.Watermoleculesatlongerdis-tancesfromtheproteinsurfaceoftenoccupyalternative,partially“lledsitesandaredif“culttomodelevenatveryhighresolution.Thebulksolventregioncontainscompletelydisorderedmoleculesanddoesnotshowanyfeaturesexceptmoreorless”atlevelofelectrondensity.Thisbulksolventregionusu-allyoccupies50%ofthecrystalvolume,althoughsomecrystalscontaineitherlessormoresolventthanusual.Theamountofsolventcanbeestimatedfromtheknownproteinsizeandthevolumeofthecrystalunitcell,usingtheso-calledMatthewscoef“cient[19].Crystalscontainingmoresolventusuallydisplaylowerdiffractionpowerandresolution,inkeepingwiththedegreeofdisorder,whichisaconsequenceofweakerstabilizationoftheproteinmoleculesthroughinter-molecularinteractions.AquicklookattheÞlesprovidedbytheProteinDataBankVirtuallyalljournalsthatpublisharticlesdescribing3DproteinstructuresrequirethattheauthorsdeposittheirresultsinthePDB.Whendeposited,eachstruc-tureisgivenauniquePDBaccessioncodeconsistingoffourcharacters.Ifastructureislaterwithdrawnorreplaced,thecodeisnotreused.Anychangestoatomiccoordinatesresultinanewaccessioncode;theold“lesarethenmovedintotheobsoletearea,butcanstillbeaccessed(withsomeeffort).Structuralinformationcanbesubsequentlydownloadedbytheusersasatext-formatted“le.Forastructurewiththeaccessioncode9xyz,thecorresponding“lewouldbe9xyz.pdb.(Foreasierhandlingbycomputerprograms,thesameinformationisalsostoredinaCrystallo-graphicInformationFile,9xyz.cif.)Thetext“lecon-tainsaheadersectionwiththeexperimentaldetailsandacoordinatesectionwithallexperimentallylocatedatomsinthestructureofinterest.Eachatomisidenti“edbyaninventorytagspecifyingitsname,res-iduetype,chainlabel,andresiduenumber,whichisfollowedby“venumericalvaluesspecifyingitsloca-tion(orthogonalcoordinatesexpressedinAsiteoccupancyfactor(afractionbetween0and1),anditsdisplacementparameteror-factor(expressedinA),which(atleastintheory)providesinformationabouttheamplitudeofitsoscillation.AnypersonintheworldwithInternetaccesscanfreelydownloadthese“lesordisplaythemonthecomputerscreenusingoneofseveralapplicationsavailablefromthePDBsite(http://www.rcsb.org/pdb/).Forgreater”exi-bility,itisalsopossibletouseoneofthemoreadvancedgraphicalprograms,forexample,example,pymol[21]or[22].Theseprograms,andsomeothers,provideavarietyofwaysfordisplayingandmanipulationofthe3Dstructuresandallowtheirdetailedexamination.A“leheadergivesadescriptionoftheX-rayexperi-ment,thecalculationsthathaveledtostructuredeter-mination,andsomeparametersthatcanhelpthereaderassessthequalityofthestructure.Traditionally,theMaterialsandmethodssectionofpapersthatdescribedcrystallographicexperimentsexplainedindetailhowthestructurewassolvedandprovidedinformationthatallowedthereadertoevaluatethequalityoftheexperimentaldata.Recently,high-impactjournalshavebeenenforcingmuchstricterlimitsofthesizeofthepapersand,atbest,anextractofthisinformationcanbefoundinSupplementarymaterialsection,whichisusuallyonlyavailableonlineandfre-quentlyisnotfullyreviewed.Evaluationofstructurequalitybasedonthecon-tentsofPDB“leheadersisnoteasyfornon-crystal-lographers,yetwemuststressthatanyuserofsuchinformationshouldlookattheheader“rst,beforespendingtoomuchtimelookingatthe(potentiallyillusory)detailsofthestructure.APDB“leusuallycontainsinformationaboutdataextentandquality(resolution,completeness,,bothoverallandinthehighestresolutionshell),aswellasindicatorsofthequalityoftheresultingstructure,suchasvideinfra).Inprinciple,theinformationthatisprovidedinaPDBdepositshouldbesuf“cienttocreatetheMaterialsandmethodssectionbyanappropriatesoftwareutility.However,theinformationintheheadersofPDB“lesisoftenincomplete,contra-dictory,orerroneous.Anextremecaseisillustratedbythedeposition2hyd[23]thatcorrectedaseriesoffaultystructureswithdrawnfromthePDB(togetherwithpapersretractedfromseveralhigh-impactjour-videinfra).Theheaderofthe2hyd.pdb“ledoesnotcontainanyinformationonhowthecorrectstruc-turewasarrivedat…all“eldsthatdescribestructuresolutionandqualityofthedataaredesignatedasNULL.Although,asdiscussedinthefollowingsec-tions,noneoftheseparametersaloneisarock-solidindicatorofthequalityofaproteinstructure,theydoprovideinformationthathelpsinassessingthelevelofdetailthatcouldbegleanedfromsuchastructure.WeconsiderPDB“lesthatdonotcontainthisinforma-tiontobeseriouslyde“cient.A.Wlodaweretal.Proteincrystallographyfornon-crystallographersFEBSJournal(2007)Journalcompilation2007FEBS.NoclaimtooriginalUSgovernmentworks Inadditiontothetext“le(e.g.9xyz.pdb),eachcrys-tallographicPDBdepositionshouldbeaccompaniedbyacorresponding“lewiththeexperimentalstructurefactoramplitudes(9xyz-sf.cif).Mostregretfully,formanyofthePDBentriesnostructurefactorsareavail-able,andevenforthemostrecentdepositions(after1January2000)theyarefoundinonly79%ofthecases,despitetheNationalInstitutesofHealth(NIH)requiringthatalldepositsthathaveresultedfromNIH-sponsoredresearchshouldincludeexperi-mentalstructurefactorsaswell(mostotherfundingagencieshavesimilarrules).Theavailabilityofstruc-turefactorsallowsre-re“nementofthestructureandindependentevaluationofmodelqualityandtheclaimedaccuracyofdetails(although,ofcourse,suchchecksarenotexpectedtobeperformedtoofre-HowtoassessthequalityofthediffractiondataThequalityofmacromolecularcrystalstructuresisultimatelydependentonthequalityofthediffractiondatausedintheirdetermination.Themostimportantindicatorsofdataqualityareparameterssuchasreso-lution,completeness,(orsignal-to-noiseratio),and,overallandinthehighestresolutionshell.Itisveryimportanttounderstandtheirmeaningandtherelationshipbetweentheirnumericalvalues.ResolutionofdiffractiondataAnimportantparametertoconsiderwhenassessingthelevelofcon“denceinamacromolecularstructureistheresolutionofthediffractiondatautilizedforitssolutionandre“nement(oftenreferredtoasresolutionofthestructure).ResolutionismeasuredinAandcanbede“nedastheminimumspacing()ofcrystallatticeplanesthatstillprovidemeasurablediffractionofX-rays.Thistermde“nesthelevelofdetail,ortheminimumdistancebetweenstructuralfeaturesthatcanbedistinguishedintheelectron-densitymaps.Thehighertheresolution,thatis,thesmallerthethebetter,becausetherearemoreindependentre”ec-tionsavailabletode“nethestructure.Thetermscus-tomarilyappliedtoresolutionarelow,medium,high,andatomic(Fig.5).Theappearanceofelec-trondensityasafunctionofresolutionisshowninFig.3.Thelowest-resolutioncrystalstructuresthathavebeenpublishedwiththecoordinatesstartatares-olutionof,whichisusuallysuf“cienttoprovideaveryroughideaabouttheshapeofthemacromole- LowMedium Fig.5.Criteriaforassessmentofthequalityofcrystallographicmodelsofmacromolecularstructures.Fortheresolutionandcriteria,themoreÔgreenÕ(i.e.lower)thevalue,thebetter.Withandrmsdfromidealitythesituationisdifferentbecausethereissomeoptimalvalueanddrasticdeparturesinbothdirectionsalsosetaredßag,althoughfordifferentreasons.Whenthedifferencebetweenexceeds7%,itindicatespossibleover-interpretationoftheexperimentaldata.Butifitisverylow(saybelow2%),itstronglysuggestthatthetestdatasetisnottrulyÔfreeÕ,forexample,becausethestructureispseudosymmetricor,evenworse,becausethetestreßectionshavebeencompromisedinaroundofreÞnementorwerenotproperlytransferredfromonedatasettoanother.Whenrmsd(bonds)isveryhigh,itisanobvioussignalofmodelerrors.However,whenitisverylow(e.g.0.004A),itindicatesthatthroughtootightrestraintsthemodelunderwentgeometryoptimization,ratherthanreÞnementdrivenbytheexperimentaldiffractiondata.Therearedifferentopinionsabouthowrigorousthestereochemicalrestraintsshouldbe.However,becausetheÔidealÕbondlengthsthemselvessufferfromerrorsintheorderof0.02A,itisreasonabletorequirethemodeltoadheretothemalsoonlyatthislevel.Proteincrystallographyfornon-crystallographersA.Wlodaweretal.FEBSJournal(2007)Journalcompilation2007FEBS.NoclaimtooriginalUSgovernmentworks cule,especiallyifitcontainsmanyhelices,aswasthecaseofthe“rstpublishedstructureofmyoglobin[1].However,veryfewcrystalstructuresofeventhelargestmacromoleculesarecurrentlypublishedatsuchlowresolution.Forexample,althoughearlyreportsofthestructureofribosomalsubunits,amongthelargestasymmetricassembliesstudiedtodatebycrystallogra-phy,werebasedon5Adata[24],theywerequicklyfollowedbyaseriesofstructuresat2.4…3.3AATodaysstandardformediumresolutionstartsat2.7A,wherethereisthe“rstchancetoseewell-de“nedwatermolecules,whosehydrogen-bondingdistancesaretypicallythatlong.Increasinglymorestructuresarenowdeterminedtoaresolutionexceed-ing2A.Thevalueof1.5AcorrespondstotypicalC…Ccovalentbondsinmacromolecules.Whentheresolutionissigni“cantlybeyondthislimit(e.g.1.4A),ananisotropicmodelofatomicdisplace-mentscanbere“ned.At1.2A,fullatomicresolutionisachieved[28,29].Thiscorrespondstotheshortestinteratomicdistancesnotinvolvinghydrogen(C=Ogroups).Directlocationofhydrogenatomsintheelec-tron-densitymapbecomespossibleatresolutionhigherthan1.0A,becausecovalentbonddistancesofhydro-genareintherange0.9…1.0A.Theresolutionof0.77Acorrespondstothephysicallimitde“nedbyX-rayradiation(1.542A).Suchresolutionisveryrarelyachievedinmacromolecularcrystallogra-phy[30,31],andisbeyondtheroutinelimitsofevensmall-moleculecrystallography.Ultra-highresolutionallowsmappingofdeformationelectrondensity,forexample,ofindividualatomicorbondingorbitals.Theclaimedresolutionofastructuredeterminationissometimesonlynominal.Iftheaverageratioofre”ectionintensitytoitsestimatederror,�),inthehighestresolutionshellis2.0,itcanbeassumedthatthetrueresolutionisnotasgood.However,ifthisnumberismuchhigherthan2.0,itindicatesthatthecrystalisabletodiffractbetterbuttheresolutionofdatawaslimitedbytheexperimenterortheset-upofthesynchrotronexperimentalstation.Theuseofmaxi-mumachievableresolutionforre“nementnotonlypermits“nerstructuredetailstobeobserved,butalsoremovespossiblebiasfromthemodel,ashigherreso-lutionimprovesthedata-to-parameterratio.IthastobenotedthattheparametersinthePDBdepositheaderareusuallyprovidedforthesetofdatausedforstructurere“nement,ratherthanforthedataoriginallyusedtosolvethestructure.Thesetofdatausedinre“nementcanbecollectedwithadifferentexperimentalprotocolthanthesetofdatacollectedforphasing.Forre“nement,itismostimportanttocollectacompletedatasettotheresolutionlimitofdiffraction,whereasforphasingitismostimportanttocollectaccuratedataatlowerresolution,becausehigh-resolutionintensitiesaregenerallytooweaktoprovideusefulphasingsignal.Forthatreason,itisdif“culttoassessthequalityofphasingfromthepublishedordepositedinformation,ifaseparateexperimentaldatasetwasusedforre“nement.QualityoftheexperimentaldiffractiondataTherawresultofamoderndiffractionexperimentisasetofmanydiffractionimages,storedincomputermemoryas2Dgridsofpixelscontainingintensitiesoftheindividualre”ections.Theintensitieshavetobeintegratedoverthosepixelsthatrepresentindividualre”ections.Mostre”ections(togetherwiththeirsym-metryequivalents)aremeasuredmanytimes,andtheirintensitieshavetobeaveragedaftertheapplicationofallnecessarycorrectionsandappropriatescaling.Thisprocessisknownasscalingandmerging,anditsresultisasetofuniquere”ectionintensities,eachaccompaniedbyastandarduncertainty,orestimateoferror.Multipleobservationsofthesamere”ectionpro-videameanstoidentifyandrejectpotentialoutliers,whichmayhaveresulted,forexample,frominstru-mentalglitches.However,thenumberofsuchrejec-tionsshouldbeminimal,afractionofapercentatAsmentionedpreviously,theaccuracyoftheaver-agedintensitiescanbejudgedfromthespreadoftheindividualmeasurementsofequivalentre”ectionsbytheresidual.Thesimpleformof(wherehenu-meratestheuniquere”ectionsanditheirsymmetry-equivalentcontributors)isnotthemostusefulindicator,becauseitdoesnottakeintoaccountthemultiplicityofmeasurements.Moreelaborateversionshavebeenproposed[32,33],buttheyareseldomquotedinpractice.Agoodsetofdiffractiondatashouldbecharacter-izedbyanvalue4…5%,althoughwithwell-optimizedexperimentalsystemsitcanbeevenlower.Inouropinion,avaluehigherthan10%suggestssub-optimaldataquality.Atthehighestresolutionshell,thecanbeallowedtoreach30…40%forlow-symmetrycrystalsandupto60%forhigh-symme-trycrystals,sinceinthelattercasetheredundancyisusuallyhigher.Inprinciple,highmultiplicity(orredundancy)ofmeasurementsisdesirable,asitimprovesthequalityoftheresultingmergeddataset,withrespecttoboththeintensitiesandtheirestimateduncertainties.How-ever,inpracticethiseffectmaybespoiledbyradiationA.Wlodaweretal.Proteincrystallographyfornon-crystallographersFEBSJournal(2007)Journalcompilation2007FEBS.NoclaimtooriginalUSgovernmentworks damage,initiatedinproteincrystalsbyionizingradiation,especiallyattheveryintensesynchrotronbeamlines[34,35].Itisnoteasyinpracticetostrikeanoptimalbalancebetweenthepositiveeffectofincrea-sedmultiplicityandthenegativein”uenceofradiationThemeaningfulnessofmeasuredintensitiescanbegaugedbytheaveragesignal-to-noiseratio,�).Thismeasureisnotalwaysabsolutelyvalidbecauseitisnottrivialtoaccuratelyestimatetheuncertaintiesofthemeasurements[)].Usuallythediffractionlimitisde“nedataresolutionwherethe�)valuedecreasesto2.0.Ifthedatacollectionexperimentwasnotconductedproperlyoriftherewasrapiddecayofdiffractionpower,somere”ectionsmaynotbemeasuredatall,andthedatamaynotbe100%complete.BecauseofthepropertiesofFouriertransforms,eachvalueoftheelectron-densitymapiscorrectlycalculatedonlywiththecontributionofallre”ections,thuslackofcom-pletenesswillnegativelyin”uencethequalityandinter-pretabilityofthemapscomputedfromsuchdata.Datacompleteness,thatisthecoverageofalltheoreti-callypossibleuniquere”ectionswithinthemeasureddataset,isthereforeanotherimportantparameterofdataquality.Theabovenumericalcriteriaareusuallyquotedforalldataandforthehighestresolutionshell.Unfortu-nately,itisnotcustomarytoquotethesevaluesforthelowestresolutionshell,containingthestrongestre”ections,whicharemostimportantforallphasingproceduresandfortheproperappearanceoftheelec-tron-densitymaps.Overalldatacompletenessmayreach,forexample,97%,butiftheremaining3%ofre”ectionsareallmissingfromthelowestresolutioninterval,allcrystallographicprocedures,fromphasingto“nalmodelbuilding,willsuffer.Asusual,thereareexceptionstotheserules.Thisis,forexample,thecasewithviruses,whichpossessveryhighinternal,non-crystallographicsymmetry,ineffectincreasingtheredundancyofthestructuralmotif,evenifthedatamaynotbecomplete.Forexample,forbluetonguevirus,980individualcrystalswereusedtocollectover21.5millionre”ections,and,stillthedatasetwasonly53%complete(7.8%inthehighestresolutionshell).Nevertheless,thesedataweresuf“-cientforsolvingthestructure[36].StructurequalityÐ,Ramachandranplot,rmsd,andotherimportantThequalityofacrystalstructure(and,indirectly,theexpectedvalidityofitsinterpretation)canbeassessedbasedonanumberofindicators.Themostimportantoneswillbediscussedhereinasimpli“edmanner,withoutanyattempttoprovidemathematicaljusti“ca-tionfortheiruse,butonlytoprovidesomeguidanceastotheirmeaning.-factorandAsmentionedearlier,residuals,or-factors,usuallyexpressedaspercent,butoftenasdecimalfractions,measuretheglobalrelativediscrepancybetweentheexperimentallyobtainedstructurefactoramplitudes,,andthecalculatedstructurefactoramplitudes,,obtainedfromthemodel.The-factor,de“ned,combinestheerrorinherentintheexperimentaldataandthedeviationofthemodelfromreality.Withincreasinglybetterdiffractiondata,frequentlycharacterizedby4%orless,the-factoriseffectivelyameasureofmodelerrors.Well-re“nedmacromolecularstructuresareexpectedtohave20%.When30%(Fig.5),thestructureshouldberegardedwithahighdegreeofreservationbecauseatleastsomepartsofthemodelmaybeincorrect.Thebestre“nedmacro-molecularstructuresarecharacterizedbybelow10%.Examplesofsuchstructuresincludexylan-ase10Aat1.2Aresolution[37],rubredoxinat0.92A[38],andantifungalproteinEAFP2at0.84AAamongothers.Theatomicresolutionstructureof-asparaginase(PDBcode1o7j)describestheposi-tionsofover20000independentatomsintheasymmetricunit(includinghydrogenatoms),yetitwasre“nedto=11%at1Aresolution[40].Insmall-moleculecrystallography,wherethemodelscontainfeweratomsandthedatacanbecorrectedforvarioussystematicerrors,itisnotunusualtoseeof1…2%.Animportantparameterthatwasintroducedintocrystallographicpracticein1992isfreefreeRfreeiscalculatedanalogouslytonormal-factor,butfor1000randomlyselectedre”ections(veryoftenin”atedtounnecessarilylargesetsduetoblinduseofdefaultsindatareductionsoftware)whichhaveneverenteredintomodelre“nement,althoughtheymighthavein”uencedmodelde“nition[42].Inthisway,ifthemathematicalmodelofthestructurebecomesunreasonablycomplex,i.e.includesparametersforwhichthereisnojusti“cationintheexperimentaldata,willnotimprove(eventhoughthe-factormaydecrease),indicatingover-interpretationofthedata.Thisisbecausethesuper”uousparameterstendtomodeltherandomerrorsoftheworkingdataset,whicharenotcorrelatedwiththeerrorsintheProteincrystallographyfornon-crystallographersA.Wlodaweretal.FEBSJournal(2007)Journalcompilation2007FEBS.NoclaimtooriginalUSgovernmentworks isanimportantvalidationparameterandshouldsetawarningifitexceedsbymorethan7%(Fig.5).Itshighvaluemayindicateover-“ttingoftheexperimentaldata,ormayresultfromaseriousmodeldefect.Forexample,additionofanunreason-ablenumberofwatermoleculesintothenoisyfeaturesofthesolventregionwillalwayslowertheordinary-factor,butwillnotimproveModiÞedformsoftheInadditiontotheconventionalandmostpopularcrys--factordiscussedabove,otherresidualsarealsoinusetogaugetheagreementbetweentherealandmodelworlds.hasalreadybeenmentionedasacross-validationparameterbasedonre”ectionsexcludedfromre“nement.However,itsindependencefromthemodelisnotcompleteasitmaybeusedtodecideonthecourseofre“nement(andmodelcon-struction).Therefore,anevenmoreindependentresidual,called,hasrecentlybeenproposed[42].Thatresidualshouldbebasedonanothersubsetofre”ectionsthatarekeptinavaultandneverusedinanycalculations,exceptforthe“nalAlthoughthisconceptismethodologicallycorrect,itisnotquitecertainwheretoputalimitforsacri“ceofthescarceexperimentalobservationsonthealtarofcross-validation,asremovalofconsecutivesubsetsofre”ectionsintroducesmathematicalerrorsintheFouriertransformationprocess(mapcalculation)andeffectivelyworsensthe“nalmapinterpretability.Acombinedapplicationoftestingwouldrequire2000…4000re”ections,whichmightamountto20%ofallobservationsforatypicaldatasetforamedium-sizeprotein.Anotherresidual,morecommoninsmall-moleculethaninproteinwork,istheweighted-factororwR2,basedonre”ectionintensitiesandincludingthestatis-ticalweightswithwhichtheobservationsenterthere“nement[43].Theproblemofdataweightingdoesnothaveagoodsolutioninproteincrystallographybecausetheuncertainties(errors)estimatedforthere”ectionintensitiesarenotalwaysveryreliable.Theycanbemoremeaningfulifderivedfromdataofhighredundancy,i.e.whenmanyobservationscontributetothesameaveragedre”ectionintensity.Acompletelydifferentphilosophyisbehindthede“-nitionoftheso-calledreal-space-factor.Here,theresidualiscalculatedtore”ectthecorrelationbetweentheexperimentalelectron-densitymapandtheonegeneratedpurelyfromthemodel.Real-spaceareusedlessfrequently;thedisadvantageisthateventheexperimentalmapis,inmostcases,basedonmodel-derivedphases.Animportantadvantageisthat-factorscanbecalculatedselectivelyfordiffer-entregionsofthemodel,thuseasilyrevealingthetroublingparts,somethingthatisnotobviousfromthediffraction-spaceresiduals.Root-mean-squaredeviationsfromstereochemicalstandardsRmsdfromstandardstereochemistryindicatehowmuchthemodeldepartsfromgeometricalparametersthatareconsideredtypical,orrepresentchemicalcom-monsensebasedonpreviousexperience.Usuallythesamestandardsareusedasrestraints(withadjustableweights)duringstructurere“nement[9,10].Differentparameterscanbeevaluatedbythermsdcriterion,butitismostcommontousethevalueforbondlengthswhencomparingdifferentmodels.Good-quality,med-ium-to-high-resolutionstructuresareexpectedtohavearmsd(bond)of0.02A(Fig.5),althoughnumbershalfthatsizearealsoacceptable.Whenthisnumberbecomestoohigh�(0.03A),itsigni“esthatsome-thingmightbewrongwiththemodel.Itisnotdesir-abletolowerthisvalueatallcosts,becausethestandardsrepresentsomeaveragesandarethemselvesnoterror-free[12].Atveryhighresolution,therestraintcontrolofmodelgeometry(atleastinwell-de“nedareas)becomeslessimportantbecausetheexperimentalinformationstronglydeterminesthecourseofthere“nement.RamachandranplotsandpeptideplanarityTheglobaldeviationsofstereochemicalparametersfromtheirexpectedvalues,discussedabove,mightraisequestionsaboutthequalityofthestructurebutwouldnotpinpointthesourceofpossibleerrors.Totracethem,onenormallyrunsageometryvalidationprogram,suchas[44]orortolookforindicationsofcuriousfeatures.Aparticu-larlyusefultoolistheRamachandranplot[46],show-ingthemappingofpairsoftorsionanglesofthepolypeptidebackbone(de“nedinFig.6)againsttheexpectedcontours.Theangleshaveastrongvali-dationpowerbecausetheirvaluesareusuallynotrestrainedinthere“nement(unlessaspecialtorsion-angle-re“nementmethodisused)[47].TwoexamplesofRamachandranplotareshowninFig.7.FortheErwiniachrysanthemi-asparaginasestructure(PDBcode1o7j;Fig.7A),�90%oftheanglesarefoundinthemostfavoredregionofthediagram.Oneresidue,Thr204,isfoundinthedisallowedregion,butitsstrainedconformationwaswelldocumentedinthatA.Wlodaweretal.Proteincrystallographyfornon-crystallographersFEBSJournal(2007)Journalcompilation2007FEBS.NoclaimtooriginalUSgovernmentworks andotherasparaginasestructures[40],thusthisdepar-turefromidealitycanbeacceptedwithcon“dence.ThatisnotthecasewiththeRamachandranplot(Fig.7B)forthestructureoftheC3bcomplementpathwayprotein(PDBcode2hr0),whichappearstosufferfromamultitudeofproblems(videinfraThethirdmain-chainconformationalparameter,thepeptidetorsionangle,isexpectedtobecloseto180orexceptionallyto0-peptides(thelattersitua-tionmaybemorefrequentthanoriginallythought).Thepeptideplanesareusuallyunderverytightstereo-chemicalrestraints,althoughthereisgrowingevidencethatdeviationsof±20fromstrictplanarityshouldbetreatedasnotabnormal[12,38,48]).Unreasonablytightpeptideplanarityrestraintsmayleadtoarti“cialdistortionsoftheneighboringanglesintheRa-machandranplot.However,sometimesoneencountersinthePDBproteinstructureswithtotallyimpossiblepeptide-bondtorsionangles.Modelscontainingsuchviolationsshouldberegardedashighlysuspicious.Canwetrustthepublishedmacromolecularstructures?Inouropinion,thegeneralanswertothisquestionisade“niteyes,although,asshownbelow,someprob-lemsmaybeencounteredinindividualcases.We Fig.6.Schematicrepresentationofafragmentoftheproteinback-bonechainwithdeÞnitionoftorsionanglesfortheresidue.Theseangleshaveareferencevalueof0intheeclipsedconformation,butaspresentedintheÞguretheyareallequalto B Fig.7.TwoexamplesofaRamachandrandiagram.(A)PlotforErwiniachrysanthemi-asparaginase,oneofthelargeststructuressolvedtodateatatomicresolution(PDBcode1o7j).(B)Plotforthe2.26AstructureoftheC3bcomplementprotein(PDBcode2hr0)characterizedbyaverylargenumberofmain-chaindihedralanglesoutsideoftheallowedregion,avastmajorityofthemoriginatingfromasinglepolypep-tidechain.Proteincrystallographyfornon-crystallographersA.Wlodaweretal.FEBSJournal(2007)Journalcompilation2007FEBS.NoclaimtooriginalUSgovernmentworks discusshereafewproblemsthatwefoundinthescien-ti“cliteratureandinthedepositedcoordinates.Wewouldliketostressthatsuchproblemsarequiterare,althoughthereadersofcrystallographicpapersshouldbeawareoftheirexistence.MisrepresentationofcrystallographicFortunatelyforthe“eld,knowncasesofoutrightfab-ricationofcrystallographicdataareextremelyrare,maybebecausethetechniqueissoheavilybasedoncalculationsthatdataarenoteasytofake.PerhapsthebestknowncaseofthatsortwasadiscoverythatthepublisheddiffractionpatternsattributedtovalyltRNAwereactuallythoseofhumancarbonicanhydraseB[49].Thatsubstitutionwasdetectedbyanalyzingtheunitcellparametersofthepublisheddiffractionphoto-graphs…theirvaluesarequitecharacteristicforagivencrystal,althoughtheymightbearchancesimilar-itytocrystalsofothermacromolecules.Inthatcase,thelatterpossibilitywasruledoutthroughcarefulanalysisofotheraspectsofthepresenteddata.Acaseofpossiblemanipulationofdiffractiondatahasrecentlybeendescribed(butitmustbestressedthat,asofthetimeofwritingofthisreview,itisnotyetof“ciallyproven).ItwaspointedoutthatthedatadepositedinthePDBforthestructureofproteinC3binthecomplementpathway,re“nedat2.26Ation(PDBcode2hr0),areinconsistentwiththeknownphysicalpropertiesofmacromolecularstructuresandtheirdiffractiondata[50].Forexample,thedepositedstructurefactorsdidnotshowanyindicationofthepresenceofbulksolvent,theelectrondensityofthepresumablylargelyunfoldeddomainwasexcellent,andtherewasnocorrelationbetweensurfaceaccessi-bilityandtheatomic-factors.Inaddition,someotherfeatures(18distancesbetweennon-bondedatomsof2A,severalpeptidetorsionanglesdeviat-ingfromplanaritybyasmuchas57,and4.2%ofoutliersintheRamachandranplot,almostallinonesubunit;Fig.7B)areclearindicationsofseriousprob-lemswiththisstructure.HonesterrorsinstructuredeterminationInourexperience,seriouserrorsindescribingawholemacromoleculearerare,especiallynowadays,althougherrorsinsomelocalareasmightbemorecommon.Astructureofribulose-1,5-biphosphatecarboxylase-oxy-genasewiththechainofoneofthesubunitstracedcompletelybackwardswaspublished[51],but,inawaythatshouldreassurenon-crystallographers,theerrorwasnotedalmostimmediately[52].Thestate-mentfoundintheabstractofthelatterpublicationoneofthesemodelsisclearlywrong,paraphrasingthewayWinnie-the-PoohwasaddressedbyRabbit(oneofuswaseatingtoomuch,andIknewitwasntme)[53],isanexcellentindicationoftheself-correct-ingpotentialofthecollectiveexperienceofthecrystal-lographiccommunity.Alaterre-enactmentofthiscase[8]showedthat,althoughitispossibletore“neabackwards-tracedstructureatmediumresolutiontoacceptablevaluesofandrmsd(bond),thevalueofwouldremaincompletelyunacceptable(inthatcase,61.7%),clearlyindicatingthatthemodelwasinerror.Withthemandatoryuseof,similarerrorsareunlikelytohappenagain.Averyrecentcaseofanimportantseriesofstruc-turesthatwereseriouslymisinterpretedpointsoutthedangerintroducedbydeviationfromstandardcrystal-lographicproceduresandbyover-interpretationoflow-resolutiondata.ThestructureoftheMsbAABCtransporterprotein[54],aswellasseveralrelatedstructurespublishedbythesamegroup,hadtoberetractedafterthestructureofSav1866,anothermem-berofthefamily,waspublished[23].Allstructuresoftheseveryimportantintegralmembraneproteinsweresolvedatlowresolution.ThestructureofMsbAwasre“nedusingnon-standardprotocolsthatutilizedmul-tiplemolecularmodels,andthisapproachmayhavemaskedproblemsthatwouldhavebeenobvioushadtheauthorsstayedwithmoretraditionalre“nementtechniques.Itmustbestressedthatallthesestructureswereverydif“culttosolveandeventheapparentlycorrectstructureofSav1866ischaracterizedbyratherhighvaluesof(25.5%and27.2%,respec-tively),althoughsuchvaluesarenotunusualat3AUnliketheveryrarecasesmentionedaboveinwhichthewholestructureswerequestionable,localmis-trac-ingofelementsoftheproteinchainhasbeenmorecommon.Anumberofsuchcaseshavebeenreviewedpreviously[8].Althoughthistypeoferrormaymatterverylittleifithappenstobelimitedtoanareaoftheproteinthatisremotefromtheactivesiteorfromsite(s)ofinteractionwithotherproteins,inothercasesitmayleadtomisinterpretationofbiologicalpro-cesses.Onewell-knowncase,inwhichmodelingastrandinsteadofahelixledtopostulatingadoubt-fulmodelofautolysis,wasprovidedbyHIV-1pro-tease[55].However,similartothecasesmentionedabove,theimplausibilityoftheoriginalinterpretationbecameclearalmostimmediately,when,“rst,thestructureofarelatedRoussarcomavirusproteasebecameavailable[56],and,soonthereafter,whentheA.Wlodaweretal.Proteincrystallographyfornon-crystallographersFEBSJournal(2007)Journalcompilation2007FEBS.NoclaimtooriginalUSgovernmentworks structureofHIV-1proteaseitselfwasindependentlydetermined[57].Oneimportantpracticalaspectofcrystallographicstructuresistoprovidedetailsoftheinteractionsbetweenmacromolecules(usuallyenzymes)andsmall-orlarge-moleculeinhibitors.Interpretationofsuchstructuresdependsverymuchonthequalityoftheelec-trondensityfortheinhibitor.Insomecases,suchasthecomplexofbotulinumneurotoxintypeBproteasewithasmallinhibitorBABIM[58],thestructuralconclusionshadtobelaterretracted,althoughthecrystallographicqualityindicatorsappearedtobemorethanacceptable(resolution2.8A=16.2%,=23.8%).Similarly,thevalidityofthestructureofacomplexofthesameenzymewithatargetpeptidewasquestioned[59],becausethe38-residuepeptidewasapparently“ttedtoaverynoisymapthatcouldnotsupporttheinterpretationofitsstructure.Interpretation(andover-interpretation)ofstructuralmodelsAssumingthatthereaderhaslookedattheheaderofthePDB“leandbecomeconvincedthattherearenoindicationsofanyproblemswiththediffractiondataorwiththeresultsofthere“nement,whatotherprop-ertiesofthestructureshouldbeconsidered?Animpor-tantaspectofmacromolecularcrystalstructuresisthedescriptionofsolventareas,aswaterplaysavitalroleinthestructureofbiomoleculesandoftenin”uencesproteinfunction.Anotherimportantaspectofthestructureisthedescriptionofotherligands,especiallyboundmetals.Subsequentinterpretationofthestruc-turesintermsofknownbiologicalandbiochemicalpropertiesisacrucialstepinstructuralbiology.ItisalsonecessarytoconsiderwhetherthefeaturesdescribedinthePDBdeposit,suchas,forexample,placementofhydrogenatoms,couldbejusti“edbytheresolutionandqualityoftheexperimentaldata.SolventstructureThesolventcontentofproteincrystalswas“rstana-lyzedbyMatthews[19]onthebasisofthefewproteincrystalstructuresknownatthattime,andwasfoundtorangefrom27to65%.ExaminationofthecurrentcontentsofthePDBindicatesthatthisestimateisstillvalid,withanaverageof51%,althoughsomeexcep-tionsarepresent.However,theapparentsolventcon-tentofentriessuchas2avy(92%)or1q9i(2.0%)certainlyindicateserrorsinthePDB.Thepresenceofsucherrors(10caseswithsolventcontentbelow2.5%)mustberecognizedbytheusersofthisdatabase.BecauseX-raycrystallographycanobserveonlyobjectsthatarerepeatedthroughouttheentirevolumeofthecrystalinaperiodicfashion,onlywell-orderedsolventmoleculescanbeidenti“edintheelectron-den-sitymap.Moreover,thenumberofobservedwatermoleculesalsodependsontheresolutionoftheexperi-mentaldata.Togetaroughestimateoftheexpectedratioofthenumberofwatermoleculestoproteinresi-duesoneshouldsubtracttheresolution(inA)from3.Thisindicatorcouldbehigher(byupto100%)forcrystalstructureswithahighsolventcontent(Mat-thewscoef“cient�3.0A).Thusatlowresolu-tion(2.5A)itshouldbepossibletoidentifyintheelectron-densitymapsatmost0.3…0.5orderedwatermoleculesperproteinresidueandatveryhighresolu-tion(1.0A)thismayincreaseto2watermoleculesperresidue.StructuresexceedingtheselimitsmaycontainItshouldbenotedthattheinclusionofawatermol-eculeinthemodelusuallyincreasesthenumberofre“nementparametersbyfour(threecoordinatesplustheisotropic-factor)andsubsequentlydecreasesthe-factor,soassigningwatertoeachunidenti“edsec-tionofdensityisverytempting,butmaynotbejusti-“ed.Thepresenceofwatermoleculeswithhigh-factors�(100A)indicatesthatthesolventstruc-turewasnotre“nedverycarefully.Alargedifferenceinthevaluesofthe-factorsforasolventmoleculeanditsenvironmentisalsoverysuspicious.MetalcationsAround30%ofallPDBdepositsreportthepresenceoforderedmetalions,with20%containingametallocatedinasiteimportantforthebiologicalactivityofthemacromolecule.Functionalanalysisofanumberofproteinscruciallydependsontheabilitytoidentifypossiblemetalionsinanunambiguousway.Unfortu-nately,PDB“lesdonotcontainanyinformationabouttheproceduresthatwereusedformetalassign-mentandre“nement,andeventherelevantpapersoftenrelegatethisinformationtosupplements.Some-timesmetalpositionsaredetermineddirectly,utilizingtheiranomalousscatteringofX-rays.Applicationofthisprocedureprovidesthehighestcredibility,butmostoftenthemetalsareassignedsimplytothehighpeaksofelectron-densitymaps.Whenassigningmetalionsinthelatterway,theexperimentershouldhaveexaminedthenumberofligands,thegeometryofthecoordinationsphere,andthe-factoroftheionanditsenvironment.Forexample,thedistancebetweencalciumandoxygenatomsshouldbe2.40Abetweenmagnesiumandoxygen2.07A[60].IftheProteincrystallographyfornon-crystallographersA.Wlodaweretal.FEBSJournal(2007)Journalcompilation2007FEBS.NoclaimtooriginalUSgovernmentworks distancesbetweenaputativecalciumandtheneighbor-ingoxygenatomsarearound2.1A,twopossibilitiesshouldbeconsidered:(a)amagnesiumionispresent,buttheexperimenterhaswronglyassignedthedensitytocalcium;or(b)there“nementwasperformedwithinappropriaterestraints.Metaliondistancerestraintsarenecessaryespeciallyforlowerresolutiondata,wheretheobservation-to-parameterratiosareusuallyinsuf“cientforunrestrainedre“nement[12].Certainmetalshavepreferencesforaparticulartypeofcoordi-nation,forexampleMgtendstoshowoctahedralcoordination,whereasZnismostoftentetrahedral[60…62].Ausefultoolfordifferentiatingbetweenvari-ousmetalionsisthebondvalenceconcept,whichtakesintoaccountthevalenceofthemetalandthechemicalnatureoftheligands[63…65].AnexampleofanionassignedasMgthatviolatesmostoftherulesgivenaboveisshowninFig.1B.Unfortunately,thispartofthestructureoffrankensteinasewascopieddirectlyfromthe“le1q9qdepositedinthePDB.Whereasthepresenceinastructureofafewmetalionswithacceptabledistancestotheproteinandgoodgeometryshouldbeconsiderednormal,thepresenceoftoomanysuchionsthatdonotmakereasonablecon-tactswiththeproteinshouldbeamatterofconcern.Forexample,the2.6AstructureofThermusthermo-RNApolymerase(PDBcode1iw7)contains485ions,thevastmajorityfarbeyond2.07Athenearestoxygenatom.Wemaysafelyassumethattheidentityofmostoftheseionsisverydubious,tosaytheleast.PlacementofhydrogenatomsHydrogenatomslacktheelectroniccoreand,inmole-culesofchemicalcompounds,theirsingleelectronisalwaysinvolvedintheformationofbonds.HydrogenatomsarethereforetheweakestscatterersofX-rays,andeveninsmall-moleculecrystallographytheirdirectlocalizationisdif“cult.Theonlychancetodirectlylocalizetheminmacromolecularstructuresisinthedifferencemapaftertherestofthestructuralmodelhasbeencarefullyre“nedatveryhighresolution.However,evenforthoseproteinsthatdiffractX-raystoultra-highresolution,onlyafractionofallhydrogenatomscanbeidenti“edinsuchmaps.Althoughhydrogenatomsarenoteasytolocalizedirectly,theyareobviouslypresentinallproteins,sug-ars,andnucleicacids,andareinvolvedinmanybio-logicalprocesses.Thelocationofmostofthemcanbecalculatedwithgoodaccuracyfromthepositionsoftheheavieratoms.Asaconsequence,itisadvisabletoincludethemajorityofhydrogenatomsinastructuralmodelatcalculatedpositions,andre“nethemasrid-ingontheirparentatoms.Inthisway,theirparame-tersarenotre“nedindependently,buttheircoordinatesarerecalculatedaftereachre“nementcycleandtheircontributiontoX-rayscatteringiscorrectlytakenintoaccount.Somere“nementprogramshaveoptionsforsuchatreatmentofhydrogenatomsinanautomaticway.Athighresolutiontheircontributionmayresultinadropoftheoverall-factorbyafewpercent.Moreover,ifHatomsarenotincluded,theircontributionisrepresentedcompletelybytheparentatomanditspositiontendstore“netothecenterofgravityofbothatoms.Asaresultthegeometryofthere“nedmodelmaybeslightlydistorted.Unfortunately,whereasthismethodisapplicabletomosthydrogenatoms,whicharerigidlyconnectedwithinsuchgroupsasmethylene,amide,phenyl,etc.,someotherhydrogenatoms,oftenthemostinterestingfromthechemicalandbiologicalpointofview,e.g.thosewithinhydroxylgroupsorwithinfunctionsthatcanbeeasily(de)protonated,suchascarboxyloraminogroups,cannotbetreatedinthisway.Insomecases,whenthemodelisaccurateenoughandre“nedathighresolution,theirpresencecanbeinferredindi-rectlybyanalyzingthegeometryofthechemicalenvi-ronment(Fig.8A).Forexample,ifthetwoC…Obondlengthswithinacarboxylgroupdiffersigni“cantly,thenmostprobablythisacidicgroupisnotionized.TheinternalC…N…Cbondanglesinheterocyclicrings,suchasintheimidazoleringofhistidine,tendtobebyupto5widerifthenitrogenatomisprotonated[66].Instructuresre“nedatultra-highresolution,aswellasinstructuresobtainedbyneutrondiffraction(atechniquenotdiscussedhere,butwhoseutilityiswelldocumented)[67,68],positionsofsomehydrogenatomscanbevisualizeddirectly(Fig.8B).Somelow-resolutioncoordinatesetsweredepositedinthePDBwithhydrogenatomsthatwereutilizedduringthere“nement,butwhichclearlycannothaveanyexperimentalbasisinstructuressolvedatlowreso-lution.Someexamplesareprovidedby1pma(3.4A1gtp(3.0A),1pfx(3.0A),or1ned(3.8A),amongoth-ers.Thereadermightsafelyassumethatthesehydro-genatomswereonlymodeledandnotdeterminedCatalyticmechanismThecrystalstructureofanucleicacidcomplexoftheenzymeonconase[69]mayrepresentacaseinwhichtheinterpretationofthestructuralresultscontradictstheestablishedpicturebygoingbeyondwhatcanbejusti“edbytheextentandqualityofthediffractionA.Wlodaweretal.Proteincrystallographyfornon-crystallographersFEBSJournal(2007)Journalcompilation2007FEBS.NoclaimtooriginalUSgovernmentworks data.Theauthorspostulatedanovelcatalyticmecha-nisminvolvingtheattackonthephosphodiesterbondbytheN2imidazoleatomofthecrucialcatalyticHis97residueratherthanbyN1,asisthecasewithotherRNaseA-likeenzymes.TheorientationoftheHis97ringinthedepositedstructure(PDBcode2i5s)wasdetermined,onthebasisofthe-factorsoftheimidazoleatoms,tobeoppositetothatfoundinallotherrelatedstructures.However,theinterpretationmaybeanexampleoftrustingcrystallographicdatabeyondthelevelofcredibility.First,the1.9Ationdatawerenotofthehighestquality=12.5%).Second,the“nalre“nedmodelplacesthecatalyticnitrogen4.15Afromtheatombeingattacked,atananglethatpreventsthecreationofanyhydrogenbonds.ItseemstousmorelikelythateitherthesidechainofHis97mighthavebeentrappedinanon-productiveorientation,orthere“nedvaluesofthe-factors,andinconsequencethededucedori-entationofthehistidinering,werein”uencedbydataIsthestructurerelevanttoexplanationofthebiologicalproperties?Infrequently,amacromolecularstructuremaybecom-pletelycorrectincrystallographicterms,yetthecoor-dinatesmaynotcorrespondtothebiologicallyrelevantstateofthemolecule.Afewexamplesillustratethissit-uation.The“rststructureofthecoredomainofHIV-1integrase(PDBcode1itg)containedacacody-latemoleculederivedfromthecrystallizationbufferattachedtoacysteinesidechainlocatedintheactive-sitearea[70].ThisledtheconstellationofthecatalyticresiduesAsp64,Asp116,andGlu152toassumeanon-nativecon“guration,althoughthedistortionofthecatalyticapparatusbecameapparentonlylater,bycomparisonwithother,unperturbedstructures,nota-blythecatalyticdomainofintegrasefromaviansar-comavirus[71,72].Themostsigni“cantconsequenceoftheinactiveconformationofthecatalyticresidueswastheinabilityofthetwoaspartatesidechainstobindacatalyticdivalentmetalcationinacoordinatedfashion.SubsequentstudiesofMgcomplexesofHIV-1integrasecrystallizedintheabsenceofcacody-latewereinfullagreementwiththestructuresofotherrelatedenzymes[73,74].Adifferentexampleofthedif“cultiesingainingmechanisticinsightsfromhigh-resolutionstructuresofenzymesisprovidedbyacomparisonofcrystalstruc-turesoftheproteolyticdomainofLonproteasesbelongingtotwocloselyrelatedfamilies,AandB.StructuralandbiochemicalinvestigationofsuchadomainofEscherichiacoliLonA(LonA;PDBcode1rre)[75]establishedthepresenceofacatalyticdyadconsistingofSer679andLys722.However,thesubse-quentlydeterminedstructureofacorrespondingdomainofMethanococcusjannaschiiLonB(PDBcode1xhk)indicatedthepresenceofacatalytictriad,which,inadditiontothetworesiduesequivalenttotheonesmentionedabove,alsoincludedAsp675E.colinumbering)[76].Suchanimportantstructuraldifferencewasinterpretedintermsofadifferentcata-lyticmechanismforthesecloselyrelatedenzymefami-lies.However,atomic-resolutioncrystalstructureofthecatalyticdomainofArchaeoglobusfulgidusLonBLonB;PDBcode1z0w)[77],aswellashigh-resolu-tionstructuresofaseriesofmutants,establisheda B Fig.8.Interpretationofthelocationofhydrogenatoms.(A)Assign-mentofhydrogenatomsbasedonthepatternofcarboxylateCÐObondlengthsoftheresiduesintheactivesiteofsedolisinreÞnedat1.0Aresolution(PDBcode1ga6)[81].Thebond-lengtherrors0.02A,thereforethedifferencesbetweentheCÐObondswithinthecarboxylicgroupsarenotdecisive,butstronglysugges-tiveabouttheprotonationstateoftheGluandAspresidues,espe-ciallythattheyformaninternallyconsistentpattern.(B)Hydrogen-omitmapfortheThr51residueinthemodeloftricliniclysozymereÞnedat0.65Aresolution(PDBcode2vb1)[80],contouredatthelevel.Hydrogenatomsarecoloredgray.Evidently,atthisthreo-ninethemethylandhydroxylgroupsdonotrotatefreely,butadoptstableconformations,duetotheirinteractionswithneighboringres-iduesinthecrystal.Proteincrystallographyfornon-crystallographersA.Wlodaweretal.FEBSJournal(2007)Journalcompilation2007FEBS.NoclaimtooriginalUSgovernmentworks differentpicture,inwhichthestrandincludingSer679wasturnedtowardssolvent,disruptingthecatalyticdyad.MutationofAsp675toAladidnotaffecttheactivityoftheenzyme.The“nalconclusion,possibleonlybecauseoftheavailabilityofawholeseriesofstructures,wasthatintheabsenceofasubstrate,product,orinhibitor,thecatalyticdomainofLonmayadoptaninactiveconformation.ThisisalessonworthApracticalapproachtoevaluatingproteinstructuresHavingpresentedabriefoutlineoftheprocessleadingtothesolutionofcrystalstructuresandafterdiscus-sionoftheappearanceoftheelectron-densitymapsandtheindicatorsofqualityofboththeexperimentaldataandtheresultingstructures,itistimetosumma-rizesomepracticalapproachestotheevaluationofmacromolecularstructurespresentedinthescienti“cliterature.Anicepictureshowingarenderedtracingofthemainchainandafewsidechainsmayconveyanimpressionthatthestructureshouldbeinterpretedasis,andevenfrankensteinase(Fig.1)mayappeartorepresentatthislevelaproperlysolvedproteinstruc-ture.Whatarethemostimportantindicatorsthatoneshouldpayattentionto?The“rstthingtocheckiswhetherthelevelofdetailofapublishedstructureandthebiologicalinferencesdrawnfromitarejusti“edbythedataresolution.Onesimpleindicatortocheckisthenumberofmodeledwatermoleculesperresidue.Forexample,oneofthe1.9Astructuresofasmallpeptidere“nedbyratherthancrystallographers(PDBcode1rb1)[78]con-tainscloseto7watermoleculesperresidue.Withmanyofthemsituated1Afromtheprotein,itmaybesafelyassumedthatthisstructureshouldnotbeinter-pretedasbiologicallyrelevant.Iftherearemorethanafewwatermoleculesincludedatresolutionlowerthan,theresultsareunquestionablyover-interpreted.Examplesofsuchstructureswithtoogeneroussolventmodelsare1zqrwith146watermoleculesper335resi-duesat3.7Aresolution,1q1pwith237watermole-culesper213residuesat3.2A,or1hv5with2136watermoleculesper972residuesat2.6A.However,structuressuchas1ys1with147watermoleculesper320residuesre“nedat1.1Aor2ifqwith102watermoleculesand315residuesat1.2Amayunderestimatethesolventcontent.Itisobvioustousthatthestruc-ture1ixhthatcontainsnosolventatall,despiteresolu-tionof0.98A-factorof11.4%,mustbeanexampleofadepositionorprocessingerrorintheAnotherresolution-relatedproblemiswhetherindi--factorswerere“ned,orwhetheronlyan(forthewholestructure)orgroup(foreachresidue)werere“ned.Anystructureatres-olutionlowerthaninwhich-factorswerere“nedindividuallyforeachatomshouldbetakenwithagrainofsalt,becausetheprocedureintro-ducedtoomanyparameters.Thestructure1q1pmentionedaboveisanexampleofsuchanapproach(re“nementoftoomanyparametersandadditionoftoomanysolventmoleculesoftengotogether).Anotherexampleofre“nementthatissubjecttobothreservationsisprovidedbyastructurealsodis-cussedpreviously,namelyDNApolymerasecode1zqr)re“nedat3.7A,inwhichthe“nalmodelcontains326proteinresidues,146watermolecules,3metalions,and15nucleotides,allre“nedwithindi--factorsthatrangefrom1.0to100.0AForhigh-resolutionstructures,anisotropicrepresenta-tionofatomicdisplacementparametersissubstanti-atedonlyiftheresolutionisbetterthan1.4A.Atlowerresolutionstheuseofsixre“nedanisotropicparametersinsteadofoneisotropic-factorisnotwarrantedbythenumberofre”ectionsavailableforre“nement.ThusastructureofmistletoelectinI(PDBcode1onk),re“nedanisotropicallyat2.1Aresolution,isagoodexampleofaprocedurethatshouldbetterbeavoided.ThenextparameterstoconsultwouldbetheothertwoRspresentedinFig.3.Intypicalsituations,thethreecriteriashouldbecongruous,i.e.high-resolutionstructuresareexpectedtobecharacterizedbylowerfactorsandbettergeometricalquality.However,theseparametersshouldnotbeinthealarmingredregions.Asanexample,thestructureofeye-lensaquaporin(PDBcode2c32),re“nedwithindividualatomictorsusingdataextendingonlyto7.01A=39.0%and=38.7%,seemstobeunacceptableforseveralofthereasonsgivenabove.The2.2Astructureofferricbindingprotein(PDBcode1d9y)ischaracterizedby=18.5%and=37.7%,withverylargedifferencesbetweenthe-factorsofneighboringatomsandgroups,withanisotropicallyre“nedFeandSatoms,andwithnorecordofgeometryindicatorssuchasrmsd(bond)giveninthePDB“le.Thisstructureandoneslikeitshouldalsoraisesigni“cantconcerns.Tofurtherevaluateastructuralmodel,thereadermayusevalidationprogramssuchas[44]or[45].Presumablytheyhavealreadybeenusedbytheauthors,buttheresultsarenotalwayssummarizedinthearticles.WeespeciallyrecommendcheckingtheRamachandranplottomakesurethatnoA.Wlodaweretal.Proteincrystallographyfornon-crystallographersFEBSJournal(2007)Journalcompilation2007FEBS.NoclaimtooriginalUSgovernmentworks unexplainedtorsionanglesarefoundindisallowedregions.Althoughnolongercommonlyfoundinpubli-cations,suchaplotisavailableonthePDBwebsiteforeachdepositedstructure.ItmaybeagoodhabittohavealookattheatomiccoordinatesectionofthePDB“letoseethelevelofthe-factorsoftheatomsthatareinthemostimportantfragmentsofthestruc-ture.Ifoneseesvaluessystematically�40A,thefragmentmaynotbewellde“nedatall.Anevenmoreimportantparametertocheckistheoccupancyfactor.Well-orderedatomsshouldhave1.00inthiscolumnofthePDB“le.Valuesequalto0.00meanthatsuchatomsarecompletely“ctitious,withoutanysupportfromtheexperiment,addedonlytomarkthechemicalcompositionoftheproteinsequence.Regionswithzerooccupancyshouldneverbeconsideredpartoftheexperimentalmodel,andconsequentlymustbeexcludedfromanyinterpretations.Occupancieshigherthan1.0resultfromobviouserrors.WhenscrollingthroughasourcePDB“le,itmaybeusefultoseeiftherewereanyalert”agssetbytheannotator,and,forthemoreinquisitivereader,toseewhatdataqual-ityisreportedintheexperimentalsection.Inadditiontoprovidingtheabovecriteria,arespectablecrystallographicpublicationshouldshowtheelectron-densitymaponwhichthekeyconclusionshinge.Thereadershouldbeabletoassessitsquality,especiallywithreferencetothecontourlevelatwhichitispresented.Asdiscussedthroughoutthisreview,ifboththecoordinatesandstructurefactorsareavailableinthePDB,itispossibletoindependentlyassessthequalityofpublishedcrystalstructuresandthusadjustexpecta-tionsaboutthelevelofdetailthatmaybesafelyacceptedbythereaders.Althoughsomelarge-scaleindependentre“nementeffortsareunderway,inwhichmanydepositedstructuresarere-re“nedusingconsis-tentprotocols,inavastmajorityofcasesthereaderswillnotbeexpectedtorepeatstructurere“nementandmapanalysisthemselves.Itisveryimportanttoapplysomecommon-sensetestsbeforetakingstructuralresultsasanabsoluteproofofthebiologicalpropertiesofmacromolecules.Doestheproposedactivesiteandthemechanismofactionmakesense?Clearly,theactivesiteoffranken-steinase,withonlyhydrophobicresiduespresent(Fig.1A),isunlikelytobeabletocatalyzeanyknownchemistry.Ifmetalionsareimportantforstructureinterpretation,havetheybeencorrectlyassigned?Again,theexampleoffrankensteinase,inwhichaputativeMgiondoesnothavetheexpectedcoordi-nation(Fig.1B),showsthatstrongproofsofthemetalidentityshouldbepresent.Althoughwehaveprovidedanumberofexamplesshowingthatnotallpublishedstructuresyieldthesamelevelofinformation,weshouldstressagainthat,byandlarge,anoverwhelmingmajorityofcrystalstructurescanbesafelyassumedtobeunquestionablycorrect.Itisimportanttokeepinmindthatcrystallog-raphyistheonlymethodthathasextensivebuilt-inqualitycontrolcriteriaofthestructuralproduct.Inelectronmicroscopy,themodeldoeshaveade“nitiveresolution,buttypicallyitisatleastanorderofmag-nitudeworsethaninX-raycrystallography.NMR-derivedmodelsdonotpossessanyresolutionandtheirqualitycannotbeassessedbyreferencetoexperimentusingacriterionsuchasthe-factor.Theycanbeevaluated,however,bysimilarrmsd(bond)andRama-chandranplotcriteria.Finally,althoughwegavesomegeneralguidelinesfortheinterpretationoftheindica-torsofstructurequality,wemuststressthatthereissomelevelofsubjectivityintheirinterpretation,andthatothercrystallographersmaynotexactlyagreewithallofourrecommendations.That,however,isthebeautyofthecrystallographicmethoditisalwaysopentofurtherre“nement.WewouldliketothankHepingZhengforhelpingwithidenti“cationofthePDB“lesmentionedinthisreview.OriginalworkinthelaboratoriesofAWandZDwassupportedbytheIntramuralResearchPro-gramoftheNIH,NationalCancerInstitute,CenterforCancerResearch,andWMwassupportedbygrantGM74942andGM53163.TheresearchofMJwassupportedbyaFacultyScholarfellowshipfromtheCenterforCancerResearchoftheNationalCancer1KendrewJC,BodoG,DintzisHM,ParrishRG,Wyck-offH&PhillipsDC(1958)Athree-dimensionalmodelofthemyoglobinmoleculeobtainedbyX-rayanalysis.,662…666.2BernsteinFC,KoetzleTF,WilliamsGJB,MeyerEFJr,BriceMD,RogersJR,KennardO,ShimanouchiT&TasumiM(1977)TheProteinDataBank:acomputer-basedarchival“leformacromolecularstructures.JMol,535…547.3BermanHM,WestbrookJ,FengZ,GillilandG,BhatTN,WeissigH,ShindyalovIN&BournePE(2000)TheProteinDataBank.NucleicAcidsRes,235…242.4LevittM(2007)GrowthofnovelproteinstructuralProcNatlAcadSciUSA,3183…3188.Proteincrystallographyfornon-crystallographersA.Wlodaweretal.FEBSJournal(2007)Journalcompilation2007FEBS.NoclaimtooriginalUSgovernmentworks 5BrownEN&RamaswamyS(2007)Qualityofproteincrystalstructures.ActaCrystallogrDBiolCrystallogr,941…950.6BormanS(2007)Structurequality:crystalstructuresinhotterjournalstendtohavemoreerrors.ChemEng,11.7HendricksonWA(1985)Stereochemicallyrestrainedre“nementofmacromolecularstructures.MethodsEnz-,252…270.8KleywegtGJ&JonesTA(1995)Wherefreedomisgiven,libertiesaretaken.,535…540.9EnghR&HuberR(1991)AccuratebondandangleparametersforX-rayprotein-structurere“nement.CrystallogrDBiolCrystallogr,392…400.10EnghRA&HuberR(2001)Structurequalityandtargetparameters.InInternationalTablesforvol.F(RossmanMG&ArnoldE,eds),pp.382…392.Kluwer,Dordrecht.11AllenFH(2002)TheCambridgeStructuralDatabase:aquarterofamillioncrystalstructuresandrising.CrystallogrDBiolCrystallogr,380…388.12JaskolskiM,GilskiM,DauterZ&WlodawerA(2007)Stereochemicalrestraintsrevisited:howaccuratearere“nementtargetsandhowmuchshouldproteinstruc-turesbeallowedtodeviatefromthem?ActaCrystallogrDBiolCrystallogr,611…620.13PainterJ&MerrittEA(2006)Optimaldescriptionofaproteinstructureintermsofmultiplegroupsundergo-ingTLSmotion.ActaCrystallogrDBiolCrystallogr,439…450.14BranC-I&JonesTA(1990)Betweenobjectivityandsubjectivity.,687…689.15BlundellTL&JohnsonLN(1976)ProteinCrystallogra-.AcademicPress,NewYork,NY.16DrenthJ(1999)PrinciplesofProteinX-rayCrystallog-.Springer,NewYork,NY.17BlowD(2002)OutlineofCrystallographyforBiologistsOxfordUniversityPress,NewYork,NY.18RhodesG(2006)CrystallographyMadeCrystalClearAcademicPress,Burlington,VT.19MatthewsBW(1968)Solventcontentofproteincrys-JMolBiol,491…497.20SayleRA&Milner-WhiteEJ(1995)RasMol:biomolec-ulargraphicsforall.TrendsBiochemSci,374…376.21DeLanoWL(2002)Thepymolmoleculargraphicssys-.DeLanoScienti“c,SanCarlos,CA.22EmsleyP&CowtanK(2004)Coot:model-buildingtoolsformoleculargraphics.ActaCrystallogrDBiol,2126…2132.23DawsonRJ&LocherKP(2006)Structureofabacte-rialmultidrugABCtransporter.,180…185.24BanN,NissenP,HansenJ,CapelM,MoorePB&SteitzTA(1999)PlacementofproteinandRNAstruc-turesintoa5Aresolutionmapofthe50,841…847.25BanN,NissenP,HansenJ,MoorePB&SteitzTA(2000)Thecompleteatomicstructureofthelargeribo-somalsubunitat2.4A,905…920.26WimberlyBT,BrodersenDE,ClemonsWMJr,Mor-gan-WarrenRJ,CarterAP,VonrheinC,HartschT&RamakrishnanV(2000)Structureofthe30,327…339.27SchluenzenF,TociljA,ZarivachR,HarmsJ,Glueh-mannM,JanellD,BashanA,BartelsH,AgmonI,FranceschiFetal.(2000)Structureoffunctionallyacti-vatedsmallribosomalsubunitat3.3A,615…623.28SheldrickGM(1990)PhaseannealinginSHELX-90:directmethodsforlargerstructures.ActaCrystallogrDBiolCrystallogr,467…473.29MorrisRJ&BricogneG(2003)Sheldricks1.2Aandbeyond.ActaCrystallogrDBiolCrystallogr30JelschC,TeeterMM,LamzinV,Pichon-PesmeV,BlessingRH&LecomteC(2000)Accurateproteincrystallographyatultra-highresolution:valenceelectrondistributionincrambin.ProcNatlAcadSciUSA31HowardEI,SanishviliR,CachauRE,MitschlerA,ChevrierB,BarthP,LamourV,VanZandtM,SibleyE,BonCetal.(2004)UltrahighresolutiondrugdesignI:detailsofinteractionsinhumanaldosereduc-tase-inhibitorcomplexat0.66A,792…804.32DiederichsK&KarplusPA(1997)ImprovedR-factorsfordiffractiondataanalysisinmacromolecularcrystal-NatStructBiol,269…275.33WeissMS&HilgenfeldR(1997)OntheuseofthefactorasaqualityindicatorforX-raydata.JApplCrystallogr,203…205.34GarmanE(2003)Coolcrystals:macromolecularcryo-crystallographyandradiationdamage.CurrOpinStruct,545…551.35RavelliRB&GarmanEF(2006)Radiationdamageinmacromolecularcryocrystallography.CurrOpinStruct,624…629.36GrimesJM,BurroughsJN,GouetP,DiproseJM,Mal-byR,ZientaraS,MertensPP&StuartDI(1998)Theatomicstructureofthebluetongueviruscore.,470…478.37DucrosV,CharnockSJ,DerewendaU,DerewendaZS,DauterZ,DupontC,ShareckF,MorosoliR,KluepfelD&DaviesGJ(2000)Substratespeci“cityinglycosidehydrolasefamily10.StructuralandkineticanalysisofStreptomyceslividansxylanase10A.JBiolChem,23020…23026.38EU3-DValidationNetwork(1998)Whochecksthecheckers?Fourvalidationtoolsappliedtoeightatomicresolutionstructures.JMolBiol,417…436.39XiangY,HuangRH,LiuXZ,ZhangY&WangDC(2004)CrystalstructureofanovelantifungalproteinA.Wlodaweretal.Proteincrystallographyfornon-crystallographersFEBSJournal(2007)Journalcompilation2007FEBS.NoclaimtooriginalUSgovernmentworks distinctwith“vedisul“debridgesfromEucommiaulmo-Oliveratanatomicresolution.JStructBiol40LubkowskiJ,DauterM,AghaiypourK,WlodawerA&DauterZ(2003)AtomicresolutionstructureofniachrysanthemiActaCrystallogrDBiolCrystallogr,84…92.41BrungerAT(1992)ThefreeRvalue:anovelstatisticalquantityforassessingtheaccuracyofcrystalstructures.,472…474.42KleywegtGJ(2007)Separatingmodeloptimizationandmodelvalidationinstatisticalcross-validationasappliedtocrystallography.ActaCrystallogrDBiol,939…940.43HamiltonW(1974)Testsforstatisticalsigni“cance.In:InternationalTablesforX-rayCrystallography,vol.IV.(IbersJA&HamiltonWC,eds),pp.285…292.TheKynochPress,Birmingham.44LaskowskiRA,MacArthurMW,MossDS&ThorntonJM(1993):programtocheckthestereochemicalqualityofproteinstructures.JAppl,283…291.45DavisIW,MurrayLW,RichardsonJS&RichardsonDC(2004):structurevalidationandall-atomcontactanalysisfornucleicacidsandtheirNucleicAcidsRes,W615…W619.46RamakrishnanC&RamachandranGN(1965)Stereo-chemicalcriteriaforpolypeptideandproteinchainconformations.IIAllowedconformationforapairofpeptideunits.BiophysJ,909…933.47BrungerAT,AdamsPD,CloreGM,DeLanoWL,GrosP,Grosse-KunstleveRW,JiangJS,KuszewskiJ,NilgesM,PannuNSetal.(1998)CrystallographyandNMRsystem:anewsoftwaresuiteformacromolecularstructuredetermination.ActaCrystallogrDBiolCrys-,905…921.48AddlagattaA,KrzywdaS,CzapinskaH,OtlewskiJ&JaskolskiM(2001)Ultrahigh-resolutionstructureofaBPTImutant.ActaCrystallogrDBiolCrystallogr49HendricksonWA,StrandbergBE,LiljasA,AmzelLM&LattmanEE(1983)Trueidentityofadiffrac-tionpatternattributedtovalyltRNA.50JanssenBJ,ReadRJ,BrungerAT&GrosP(2007)Crystallography:crystallographicevidencefordeviatingC3bstructure.,E1…E2.51ChapmanMS,SuhSW,CurmiPM,CascioD,SmithWW&EisenbergDS(1988)TertiarystructureofplantRuBisCO:domainsandtheircontacts.,71…52KnightS,AnderssonI&BranCI(1989)Reexami-nationofthethree-dimensionalstructureofthesmallsubunitofRuBisCofromhigherplants.53MilneAA(1926)WinniethePooh,p.25.Methuen,54ChangG&RothCB(2001)StructureofMsbAfromE.coli:ahomologofthemultidrugresistanceATPbindingcassette(ABC)transporters.,1793…55NaviaMA,FitzgeraldPM,McKeeverBM,LeuCT,HeimbachJC,HerberWK,SigalIS,DarkePL&SpringerJP(1989)Three-dimensionalstructureofaspartylproteasefromhumanimmunode“ciencyvirus,615…620.56MillerM,JaskolskiM,RaoJKM,LeisJ&WlodawerA(1989)Crystalstructureofaretroviralproteaseprovesrelationshiptoasparticproteasefamily.,576…579.57WlodawerA,MillerM,JaskolskiM,SathyanarayanaBK,BaldwinE,WeberIT,SelkLM,ClawsonL,SchneiderJ&KentSBH(1989)Conservedfoldinginretroviralproteases:crystalstructureofasyntheticHIV-1protease.,616…621.58HansonMA,OostTK,SukonpanC,RichDH&Ste-vensRC(2000)StructuralbasisforBABIMinhibitionofbotulinumneurotoxintypeBprotease.JAmChem,11268…11269.59RuppB&SegelkeB(2001)Questionsaboutthestruc-tureofthebotulinumneurotoxinBlightchainincom-plexwithatargetpeptide.NatStructBiol,663…664.60HardingMM(1999)Thegeometryofmetal…ligandinteractionsrelevanttoproteins.ActaCrystallogrDBiolCrystallogr,1432…1443.61HardingMM(2002)Metal…ligandgeometryrelevanttoproteinsandinproteins:sodiumandpotassium.CrystallogrDBiolCrystallogr,872…874.62HardingMM(2006)Smallrevisionstopredicteddis-tancesaroundmetalsitesinproteins.ActaCrystallogrDBiolCrystallogr,678…682.63BreseNE&OKeeffeM(1991)Bond-valenceparame-tersforsolids.ActaCrystallogrDBiolCrystallogr64BrownID(1992)Chemicalandstericconstraintsininorganicsolids.ActaCrystallogrDBiolCrystallogr65MullerS,KopkeS&SheldrickGM(2003)Isthebond-valencemethodabletoidentifymetalatomsinproteinActaCrystallogrDBiolCrystallogr,32…66SinghC(1965)Locationofhydrogenatomsincertainheterocycliccompounds.ActaCrystallogrDBiolCrys-,861…864.67WlodawerA(1982)NeutrondiffractionofcrystallineProgBiophysMolBiol,115…159.68NiimuraN,AraiS,KuriharaK,ChatakeT,TanakaI&BauR(2006)Recentresultsonhydrogenandhydra-tioninbiologystudiedbyneutronmacromolecularcrys-CellMolLifeSci,285…300.Proteincrystallographyfornon-crystallographersA.Wlodaweretal.FEBSJournal(2007)Journalcompilation2007FEBS.NoclaimtooriginalUSgovernmentworks 69LeeJE,BaeE,BingmanCA,PhillipsGNJr&RainesRT(2007)Structuralbasisforcatalysisbyonconase.JMolBiol,doi:10.1016/j.jmb.2007.09.089.70DydaF,HickmanAB,JenkinsTM,EngelmanA,Crai-gieR&DaviesDR(1994)Crystalstructureofthecata-lyticdomainofHIV-1integrase:similaritytootherpolynucleotidyltransferases.,1981…1986.71BujaczG,JaskolskiM,AlexandratosJ,WlodawerA,MerkelG,KatzRA&SkalkaAM(1995)Highresolu-tionstructureofthecatalyticdomainoftheaviansar-comavirusintegrase.JMolBiol,333…346.72BujaczG,JaskolskiM,AlexandratosJ,WlodawerA,MerkelG,KatzRA&SkalkaAM(1996)Thecatalyticdomainofaviansarcomavirusintegrase:conformationoftheactive-siteresiduesinthepresenceofdivalent,89…96.73MaignanS,GuilloteauJP,Zhou-LiuQ,Clement-MellaC&MikolV(1998)CrystalstructuresofthecatalyticdomainofHIV-1integrasefreeandcomplexedwithitsmetalcofactor:highlevelofsimilarityoftheactivesitewithotherviralintegrases.JMolBiol,359…368.74GoldgurY,DydaF,HickmanAB,JenkinsTM,CraigieR&DaviesDR(1998)ThreenewstructuresofthecoredomainofHIV-1integrase:anactivesitethatbindsProcNatlAcadSciUSA,9150…9154.75BotosI,MelnikovEE,CherryS,TropeaJE,KhalatovaAG,RasulovaF,DauterZ,MauriziMR,RotanovaTV,WlodawerAetal.(2004)ThecatalyticdomainofEscherichiacoliLonproteasehasauniquefoldandaSer-Lysdyadintheactivesite.JBiolChem,8140…76ImYJ,NaY,KangGB,RhoSH,KimMK,LeeJH,ChungCH&EomSH(2004)TheactivesiteofaLonproteasefromMethanococcusjannaschiidistinctlydif-fersfromthecanonicalcatalyticdyadofLonproteases.JBiolChem,53451…53457.77BotosI,MelnikovEE,CherryS,KozlovS,MakhovskayaOV,TropeaJE,GustchinaA,RotanovaTV&WlodawerA(2005)Atomic-resolutioncrystalstructureoftheproteolyticdomainofLonrevealstheconformationalvariabilityintheactivesitesofLonproteases.JMolBiol,144…78HoltonJ&AlberT(2004)AutomatedproteincrystalstructuredeterminationusingProcNatlAcadSci,1537…1542.79JedrzejczakR,DauterZ,DauterM,PiatekR,ZalewskaB,MrozM,BuryK,NowickiB&KurJ(2006)StructureofDraDinvasinfromuropathogenicEscherichiacoli:adimerwithswappedbeta-tails.CrystallogrDBiolCrystallogr,157…164.80WangJ,DauterM,AlkireR,JoachimiakA&DauterZ(2007)Tricliniclysozymeat0.65ACrystallogrDBiolCrystallogr,1254…1268.81WlodawerA,LiM,GustchinaA,DauterZ,UchidaK,OyamaH,GoldfarbNE,DunnBM&OdaK(2001)Inhibitorcomplexesofthe,15602…15611.A.Wlodaweretal.Proteincrystallographyfornon-crystallographersFEBSJournal(2007)Journalcompilation2007FEBS.NoclaimtooriginalUSgovernmentworks