Aguidetobioinformaticsforimmunologists FionaJWhelanNicholasVLYapMichaelGSuretteGBrianGoldingandDawnMEBowdishDepartmentofBiochemistryandBiomedicalSciencesMcMasterUniversityHamiltonONCanad ID: 293838
Download Pdf The PPT/PDF document "REVIEWARTICLEpublished:04December2013doi..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
REVIEWARTICLEpublished:04December2013doi:10.3389/\fmmu.2013.00416 Aguidetobioinformaticsforimmunologists FionaJ.Whelan,NicholasV.L.Yap,MichaelG.Surette,G.BrianGoldingandDawnM.E.BowdishDepartmentofBiochemistryandBiomedicalSciences,McMasterUniversity,Hamilton,ON,CanadaDepartmentofBiology,McMasterUniversity,Hamilton,ON,Canada Keywords:bioinformatics,immunology,sequencealignments,single-nucleotidepolymorphisms,transcriptionalpro\fling,scavengerreceptorAlthoughpublicperceptionindicatesthatbioinformaticsisarela-tivelynewdisciplineborneoutoftheomicsage,bioinformatics Whelanetal.Guidetobioinformaticsforimmunologists name(12).Sincethistime,ithasbecomethedefactoleformatformost,ifnotall,bioinformaticsequenceanalyses.Simplyput,thisformatisadescriptionofasequenceprecededbyagreater-than()symbol,followedbythesequenceinthestandardIUPACnucleotideorproteincode.Anaccuratelyannotatedandappropriatelyformattedsequenceofthegene(s)ofinterestisaprerequisiteofmanybioinfor-matictechniques.Since2007,theNationalCenterforBiotech-nologyInformation(NCBI)hasmadethenucleotidesequencesofmorethan260,000organismsaccessiblethroughitspubliclyavailabledatabase,GenBank(13).GenBank'sglobalcoverageofsequencedataisensuredbydailyexchangesofinformationwiththeEuropeanMolecularBiologyLaboratory's(EMBL)NucleotideSequenceDatabase,andtheDNADataBankofJapan(DDBJ)(13).TheinformationstoredinGenBankismadeaccessiblethroughEntrez,NCBI'scomprehensivesearchengine(13).UsersofEntrezhavetheoptionofsearchingwithinspecicdatabases,suchasnucleotideandproteinsequences,ExpressedSequenceTags(ESTs),andmacromolecularstructures(14).OnesuchdatabaseisEntrezGene,whichprovidesgene-centeredinformation(15).EntrezGeneincludesonlythosegenerecordscorrespondingtogenomeswhichhavebeenfullysequencedortogenesthathaveactiveresearchgroupsassociatedwiththem(15);searchesofthisorothercurateddatabasesavoidpoorsearchresults.Additionally,becausesomeannotationsincompletegenomesarequitesuspect,theuseofEntrezGenepreventstheuseofinappropriatelyannotatedorlowqualitysequences.SearchingthisdatabaseprovidesusefulinformationsuchastheGenomicregions,transcripts,andproductssection,whichishelpfulinvisualizingtheexonicstructureandchro-mosomalorientationofagene.TheBibliographysectionsum-marizespeer-reviewedarticlesinwhichthegeneisatthefore-front.Additionally,amultiplesequencealignmentofthegeneofinteresttoknownhomologscanbegeneratedbychoosingtheHomologysectionunderGeneralgeneinformation;thismaybeofinteresttothoseconductingcross-speciesorevolutionarystudies.Whengatheringsequencedata,theusershouldrefertothesectionentitledNCBIReferenceSequences(RefSeq)(Figure1).UsingRefSeqsisimportantbecausethesesequencesmeetastrin-gentstandardsetbyNCBI,includingtheassurancethatsupportingevidenceforthegeneisavailable(16).Here,atleastonesetofmRNAandproteinsequenceswillbedisplayed;isoformsofagivenproteinaredisplayedwithmultipleentries.AlthoughwehavechosentousetheNCBI'sEntrezplatforminthisexampleitshouldbenotedthatthereareotherequally FIGURE1|RetrievalofnucleicacidandproteinFASTAformattedsequencesfromanEntrezGenesearch.UponsearchingforandselectingtheHomosapiensSCARA3gene,avarietyofinformationcanberetrievedincludingidentiersfortheEnsembl,MendelianInheritanceofMan(MIM),andHumanProteinReferenceDatabase,inadditiontoinformationaboutthegenomiccontextofthegene.FromtheNCBIReferenceSequences(RefSeq)section,themostup-to-dateandthoroughlycuratedFASTAformattedsequencesmaybeobtained.SequenceswithAccessionIdentiersbeginningwithNMorXMaremRNAandNPorXPareprotein.MultipleRefSeqentriesmaybepresentinthecaseofgeneisoforms.SelectingtheNP_057324.2AccessionIdentier,informationconcerningtheSCARA3isoform1,proteinisdisplayed,includinglinkstopublicationsinvolvingthisprotein.ByselectingFASTAatthetopofthepage,theFASTAformattedsequenceisprovided,whichincludesthereferencenumber,species,andname.Thissequenceissuitableforinputintomostonlinebioinformatictools. FrontiersinImmunology|MolecularInnateImmunityDecember2013|Volume4|Article416|2 Whelanetal.Guidetobioinformaticsforimmunologists Table1|PublicdatabasescontainingDNA,mRNAandproteinsequences. AcronymNameHostedbyURLFeaturesReference GenBankGenBankNationalCenterforBiotechnologyInformationhttp://www.ncbi.nlm.nih.gov/genbank/AnannotatedcollectionofallpubliclyavailableDNAsequences(EST,geneandtranscriptsequencesandunannotatedsinglereadsequencesfromgenomesequencingprojects)Bensonetal.(13)EMBL-BANKEMBLNucleotideSequenceDatabaseEuropeanMolecularBiologyLaboratory(EMBL)http://www.ebi.ac.uk/embl/AcollectionofDNAandRNAsequencessubmittedbyresearchers,genomesequencingprojects,andpatentapplications.Inadditiontoqueryingindividualgenes,wholegenomesmaybebrowsedKulikova(56)DDBJDNADataBankofJapanDNADataBankofJapanhttp://www.ddbj.nig.ac.jp/AcollectionofnucleotidesequenceswheresequencesofrecentlysequencedgenomesareparticularlywellrepresentedMiyazaki(57)UCSCUCSCGenomeBioinformaticssiteGenomeBioinformaticsGroupattheUniversityofCaliforniaSantaCruzhttp://genome.ucsc.edu/Containsreferencesequencesandworkingdraftassembliesforalargecollectionofgenomes.Sourceofsequencesforgenomesthathavenotbeencomprehensivelysequencedandannotated(e.g.,Neadertal)Kentetal.(58) appropriatedatabasesavailable.Althoughitisbeyondthescopeofthisreviewtodescribethemindetail,Table1providesanoverview.PREDICTINGPOST-TRANSLATIONALMODIFICATIONSPost-translationalmodicationsofaproteincanincludephos-phorylation,glycosylation,ubiquitination,methylation,andlipi-dationamongstmanyothers.Post-translationalmodicationmaychangethefunction,cellularlocalization,orabundanceofapro-tein.Justasunderstandingproteindomainsandgenomiccontextcaninformthefunctionofaprotein,understandinghowapro-teinispost-translationallymodiedmayprovideimportantcluesregardingfunction.Forexample,signaltransductionmediatedbytheimmunoreceptortyrosine-basedactivationmotif(ITAM)oftheT-cellreceptor,requiresthedualphosphorylationoftwoofitstyrosineresidues[reviewedinRef.(17)].Predictionsastowhichofthemanypossiblepost-translationalmodicationsarestatis-ticallylikelyinagivenproteinmayexplaincellularlocalizationpatterns,regulationofproteinabundance,andindicatewhethertheproteincontainsspecicsignalingproperties.Asanexample,previousresearchhasdemonstratedthattheprototypicalmemberoftheclassAscavengerreceptors,SRAI,hasaserineinthecytoplasmicdomainofthisprotein,which,whenphosphorylated,isessentialforitsphagocyticfunction(18,19).However,itisnotknownwhethertheothermembersoftheclassAscavengerreceptorfamily,suchasSCARA3,containsimilarsitesofpost-translationalmodications.KnowledgeofsuchsiteswouldsuggestthatSCARA3,likeSRAI,isalsoaphagocyticrecep-torwhosesignalingpathwaysareconservedwithinthisreceptorfamily.TheSCARA3FASTAformattedproteinsequenceobtainedfromNCBIwasanalyzedusingtheNetPhos2.0Server(Figure2).Thistoolwasbuiltontheknowledgethatthe7-to12-aminoacidsneighboringaphosphorylatedresiduetendtohaveaspeci-edcompositioninordertoberecognizedbyspecickinasesandphosphatases(20).Usingthisinformation,NetPhospredictssitesofphosphorylationinaproteinsequence.InthecaseofSCARA3,multiplesiteswereidentiedoverthethresholdprobabilityvaluedenedbythesoftwaretobeserine(S)-,threonine(T)-,ortyrosine(Y)-phosphorylated(Figure2),indicatingthateventhoughtheseresiduesdifferfromthoseidentiedinSRAI,SCARA3maypossesssimilarfunctionality.InadditiontoNetPhos,therearemanypost-translationalmod-icationpredictiontoolspublicallyavailablewhichrequirethesoleinputofaproteinsequence.ArepresentativecollectionofthesetoolsissummarizedinTable2.IDENTIFYINGCONSERVEDMOTIFSSomeregionsofagenearemoresusceptibletotheaccumulationofmutationalchangeoverevolutionarytimethanothersandpro-tectionfromchangeislargelyduetothebiologicalimportanceofsucharegion(21).Highlyconservedregionshavegenerallybeendemonstratedtoencodeforareasessentialforaprotein'sexpres-sionorfunctionwhereevenslightchangeswouldthreatentheorganism'ssurvival.Incontrast,inotherareasofaprotein,neu-tralmutationsthatdonotaffectproteinfunctionmayaccumulateovertime(21).Byexaminingareasofconservationinaproteinofinterestacrossitsorthologs(i.e.,genesseparatedbyaspeciationevent;thesamegeneindifferentspecies)andparalogs(i.e.,genesseparatedbyageneduplicationevent;similargenesinthesamespecies)onecanpredictregionsthatareimportantforexpressionorfunction(22).Thisisaccomplishedbyperformingsequencealignments.Analignmentofsequencessimplyput,istheadditionofgaps(repre-sentedas-s)atvariablepositionsinasetofinputsequencesinordertomaximizethenumberofsimilarresiduespercolumninthealignment(22).Thesealignmentscomeinavarietyofforms:rst,theycaneitherbepairwise,involveonlytwosequences,ormultiple,involvemorethantwosequences.Second,theycanbeglobal,whichmeansthefulllengthofallsequencesarealigned,orlocal,indicatingthatthebestalignmentisdisplayed,evenifthatmeansonlyaligningaportionoftheinputtedsequencestoeachother(23).Theuseofpairwiseversusmultiplesequencealign-mentsdependsonhowmanycloselyrelatedproteinstheuserhasattheirdisposal;themoresequences,iftheyarecloselyrelated,willbetterinformthealignment.However,thechoiceoflocal www.frontiersin.orgDecember2013|Volume4|Article416|3 Whelanetal.Guidetobioinformaticsforimmunologists Table2|Arepresentativecollectionofbioinformatictoolsforpost-translationalmodication(PTM)prediction. NameHostedbyPTMpredictedURL/Reference NetCGlyc1.0ServerCenterforBiologicalSequenceAnalysis(CBS)C-mannosylationsitesinmammalianproteinshttp://genome.cbs.dtu.dk/services/NetCGlyc/;Julenius(59)NMTTheResearchInstituteofMolecularPathology(IMP)BioinformaticsGroupTheMYRpredictorforpredictionofN-terminalN-myristoylationofproteinshttp://mendel.imp.univie.ac.at/myristate/SUPLpredictor.htmPrePS:PrenylationPredictionSuiteTheResearchInstituteofMolecularPathology(IMP)BioinformaticsGroupPredictswhetheraproteinisprenylatedhttp://mendel.imp.ac.at/PrePS/;Maurer-StrohandEisenhaber(60)NetPhos2.0ServerCenterforBiologicalSequenceAnalysis(CBS)Predictionsofphosphorylationsitesonserine,threonine,andtyrosineresidueshttp://genome.cbs.dtu.dk/services/NetPhos/;Blometal.(20)TheSulnatorExPASyBioinformaticsResourcePortalPredictionoftyrosinesulfationsiteshttp://web.expasy.org/sulnator/;Monigattietal.(61)SUMOplotAnalysistoolAbgentPredicttheprobabilityofsumoylationsiteswithinaproteinsequencehttp://www.abgent.com/tools/ProP1.0ServerCenterforBiologicalSequenceAnalysis(CBS)Predictsarginineandlysinepropeptidecleavagesiteshttp://genome.cbs.dtu.dk/services/ProP/;Duckertetal.(62)UBPredIndianaUniversity,ColumbiaUniversity,UniversityofCalifornia,SanDiego,CA,USAPredictsproteinubiquitinationsiteshttp://www.ubpred.org/;Radivojacetal.(63) TherearemanypublicallyavailablePTMpredictiontoolsthatrequireonlytheinputofaproteinsequence.Thistableoutlinesarepresentativesubsetthatareavailableasonlinetools.versusglobalalignmentsisnotasstraightforward.Theresultsoflocalalignmentsareoftenmoremeaningfulbecausethemethodemphasizesregionsofhighsimilaritybetweensequences(23).Thesetypesofalignmentsarequiteinformativewhencompar-ingdivergentproteinsequencesthatarehypothesizedtoshareaspecicproteindomain.However,oftenaresearcherisinterestedincomparingfull-lengthsequencesofhighsimilaritytoeachother,inwhichcaseaglobalalignmentmustbeemployed.Inourcase,wewereinterestedinthesimilaritiesofSCARA3totheothermembersoftheclassAscavengerreceptors(itsparalogs)that,todate,havebeenbettercharacterizedintermsofbiologicalfunctionandexpression.AnysimilaritiesbetweenspecicregionsofSCARA3andthesewell-characterizedcousinswouldallowustohypothesizethattheseregionsperformsimilarfunctionsinbothproteins.Assuch,wecomputedaglobalalignmentofthehumanSCARA3proteinwiththeotherfourmembersofthispro-teinfamily(Figure3).Aglobalsequencealignmentisusedinthiscasebecausepreviousresearchhassuggestedthattheseproteinshaveevolvedinparallelformanymillionsofyears,resultinginsomesimilarbiologicalfunctions,suggestingthattheyshareareasofsimilarityacrossthefulllengthsoftheseproteins(11,24).EuropeanMolecularBiologyLaboratory'sEuropeanBioinfor-maticsInstitute(EBI)hasasetoftoolsavailableforbothpairwise1andmultiplesequencealignments2.IntheexampleinFigure3,weperformaglobalmultiplesequencealignmentoftheclassAscavengerreceptorproteinsequencesfromHomosapiensusingtheClustalW2tool(Figure3A).ClustalW2waschosenbecauseit 1http://www.ebi.ac.uk/Tools/psa2http://www.ebi.ac.uk/Tools/msaissuitableformedium-lengthalignments,whichisperfectforanalysisofthescavengerreceptors,whichareapproximately500basepairsinlength.Additionally,ClustalW2producesacolor-fuloutput,whichmakesiteasytovisualizeconservedresiduesandpatternsofchargeorresiduerepeatsbyvisualinspection.AportionoftheresultsofthisalignmentcanbevisualizedinFigure3B.Notably,thisalignmentidentiedanareaofconser-vationattheC-terminalregionofthecollagenousdomainacrossallvemembersoftheclassAscavengerreceptors(Figure3C).Thisarea,consistingofpredominantlychargedaminoacids,hasbeenpreviouslyimplicatedinligandbindinginSRAI(25).Con-sequentlywemightpredictthatthisregionisaligand-bindingsitenotonlyinSRAI,butalsointheotherfourmembersofthisproteinfamily.Anotherapproachtotheidenticationofconservedmotifs,especiallyusefulwhennoknownhomologsexist,arespecial-izedtoolsthatexamineaninputsequenceforknowndomains.AnexampleofsuchatoolisNCBI'sConservedDomainSearch(CD-search)whichcomparesauser-providedsequenceagainstanNCBI-curateddatabaseofknowndomains(26).Thesetoolsdonotndtheintricaciesofsequencealignmentsbutcan,however,beveryinformative.STRUCTURALANALYSISACQUIRINGPUBLICALLYAVAILABLEMACROMOLECULARSTRUCTURESOfcourse,whilecluestoaprotein'sfunctioncanbehiddenwithinitssequence,attheendoftheday,it'stheprotein'sstructurethatdictatesitsfunction.BecauseoftheeaseofDNAandpro-teinsequencinggiventoday'stechnologies,thereismoresequencedataavailablecomparedtostructuralevidence;however,databases www.frontiersin.orgDecember2013|Volume4|Article416|5 Whelanetal.Guidetobioinformaticsforimmunologists resourcessuchasIRIS(Immuneresponseinsilico)takeasimilarapproachtocharacterizingthetranscriptionalprolesofhumanleukocytesubsetsandincludedifferentactivationstates(47).GENETICVARIATIONANALYSISOFSINGLE-NUCLEOTIDEPOLYMORPHISMThemostcommontypeofvariationwithinthehumangenomearesingle-nucleotidepolymorphisms(SNPs),whichoccur,onaver-age,every1200basepairs(48).SNPscanbenon-synonymousorsynonymous;non-synonymousSNPsresultinachangeintheaminoacidsequenceofthetranslatedprotein,whilesynonymousSNPsdonotaltertheaminoacidcompositionbecauseoftheredundancyofthegeneticcode.Single-nucleotidepolymorphismanalysisofaproteincangreatlyaidintheunderstandingofitsfunctionasthesesmallalterationscanresultinsubstantialchangesinthefunctional-ityoftheprotein.Forexample,aSNPatareceptor'sbindingsitemayaltertheoriginalproteinsuchthatitwouldbeabletobindapathogenthatitpreviouslywasunableto,or,incon-trast,mayabolishitsabilitytobinditsusualbindingpartner.Inonestudy,researchersstudieddifferencesinSNPfrequenciesofMal/TIRAPtoexplaindifferencesinTLR2andTLR4signalingbetweenEuropeanandAfricanpopulations(49).Aftercloningthetwovariants,S180LandS180,resultsindicatedthatS180LheterozygousindividualshadahighercytokineproductionlevelthanS180homozygousindividuals(49).LowerallelefrequenciesofS180LinAfricanandAsianpopulationsmightindicateselec-tionoccurredafterhumansmigratedfromAfricasincethevariantmayhavegrantedaddedbacterialresistanceinthechanginghabi-tat(49).ThisstudydemonstrateshowSNPanalysescanbeusedtoidentifyfunctionaldomainsofaproteinaswellasuncoveraprotein'spotentialevolutionaryhistory.ThereareseveralpubliclyavailableonlinedatabasesfortheanalysisofSNPsinaproteinofinterest(summarizedinTable5);here,weuseTheUniversityofCalifornia,SantaCruz(UCSC)GenomeBrowser7toperformananalysisofSNPspresentwithinSCARA3.Regionsofinterestcanbesearchedforbyenteringthenameofageneoritscorrespondingchromosomalposition.TheGenomeBrowsercontainsmultipletracksthatcontaindiffer-enttypesofannotation,includingthosebasedonNCBIRefSeqs,mRNAalignments,andUCSCGenes(50)(Figure7).Inaddition,thebrowsercandisplayreportsregardinggeneexpression,regu-lation,andvariation,amongotherinformation(50).TheUCSCGenomeBrowserincludesanannotatedSNPtrackwithover23millionreferenceSNPsfromNCBI'sSNPDatabase(dbSNP)(50) 7http://genome.ucsc.edu Table5|Publiclyavailablesingle-nucleotidepolymorphism(SNP)databases. NameHostedbyURLFeaturesAvailabilityReference UCSCUniversityofCalifornia,SantaCruz,CA,USAhttp://genome.ucsc.edu/IntegratedbrowserdisplayingtracksbuiltfromannotationsetsincludingSNPs,mRNA,diseaseassociationstudies,andmoreWebappletKent(68)dbSNPNationalCenterforBiotechnologyInformationhttp://ncbi.nlm.nih.gov/SNP/CentraldatabaseofSNPswithintegrateddatafrommultiplepopulationstudiesincludingthe1000genomeprojectWebappletSherryetal.(48)GWAScentral(formerlyHGVbasedatabase)Institutes,Consortia,andindividuallaboratorieshttp://gwascentral.org/Databaseofhumangeneticvariation.Displaysinformationonphenoytpes,genes,regions,ormarkersbasedonSNPsWebappletFredmanetal.(69)ENSEMBLEuropeanBioinformaticsInstitute(EBI)http://ensembl.org/Containsavailablegenomesofmultiplespecies.Displayssummaryinformationregardingisoforms,SNPs,andotherfeaturesofgenesorproteinsWebappletFliceketal.(70)HapMapNationalCenterforBiotechnologyInformationhttp://hapmap.ncbi.nlm.nih.gov/ContainsintegrateddataofSNPsforhaplotypeanalysis,ndingtagSNPs,andforidentifyingGWAShitsWebappletGibbsetal.(71)1000GenomeProjectEuropeanBioinformaticsInstitutehttp://1000genomes.orgContains1092availablehumangenomesforanalysisaswellassummarydocumentationregardingSNPsandothervariationFTPdownloadAbecasisetal.(72)HaploViewTheBroadInstitutehttp://broadinstitute.org/Calculatesr2andD0valuesforperforminghaplotypeanalysisofSNPswithHapMapdataoruserinputdataFordownloadonallmajorplatformsBarrettetal.(73) ThislistincludesonlySNPdatabasesthatfocusonhumanand/ormousesequences;other,morespecializeddatabasesmayexistforotherorganisms.AlldatabaseslistedacceptnovelSNPsfromprivateandpublicorganizations. www.frontiersin.orgDecember2013|Volume4|Article416|11 Whelanetal.Guidetobioinformaticsforimmunologists FIGURE7|UsingtheUCSCGenomeBrowsertosearchforsingle-nucleotidepolymorphisms(SNPs)inSCARA3.Thisbrowsercontainsmultipletracks,includingthelocationofSNPsacrossthelengthofaprotein.HereweshowtheoutputfrominputtingtheNCBIRefSeqforSCARA3isoform(A).Furtheroptionstohideorshowmoreannotationtracksareavailabledirectlybelowthegraphicaloutput.UndertheVariationandRepeatstab,selectingpackundertheCommonSNPsoptionupdatestheoutputtoincludeafulldisplayofSNPsrepresentedbytheirrefSNPclusterIDnumbers[(B),circled].ClickingonanyoftherefSNPclusterIDsleadstoalinkdisplayingfurtherinformationregardingtheSNPaswellasalinktoNCBI'sdbSNPdatabase. (Figure7B).SNPsareannotatedusingarefSNPclusterIDnumber(rs#)whichrepresentsallSNPs,oftenfrommultiplepopulationstudies,thatmaptothesamelocationinthegene.Additionally,eachindividualSNPwithinaclusterisassociatedwithaSNPAccessionnumber(ss#)(48).SelectingarefSNPclusterwithintheGenomeBrowserwilldisplayinformationsuchasthenucleotide FrontiersinImmunology|MolecularInnateImmunityDecember2013|Volume4|Article416|12 Whelanetal.Guidetobioinformaticsforimmunologists FIGURE8|ExampleresultspagefromtheNCBIdbSNPdatabaseforSCARA3SNPrs17057523.ByfollowingthelinkfromtheUCSCGenomeBrowsertodbSNP,moreinformationisprovidedforSNPrs17057523includingallelefrequencies,ancestralalleles,andchromosomalposition(A).Followingthisinformationonthedatabasewebsite,areothertabsthatshowmoreinformationregardingtheSNPthatmaybeusefultoinvestigators.ThePopulationDiversitysectiondisplaysinformationregardingallelefrequenciesfromdifferentsampledpopulations(B).ClickinganyofthepopulationlinksshowsinformationonhowtheSNPwasgenotyped,thepopulationsamplesize,andotherexperimentalconditionsused. www.frontiersin.orgDecember2013|Volume4|Article416|13 Whelanetal.Guidetobioinformaticsforimmunologists 60.Maurer-StrohS,EisenhaberF.Renementandpredictionofproteinprenylationmotifs.GenomeBiol(2005)6:R55.doi:10.1186/gb-2005-6-6-r5561.MonigattiF,GasteigerE,BairochA,JungE.TheSulnator:predictingtyro-sinesulfationsitesinproteinsequences.Bioinformatics(2002)18:76970.doi:10.1093/bioinformatics/18.5.76962.DuckertP,BrunakS,BlomN.Predictionofproproteinconvertasecleavagesites.ProteinEngDesSel(2004)17:10712.doi:10.1093/protein/gzh01363.RadivojacP,VacicV,HaynesC,CocklinRR,MohanA,HeyenJW,etal.Identi-cation,analysis,andpredictionofproteinubiquitinationsites.Proteins(2010)78:36580.doi:10.1002/prot.2255564.AndreiRM,LoniT,CallieriM,ZiniMF,MarazitiG,PanMC.BioBlender:ASoftwareforIntuitiveRepresentationofSurfacePropertiesofBiomolecules.(2010).p.119.Availableat:http://cds.cern.ch/record/129440265.Jmol:AnOpen-SourceJavaViewerforChemicalStructuresin3D.Availableat:http://www.jmol.org/66.ColeC,BarberJD,BartonGJ.TheJpred3secondarystructurepredictionserver.NucleicAcidsRes(2008)36:W197201.doi:10.1093/nar/gkn23867.ChouP,FasmanG.Predictionofproteinconformation.Biochemistry(1974)13:22245.doi:10.1021/bi00699a00268.KentWJ.BLATtheBLAST-likealignmenttool.GenomeRes(2002)12:65664.doi:10.1101/gr.22920269.FredmanD,SiegfriedM,YuanYP,BorkP,LehväslaihoH,BrookesAJ.HGVbase:ahumansequencevariationdatabaseemphasizingdataqualityandabroadspectrumofdatasources.NucleicAcidsRes(2002)30:38791.doi:10.1093/nar/30.1.38770.FlicekP,AhmedI,AmodeMR,BarrellD,BealK,BrentS,etal.Ensembl2013.NucleicAcidsRes(2012)41:D4855.doi:10.1093/nar/gks123671.GibbsRA,BelmontJW,HardenbolP,WillisTD,YuF,YangH,etal.TheinternationalHapMapproject.Nature(2003)426:78996.doi:10.1038/nature0216872.AbecasisGR,AutonA,BrooksLD,DePristoMA,DurbinRM,HandsakerRE,etal.Anintegratedmapofgeneticvariationfrom1,092humangenomes.Nature(2012)490:5665.doi:10.1038/nature1163273.BarrettJC,FryB,MallerJ,DalyMJ.Haploview:analysisandvisualizationofLDandhaplotypemaps.Bioinformatics(2005)21:2635.doi:10.1093/bioinformatics/bth457ConictofInterestStatement:Theauthorsdeclarethattheresearchwasconductedintheabsenceofanycommercialornancialrelationshipsthatcouldbeconstruedasapotentialconictofinterest.Received:10September2013;accepted:13November2013;publishedonline:04December2013.Citation:WhelanFJ,YapNVL,SuretteMG,GoldingGBandBowdishDME(2013)Aguidetobioinformaticsforimmunologists.Front.Immunol.4:416.doi:10.3389/mmu.2013.00416ThisarticlewassubmittedtoMolecularInnateImmunity,asectionofthejournalFrontiersinImmunology.Copyright©2013Whelan,Yap,Surette,GoldingandBowdish.Thisisanopen-accessarticledistributedunderthetermsoftheCreativeCommonsAttributionLicense(CCBY).Theuse,distributionorreproductioninotherforumsispermitted,providedtheoriginalauthor(s)orlicensorarecreditedandthattheoriginalpublicationinthisjournaliscited,inaccordancewithacceptedacademicpractice.Nouse,distributionorreproductionispermittedwhichdoesnotcomplywiththeseterms. FrontiersinImmunology|MolecularInnateImmunityDecember2013|Volume4|Article416|16