/
ThreeYearsofExperiencewithSledgehammerL.C.PaulsonandJ.C.BlanchetteThet ThreeYearsofExperiencewithSledgehammerL.C.PaulsonandJ.C.BlanchetteThet

ThreeYearsofExperiencewithSledgehammerL.C.PaulsonandJ.C.BlanchetteThet - PDF document

tatyana-admore
tatyana-admore . @tatyana-admore
Follow
362 views
Uploaded On 2015-10-24

ThreeYearsofExperiencewithSledgehammerL.C.PaulsonandJ.C.BlanchetteThet - PPT Presentation

1Isabelle26isagenerictheoremproverbasedonalogicalframework23IsabelleHOListhespecialisationofIsabelleforhigherorderlogic2 ThreeYearsofExperiencewithSledgehammerLCPaulsonandJCBlanchetteused ID: 170902

1Isabelle[26]isagenerictheoremprover basedonalogicalframework[23].Isabelle/HOListhespecialisationofIsabelleforhigher-orderlogic.2 ThreeYearsofExperiencewithSledgehammerL.C.PaulsonandJ.C.Blanchetteused

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "ThreeYearsofExperiencewithSledgehammerL...." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

ThreeYearsofExperiencewithSledgehammerL.C.PaulsonandJ.C.BlanchetteThetwoaspectsofproblempreparation(translationintorst-orderlogicandidenticationofrelevantfacts)eachrequiredasubstantialresearcheffort.Thenumerouschoicesoutlinedbelowweremadeonthebasisofinnumerableexperimentsthatconsumedmanythousandsofhoursofprocessortime.2.1TranslationintoFirst-OrderLogicMostinteractivetheoremproverssupportalanguagemuchricherthanthatofrst-orderlogic.Isabelle/HOL[21]supportspolymorphichigher-orderlogic[2,9,10],augmentedwithaxiomatictypeclasses[39].1Manyuserproblemscontainnohigher-orderfeaturesandmightbeimaginedtoliewithinrst-orderlogic;however,eventheseproblemsarefulloftypinginformation.Typeinformationcantakequadraticspace[17]becauseeverytermmustbeannotatedwithitstype,recursively,rightdowntothevariables.Hurd[13]observedthatomittingtypeinformationgreatlyimprovedthesuccessrateofhistheoremprover,Metis.Thisishardlysurprising,sincethetypeinformationvirtuallyburiesthetermsthemselves.HurdwasabletoomittypeinformationbecausehisproofsarereconstructedwithinHOL4[10],whichrejectedanyproofsthatdidnotcorrespondtowell-typedhigher-orderlogicdeductions.Sledgehammerwasalwaysintendedtorelyonananalogousprocessofsoundproofreconstruction,andfromtheoutsetitwasclearthatincludingcompletetypeinformationwouldbeunworkable.Com-pletelyomittingtypeinformation,althoughsuccessfulforHOL4,wouldnothaveworkedforIsabellebecauseofitsheavyuseoftypeclasses.Wechosetoincludeenoughtypeinformationtoenforcecorrecttypeclassreasoning(thetypeclasshierarchyiseasilyexpressedusingHornclauses)butnottospecifythetypeofeveryterm[19,§4].Somecolleagueshaveexpressedhorrorattheveryideaofusingunsoundtranslations;therstauthorhaswrittenalengthyexplorationofthesalientissues[17,§2.8].Higher-orderproblemsposedspecialdifculties.Weneverexpectedrst-ordertheoremproverstoperformdeephigher-orderreasoning,butmerelyhopedtoautomateproofswherethehigher-orderstepsweretrivial.Weexaminedseveralmethodsoftranslatinghigher-orderproblemsintorst-orderlogic,allowingtruthvaluestobethevaluesoftermsandcurriedfunctionstotakevaryingnumbersofarguments[17].Weeventuallyadoptedatranslationbasedontheonethatweusedforrst-orderlogic,modiedtointroducehigher-ordermechanisms(suchasan“applyoperator”forfunctionvalues)onlywhenabsolutelynecessary.Wetherebyeliminatedouroriginaldistinctionbetweenrst-orderandhigher-orderproblems.Ahigher-orderfeaturewithinaproblemaffectsthetranslationlocally,yieldingasmoothtransitionfrompurelyrst-ordertoheavilyhigher-orderproblems.Wealsoexperimentedwithtwomethodsofeliminatingl-abstractionsinterms:bytranslatingthemintocombinatorformorbydeclaringequivalentfunctions.WeultimatelyoptedforanaivetranslationschemebasedonthecombinatorsS,K,I,B,andC:moresophisticatedschemesdeliverednoadditionalbenets.Unfortunately,ourexperiencesuggeststhatSledgehammerisseldomsuccessfulonproblemscontaininghigher-orderelements.Integrationwithagenuinehigher-orderautomatictheoremprover,suchasLEO-II[5]andSatallax[3],seemsnecessary.Thiswouldposeinterestingproblemsforproofreconstruction:LEO-II'sapproachistoreducehigher-orderproblemstorst-orderonesbyrepeatedlyapplyingspecialisedinferencerulesandthencallingrst-orderATPs.ALEO-IIproofwillthereforeconsistofastringofhigher-orderstepsfollowedbyarst-orderproof.Thelatterpartweknowhowtodo;thecrucialchallengeistodeviseareliablewayofemulatingthehigher-orderstepswithinIsabelle.Arithmeticremainsanissue.Apurelyarithmeticproblemcanbesolvedusingdecisionprocedures,butwhataboutproblemsthatcombinearithmeticwithasignicantamountoflogic?Inprinciple,Sledge-hammercouldsolvesuchproblemswiththehelpofanATPthatcombinedarithmeticandlogicalreason-ing,analogoustoLEO-II'sapproachtohigher-orderlogic.CurrentSMTsolversareprobablyoflittle 1Isabelle[26]isagenerictheoremprover,basedonalogicalframework[23].Isabelle/HOListhespecialisationofIsabelleforhigher-orderlogic.2 ThreeYearsofExperiencewithSledgehammerL.C.PaulsonandJ.C.BlanchetteusedSledgehammermanytimeswhileconstructingaproof,woulditbefeasibletorunthatproofagain,perhapstomodifyitusingalaptopwhileataconference?Tobeuseful,Sledgehammerwouldhavetoreturnapieceofproofscriptthatcouldbeexecutedcheaply.2.3.1ReconstructionoftheResolutionProofTheoriginalplanwastoemulatetheinferencerulesofautomatictheoremproversdirectlywithinIsa-belle.Weshouldhaveknownbetter:Hurd[12]hadnoticedthattheproofsdeliveredbyGandalf[35]werenotdetailedandexplicitenough.WemadethesamediscoverywithSPASSand,despiteconsider-ableefforts,wereonlyabletoreconstructahandfulofproofs[19].Wecameupwithanewplan:touseageneraltheoremprover,Metis,toreconstructeachproofstep.MetiswasdesignedtobeinterfacedwithLCF-styleinteractivetheoremprovers,specicallyHOL4.IntegratingitwithIsabelle'sproofkernelrequiredsignicanteffort[28].MetisthenbecameavailabletoIsabelleusers,anditturnedouttobecapableofreconstructingproofstepseasily.TheoutputofSledgehammerwasnowalistofcallstoMetis,eachofwhichprovedaclause.Whiletheoutputisprimarilydesignedforreplayingproofs,italsohasapedagogicalvalue:unlikeIsabelle'sautomatictactics,whichareblackboxes,theproofsdeliveredbySledgehammercanbeinspectedandunderstood.Considerthetheorem“length(tlxs)lengthxs”,whichstatesthatthetailofalist(thelistfromwhichweremoveitsrstelement,ortheemptylistifthelistisempty)isshorterthanorofequallengthastheoriginallist.TheproofproducedbyVampire,expressedinIsabelle'sstructuredIsarformat,looksasfollows:proofneg_clausifyassume“:length(tlxs)lengthxs”hence“drop(lengthxs)(tlxs)6=[]”by(metisdrop_eq_Nil)hence“tl(drop(lengthxs)xs)6=[]”by(metisdrop_tl)hence“8u:xs@u6=xs_tlu6=[]”by(metisappend_eq_conv_conj)hence“tl[]6=[]”by(metisappend_Nil2)thus“False”by(metistl.simps(1))qedTheneg_clausifymethodtransformstheIsabelleconjectureintonegatedclauseform,ensuringthatithasthesameshapeasthecorrespondingATPconjecture.Thenegationoftheclauseisintroducedbytheassumekeyword,andaseriesofintermediatefactsintroducedbyhenceleadtoacontradiction.ThisapproachwasinspiredbytheOttererprooftransformationservice[40].Resolutionproofsshouldideallybetranslatedtonatural,intuitiveIsabelleproofs.Thebest-knownpriorworkontranslatingresolutionproofsisTRAMP[16];itsapplicabilitytoSledgehammerisunexplored.PreliminaryworkhascommencedatMunichtoseetowhatextentresolutionproofscanbetrans-formedintointelligibleproofs.Therststepistotransformtheproofintoadirectproofbyapplyingcontrapositionrepeatedlyandintroducingcasesplitswhereappropriate.Forexample,theproofaboveistransformedintoproof–have“tl[]=[]”by(metistl.simps(1))hence“9u:xs@u=xs^tlu=[]”by(metisappend_Nil2)hence“tl(drop(lengthxs)xs)=[]”by(metisappend_eq_conv_conj)hence“drop(lengthxs)(tlxs)=[]”by(metisdrop_tl)thus“length(tlxs)lengthxs”by(metisdrop_eq_Nil)qedFormostIsabelleusers,thedirectproofismucheasiertounderstandandmaintain.4 ThreeYearsofExperiencewithSledgehammerL.C.PaulsonandJ.C.Blanchettebebettertoemployevenmoretheoremprovers.Wehaveundertakeninformal,unpublishedexperimentsinvolvingmanyothersystems.Gandalf[35]showsgreatpotential,butunfortunatelyitdoesnotoutputusefulproofs;onecannoteasilyidentifywhichaxiomshavetakenpartintheproof.AsimplesourcecodemodicationtoimprovethelegibilityofproofswouldallowGandalftomakeusefulcontributions.Unfortunately,wewereunabletoidentifythenecessarychanges.Gandalfhasbeenfoundtobeunsound,2butasmallpercentageofincorrect(andhenceunreconstructable)proofswouldbetolerable.SInE,theSumoInferenceEngine[11],isawrapperaroundEthatisdesignedtocopewithlargeaxiombases.WepassitmorefactsthancanbehandledbytheotherATPs,anditsometimessurprisesuswithoriginalproofs.Inthecurrentexperimentalsetup,itisinvokedremotelyviaSystemOnTPTP[34]inparallelwithVampire.PeoplesometimessuggestthatweincludeProver9[15].Inourexperiments,Prover9performedpoorlyonthelargeproblemsgeneratedbySledgehammer.Itcouldbeeffectiveinconjunctionwithanadvancedandselectiverelevancelter.Wecouldalsorunmultipleinstancesofatheoremproverwithdifferentheuristics.ThisisnotnecessarywithVampire,whichattemptsavarietyofheuristicsinseparatetimeslices.ItcouldbeparticularlyeffectivewithE,butdesigningsuitableheuristicsrequireshighlyspecialisedskills.3EvaluationIntheir“JudgementDay”study,BöhmeandNipkow[8]evaluatedSledgehammerwithE,SPASS,andVampireon1240provableproofgoalsarisinginsevenrepresentativeIsabelletheories:ArrowArrow'simpossibilitytheoremNSNeedham–Schroedershared-keyprotocolHoareCompletenessofHoarelogicwithproceduresJinjaTypesoundnessofasubsetofJavaSNStrongnormalisationofthetypedl-calculuswithdeBruijnindicesFTAFundamentaltheoremofalgebraFFTFastFouriertransformSledgehammerhasbeendevelopedfurthersincetheyrantheirexperiments.Inparticular,itnowcom-municateswithATPsusingfullrst-orderlogicinsteadofclauseform,addsSInEtothecollectionofATPs,andemploysthelatestversionsofSPASS,Vampire,andMetis.Toaccountforthesechanges,werantheJudgementDaybenchmarksuiteonthesamehardwareasBöhmeandNipkowbutwiththelatestversionofSledgehammerandoftheIsabelletheories.WhenrunningthefourATPsinparallelfor120seconds,followedbyMetiswitha30-secondtimelimit,Sledgehammernowsolves52%ofthegoals(comparedwith48%inBöhmeandNipkow).ThetablebelowgivesthesuccessratesforeachATPandtheory. ArrowNSHoareJinjaSNFTAFFT Avg. SInE0.4 18%22%43%31%61%53%17% 40%E1.0 19%39%45%33%66%57%17% 44%SPASS3.7 30%35%43%32%59%58%17% 44%Vampire1.0 36%40%50%35%63%60%17% 47% Together 43%45%54%41%68%65%26% 52% 2Seehttp://www.cs.miami.edu/~tptp/TPTP/BustedAsUnsound.html.6 ThreeYearsofExperiencewithSledgehammerL.C.PaulsonandJ.C.Blanchettehence“xspaceM”by(metissets_into_spacelambda_system_sets)hence“spaceM�(spaceM�x)=x”by(metisdouble_diffequalityE)thus“spaceM�x2lambda_systemMf”usingxby(forcesimpadd:lambda_system_def)qedEachoftheintermediatefactsisprovedbyacalltoMetisthatwasgeneratedusingSledgehammer.Whiletheexamplefeaturesalinearprogressionoffacts,Isarproofscanalsobenestedtoanydepth.Isaralsosupportscalculationalreasoning[4].Achainofreasoningsteps,connectedbyfamiliarrelationssuchas=,,and,canbewrittenwithseparateproofsforeachstepofthecalculation.Onceagain,iftheusercanseetheintermediatestagesofthetransformation,thentheproofofeachstepcaneasilybefound.Theexamplebelowillustratesthistypeofreasoning:proof–...have“f(u\(x\y))+f(u�x\y)=(f(u\(x\y))+f(u\y�x))+f(u�y)”by(metisclass_semiring.add_aey)alsohave“:::=(f((u\y)\x)+f(u\y�x))+f(u�y)”by(metisInt_commuteInt_left_commute)alsohave“:::=f(u\y)+f(u�y)”usingfxIntyubyautoalsohave“:::=fu”by(metisfyu)nallyshow“f(u\(x\y))+f(u�x\y)=fu”.qedTopdownproofdevelopmentisgreatlyassistedbyatrivialIsarfeature:theabilitytoomitproofs.Whereaproofisrequired,theusermaysimplyinsertthewordsorry.Isabellethenregardsthetheoremasproved.4Theusercanthencheckthatthenewlyintroducedpropositionindeedsufcestoprovethenextpropositioninthedevelopment.Adifcultproofcandevelopasaseriesofpropositions,eachinitially“proved”usingsorrybuteventuallyusingeitherSledgehammer,anautomatictactic,oranestedproofdevelopmentofthesameform.Progressinsuchaproofcanbemeasuredintermsofthedifcultyofthepropositionsthatlackrealproofs.Althoughwecanneverbecertainthataproofdevelopmentcanbecompleteduntiltheveryend,theabilitytowritesorryinplaceofaproofreducestheriskofdiscoveringthatalemmaisuselessonlyafterspendingweeksprovingit.InJanuary2010,aspartofitsnewM.Phil.programme,theUniversityofCambridgeofferedalecturecourseonIsabelle[22].Thecoursematerialsincludedalmostnoinformationaboutthelow-leveltacticsthathadbeenthemainstayofIsabelleproofsfornearly20years.Onlytwoofthe12lecturesweredevotedtoIsarstructuredproofs,andtheytookanovelapproach:ratherthanproceedingmethodicallythroughtheIsarfundamentals,thelecturespresentedtheouterskeletonofaproof,withcrucialsectionsreplacedbysorry.TheydescribedtheideaoftryingtoeliminateeachsorryusingeitherSledgehammerorsomeautomatictactic.Practicalworksubmittedbythestudentslaterdemonstratedthatseveralofthemhadlearnthowtowritecomplex,well-structuredproofs.WewerehappytoreassurethemthatsubmittingworkgeneratedlargelybySledgehammerwasbynomeanscheating! 4TheexistenceofsorrydoesnotcompromiseIsabelle'ssoundness,becauseitisonlypermittedduringinteractivesessions.Atheorylecontaininganoccurrenceofsorrymaynotbeimportedbyanothertheory.8 ThreeYearsofExperiencewithSledgehammerL.C.PaulsonandJ.C.Blanchette[4]GertrudBauerandMarkusWenzel.Calculationalreasoningrevisited(anIsabelle/Isarexperience).InRichardJ.BoultonandPaulB.Jackson,editors,TheoremProvinginHigherOrderLogics:TPHOLs2001,LNCS2152,pages75–90.Springer,2001.Onlineathttp://link.springer.de/link/service/series/0558/tocs/t2152.htm.[5]ChristophBenzmüller,LawrenceC.Paulson,FrankTheiss,andArnaudFietzke.LEO-II—Acooperativeautomatictheoremproverforhigher-orderlogic.InAlessandroArmando,PeterBaumgartner,andGillesDowek,editors,AutomatedReasoning—4thInternationalJointConference,IJCAR2008,LNAI5195,pages162–170.Springer,2008.[6]ChristophBenzmüllerandVolkerSorge.OANTS—Anopenapproachatcombininginteractiveandau-tomatedtheoremproving.InManfredKerberandMichaelKohlhase,editors,SymbolicComputationandAutomatedReasoning,pages81–97.A.K.Peters,2000.[7]MarcBezem,DimitriHendriks,andHansdeNivelle.Automaticproofconstructionintypetheoryusingresolution.JournalofAutomatedReasoning,29(3–4):253–275,2002.[8]SaschaBöhmeandTobiasNipkow.Sledgehammer:Judgementday.InJürgenGieslandReinerHähnle,editors,AutomatedReasoning(IJCAR2010),LNCS6173,pages107–121.Springer,2010.[9]AlonzoChurch.Aformulationofthesimpletheoryoftypes.JournalofSymbolicLogic,5:56–58,1940.[10]M.J.C.GordonandT.F.Melham.IntroductiontoHOL:ATheoremProvingEnvironmentforHigherOrderLogic.CambridgeUniversityPress,1993.[11]KryštofHoder.SInE(SumoInferenceEngine).http://www.cs.man.ac.uk/~hoderk/sine/.[12]JoeHurd.IntegratingGandalfandHOL.InYvesBertot,GillesDowek,AndréHirschowitz,ChristinePaulin,andLaurentThéry,editors,TheoremProvinginHigherOrderLogics:TPHOLs'99,LNCS1690,pages311–321.Springer,1999.[13]JoeHurd.First-orderprooftacticsinhigher-orderlogictheoremprovers.InMylaArcher,BenDiVito,andCésarMuñoz,editors,DesignandApplicationofStrategies/TacticsinHigherOrderLogics,numberNASA/CP-2003-212448inNASATechnicalReports,pages56–68,September2003.[14]DavidMcAllester.Ontic:Aknowledgerepresentationsystemformathematics.InEwingLuskandRossOverbeek,editors,9thInternationalConferenceonAutomatedDeduction,LNCS310,pages742–743.Springer,1988.[15]WilliamMcCune.Prover9andMace4.http://www.cs.unm.edu/~mccune/prover9/.[16]AndreasMeier.TRAMP:Transformationofmachine-foundproofsintonaturaldeductionproofsattheasser-tionlevel(systemdescription).InDavidMcAllester,editor,AutomatedDeduction—CADE-17InternationalConference,LNAI1831,pages460–464.Springer,2000.[17]JiaMengandLawrenceC.Paulson.Translatinghigher-orderclausestorst-orderclauses.JournalofAutomatedReasoning,40(1):35–60,2008.[18]JiaMengandLawrenceC.Paulson.Lightweightrelevancelteringformachine-generatedresolutionprob-lems.JournalofAppliedLogic,7(1):41–57,2009.[19]JiaMeng,ClaireQuigley,andLawrenceC.Paulson.Automationforinteractiveproof:Firstprototype.InformationandComputation,204(10):1575–1596,2006.[20]TobiasNipkow.AtutorialintroductiontostructuredIsarproofs.http://isabelle.in.tum.de/dist/Isabelle/doc/isar-overview.pdf.[21]TobiasNipkow,LawrenceC.Paulson,andMarkusWenzel.Isabelle/HOL:AProofAssistantforHigher-OrderLogic.Springer,2002.LNCS2283.[22]LawrenceC.Paulson.Interactiveformalverication.http://www.cl.cam.ac.uk/teaching/0910/L21/.Lecturecoursematerials.[23]LawrenceC.Paulson.Thefoundationofagenerictheoremprover.JournalofAutomatedReasoning,5(3):363–397,1989.[24]LawrenceC.Paulson.Settheoryforverication:I.Fromfoundationstofunctions.JournalofAutomatedReasoning,11(3):353–389,1993.[25]LawrenceC.Paulson.Settheoryforverication:II.Inductionandrecursion.JournalofAutomatedReason-ing,15(2):167–215,1995.10

Related Contents


Next Show more