/
togeneratecorrectprogramsbecauseeachcomponentcanpartiallycorrectforthe togeneratecorrectprogramsbecauseeachcomponentcanpartiallycorrectforthe

togeneratecorrectprogramsbecauseeachcomponentcanpartiallycorrectforthe - PDF document

luanne-stotts
luanne-stotts . @luanne-stotts
Follow
383 views
Uploaded On 2016-06-23

togeneratecorrectprogramsbecauseeachcomponentcanpartiallycorrectforthe - PPT Presentation

Program Result Input Notes pwd success Printthecurrentworkingdirectory Difcultasthereisnoinput pwd success Printtheuserdirectory CWD147userdir148inJava pwd success Printthecurrentdirector ID: 374739

Program Result Input Notes pwd success Printthecurrentworkingdirectory. Difcultasthereisnoinput. pwd success Printtheuserdirectory. CWD=“user.dir”inJava. pwd success Printthecurrentdirector

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "togeneratecorrectprogramsbecauseeachcomp..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

togeneratecorrectprogramsbecauseeachcomponentcanpartiallycorrectforthemistakesoftheothers.Forexample,adatabasequerywillreturnmanypossiblere-sults,mostofwhichwillbeincorrect,butbyleveragingthetypesystemthestitchercaneliminatemanyunlikelysolutions.Evenmoreimportantly,thetestcasesallowMachotopartiallydetouraroundthedifcultproblemofnaturallanguageprocessing.Modernmachinelearningtechniquesprovideprobabilisticanswers,whetherthequestionisthemeaningofapieceofnaturallanguageorthebestsamplefunctioninthedatabasetouse.Backedbyitsautomateddebugger,Machocanaffordtotrymul-tiplesolutions.Inaddition,combiningexamplesandnaturallanguagegreatlyreducestheirambiguity:thesetofprogramsthatsatisesboththenaturallanguageandthetestcasesismuchsmallerthanthesetsthatsatisfyeachinputindivid-ually,althoughtherearesomeexceptions:Machofounditsurprisinglyeasytosynthesizecatfromaunittestus-ingtheemptylesitusedforgeneratingls.However,wefoundthatmostofthetimeaprogramthatpassedevenonereasonabletestcasewouldbecorrect.Togethernat-urallanguageandexamplesformafairlyconcretespec-ication.2ArchitectureMacho'sworkowmirrorsahumanprogrammer.Itmapsthenaturallanguagetoimpliedcomputation,mapsthoseabstractionstoconcreteJavacode,combinesthecodechunksintoacandidatesolution,andnallyde-bugstheresultingprogram.Thegoalofeachsubsystemisthereforetominimizetheamountofbruteforceandtherebysynthesizethelargestpossibleprograms.2.1NaturalLanguageParserOurnaturallanguageparsingsubsystemattemptstoex-tractimpliedchunksofcomputationandthedataowbetweenthemfromthewordsandphrasesitreceives,andencodethatknowledgeforthedatabase.Usuallythestructureofthesentencecanbedirectlytransformedtorequestedcomputation:verbsimplyaction,nounsim-plyobjects,andtwonounslinkedbyaprepositionimplysomesortofconversioncode.Thismappingisconcep-tuallysimilartopreviouswork[1],butMacho'sdatabase“understands”amuchlargernumberofconcepts,includ-ingabbreviations.Inordertohandlethesemorevariedsentences,webeganwithanoff-the-shelfsystempro-videdbytheUniversityofIllinoisCognitiveCompu-tationgrouptotagindividualwordswiththeirpartofspeech(noun,verb,adjective,etc.)andtosplitsentencesapartintosmallerphrases.Ourmainproblemwasxingtheerrorsoftheparser,whichwastrainedonastandardcorpusofnewspaperar-ticles,notjargonlledmanpages.Forexample,`le'isusuallyaverb,like“theSECledchargesagainstEn-rontoday.”andprintisoftenanoun,e.g.,“Theirfoulprintswillnotsoonbecleansedfromthenancialsys-tem.”.Thesekindsoferrorswerequitecommon.Tohelpdetectwhatwordswereintendedtoactasac-tions,webuildagraphofprepositionslinkingtheobjectsinasentencetogetherintoatree.Atraversalofthistreerevealstherelationshipbetweenthenounsatitsleaves.Whenwendwordsthatarenotlinkedtotherestofthesentencebythisgraph,wecanguessthattheyaremisclassiedverbs.Theparseralsoprovidessomehintsastolikelycontrolow.Forexample,pluraladjectiveoradverbialphrasesoftenimplyalteroperationthatisimplementedasanifstatement.Thedescriptionofgrepcontains`linesmatchingapattern'whichimpliesonlysomelineswillbeused.2.2DatabaseAsthesubsystemthatmapsnaturallanguageabstrac-tionstoconcreteJavacode,thedatabaseistheenginethatpowersMacho.Whenthedatabasecansuggestrea-sonablecodechunks,thestitchingcanusuallyndacor-rectsolution,butwhenthedatabasefailsthespaceofcandidateprogramsissimplytoolargetosucceedbyailingrandomly.OuroriginalplanwastouseGoogleCode,butweal-mostimmediatelydismisseditascompletelyinadequate.GoogleCodeindexesahugenumberofles,butitap-pearstoonlyperformkeywordsearchontherawtextofthesourceles,whichwefoundtobeinadequateforourproblem.Instead,wedevelopedourowndatabaseforMacho.Ourrststepwastoobtainadatasetofabout200,000Javalesfromopensourceprojectsandcompilethemusingaspecialversionofjavacthatwemodiedtoemitabstractsyntaxtrees.Wecompiledratherthanparsedbe-causewewantedexactgloballocationsforeachfunctioncall,andbecausewedidn'twanttoreusebrokencode.Sinceopensourceprogrammersarenotexactlyparagonsofcodemaintenance,onlyabouthalfofoursourcelescompiledsuccessfully.Ourdatabasereturnscandidatemethodsbasedonin-putandoutputvariables,e.g.thequerydirectory!fileswouldreturnallfunctionscalledwithaninputvari-ablenameddirectoryandassignedtoavariablenamedles.Thisnicelycapturedthedifferentabstractionsthatdifferentprogrammersusedtorepresentcode,whichisimportantbecausefunctionshaveonlyonename.Theproblemwiththisapproachisthatmanythingsaren'tusuallyimplementedasfunctions.Higherlevelconcepts2 Program Result Input Notes pwd success Printthecurrentworkingdirectory. Difcultasthereisnoinput. pwd success Printtheuserdirectory. CWD=“user.dir”inJava. pwd success Printthecurrentdirectory. Abbreviation! pwd fail Printtheworkingdirectory. BreaksNLPforarcanereasons. pwd fail Showthecurrentworkingdirectory. Databaseentriesforshowaremostlygraphics. cat success Printthelinesofale. Vanilla. cat success Readale. Printissynthesized. cat fail Displaythecontentsofale. Databaseentriesforcontentsaremostlygraphics. cat fail Printale Solutionsprintthelename. sort success Sortthelinesofale. Printissynthesized. sort success Sortalebyline. sort fail Sortale. Insufcientlyprecisespecication. sort fail Sortthecontentsofale Databaseentriesforcontentsaremostlygraphics. grep success Printthelinesinalematchingapattern. SolutionsusingbothJavaLibandGNUregexes. grep fail Findapatterninthelinesofale. Correctexceptforifstatementlinkingtestandprint. grep fail Searchleforapattern. Poorresiliencyforfunctionnames. ls success Printthenamesoflesinadirectory.Sortthenames. ls success Printthecontentsofafolder.Sortthenames. ls fail Printthenamesoftheentriesinadirectory. Entriestonamesfails. ls fail Printthelesinadirectory. Doesnotsynthesizesort. cp success Copysrcletodestle. Programmerabbreviation! cp success Copyletole. UglybutMachoneedstoknowtherearetwoinputs. cp fail Duplicateletole. Nocandidateindatabase. wget fail Downloadle. Candidateshaveextrafunctionality. wget fail Opennetworkconnection.Downloadle. Machocan'tcreatebuffertransferloop. head fail Printthersttenlinesofale. 'First'isincomprehensible. uniq fail Printale.Ignoreadjacentlines. 'Ignore'and'adjacent'don'tmaptolibraries. perl fail Theanswertolife,theuniverse,andeverything. Seemstowork,butit'sstillrunning. Figure2:Macho'sresultsforgeneratingselectcoreutils.Thisgureshowstheresultsforpwd,cat,sort,grep,ls,cp,wget,head,anduniq,andthenaturallanguageinputweusedforeachoftheseprograms. Givingoutpartialcreditisalsodifcult.SomeofMacho'ssolutionsareveryclosebutnotbyteidentical,butautomaticallydeterminingwhetherornotanoutputissufcientlyclosetothetestcaseisapproximatelyashardasgeneratingtheprogram,anarticialversionoftheDunning-Krugereffect.UnderthesecircumstanceswedecidedtotrytopickaninterestingsetofnaturallanguageinputsrightontheborderofMacho'scapabili-tiesanduseourbestjudgementwhenthetestcaseswere“close”.Machosucceededingeneratingsimpleversionsofsixoutofninecoreutils-pwd,cat,sort,grep,cp,andls-andfailedtosynthesizewget,head,anduniq.Foreachcoreutility,wetargeteditsdefaultbehavior:nooptionsandtheminimumnumberofargumentspossible.Sincewehadtheprogramsavailableanyway,weusedthemtogenerateourunittests.AlloftheprogramshadonlyoneshorttestandtheresultsareshowninFigure2.4LessonsLearned4.1TheDatabaseisKingAlthoughmostoftheprogramsMachowritesare10-15linesorless,therearealotofpotential10-lineJavapro-grams.Bruteforcereallydoesnotgetveryfar-theabil-ityofthedatabasetoselectreasonablepiecesfromthenaturallanguageheuristicsisabsolutelycritical.Ingen-eral,whenthestitchingfailed,itwasoftenreasonabletothinkofahack,orasimplex,orjustletitrunalittlelonger,butwhenthedatabasefailedMachohadnohopeofevergeneratingacorrectsolution.ImprovingMachowillrequireasuperiordatabaseaboveeverythingelse.4.2PureNLPisBadProgrammingwithnaturallanguageisgenerallyconsid-eredabadideabecausespecifyingdetailsgraduallymu-tatesthenaturallanguageintoawordyversionofVisualBasic.Consideranaturallanguagespecforls:Takethepath"/home/zerocool/"Ifthepathisafile,printit.Otherwisegetthelistoffilesinthedirectory.Sorttheresultalphabetically.Goovertheresultfromthebeginningtotheend:Ifthecurrentelement'sfilenamedoesnotbeginwith".",printit.whichisourbestguessfortheinputrequiredforPe-gasus[3];itisobviouswhymostprogrammerswould4

Related Contents


Next Show more