yanglingupennedu June 2010 Abstract We explore the implications of Zipfs law for the understanding of linguistic productivity Focusing on language acquisition we show that the item usage based approach has not been supported by adequate statistical e ID: 3789
Download Pdf The PPT/PDF document "Whos Afraid of George Kingsley Zipf Char..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
AbstractWeexploretheimplicationsofZipf'slawfortheunderstandingoflinguisticproductivity.Focusingonlanguageacquisition,weshowthattheitem/usagebasedapproachhasnotbeensupportedbyadequatestatisticalevidence.Bycontrast,thequantitativepropertiesofaproductivegrammarcanbepreciselyformulated,andareconsistentwithevenveryyoungchildren'slanguage.Moreover,drawingfromre-searchincomputationallinguistics,thestatisticalpropertiesofnaturallanguagestronglysuggestthatthetheoryofgrammarbecomposedofgeneralprincipleswithoverarchingrangeofapplicationsratherthanacollectionofitemandconstructionspecicexpressions.2 syntacticcompetenceiscomprisedtotallyofverb-specicconstructionswithopennominalslots,ratherthanabstractandproductivesyntacticrulesunderwhichpresumablyabroaderrangeofcombinationsisexpected.Limitedmorphologicalinection.AccordingtoastudyofchildItalian(Pizutto&Caselli1994),47%ofallverbsusedby3youngchildren(1;6to3;0)wereusedin1person-numberagreementform,andanadditional40%wereusedwith2or3forms,wheresixformsarepossible(3person2number).Only13%ofallverbsappearedin4ormoreforms.Again,thelowlevelofusagediversityistakentoshowthelimitednessofgeneralizationcharacteristicofitem-basedlearning.Unbalanceddeterminerusage.CitingPine&Lieven(1997)andothersimilarstudies,itisfoundthatwhenchildrenbegantousethedeterminersaandthewithnouns,therewasalmostnoover-lapinthesetsofnounsusedwiththetwodeterminers,suggestingthatthechildrenatthisagedidnothaveanykindofabstractcategoryofDeterminersthatincludedbothoftheselexicalitems.Thisndingisheldtocontradicttheearlieststudy(Valian1986)whichmaintainsthatchilddeter-mineruseisproductiveandaccuratelikeadultsbytheageof2;0.Sofaraswecantell,however,theseevidenceinsupportforitem-basedlearninghasbeenpresented,andaccepted,onthebasisofintuitiveinspectionsratherthanformalempiricaltests.Forinstance,amongthenumerousexamplesfromchildlanguage,nostatisticaltestwasgiveninthemajortreat-ment(Tomasello1992)wheretheVerbIslandHypothesisandrelatedideasaboutitem-basedlearningareputforward.Specically,notesthasbeengiventoshowthattheobservationsabovearestatisticallyinconsistentwiththeexpectationofafullyproductivegrammar,thepositionthatitem-basedlearningopposes.Nor,forthatmatter,aretheseobservationsshowntobeconsistentwithitem-basedlearning,which,asweshallsee,hasnotbeenclearlyenougharticulatedtofacilitatequantitativeevaluation.Inthispaper,weprovidestatisticalanalysistollthesegaps.Wedemonstratethatchildren'slanguageuseactuallyshowstheoppositeoftheitem-basedview;theproductivityofchildren'sgrammarisinfactcon-rmed.Morebroadly,weaimtodirectresearcherstocertainstatisticalpropertiesofnaturallanguageandthechallengestheyposeforthetheoryoflanguageandlanguagelearning.Ourpointofdepartureisanamethathasbeen,andwillcontinueto,tormenteverystudentoflanguage:GeorgeKingsleyZipf.2ZipanPresence2.1ZipanWordsUndertheso-calledZipf'slaw(Zipf1949),theempiricaldistributionsofwordsfollowacuriouspattern:relativelyfewwordsareusedfrequentlyveryfrequentlywhilemostwordsoccurrarely,withmanyoccurringonlyonceinevenlargesamplesoftexts.Moreprecisely,thefrequencyofawordtendstobeapproximatelyinverselyproportionaltoitsrankinfrequency.LetfbethefrequencyofthewordwiththerankofrinasetofNwords,then:f=C rwhereCissomeconstant(1)IntheBrowncorpus(Kucera&Francis1967),forinstance,thewordwithrank1isthe,whichhasthefrequencyofabout70,000,andthewordwithrank2isof,withthefrequencyofabout36,000:almostexactlyasZipf'slawentails(i.e.,700001360002).TheZipancharacterizationofwordfrequency4 2.2ZipanCombinatoricsThelongtailofZipf'slaw,whichisoccupiedbylowfrequencywords,becomesevenmorepronouncedwhenweconsidercombinatoriallinguisticunits.Take,forinstance,n-grams,thesimplestlinguisticcombinationthatconsistsofnconsecutivewordsinatext.2Sincetherearealotmorebigramsandtrigramsthanwords,thereareconsequentlyalotmorelowfrequencybigramsandtrigramsinalinguisticsample,asFigure2illustratesfromtheBrowncorpus(forrelatedstudies,seeTeahan1997,Haetal.2002): 40 50 60 70 80 90 100 200 100 50 40 30 20 10 5 4 3 2 1 Cumulative%oftypesFrequencywords bigrams trigrams Figure2.Thevastmajorityofn-gramsarerareevents.Thex-axisdenotesthefrequencyofthegram,andthey-axisdenotesthecumulative%ofthegramthatappearatthatfrequencyorlower.Forinstance,thereareabout43%ofwordsthatoccuronlyonce,about58%ofwordsthatoccur1-2times,68%ofwordsthatoccur1-3times,etc.The%ofunitsthatoccurmultipletimesdecreasesrapidly,especiallyforbigramsandtrigrams:approximately91%ofdistincttrigramtypesintheBrowncorpusoccuronlyonce,and96%occuronceortwice.Therangeoflinguisticformsissovastthatnosampleislargeenoughtocaptureallofitsvarietiesevenwhenwemakeacertainnumberofabstractions.Figure3plotstherankandfrequencydistributionsofsyntacticrulesofmodernEnglishfromthePennTreebank(Marcusetal.1993).Sincethecorpushasbeenmanuallyannotatedwithsyntacticstructures,itisstraightforwardtoextractrulesandtallytheirfrequencies.3ThemostfrequentruleisPP!PNP,followedbyS!NPVP:again,theZipf-likepatterncanbeseenbythecloseapproximationbyastraightlineonthelog-logscale. 2Forexample,giventhesentencethecatchasesthemouse,thebigrams(n=2)arethecatchasesthemousearethecat,catchases,chasesthe,andthemouse,andthetrigrams(n=3)arethecatchases,catchasesthe,chasesthemouse.Whenn=1,wearejustdealingwithwords.3CertainruleshavebeencollapsedtogetherastheTreebankfrequentlyannotatesrulesinvolvingdistinctfunctionalheadsasseparaterules.6 SupposealinguisticsamplecontainsSdeterminer-nounpairs,whichconsistofDandNuniquedeterminersandnouns.(InthepresentcaseD=2foraandthe.)ThefullproductivityoftheDPrule,bydenition,meansthatthetwocategoriescombineindependently.Twoobservations,oneobviousandtheothernovel,canbemadeaboutthedistributionsofthetwocategoriesandtheircombinations.First,nouns(andopenclasswordsingeneral)willfollowzipf'slaw.Forinstance,thesingularnounsthatappearintheformofDP!DNintheBrowncorpusshowalog-logslopeof-0.97.IntheCHILDES(MacWhinney2000)speechtranscriptsofsixchildren(seesection3.2fordetails),theaveragevalueoflog-logslopeis-0.98.Thismeansthatinalinguisticsample,relativelyfewnounsoccuroftenbutmanywilloccuronlyoncewhichofcoursecannotoverlapwithmorethanonedeterminers.Second,whilethecombinationofDandNissyntacticallyinterchangeable,N'stendtofavoroneofthetwodeterminers,aconsequenceofpragmaticsandindeednon-linguisticfactors.Forinstance,wesaythebathroommoreoftenthanabathroombutabathmoreoftenthanthebath,eventhoughallfourDPsareperfectlygrammatical.Thereasonforsuchasymmetriesisnotamatteroflinguisticinterest:thebathroomismorefrequentthanabathroomonlybecausebodilyfunctionsareamoreconstantthemeoflifethanrealestatematters.Wecanplacethesecombinatorialasymmetriesinaquantitativecontext.Asnotedearlier,about75%ofdistinctnounsintheBrowncorpusoccurwithexclusivelytheorabutnotboth.Eventheremaining25%whichdooccurwithtendtohavefavorites:onlyafurther25%(i.e.12.5%ofallnouns)areusedwithaandtheequallyfrequently,andtheremaining75%areunbalanced.Overall,fornounsthatappearwithbothdeterminersasleastonce(i.e.25%ofallnouns),thefrequencyratiobetweenthemoreoverthelessfavoreddetermineris2.86:1.(Ofcourse,somenounsfavorthewhileothersfavora,asthebathroomandbathexamplesaboveillustrate.)Thesegeneralpatternsholdforchildandadultspeechdataaswell.Inthesixchildren'stranscripts(section3.2),theaveragepercentageofbalancednounsamongthosethatappearwithboththeandais22.8%,andthemorefavoredvs.lessfavoreddeterminerhasanaveragefrequencyratioof2.54:1.Eventhoughtheseratiosdeviatefromtheperfect2:1ratiounderthestrictversionofZipf'slawthemorefavoredisevenmoredominantoverthelesstheyclearlypointouttheconsiderableasymmetryincategorycombinationusage.Asaresult,evenwhenanounappearsseveraltimesinasample,thereisstillasignicantchancethatithasbeenpairedwithasingledeterminerinallinstances.Together,Zipandistributionsofatomiclinguisticunits(words;Figure1)andtheircombinations(n-gramsFigure2,phrasesFigure3)ensurethatthedeterminer-nounoverlapmustberelativelylowunlessthesamplesizeSisverylarge.Insection4,weexamine,anddiscoversimilarpatterns,fortheusagepatternsofverbalsyntaxandmorphology.Forthemoment,wedevelopaprecisemathematicaltreatmentandcontrastitwiththeitem-basedlearningapproachinthecontextoflanguageacquisition.3QuantifyingProductivity3.1TheoreticalanalysisConsiderasample(N,D,S),whichconsistsofNuniquenouns,Duniquedeterminers,andSdeterminer-nounpairs.HereD=2fortheandathoughweconsiderthegeneralcasehere.Thenounsthathaveappearedwithmorethanone(i.e.two)determinerswillhaveanoverlapvalueof1;otherwise,theyhavetheoverlapvalueof0.Theoverlapvaluefortheentiresamplewillbethenumberof1'sdividedbyN.Ouranalysiscalculatestheexpectedvalueoftheoverlapvalueforthesample(N,D,S)underthe8 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 10 20 30 40 50 60 70 80 90 100 ExpectedOverlapRank Figure4.Expectedoverlapvaluesfornounsorderedbyrank,forN=100nounsinasamplesizeofS=200withD=2determiners.WordfrequenciesareassumedtofollowtheZipandistribution.Ascanbeseen,fewofnounshavehighprobabilitiesofoccurringwithbothdeterminers,butmostare(far)belowchance.Theaverageoverlapis21.1%.UnderZipandistributionofcategoriesandtheirproductivecombinations,lowoverlapvaluesareamathematicalnecessity.Asweshallsee,thetheoreticalformulationherenearlyperfectlymatchthedistributionalpatternsinchildlanguage,towhichweturnpresently.3.2DeterminersandproductivityMethods.Tostudythedeterminersysteminchildlanguage,weconsiderthedatafromsixchildrenAdam,Eve,Sarah,Naomi,Nina,andPeter.ThesearetheallandonlychildrenintheCHILDESdatabase(MacWhinney2000)withsubstantiallongitudinaldatathatstartsattheverybeginningofsyntacticde-velopment(i.e,oneortwowordstage)sothattheitem-basedstage,ifexists,couldbeobserved.Forcomparison,wealsoconsidertheoverlapmeasureoftheBrowncorpus(Kucera&Francis1967),forwhichproductivityisnotindoubt.Werstremovedtheextraneousannotationsfromthechildtextandthenappliedanopensourceimplementationofarule-basedpart-of-speechtagger(Brill1995):5wordsarenowassociatedwiththeirpart-of-speech(e.g.,preposition,singularnoun,pasttenseverb,etc.).ForlanguagessuchasEnglish,whichhasrelativelysalientcuesforpart-of-speech(e.g.,rigidwordorder,lowdegreeofmorphologicalsyncretism),suchtaggerscanachievehighaccuracyatover97%.Thisalreadylowerrorratecausesevenlessconcernforourstudy,sincethedeterminersaandthearenotambiguousandarealwayscor-rectlytagged,whichreliablycontributestothetaggingofthewordsthatfollowthem.TheBrownCorpusisavailablewithmanuallyassignedpart-of-speechtagssonocomputationaltaggingisnecessary.Withtaggeddatasets,weextractedadjacentdeterminer-nounpairsforwhichDiseitheraorthe,andNhasbeentaggedasasingularnoun.Wordsthataremarkedasunknown,largelyunintelligible 5Availableathttp://gposttl.sourceforge.net/.10 15 20 25 30 35 40 45 50 15 20 25 30 35 40 45 50 EmpiricalPredictedidentity r=1.06 Figure5.Thesolidlinerepresentsthelinearregressiontoftheexpectedvs.empiricalvaluesofoverlapinTable1column5and6(r=1.08,adjustedR2=0.9716).Thedottedlineindicatesaperfectt(i.e.,theidentityfunctiony=x).Therefore,wecouldthatthedeterminerusagedatafromchildlanguageisconsistentwiththeproductiveruleDP!DN.Theempiricalstudiesalsorevealconsiderableindividualvariationintheoverlapvalues,anditisinstructivetounderstandwhy.AstheBrowncorpusresultshows(Table1lastrow),samplesizeS,thenumberofnounsN,orthelanguageuser'sagealoneisnotpredictiveoftheoverlapvalue.Thevariationcanberoughlyanalyzedasfollows;seeValianetal.(2009)forarelatedproposal.GivenNuniquenounsinasampleofS,greateroverlapvaluecanbeobtainedifmorenounsoccurmorethanonce.Thatis,wordswhoseprobabilitiesaregreaterthan1=Scanincreasetheoverlapvalue.Zipf'slaw(2)allowsustoexpressthiscutofflineintermswithranks,astheprobabilityofthenounnrwithrankrhastheprobabilityof1=(rHN).ThederivationbelowusesthefactthattheNthHarmonicNumberPNi=11=icanbeapproximatedbylnN.S1 rHN=1r=S HNS lnN(5)Thatis,onlynounswhoseranksarelowerthanS=(lnN)canbeexpectedtobenon-zerooverlaps.ThetotaloverlapisthusamonotonicallyincreasingfunctionofS=(NlnN)which,giventheslowgrowthoflnN,isapproximatelyS=N,atermthatmustbepositivelycorrelatedwithoverlapmeasures.Thisresultisconrmedinstrongestterms:S=Nisanearperfectpredictorfortheempiricalvaluesofoverlap(lasttwocolumnsofTable1):r=0.986,p0.00001.12 ciesfromthatchild'sinput(localmemorylearner)andthedeterminer-nounpairsalongwiththeirfre-quenciesintheentire1.1millionutterancesofadultspeech(globalmemorylearner).ForeachchildwithasamplesizeofS(seeTable1,column2),andforeachvariantofthememorymodel,weusetheMonteCarlosimulationtorandomlydrawSpairsfromthetwosetsofdatathatcorrespondtothelocalandglobalmemorylearningmodels.Theprobabilitywithwhichapairisdrawnisproportionaltoitsfre-quencyinthetwosetsofdata.Thus,amorefrequently-usedpairsintheinputwillhaveahigherchanceofbeingdrawn,whichreectsfrequencyeffectsinlearningsoftenemphasizedintheitem/usage-basedapproach(e.g.,Tomasello2001,2003,Matthewsetal.2005,Bybee&Hopper2001,amongothers).Eachsample,then,consistsofalistofdeterminer-nounpairswithvaryingoccurrencecounts.Wecalculatethevalueofoverlapfromthislist,thatis,thepercentageofnounsthatappearwithbothaandtheoverthetotalnumberofnouns.Theresultsareaveragedover1000draws.TheseresultsaregiveninTable2. Child SampleSize(S) Overlap(globalmemory) Overlap(localmemory) Overlap(empirical) Eve 831 16.0 17.8 21.6 Naomi 884 16.6 18.9 19.8 Sarah 2453 24.5 27.0 29.2 Peter 2873 25.6 28.8 40.4 Adam 3729 27.5 28.5 32.3 Nina 4542 28.6 41.1 46.7 First100 600 13.7 17.2 21.8 First300 1800 22.1 25.6 29.1 First500 3000 25.9 30.2 34.2 Table2.Thecomparisonofdeterminer-nounoverlapbetweentwovariantsofitem-basedlearningandempiricalresults.Bothsetsofoverlapvaluesfromthetwovariantsofitem-basedlearning(column3and4)differsig-nicantlyfromtheempiricalmeasures(column5):p0.005forbothpairedt-testandpairedWilcoxontest.Thissuggeststhatchildren'suseofdeterminersdoesnotfollowthepredictionsoftheitem-basedlearningapproach;itcertainlydoesnotseemtobetheresultofthechildretrievingjointlystoreddeterminer-nounpairsfromtheinputinafrequencysensitivefashion.Naturally,ourevaluationhereistentativesincethepropertestcanbecarriedoutonlywhenthetheoreticalpredictionsofitem-basedlearningaremadeclear.Andthatisexactlythepoint:theadvocatesofitem-basedlearningnotonlyrejectedtheal-ternativehypothesiswithoutadequatestatisticaltests,butalsoacceptedthefavoredhypothesiswithoutadequatestatisticaltests.4AnItemizedLookatVerbsTheformalanalysisinsection3canbegeneralizedtochildverbsyntaxandmorphology,whichareamongthemainsupportingcasesforitem-basedlearning.Unfortunately,theacquisitiondatainsup-portoftheVerbIslandHypothesis(Tomasello1992)andtheitem-basednatureofearlymorphology(Pizutto&Caselli1994)citedinsection1hasnotbeenmadeavailableinthepublicdomain.ButtheZipanrealityisinherent:thecombinatoricsofverbsandtheirmorphologicalandsyntacticassociatesaresimilarlylopsidedinusagedistributionasiswiththedeterminers.Wenowturntoexaminethestatisticaldistributionsofverbalmorphologyandsyntax.14 4.2AllverbsareislandsWenowstudythedistributionalpropertiesofverbalsyntaxthathavebeenattributedtotheVerbIslandHypothesis.Wefocusonconstructionsthatinvolveatransitiveverbanditsnominalobjects,includingpronounsandnounphrases.FollowingthedenitionofsentenceframeinTomasello'soriginalVerbIslandstudy(1992,p242),eachuniquelexicalitemintheobjectpositioncountsasauniqueconstructionfortheverb.Figure6showstheconstructionfrequenciesofthetop15transitiveverbsin1.1millionchilddi-rectedutterances.Processingmethodsareasdescribedinsection3.2excepthereweextractadjacentverb-nominalpairsinpart-of-speechtaggedtexts.Foreachverb,wecountitstop10mostfrequentcon-structions,whicharedenedastheverbfollowedauniquelexicalitemintheobjectposition(e.g.,askhimandaskJohnaredifferentconstructions.)Foreachofthe10ranks,wetalliedtheconstructionfrequenciesforall15verbs.8 4 4.5 5 5.5 6 6.5 7 7.5 8 0 0.5 1 1.5 2 2.5 log(freq)log(rank) Figure6.Rankandfrequencyofverb-objectconstructionsbasedon1.1millionchild-directedutterances.TheverbconstructionfrequencythusalsofollowaZipf-likepattern:evenforlargecorpora,averbap-pearsinfewconstructionsfrequentlyandinmostconstructionsinfrequentlyifatall.TheobservationofVerbIslands,thatverbstendtocombinewithoneorfewelementsoutofalargerange,isinfactcharac-teristicofafullyproductiveverbalsyntaxsystem.Asfarasweknow,thequantitativepredictionsoftheVerbIslandHypothesishavenotbeenspelledoutbutwemayestimatethenecessaryamountoflanguagesamplethatwouldmasktheseislandeffects.Theappealtounevennessofverbalconstructionfrequenciesseemstoreecttheexpectationthatunderfullproductivity,mostverbsoughttoappearwithmostofthepossiblerangeofarguments.Substitutingnounsanddeterminersforverbsandnominals,theformalanalysiscouldbecarriedoutfortheverbalsyntacticsystem.Insteadofcalculatingtheexpectednumbersofdeterminersthatanounappearswith,onewouldcalculatetheexpectednumberofobjectsaverbappearswith. 8Theseverbsare:put,tell,see,want,let,give,take,show,got,ask,makeeat,like,bringandhear.Thefrequencytalliesofthetop10mostfrequentconstructionsare1904,838,501,301,252,189,137,109,88,and75.16 (Jelinek1998);then-gramandruledistributionsdiscussedinsection2.2makethesepointsveryclearly.Forthelinguist,theZipannatureoflanguageraisesimportantquestionsforthedevelopmentoflinguistictheories.First,Zipf'slawhintsattheinherentlimitationsinapproachesthatstressthestor-ageofconstruction-specicrulesorprocesses(e.g.,Goldberg2003,Culicover&Jackendoff2005).Forinstance,thecentraltenetsofConstructionGrammarviewsconstructionsasstoredpairingsofformandfunction,includingmorphemes,words,idioms,partiallylexicallylledandfullygenerallinguis-ticpatternsandthetotalityofourknowledgeoflanguageiscapturedbyanetworkofconstructions(Goldberg2003,p219).YettheZipandistributionoflinguisticcombinations,asillustratedinFigure3fortheWallStreetJournalandFigure4forchilddirectedspeech,ensurethatmostpairingsofformandfunctionsimplywillneverbeheard,nevermindstored,andthosethatdoappearmaydosowithsufcientlylowfrequencysuchthatnoreliablestorageanduseispossible.Second,andmoregenerally,Zipf'slawchallengestheconventionalwisdomincurrentsyntacticthe-orizingthatmakesuseofahighlydetailedlexicalcomponent;therehavesuggestionsthatallmattersoflanguagevariationareinthelexiconwhichinanycaseneedstobeacquiredforindividuallanguages.Yettheeffectivenessoflexicalizationingrammarhasnotbeenfullyinvestigatedinlargescalestudies.However,usefulinferencescanbedrawnfromtheresearchonstatisticalinductionofgrammarincom-putationallinguistics(Charniak1993,Collins2003).Thesetaskstypicallytakealargesetofgrammaticalrules(e.g.,probabilisticcontextfreegrammar)andndappropriateparametervalues(e.g.,expansionprobabilitiesinaprobabilisticcontextfreegrammar)onthebasisofanannotatedtrainingdatasuchastheTreebankwheresentenceshavebeenmanuallyparsedintophrasestructures.Theperformanceofthetrainedgrammarisevaluatedbymeasuringparsingaccuracyonanewsetofunanalyzedsentences,therebyobtainingsomemeasureofgeneralizationpowerofthegrammar.Obviously,inducingagrammaronacomputerishardlythesamethingasconstructingatheoryofgrammarbythelinguist.Nevertheless,statisticalgrammarinductioncanbeviewedasatoolthatexploreswhattypeofgrammaticalinformationisinprincipleavailableinandattainablefromthedata,whichinturncanguidethelinguistinmakingtheoreticaldecisions.Contemporaryworkonstatisticalgrammarinductionmakesuseofwiderangeofpotentiallyusefullinguisticinformationinthegrammarformalism.Forinstance,anphrasedrinkwatermayberepresentedinmultipleforms:(a)VP!VNP(b)VP!VdrinkNP(c)VP!VdrinkNPwater(a)isthemostgeneraltypeofcontextfreegrammarrule,whereasboth(b)and(c)includeadditionallexicalinformation:(b)providesalexicallyspecicexpansionruleconcerningtheheadverbdrink,andthebilexicalrulein(c)encodestheitem-specicpairingofdrinkandwater,whichcorrespondstothenotionofsentenceframeinTomasello'sVerbIslandhypothesis(1992;seesection4.2).Byincludingorexcludingtherulesofthetypesaboveinthegrammaticalformalism,andevaluat-ingparsingaccuracyofthegrammarthustrained,wecanobtainsomequantitativemeasureofhowmucheachtypeofrules,fromgeneraltospecic,contributestothegrammar'sabilitytogeneralizetonoveldata.Bikel(2004)providesthemostcomprehensivestudyofthisnature.Bilexicalrules(c),similartothenotionofsentenceframesandconstructions,turnouttoprovidevirtuallynogainoversimplermodelsthatonlyuserulesofthetype(a)and(b).Furthermore,lexicalizedrules(b)offeronlymodestimprovementovergeneralcategoricalrules(a)alone,withwhichalmostallofthegrammar'sgeneral-izationpowerlies.ThesendingsarenotsurprisinggiventheZipannatureoflinguisticproductivity:18 Chang,F.,Lieven,E.,&Tomasello,M.(2006).Usingchildutterancestoevaluatesyntaxacquisitional-gorithms.InProceedingsofthe28thAnnualConferenceoftheCognitiveScienceSociety.Vancouver,CanadaChomsky,N.(1958).ReviewofLangagedesmachinesetlangagehumainbyParVitoldBelevitch.Lan-guage,34(1),99-105.Chomsky,N.(1965).Aspectsofthetheoryofsyntax.Cambridge,MA:MITPress.Chomsky,N.(1975).Reectionsonlanguage.NewYork:Pantheon.Chomsky,N.(1981).Lecturesongovernmentandbinding.Dordrectht:Foris.Crain,S.(1991).Languageacquisitionintheabsenceofexperience.BehavioralandBrainSciences.14,597-650.Culicover,P.&Jackendoff,R.(2005).Simplersyntax.NewYork:OxfordUniversityPress.Freudenthal,D.,Pine,J.M.,Aguado-Orea,J.&Gobet,F.(2007).ModellingthedevelopmentalpatterningofnitenessmarkinginEnglish,Dutch,GermanandSpanishusingMOSAIC.CognitiveScience,31,311-341.Freudenthal,D.,Pine,J.M.&Gobet,F.(2009).SimulatingthereferentialpropertiesofDutch,GermanandEnglishrootinnitives.LanguageLearningandDevelopment,5,1-29.Gabaix,X.(1999).Zipf'sLawforCities:AnExplanation.TheQuarterlyJournalofEconomics.114,739-767.Goldberg,E.(2003).Constructions.TrendsinCognitiveScience,7,219224.Ha,LeQuan,Sicilia-Garcia,E.I.,Ming,Ji.&Smith,F.J.(2002).ExtensionofZipf'slawtowordsandphrases.Proceedingsofthe19thInternationalConferenceonComputationalLinguistics.315-320.Hay,J.&Baayen,H.(2005).Shiftingparadigms:gradientstructureinmorphology.TrendsinCognitiveSciences,9,342-348.Jelinek,F.(1998).Statisticalmethodsforspeechrecognition.Cambridge,MA:MITPress.Kucera,H&Francis,N.(1967).Computationalanalysisofpresent-dayEnglish.Providence,RI:BrownUniversityPress.Legate,J.A.&Yang,C.(2002).Empiricalreassessmentsofpovertystimulusarguments.LinguisticReview,19,151-162.Li,W.(1992).RandomtextsexhibitZipf'slaw-likewordfrequencydistribution.IEEETransactionsonInformationTheory,38(6),1842-1845.MacWhinney,B.(2000).TheCHILDESProject.LawrenceErlbaum.Mandelbrot,B.(1954).Structureformelledestextesetcommunication:Deuxétudes.Words,10,127.Matthews,D.,Lieven,E.,Theakston,A.&Tomasello,M.(2005).TheroleoffrequencyintheacquisitionofEnglishwordorder.CognitiveDevelopment,20,121-136.McNeill,D.(1963).Thecreationoflanguagebychildren.InLyons,J.&Wales,Roger.(Eds.)Psycholin-guisticPapers.Edinburgh:EdinburghUniversityPress.99-132.Miller,G.A.(1957).Someeffectsofintermittentsilence.TheAmericanJournalofPsychology,70,2,311-314.20 Zipf,G.K.(1949).Humanbehaviorandtheprincipleofleasteffort:Anintroductiontohumanecology.Addison-Wesley.22