/
forword sense disambiguation, either alone, or inconjunction with trad forword sense disambiguation, either alone, or inconjunction with trad

forword sense disambiguation, either alone, or inconjunction with trad - PDF document

phoebe-click
phoebe-click . @phoebe-click
Follow
385 views
Uploaded On 2016-11-22

forword sense disambiguation, either alone, or inconjunction with trad - PPT Presentation

and Sch159tze 1999 Mihalceaand Moldovan 1998 Traupman and Wilensky much work and a number of innovative ideasdoing significantly is illustrated in Figure 2 It is The word probabilities a ID: 491909

and SchŸtze 1999; Mihalceaand Moldovan.

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "forword sense disambiguation, either alo..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

forword sense disambiguation, either alone, or inconjunction with traditional text basedmethods. The approach is based in recent workon a method for predicting words for imageswhich can be learned from image datasets withassociated text. When word prediction isconstrained to a narrow set of choices such aspossible senses, it can be quite reliable, and weuse these predictions either by themselves and SchŸtze, 1999; Mihalceaand Moldovan., 1998; Traupman and Wilensky, much work, and a number of innovative ideas,doing significantly is illustrated in Figure 2. It is The word probabilities areprovided by simple frequency tables, and the regionprobability distribution are Gaussians over b|l)P(l)P(b)lå(1)where P(l) is the level prior, P(w|l) is However, if is automatic words. The presenceof the words has removed ambiguity from theinterpretation of the image.The second application is, of course, thereverseÑusing the image to help reduce the ambiguityof the words. We assume that the system has beentrained on a set of senses, S, for the vocabulary W. Toclarify the notation, we may have associated with the word tiger, as computed by twodifferent processes. The region in the images with the highest posterior ,I)µP(s|I)P(s|w)(2)where  P(s|w)µ1 if s is a sense of w andMoldovan., 1998). Specifically we use of a group of lions on a rock, theword pride presents a problem to the disambiguator. Byfar, pride's most common meaning is that regarding it asa deadly sin, however in this case we wish it to bedisambiguated as a group of lions. Even so, one canlook at the different WordNet sense hierarchies for prideand find that one, namely:pride�= animal group�= biological group � = group, groupingcontains the words animal and biological, making it abetter fit for the hierarchy of lion.With this structure in mind, our algorithm takes theset of keywords and, for each keyword in the set,performs a query such as the one shown above. Then,for each sense of the keyword, we perform the queriesfor the other keywords, and for each of their senses weexamine the similarities between their hypernym trees.We total up these similarities (shared nodes in the tree)and for each sense of the keyword produce a subtotal forthat sense. After we have performed this operation forall senses we divide the subtotal by the complete totalfor all senses to receive a score for that sense as the truedefinition of the keyword.5ExperimentsFor our like ÒheadÓ of the word a score of one, and typically the word sense vector would simply contain asingle value of one, with the other values being zero.We evaluate word-sense that bank_1and bank_3 are in runs using different samples for the testand training sets. We restricted the computation ofresults to those documents where there was clear senseambiguity. Because each such document typically hadonly one sense problem amid 3 or 4 words withoutsense problems, the baseline score using the measureabove is greater than 0.80 because any strategy will getabout 3 out or 4 correct for free. To clarify this further,we include the results of randomly chosen among senseswhen there is more than one available.The results are shown in Table 1 strongly suggestthat images can help disambiguate senses. The na•vemethod of text based disambiguation is comparable tochance, whereas adding image information substantiallyincreased the performance.6ConclusionThese preliminary studies strongly suggest that it isworthwhile to explore combining image informationwith more sophisticated that wetake the next step and apply the method to a data setwhere there is more sense ambiguity. Possiblecandidates which we are actively investigating includethe museum data used in (Barnard et al., 2001) andnews photos with captions available on the web.In general we have found that it is fruitful to studyhow image and text information can both complimenteach other and disambiguate one another. Differentrepresentations of the same thing can help learn co-constructed meaning. Properties which may be implicitin one representation may be more explicit and thusmore In particular, in the case ofdisambiguating words, we have shown that images canprovide a non-negligible amount of information whichcan be exploited by more traditional approaches.ReferencesAgirre, E. and Rigau, G., 1995. A proposal for wordsense disambiguation using conceptual distance, 1stInternational Conference on Recent Advances inNatural Language Processing, Velingrad.Barnard, K., Duygulu, P. and Forsyth, D., 2001.Clustering Art, IEEE Conference on ComputerVision and Pattern Recognition, Hawaii, pp. Processing. MIT Press.Cambridge, MA.Mihalcea, R. and Moldovan., D., G.A., Beckwith, R., Fellbaum, C., Gross, D. andMiller, K.J., 1990. Introduction to WordNet: an on-line lexical database. SenseDisambiguation. CSD-03-1227, Computer ScienceDivision, University of California Berkeley.Yarowsky, D., 1995. LanguageProcessing. ACL, Cambridge.Word sense disambiguation strategy Results were computed only on images with at leastone ambiguous term. Because typically only one out offour