/
And how Bidirectional Optimality Theory allows for Verbosity and Preci And how Bidirectional Optimality Theory allows for Verbosity and Preci

And how Bidirectional Optimality Theory allows for Verbosity and Preci - PDF document

tatyana-admore
tatyana-admore . @tatyana-admore
Follow
378 views
Uploaded On 2015-12-02

And how Bidirectional Optimality Theory allows for Verbosity and Preci - PPT Presentation

Manfred Krifka wwwmetricsuckscom and it is not uncommon that common people suspect a secret communist catholic or Jewish plot behind the attempts to go metric Why did the metric system not catc ID: 212135

Manfred Krifka www.metricsucks.com and

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "And how Bidirectional Optimality Theory ..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

And how Bidirectional Optimality Theory allows for Verbosity and Precision.The Sad Story of the Metric System in America Given the beginnings of the United States of America, its sympathy with the French revolution and its rationalist attitude towards the institu-tions of society, one would have expected that it would have been one of the first nations to adopt the new metric system that was introduced in France in 1800. But the history of the attempts to do so is decidedly mixed. American Congress authorized the use of the metric system in 1866. In 1959, American measurements were defined in relation to the metric system. In 1968, the government ordered a study which was pub-lished three years later under the title “A Metric America: A Decision Whose Time Has Come”. The year 1975 then saw the Metric Conver-sion Act, leading to the establishment of the US Metric Board. Amended in 1988, it resulted in the Metric Program, an organization founded to support the various federal agencies, which are required since 1991 to file an annual report on their efforts to change to the met-ric system. In spite of all these attempts, the United States of America are still the one major industrial nation that does not use the metric system as the predominant one. To this day, American schoolchildren have to count with miles that contain 1760 yards, yards that contain 3 feet, and feet that contain 12 inches. They have to memorize that an acre is 4840 square yards, and that a gallon contains 231 cubic inches. The costs of this are undoubtedly huge – they include, for example, the Mars mission of 1998 that failed because measurements were converted wrongly. Still, the general attitude of the American public towards the metric system is largely negative. There are websites with telling addresses like Manfred Krifka www.metricsucks.com , and it is not uncommon that common people suspect a secret communist, catholic or Jewish plot behind the attempts to go metric. Why did the metric system not catch on? There are many reasonsBut one that cannot be taken lightly is that certain well-intended public relation attempts intended to familiarize the American people with the metric system just did not work. Since the Metric Conversion Act, road distances in National Parks are often given in miles and kilometers. And since then, travelers encounter signs like the following one: (1) 11.265 km It is not hard to see why road signs like (1) suggest that the metric system is something for intellectuals, or “rocket scientists”, far too un-wieldy for everyday purposes. Problems with precision The problem with (1) leads us to the question: How much precision is enough? When can we stop being precise, relax and be a little vague? I won’t have much to say about this in general, but I will have to say something about the relation of precision level and linguistic form. Assume that the distance between Amsterdam and Vienna, measured as usual from city border to city border along the shortest connecting road path, is 965 kilometers. Now consider the following examples: (2) A: The distance between Amsterdam and Vienna is one thousand B: #No, you’re wrong, it’s nine hundred sixty-five kilometers.B’s reaction strikes us as inadequate; what he says is true but pedan-tic. But not so in the following exchange, where A’s utterance is actu-ally closer to the truth. Be Brief and Vague!431 (3) A: The distance between Amsterdam and Vienna is nine hundred B: No, you’re wrong, it’s nine hundred sixty-five kilometers.The road sign example (1) showed that the phenomena to be talked about here constitute a problem for translation. And indeed, if we trans-late (2) into the American measurement system, the oddness of the ex-change vanishes (one has to know that 600 miles are 965 kilometers, and 621 miles are 1000 kilometers). (4) A: The distance between Amsterdam and Vienna is six hundred and twenty-one miles. B: No, it’s six hundred miles. It is quite obvious that the oddness of (2) cannot simply be stated in terms of truth conditions. Otherwise, the following exchange should be odd as well, but it isn’t (if we disregard the fact that it is odd to render the distance of two cities with a precision of 50 meters in the first (5) A: The distance between Amsterdam and Vienna is one thousand point zero kilometers. B: No, you’re wrong, it’s nine hundred sixty-five kilometers.Examples like (2) can be multiplied at will. Consider the following: (6) A: B: #No, it’s 3.1415926535. (7) A: B: No, it’s 3.1415926535. Clearly, the reaction of B in (6) is pedantic, but it is not pedantic in The first generalization that we can draw from examples (2) to (7) is that one should not increase the level of precision that was set by the first speaker. To say that the distance between Amsterdam and Vienna is one thousand kilometers sets the level of precision to 50 km. The speaker indicates with the choice of words that the only distance values that should be mentioned are 800 km, 900 km, 1000 km, 1100 km, and so on. Changing this level in the reaction is pedantic. Similarly, to say that the value of is 3.14159 sets the level of precision to 5 decimal Manfred Krifka points; again changing this level is pedantic. If the first speaker starts out with a higher level of precision, as in (3), (5) or (7), the reactions of the second speaker at the same level are not considered pedantic. The translation problems we encountered with (1) and (5) show that the level of precision does not translate under a “precise” translation of terms. A into the metric system would have been , and there is no conceivable translation of (2) into mile measurements that would preserve the oddity of the exchange. The precision level of an expression can be marked explicitly by modifiers like or . We virtually have to employ if we want to have a round number understood in a precise way: (8) The distance between Amsterdam and Vienna is exactly one thou-sand kilometers.We can also use these modifiers to support an interpretation that an expression would have had anyway: (9) a. The distance between Amsterdam and Vienna is exactly nine hundred sixty five kilometers. b. The distance between Amsterdam and Vienna is roughly one thousand kilometers.But it is ludicrously pedantic to utter (10), because it suggests that one could be even more precise: The distance between Amsterdam and Vienna is roughly nine hundred sixty five kilometers. The notion of precision level is applicable only to the values of measurement terms, and not in cases in which numbers are used in an ordinal way. There is nothing pedantic in B’s reaction in the following example: (11) A: Her phone number is sixty-five one thousand. B: No, her phone number is sixty-five one-thousand and one. I will refer to the phenomenon we are after by the term precision level choice. The principle of proper precision level choice can be for-mulated as follows: Be Brief and Vague!433 Precision level choice: When expressing a measurement of an entity, choose a level of precision that is adequate for the purpose In (1), the adequate level of precision for motorists is on the mile or kilometer level; it is simply irrelevant for their concerns, like driving times or the amount of gas in one’s tank, to give distances to a precision down to meters. In (2), the questioner A indicates a particular precision level as adequate for the purpose at hand (roughly, 50 km), and B’s reaction is appropriate only at another precision level (roughly, 0.5 km). Notice that B’s reaction is fine as a rejection of the precision level selected by A. For example, if B is the manager of a Dutch transporta-tion company that wants to pay as little as possible to their truck drivers, the exchange sounds quite idiomatic. How do we recognize precision levels? The empirical generalization is easy enough: Round Numbers suggest Round Interpretations in meas-ure expressions. I will call this the (13) RN/RI principle: a. Short, simple numbers suggest low precision levels. b. Long, complex numbers suggest high precision levels. This is not necessarily so. In the mathematical language of science we can combine any measurement with any error margin. But it is an evident pragmatic principle in the interpretation of natural language. The RN/RI principle is, as I said, an empirical generalization only. We should be able to derive it from more general principles of language use, that is, of linguistic pragmatics, which in turn should be motivated as general principles of human behavior. This is the main topic of this article. After a short review of the few previous attempts to deal with the phenomenon, I will try to identify simpler principles that are inde-pendently motivated and lead to the RN/RI principle. We will see that two such principles can be identified that give us the correct result pro-vided they interact in a way that is suggested by Bidirectional Optimal-ity Theory, a theory of the interaction of interpretative principles devel-oped in Blutner (1998, 2000) and subsequent work. Manfred Krifka Previous work The phenomenon we are after is, of course, well known. It belongs to our everyday experience with language, and it has been noted in linguis-tic work at various places, for example in Sadock (1977), a response to Lakoff (1972), and in Pinkal (1995: 271), a comprehensive theory of the semantics of vagueness. A somewhat more detailed treatment can be found in Wachtel (1980), who develops a theory of rounded numbers for examples like the following: (14) a. Sam is approximately six feet tall. b. Odessa has approximately 1,000,000 inhabitants. Rounded numbers are marked by indicators of precision levels, like or . Wachtel also observes that we get the ap-proximate interpretation without overt marking with “round” numbers like 1.000.000, but does not explain why this is so. Curtin (1995), in an unpublished MA thesis, introduces the notion of that can be used in the linguistic representation of reality. He develops the idea that fewer entities and distinctions exist on a coarse-grained level than on a fine-grained level, but that these levels are related to each other by homomorphic mappings. He observes that, with measuring terms, expressions on the coarse-grained level are gen-erally shorter (e.g., one hundred, two hundred, three hundred) than on the fine-grained level (e.g., one hundred, one hundred and oneone hundred and two), but he does not give an explanation why this is so. Most recently, Lasersohn (1999) developed a theory of imprecise in-terpretations in terms of pragmatic “slack” of expressions. But the rela-tion to the complexity of expressions is not addressed at all. A preference for short expressions It is quite obvious that, everything else being equal, shorter and simpler expressions are preferred. This is a well-known law of the economics of language, and has found a well-known formulation in Zipf (1949). It is also behind a submaxim of manner in Grice’s William James Lectures of 1967, published as Grice (1975), which states, briefly, “Be brief!”. In the Neo-Gricean accounts of Atlas & Levinson (1981), Horn (1984), Be Brief and Vague!435 and Levinson (2000) it corresponds to the I-principle (or Horn’s R-principle), which says: “Produce the minimal linguistic information sufficient to achieve your communicational ends!” (cf. Levinson 2000: 114). The principle is typically motivated as one that is rooted in the interest of the speaker; speakers are lazy, hence like to be brief. How-ever, experience with chatterboxes will make one doubt that (i) this is true of speakers in general in all situations, and (ii) that it is solely based on the interest of speakers. In any case, one cannot deny that it is a pow-erful pragmatic principle. The principle plays an obvious role in precision level choice. It ex-plains why, if the distance between Amsterdam and Vienna is 965 kilo-meters, one can quite truthfully utter (15.a) instead of (b). (15) a. The distance between Amsterdam and Vienna is one thousand b. The distance between Amsterdam and Vienna is nine hundred Whatever (15.a) is lacking in precision is compensated by the gain in brevity, at least in typical situations. In general, in a decimal-based sys-tem, number words referring to the powers of ten, like 10, 100 or 1000, or multiples thereof, like 30, 400 or 7000, are shorter than number words referring to adjacent natural numbers. This suggests that principle (a) of the RN/RI (cf. (13)) can be ex-plained by a preference of brevity. In many situations, brevity is a wor-thier goal to aim for than precision. But can preference for brevity also explain principle (b) of the RN/RI? Perhaps it can: There is no gain when we interpret (15.b) vaguely, hence it is interpreted in a precise way. I will argue that this line of reasoning is right, and I will develop it more precisely below. But before we do that, a somewhat longish detour concerning brevity is in order. Manfred Krifka Consider the following, where we talk about different distances. (16) a. The distance A is one thousand one hundred kilometers. b. The distance B is one thousand and one kilometers. Clearly, (16.a) can be interpreted in a less precise way even though it is more complex in terms of numbers of syllables and perhaps even in its morphosyntactic structure. We can counter such examples by saying in general, multiples of powers of ten are expressed in a simpler way, and therefore allow for a less precise interpretation, even though there might be exceptions of shorter expressions that are nor multiples of powers of 10. The following examples also show an apparent exception to brevity: (17) a. The train will arrive in five / fifteen / twenty-five minutes b. The train will arrive in four / sixteen / twenty-four minutes. We allow for more lax interpretations in the cases of (17.a) than in the corresponding cases of (17.b). The reason cannot be greater brevity of expression in (17.a), as the expressions are equally complex on all counts (morphosyntactic complexity, number of syllables). Obviously, the number or digit “5” is treated in special ways in our decimal-based system, just like the numbers that are powers of ten or multiples thereof. The reason is that they represent the half point between a multiple n of a power of 10 and the multiple (n+1) of the same power of 10, and that halves are the conceptually most prominent fractions. The prominence of half points can also be observed in cases like the following: (18) a. The train will arrive in half a minute. b. The train will arrive in thirty seconds. c. The train will arrive in forty seconds. Here, (18.a) allows for an imprecise interpretation. This also holds, interestingly, for (18.b), perhaps to a somewhat lesser extent, in contrast to examples like (18.c), which are interpreted at a higher precision level. It appears that thirty seconds is just another standardized way to refer to one half of a minute, just as is a standardized way to refer to a minute. We find similar phenomena for other measures of time that Be Brief and Vague!437 are not based on the decimal system, cf. e.g. fifteen secondsfifteen min-twelve hoursAnother case in which brevity all by itself does not explain every-thing is the following. In English there are traces of competing number-ing systems, based on 12 () or 20 (). I found that people share the intuitions that (19.a) is more easily interpreted in a lax way than (19) a. A dozen / two dozens bandits were approaching. b. Twelve / twenty-four bandits were approaching.Notice, that the complexity of two dozen bandits and twenty-four is roughly the same: The first expression is syntactically more complex, but the latter one has fewer syllables. It seems that brevity of expression is related to another factor that is the critical one. The idea is the following: When we want to express certain meanings, we select a particular set of expressions, which I will expression-choice space, out of which we choose the expres-sions that identify the semantic objects. The expressions of the set that we do not choose are possible alternatives, and the fact that they were not chosen may lead to pragmatic implicatures. Expressions that are not in the set do not count in the semantic or pragmatic interpretation. The notion of expression-choice space is similar to Curtin’s notion of granu-larity. The difference is that granularity concerns meanings, whereas expression-choice space concerns expressions. To illustrate this idea, consider again the following examples (we as-sume as above that the “real” distance is 965 kilometers): (20) a. The distance between Amsterdam and Vienna is one thousand b. The distance between Amsterdam and Vienna is nine hundred In (20.a), the expression-choice space for the number word are num-ber words based on the multiples of 100, that is, eight hundred, nine hundred, one thousand, one thousand one hundred, etc. In (b), the ex-pression-choice space for the number word consists of number words based on multiples of 1. In the expression-choice space of (a) there doesn’t even exist a number word nine hundred sixty-five; the number one thousand is the closest one to the real distance. In the expres- Manfred Krifka sion-choice space of (b), on the other hand, this number word exists, and nine hundred seventy-two is not the best one. How do we know which items are contained in the expression-choice space? The expression that we are actually using tells us. If, in a meas-uring context, we use the number word , then the expression-choice space will probably contain the number words that are multiples of 10, such as . If we use a number word like three , the most likely alternatives are two hundred, three hundred, four hundred, and so on. If we use a number word like , we can either assume an alternative set like one thousand, two thousand, three thousand etc., or an alternative set of number words that are fac-tors of 100, such as nine hundredone thousand or one thousand one . If we use a number word that refers to the mid number be-tween two adjacent multiples of powers of 10, such as thirty-fivethe alternatives are the multiples of this power of 10 and their mid val-ues, such as thirty-five, etc. If we evoke numbering systems with different bases, then similar principles apply to the powers of these bases; for example, the number word invokes alternatives like one dozen, two dozens, three dozensetc. If we refer to dimensions that are measured by measure functions that are not based on the decimal system, such as time, then the expres-sion-choice space may be determined by prominent halves and quarters of the basic measurement units of these dimensions. For example, fifteen minutes will evoke alternatives such as thirty minutesone hour as alternatives. One important point to be mentioned is that a particular expression may allow for different expression-choice spaces; for example, is, in principle, compatible with number words that express multiples of 100 or multiples of 1000. In general, I will assume that expression-choice spaces of number words based on multiples of lower powers of 10 are possible. The number word one thousand then allows for the following expression-choice spaces: Be Brief and Vague!439 (21) a. one thousand, two thousand, three thousand, … b. … nine hundred, one thousand, one thousand one hundred, … c. … nine hundred ninety, one thousand, one thousand and ten, d. … nine hundred ninety-nine, one thousand, one thousand and There is a systematic relationship between these expression-choice spaces: The upper ones are more coarse-grained, the lower ones are more fine-grained. We can model this relationship by set inclusion of sets of expressions. An expression like one thousand presumably will prefer an expres-sion-choice space like (21.a) or (b), and not (c) or (d). In general, we can state that the use of an expression that comes with expression-choice spaces introduces the most coarse-grained, or a more coarse-grained, expression-choice space in the actual pragmatic evaluation. What is the relation of brevity of expressions to expression-choice space? Put simply, the average complexity of expressions in more fine-grained expression spaces is greater than the average complexity of expressions in less fine-grained expression spaces. Take the following example: (22) a. one, two, three, four, five, six, seven, … one hundred b. five, ten, fifteen, twenty, twenty-five, thirty, … one hundred c. If we measure complexity by number of syllables, the average com-plexity of (22.a) is 273/100 = 2.73; of (22.b) it is 46/20 = 2.3, and of (22.c) it is 21/10 = 2.1. Measurement by morphological complexity would yield a similar result. Hence, expressions in less fine-grained expression-choice spaces are, in general, shorter. We can then qualify the tendency for brevity as one for expression-choice spaces: Speakers prefer expression-choice spaces with expres-sions of low average complexity. I will call this principle or constraint (23) B Expression-choice spaces with shorter, less complex expressions are preferred over expression-choice spaces with longer, more complex ones. Manfred Krifka There actually may be systematic pressures that make expressions of coarse-grained expression-choice spaces shorter. Notice that in English, is a phonologically and orthographically reduced form (*). The shortness of the form obviously re-lates to the duodecimal system found in forms like . Such phono-logical simplification may be mediated by frequency of use. Of course, more comprehensive studies would be necessary to establish any such connection between expression-choice spaces and phonology. A preference for precise interpretations? After this somewhat longish digression on brevity, let us return to the general theme of preference for short expressions. We can now phrase it as follows: The pragmatic principle BXPRESSION is responsible for our interpretation of a measure term like one thousand kilometers in a vague way. Even if the speaker knows the distance more precisely, he may choose a term that is true only under a vague interpretation, be-cause of the resulting gain in brevity. It is unclear whether this also explains why we interpret a measure term like nine hundred sixty-five kilometers in a precise way. Intuitively, in this case nothing is to be gained by a vague interpretation, as the ex-pression is long anyway, and therefore we settle on a precise interpreta-tion. In order for this argumentation to go through, we have to assume that, as a matter of principle, precise interpretations of measure terms are preferred over imprecise, vague ones: (24) PNTERPRETATIONPrecise interpretations of measure functions are preferred over We can think of vague interpretations of measure functions as meas-ure functions that allow for errors. For this we can introduce the follow-ing precise notation: (25) If is a measure function, and p is a positive real number, then is the vague measure function defined as: For all x, (x) = $(x)–p$(x), $(x)+p$(x)] For example, if km(x) = 1000, then km0.05(x) = [950, 1050]. Be Brief and Vague!441 Why should precise interpretations of measure terms be preferred over vague ones? The obvious reason is: Because they are more infor-mative! This is a general pragmatic principle; witness Grice’s first sub-maxim of Quantity, which states that a speaker should make a contribu-tion as informative as required by the current purpose of information With BXPRESSION and PRECISENTERPRETATION as two inde-pendent constraints we can explain the preference patterns as in the following OT-tableau. We assume, as before, that the true, precise dis-tance is 965 kilometers. ExpressionXPRESSIONNTERPRETATION The distance between A and V is one thousand kilometers. * (b)The distance between A and V is nine hundred sixty-five kilometers. The distance between A and V is nine hundred seventy-two kilome- Here, sentence (a) is the shortest candidate, but it violates NTERPRETATION. Sentence (b) violates BXPRESSION, but it allows for a precise interpretation. If we do not rank the two constraints (as indicated by the dotted line), then it is predicted that both sentences are fine. In contrast, (c) violates both constraints, and hence is dis-preferred over the other two candidates. The problem of this view, however, is that an addressee will prefera-bly interpret an expression like The distance between Amsterdam and Vienna is one thousand kilometers as saying that the distance is pre-cisely one thousand kilometers. Under this interpretation, the expression is both brief and precise. But this is clearly not what we find – see the RN/RI-principle, (13.a). If we indeed want a precise interpretation of round numbers, we have to mark this explicitly with expressions that indicate the precision level, such as Manfred Krifka The distance between Amsterdam and Vienna is exactly one thou-sand kilometers. How can we avoid this problem? Perhaps we should change our line of reasoning by roughly 180degrees and assume instead a preference for vague interpretations. A preference for vague interpretations The opposite of PreciseInterpretation is VagueInterpretation: (28) VNTERPRETATIONMeasurement terms are preferably interpreted in a vague way. How could such a tendency be motivated? It clearly seems to run against Grice’s first submaxim of quantity, which states that information content should be maximized. But already Grice added a second sub-maxim, namely, that the speaker should not give more information than is necessary for the current purpose of information exchange. Preference for imprecise expressions has been noticed by linguists working in empirical pragmatics. Ochs Keenan (1976), in her seminal study of the pragmatics of rural speech communities in Madagascar, describes conversational maxims that prefer expressions with less in-formative content over those with more content. One of the reasons, Ochs Keenan claims, is that vague expressions help to save face: One is not as easily proven wrong if one stays vague. This preference is by no means only to be found in exotic rural communities. The philosopher of science Pierre Duhem, noted, famously, that there is a balance between precision and certainty: One cannot be increased except to the detriment of the other (cf. Duhem 1904: 178f., cited after Pinkal 1995: 262). So, it appears that a tendency for vague interpretation can be moti-vated. But does it give us what we want? Consider the following tab- Be Brief and Vague!443 (29) ExpressionXPRESSIONNTERPRETATION The distance between A and V is one thousand kilometers. (b)The distance between A and V is nine hundred sixty-five kilometers. Now (a) is preferred, and we also predict that this sentence is pref-erably understood under a vague interpretation. But (b) is dispreferred; I put the star for a violation of VNTERPRETATION in parentheses to indicate that the sentence could be seen either under a vague or a precise interpretation. There are very obvious problems with this approach. First, (b) is not really dispreferred over (a); it just suggests a precise interpretation. And more importantly, if (b) is actually uttered, a way to reduce the viola-tions on the hearer’s side is to assume not a precise, but a vague inter-pretation! This is because this would reduce the violation from two stars to once. But of course this is exactly the opposite of what we find. Something must be fundamentally wrong. Nevertheless, I will argue that BXPRESSION and VNTERPRETATION are indeed the two constraints that explain the RN/RI principle. But we have to consider the way how these constraints interact. Put informally, we must find a way to express, and motivate, the idea that one constraint can be vio-The interaction of brevity and vagueness The theoretical framework that gives us what we need is Bidirectional Optimality Theory, as developed by Blutner (1998) and Blutner (2000); cf. also Dekker & van Rooy (2000) for a game-theoretic interpretation working with the notion of Nash equilibrium. In classical Optimality Theory (OT), the input to a rule expressed in a tableau of ranked or unranked constraints is a set of expressions, and the output is the one expression, or the set of expressions, that violate the constraints the least. In Bidirectional OT, the input are pairs of ob- Manfred Krifka jects, constraints are independently specified for the members of the pair, and the output are those pairs that violate the constraints the least. The constraints are formulated in a modular fashion, for the members of the pairs. But finding the optimal solution(s) requires optimization for both members. In semantic and pragmatic applications of Bidirectional OT, the pairs of an Expression and its Interpretation. E, I of expres-sions and interpretations. A particularly interesting one has been sug-gested by Jaeger (2000); it is called bidirectional super-optimality, and (30) A pair E, I of a set of candidate expressions GEN is superopti- i. There is no superoptimal E', I GEN such that E', I ii. There is no superoptimal GEN such that The notion of superoptimal pairs E, I is thus restricted to those that have no competitor on the expression level or on the interpretation level that is itself super-optimal. (If you think this definition is circular, please It is the notion of super-optimality that we need to capture the inter-actions between constraints as we find them in precision level choice phenomena. That is, from the notion of superoptimality it follows that sometimes violating one constraint allows for the violation of other constraints, with no punishment. Consider the following cases: (31) i. The distance between A and V is one thousand kilometers, ii. The distance between A and V is one thousand kilometers, iii. The distance between A and V is nine hundred sixty-five kilometers, vague iv. The distance between A and V is nine hundred sixty-five kilometers, precise Be Brief and Vague!445 The rankings that we find for these expression/interpretation pairs are the following, where BE stands for B and VI for NTERPRETATION(32) i. � iii. i. � ii. ii. �BE iii. �VI Or, in a Hasse diagram, and in an obvious shorthand: (33) i. iii. nine hundred sixty five, vague iv. nine hundred sixty five, preciseNotice that (i) and (iv) cannot be compared directly, as they differ both in their expression component and in their interpretation compo-nent, and (30) only allows for comparison between pairs that are identi-cal in one component. Clearly, (31.i) is a superoptimal expression/interpretation pair. There is no superoptimal pair E, I such that either E, I i or E, IHence we predict that brief measure expressions are preferably inter-preted in a vague way. The pair ii clearly is not superoptimal, because it has a better superoptimal competitor, i. Also, the pair iii is not superop-timal, because it also has i as a better superoptimal competitor. But, interestingly, iv is also a super-optimal pair. The reason is that the only competitors ii. and iii., even though they fare better in terms of and VNTERPRETATION, respectively, are not themselves superoptimal. Hence we predict that complex measure ex-pressions are preferably interpreted in a precise way. The intuitive reasoning behind this is as follows. Assume that we en-counter an expression like (34): The distance between Amsterdam and Vienna is nine hundred We consider a vague vs. a precise interpretation, (= iii.) (34), precise (= (iv)). In general, vague interpretations are pre- Manfred Krifka ferred, so we are inclined to impose a vague interpretation, (iii). But under a vague interpretation, we could express the same truth conditions The distance between Amsterdam and Vienna is one thousand (= i) should be preferred over iii because it is shorter. But the speaker has not uttered (35). Hence we can refute the vague interpretation of (34) (= iii), and select the precise interpretation, This, precisely, is what was meant above how violation of one con-straint (here, brevity) allows for violation of another (here, vague inter-This line of argumentation has been used for a number of other phe-nomena in pragmatics. For example, Blutner employed it to derive the preferred interpretation of double negatives such as not unhappy as ‘happy, but not quite so’, or the interpretation of explicit causatives like cause the sheriff to die as expressing indirect, non-stereotypical causa-tion. These applications all involve a derivation of the M-principle of Levinson (2000), which states that “marked expressions have marked meanings”. In the case at hand, the “marked” measure expression is the complex one, like nine hundred and sixty-five, and the “marked” inter-pretation is the precise one. With superoptimality, the M-principle is not an axiom anymore, but a theorem. We can explain the Emergence of the Conclusion I showed that by assuming pragmatic tendencies for short expressions and vague interpretations one can account not only for the fact that short expressions have a preference for vague interpretations, but also that long expressions have a preference for precise interpretations. I have qualified the notion of short expressions as one relativized to the expres-sion-choice space, and I have tried to motivate these pragmatic tenden-cies. The two tendencies have to interact in a particular way so that the two observed pairings of expressions and interpretations result as stable solutions. This can be achieved within the framework of Bidirectional Be Brief and Vague!447 Optimality theory, where they appear as the two superoptimal solutions. In essence, long expressions are allowed, provided that they are inter-preted precisely, because under a vague interpretation a short expression would have been more optimal. I have indicated that this line of reason-ing is just one application of a general type of explanation that applies to a wide range of phenomena of the type “marked expressions express marked meanings” and “marked meanings are expressed by marked expressions”. 1. This paper is based on talks given at the Sinn und Bedeutung Conference in December 2000 in Amsterdam, and on a talk at the Hebrew University in Jeru-salem in April 2001. I thank the audiences for their comments, and in particular Reinhard Blutner, Larry Horn, Gerhard Jäger, Michael Morreau, Carl Posy, Phillipe Schlenker, Henk Zeevat. 2. Including the good one that it is indeed easier to give common fractions like 1/2, 1/3 or 1/4 for feet and yards, which suggests that we should give up the de-cimal system of counting in favor of the duodecimal one, if not the sexagesimal one of the Babylonians. Atlas, Jay David and Stephen C. Levinson 1981 -clefts, informativeness, and logical form: Radical pragmatics (re-vised standard version). In: Peter Cole (ed.), Radical pragmatics61. New York: Academic Press. Blutner, Reinhard 1998 Lexical pragmatics. Journal of Semantics 15: 115–162. 2000 Some aspects of optimality in natural language interpretation. Jour-nal of Semantics 17: 189–216. Curtin, Paul 1995 Prolegomena to a theory of granularity. Unpublished MA thesis. Dekker, Paul and Robert van Rooy 2000 Bi-directional Optimality Theory: An application of game theory. Journal of Semantics 17: 217–242. Manfred Krifka Duhem, Paul 1904 La theory physique, son objét et sa structure.Grice, Paul 1975 Logic and conversation. In: Peter Cole and Jerry L. Morgan (eds.), Syntax and Semantics 3: Speech Acts., 41–58. New York: Academic Horn, Laurence R. 1984 Toward a new taxonomy for pragmatic inference: Q-based and R-based implicature. in Deborah Schiffrin (ed.), Meaning, form, and use in context: Linguistic applications, 11–89. Washington, D.C.: Georgetown University Press. 2000 Some notes on the formal properties of bidirectional optimality the-ory. To appear in Journal of Logic, Language and InformationLakoff, George 1972 Hedges: A study in meaning criteria and the logic of fuzzy concepts. Papers from the Eighth Regional Meeting of the Chicago Linguistic Society 1972: 183–228. Lasersohn, Peter 1999 Pragmatic halos. Language 75: 522–551. Levinson, Stephen C. 2000 Presumptive meanings. Cambridge (MA): MIT. Ochs Keenan, Elinor 1976 The universality of conversational implicature, Language in Society, 5: 67–80. Pinkal, Manfred 1995 Logic and lexicon. Dordrecht: Kluwer. Sadock, Jerrold M. 1977 Truth and approximations. In: In Kenneth Whistler, Jr. Robert D. Van Valin, Chris Chiarello, Jeri J. Jaeger, Miriam Petruck, Henry Thompson, Ronya Javkin, and Anthony Woodbury (eds.), Proceed-ings of the 3rd Annual Meeting of the Berkeley Linguistics Society430–439. Berkeley: Berkeley Linguistics Society. Wachtel, Tom 1980 Pragmatic approximations. Journal of Pragmatics 4: 201–211. Zipf, George K. 1949 Human behavior and the principle of least effort. Cambridge: Addi-