Optimal interpretation as an alternative to Gricean pragmatics

The paper introduces an optimality theoretic notion of optimal interpretation based on presupposition theory and shows that it is a viable alternative for Gricean pragmatics. It moreover is directly applicable to presupposition and rhetorical structure and improves the insights in those areas. The last sections are concerned with provisionally making this point.

Since Karttunen (1973) the aim of presupposition theory has been to deal with the projection problem and influential and lasting contributions have been made by Karttunen himself, Gazdar (1979), Soames (1982) and others.Though there is definitely still controversy and quite a lot of problems, the formulations of Heim and Van der Sandt from the 80's have not been superseded by better and more comprehensive theories 3 .The projection problem is the problem of predicting [2] It builds on the unpublished (Blutner and Jäger 1999) which reconstructs Van der Sandt's account in optimality theory.[3] E.g. the recent attempts of Schlenker and his students at providing different foundations for presupposition theory all seem to be aimed at reconstructing Heim's predictions, even in the cases where these are problematic.
OSLa volume 1(1), 2009   [193] which presuppositions emerge as implicatures or entailments of the whole sentence in complex sentences that contain presupposition triggers.The formulation pre-dates the dynamic views of semantics from the 80's and uses notions like implicature or entailment that come from the logical tradition.Within dynamic semantics, one could restate it as follows: If a sentence with a presupposition trigger is used in a context, under what conditions on the context and the sentence will the presupposition be true in the context that incorporates the interpretation of the sentence.In this formulation, the presupposition can be in the new context because it was already in the old context or because it is added by the interpretation of the utterance.A better alternative, which is needed to incorporate local accommodation and intermediate accommodation in the style of Heim and Van der Sandt would be: If a sentence with a presupposition trigger is used in a context, under what conditions on the context and the sentence will the presupposition be true in which subcontexts of the context that contains the interpretation of the sentence.
The typical facts that need to be explained can be illustrated by the following example: (1) a. John thinks that if Mary left, Harry will be glad that she left.b.John thinks that if Mary was angry, Harry will be glad that she left.
The second but not the first example will force the new context to contain the information that Mary left.In Van der Sandt's theory or in Heim's theory, this is because the trigger be glad has access to Mary left in the condition of the embedded conditional in the first example (resolution), but not in the second.The second sentence can thereby be used only in contexts that either have the information that Mary left or where that information can be added to the context without making it inconsistent (accommodation).For a fuller discussion of Heim's and Van der Sandt's theory see Beaver (1997) and Geurts (1999).The following is an abstract version of these theories.
(i) Presupposition triggers have a presupposition p that must hold at the site of the trigger (ii) p should be resolved to an accessible part of the context of the trigger (iii) If this is not possible p should be accommodated (added to some context of the trigger) (iv) p should preferably be added to the outermost context of the trigger if it is consistent there.
Prinicples (i) and (ii) are in essence due to Karttunen. (iii) and (iv) are more problematic.The problem with (iii) is that not all triggers accommodate and with OSLa volume 1(1), 2009 [194]   (iv) that accommodations do not always go to the outermost context.(iii) is not hard to motivate: it follows from the fact that presuppositions are not superfluous (they can be conditions on definedness of concepts or mark properties of the contexts in which they are used) and so are related to the acceptability of the utterance.The surrounding contexts are the ones that are relevant for resolution.So any accommodation that meets (iii) restores the situation that would be needed for resolution.But (iv) is not easy to motivate and the lack of motivation is an embarassment to the theory.Van der Sandt (but not Heim) advocated a stronger version (iv') of (iv).
(iv') p should preferably be added to the outermost context of the trigger in which it is consistent.
This version has been shown by Beaver to make some wrong predictions, especially in cases like (2).
(2) Most German housewives wash their Porsche on Sundays.
Principle (iv') predicts that this should mean (3) (which might well be true).
(3) Most German housewives who have a Porsche wash it on Sundays.
But nearly everybody understands it as meaning (4) which is clearly not true 4 (local accommodation).
(4) Most German housewives have a Porsche and wash it on Sundays.
The four principles can be generalised to four general constraints.What principle (i) says is that the local 5 truth of the presupposition for the speaker is required for the use of the presupposition trigger: if the local truth fails, the presupposition trigger cannot be used.From the hearer perspective, it means that her reconstruction of what the speaker is intending to say must take the presupposition trigger into account and the local truth must be checked or created within her interpretation of what the speaker is saying.Why?Because otherwise the interpretation would not be such that the hearer could have used the same words to achieve the same intention, putting herself in the speaker's position.This last formulation can be underpinned by Grice's theory of meaning NN (Grice 1957) which requires the recognition of the intention of the speaker by the hearer to be intended by the speaker as part of her plan to realise that intention.Recognition [4] In fact, most people read it as implicating that all German housewives have Porsches, a reading not predicted by Van der Sandt. [5] This formulation ignores a complication with referring expressions and particles, where the presupposition can fail to be true in the local context, provided it is true in an accessible context.This complication can be ignored, as discussed later, because such triggers do not accommodate.
OSLa volume 1(1), 2009 of the intention by the hearer would involve at least the truth of the statement that the hearer can imagine that had she been the speaker, she could have used the same words.And this is the general constraint on interpretations that we will adopt: 1  6 : the hearer could have used the utterance herself to express the interpretation -if she would have been the speaker.
The last proviso is for dealing with perspectival shifts, for limitations on the speaker's command of the language, for errors etc.For presupposition triggers,  captures the generalisation that triggers can be used only where the local truth of their presupposition is assumed.(This could be taken as a definition of what it means to be presupposition trigger.)But presupposition is only a special case: the  constraint is about all restrictions on how to express oneself in a language which includes all lexical, syntactic, semantic, morphological and phonological rules and facts.The lexical, phonological and semantic facts that some words, intonation patterns and constructions are presupposition triggers is the fact needed for presupposition. can be articulated in many ways.One could look at the language and speech generation systems and the principles on which these are based.Or one can look at optimality theoretic syntax and phonology.But in principle any theory that defines the relation between linguistic form and interpretation can be used, including the recent probabilistic approaches.In this paper, OT syntax will be assumed and -at one point -max -constraints in OT syntax 7 .Since expressive constraints should be part of any theory of grammar, this again does not rule out other models. is not an absolute constraint.It allows for imperfect use of language or speech in which case the hearer has to correct for the imperfections.This aspect will be ignored in the rest of the paper.
One aspect of principle (iv) -see above -is a well-known principle of interpretation: the fact that internally inconsistent interpretations, and interpretations that clash with the context give way to consistent interpretations.In probabilistic approaches to language one implicitly makes the further assumption that interpretations that are less plausible than others given the context also give way to more plausible ones.This can be formulated simply as: 2 : maximise plausibility (an interpretation is bad if there is a more plausible interpretation) [6] In OT, faithfulness constraints are the constraints on the relation between the input and the output.[7] This allows a precise formulation in terms of OT syntax and phonology.Suppose a complete OT system is given deciding which candidate pronunciation is optimal for a certain semantical input in a context.An interpretation meets  if there is no alternative interpretation for which the utterance would be better in the context from which the speaker made the utterance.The notion of "better" can be spelt out in terms of the given constraint system. [196]    applied without restriction would make any interpretation maximally plausible, i.e. trivial. is a necessary restriction: only more plausible interpretations for which the hearer, stepping in the speaker's shoes, could have used the same expression will come in place of less plausible ones.So  does not rule out surprising interpretations or even corrections (implausible in any context, because they are inconsistent with it).The rules that govern language production allow people to say interesting things, because it is not possible to ignore them in interpretation.Without them, language users are condemned to exchanging unsurprising messages.For presupposition accommodation,   just rules out accommodation sites if more plausible ones are available. is the central principle for disambiguation.The rest of (iv) will be dealt with later on.Principles (ii) and (iii) are repeated below: (ii) p should be resolved to an accessible part of the context of the trigger.
(iii) If this is not possible p should be accommodated (added to some context of the trigger).
These principles together prefer resolution over accommodation and this can again be seen as the outcome of a more general constraint, that of preferring not to make unnecessary extra assumptions in interpretations, especially the assumption of new discourse referents of all kinds 8 .A similar constraint has been adopted in another optimality theoretic treatment, Hendriks and de Hoop (2001):     .
3 *: old referents are preferred over connected referents which are preferred over new referents.
Again, without limitations this principle condemns the hearer to interpret anything as old news.But it is meant to apply after  and  have ruled out other interpretations.It is below , because new interpretations can be more plausible than ones in which the information expressed is identified with old information and where referents are identified or connected with old discourse referents.* is not the reason why presupposition triggers or pronouns are resolved to given information or given identity.The assumption is that triggers -as lexical, phonological or syntactic items -mark their referents and presuppositions as given (or more accurately, they mark for properties that imply givenness).So  is responsible for the obligation to resolve.* is responsible for optional anaphora [8] Discourse referents go back to Karttunen (1976) and are incorporated into all schemes of dynamic interpretation.Nothing in this paper conflicts with the discourse representation theory of Kamp and Reyle (1993).See also the appendix.
OSLa volume 1(1), 2009 only.For example, the Slavic bare singular lexical NP can be both definite and indefinite.It is therefore not a presupposition trigger even though it has all the anaphoric and bridging possibilities of a definite NP.
For presupposition, resolution of the presupposed material to existing material accessible in a context of the trigger means that the demands of  are satisfied.There is then no need for accommodation and accommodation will be blocked by *.
* is a powerful principle.It prefers resolving interpretations of Russian bare NPs over bridging ones and bridging ones over indefinite ones.And it gives the same preferences for English definite NPs and prefers bridging indefinites over unconnected ones.Moreover, as will be discussed later, in combination with , it is also responsible for the defaults in rhetorical discourse structure.
(iv) p should preferably be added to the outermost context of the trigger if it is consistent there This preference will be a consequence of .
4 : let the interpretation decide any of the activated questions it seems to address.
In this formulation of , "the activated questions" should be read as presupposing that there are such questions.In particular, there must be one activated question that captures the point of the utterance 9 .Questions can be activated by conversation participants raising them explicitly, by being evoked by previous contributions and other features of the context and finally by being evoked by the utterance itself.
Unrestricted  would not work:  comes after the other constraints.The reason is simple: any arbitrary way of settling all activated questions would always be the most relevant interpretation.But such an interpretation is random and would not be what the speaker intended.Therefore it should be below .But also below *, because otherwise global accommodation would be allowed when local resolution is possible.It should be below  , because what is implausible cannot be implicated.Similar principles have been defended before by Grice (1975), by Sperber and Wilson (1984), by Van Rooy (2003) and others.It can be derived from cooperativity: activated questions are goals of the conversation, whether they are already given or activated by the utterance itself.Cooperativity can also be used to argue that the utterance itself addresses an activated question or one activated by itself. [9] Without this assumption, it would not be possible to recapture the insight of Stalnaker (1979) that assertions cannot just bring information that is already in the common ground.The question answered by the clause as such can be identified with its clausal topic.
OSLa volume 1(1), 2009 [198]   For presupposition,  works as follows.Assume that a trigger presupposes p and that p cannot be resolved.This activates the question "whether p".The speaker indicates by her use of the trigger that she is assuming a positive answer and thereby seems to address the question, unless the context already has conflicting information.The interpretation that answers "whether p" positively adds the information that p to the outermost context.
The constraints are meant to apply in the order in which they were given. can force implausible interpretations and that possibility is necessary for telling interesting things and for carrying out corrections. also may force the introduction of new discourse referents and prevent the settling of activated question. is a filter on how items can be resolved and on whether they can be resolved, thus sometimes overriding *. must be below * because global accommodation can be more relevant than local resolution (which appears to be always preferred).The constraint system of these four constraints in the given order can be understood as a ordered optimality theoretic system of contraints, an OT pragmatics.The constraints allow violation and the ordering fully decides the conflicts that can arise.But there are also substantial differences with optimality theoretic systems as employed in phonology and syntax.
First of all, there is no indication that any kind of variation is possible with respect to the ordering.Second, it is not clear that finite scoring of errors is the most appropriate.A candidate interpretation loses if another interpretation is better with respect to the constraint where better may be interpreted by finite as well as continuous scoring techniques.Both of these features suggest that this OT pragmatics should not be part of Optimality Theoretic linguistics (on a narrow view, OT linguistics is just the account of ) but is rather just another optimisation problem, like robot ethics, choosing a party dress, making an investment decision in a situation of partial information and others that have been noted in the literature.The special problem solved by pragmatics is explaining linguistic behaviour as a special case of explaining human behaviour and of explaining natural events.Grice (1957) starts by giving some examples of natural meaning (5-b).
(5) a.These spots mean that the child has the measles.b.Smoke means fire.
These are statements in which some natural phenomena -the spots and the smoke -are explained.The statements explain them by giving their cause: the measles and the fire.A natural explanation is just supplying another natural phenomenon (temporally and spatially connected to the explanandum) for which there is a OSLa volume 1(1), 2009 causal principle that can derive the explanandum from the explanation.The correctness question about the explanation is whether the explanation really caused the explanandum.It is not clear that one can ever be entirely sure.To account for the powers of explanation of natural phenomena that one finds in human and animals, it seems useful to think of explanation in terms of optimisation.The quality of explanation can then be thought of as something that biological and cultural evolution maximised by providing an ever sounder grasp on the component factors.The quality of explanation is clearly a factor in survival and sexual selection as well as in social success.So there is evolutionary pressure for better explanation abilties both in biological and in cultural evolution.The overall concept that is optimised can be articulated as a constraint system much like the interpretation system of the last section.
1 The explanation should be a possible cause.
Presumably, earlier successful explanations are remembered which leads to associative links that provide a range of possible causes.This is analogous to : only interpretations that can be possible causes of the utterances are considered.
The difference is that an utterance is behaviour that the interpreter could also carry out herself to realise the communicative intention that is the cause.This provides a strong test on the causal link.In the explanation of natural phenomena this must be filled by representations of causal knowledge.
2 If there is more than one possible cause one should compare them for probability in the context.The most likely should be chosen: this maximises the chance of being right.
3 If there is more than one most probable cause, they can be inspected for newness: is part of the cause already known?What part is completely new?What new states of affairs do we need to postulate?Minimising the new hypotheses is best.
In the philosophy of science, [3] is Ockham's razor.
[1] and [2] can be related to techniques like Bayesian nets in AI.
If the analogy is correct, the pragmatic system is just the ability to form successful explanations of utterances.The only new element is  which does not have an analogue in the explanation of natural phenomena and would indeed be expressive of cooperation and communication.If questions have been raised or are being raised and information is offered that pertains to them, the hearer has the right to expect that the interlocutor is cooperative and that the information given answers those questions.Language use is voluntary behaviour and a natural explanation for why the behaviour is wanted and produced is given by precisely the activated questions.
OSLa volume 1(1), 2009 [200]   The analogy with the philosophy of science can then be carried further.Philology would go beyond natural science in asking for understanding of the texts and artefacts that it studies, a reconstruction of the intentions behind those objects.The difference with natural science stems from the fact that the aim of a communicative event is communication and that the human interpreter can -as a potential producer -try to reconstruct the intentions and goals behind the communicative act.And the same situation holds for the objects studied by philology: they have a purpose and as a fellow human being, one can try to reconstruct the purposes that led to their creation.
There is continuity in this account between interpretations of natural phenomena, behaviour, involuntary vocalisations, non-linguistic communication and linguistic communication.It is the same system relying on the same kind of resources (connections between causes and effects in experience) and producing explanations of experiences.Where the explanation of behaviour starts to include the intentions of the person who carried out the behaviour, these intentions may themselves be explained by the currently activated questions that are common ground between speaker and hearer.
As an example, consider the following situation.John stands by the road waving his jacket at me.I should be asking myself first when I would be standing by the road waving my jacket at me.This is : it requires me to consider possible explanations.The possible answers should be weighed by plausibility and better answers should be selected in preference over less plausible ones ().I should then be wondering about the new elements in my explanation, can they be eliminated, can I connect them to already known things (*).And if there are activated questions can John be settling them by his waving ()?If one supposes that we were looking for a lost cow, the proper explanation may be that John has seen it and that he is indicating with his waving where it is.In this case, the pragmatic system makes a by itself not very meaningful gesture into a statement: I have found the cow!Here it is!
The theory cannot be a general pragmatics unless it is able to recapture the effects of Gricean pragmatics, in particular conversational implicatures.This section discusses a number of examples only and tries to make it clear how the implicatures should follow.Grice explains particularised conversational implicatures as deliberate floutings of his maxims: the speaker is overtly uncooperative.It is then the task of the hearer to work out reasons for why the speaker does not follow the maxim she is flouting.
The following extra assumption is necessary for these implicatures.It belongs to linguistic competence that special expression strategies can be followed: sometimes things can be expressed better by a joke, irony, understatement, hyperbole, metaphor and marked forms (forms whose meaning is normally expressed in another way).It can be more effective to do so, more amusing or more polite.
The speaker and a hearer share knowledge of these strategies and it would be part of production optimality theory to say more about when certain intentions prefer expression by these special strategies and the rules that define the strategies.This area is largely unexplored, but nothing indicates that an account is impossible.An account of this kind is necessary to make these cases pass .
Here, it is only attempted to show that the hearer can infer the intention behind the utterances in question.
For a group of implicatures, a conflict arises with .Irony, metaphor, understatement and hyperbole have literal interpretations that cannot be true, because they conflict with knowledge about the speaker, general knowledge or conceptual knowledge.Assuming that the speaker is being ironical, understating, metaphoric or hyperbolic leads to a different hypothesis about what the speaker is trying to express and drastically improves plausibility.Saying that you heard a thousand feet on the corridor relies on the hearer recognising that this cannot be true and on the hearer realising that this is only a way of expressing the impression the event made on the speaker.Saying that John is a cunning fox similarly conflicts with plausibility and the interpretation relies on figuring out how it could still be saying something that is true of John.In a marked expression  does not map the straightforward interpretation to the chosen form as in (6).
(6) Mrs T. produced a series of sounds closely resembling the score of "Home Sweet Home" The literal interpretation would be mapped to: Mrs. T sang "Home Sweet Home".But with the extra innuendo that there was something wrong with the singing, the hearer can imagine producing the form herself (if she were as clever as Grice or the reviewer he is quoting).
In another example (7) of Grice, the speaker addresses a different topic from the one given: the qualifications of John as a lecturer. (7) John is an excellent cyclist.
The statement is on topic again (and thus meets * and  better) if the refusal to address the given topic is recognised including the reasons that might have led the speaker to do so. gives rise to generalised conversational implicatures: exhaustivity implicatures including scalar implicatures.

(8)
Utterance: John has 3 sheep and five cows.These implicatures do not arise in situations where the questions are not activated.
 would be responsible for the general assumption of normality (a normal car, a normal killing, a normal drink) since it prefers the most probable interpretations.For a case like: drink (normally: alcoholic drink) this would be a question of figuring out what is typically meant with drink, something that is correlated with the frequency with which drink is used for alcoholic drinks.The context (e.g. the person who is supposed to drink it is 6 years old) or modifiers (soft) can influence this decision.
For clausal implicatures, expressive constraints in syntax need to be assumed.i.e. they are in the province of .For clausal implicatures, the following two are needed: max () and max () which force the formal expression of the speaker's adherence to the proposition expressed or the formal expression of the doubts that the speaker has with respect to the clause expressed.The expressive constraints can only be satisfied if the morphology or lexical material is available in the language for the expression of adherence or doubt.This is the case for English indicative conditionals that stand in opposition with both the causal and the counterfactual constructions.This means that the speaker cannot take the condition or the consequence of an indicative conditional to be true or subject to doubt without being forced into an alternative construction.And thereby as implicating that both condition and consequence are open issues, if the indicative conditional is chosen.The clausal implicatures of disjuncts of a disjunction follow in the same way from the competition of just saying one of the disjuncts in isolation.Other non-entailed clauses (e.g.complements of belief attributions) do not give rise to the clausal implicatures because expression alternatives are not available.
OSLa volume 1(1), 2009   The constraints except  (which merely determines the space of possible interpretations) promote certain interpretations over others. militates in favour of probable interpretations.* tries to keep referents old. prefers settling questions over leaving them open.What to do if the speaker wants to express an implausible reading, or draw the attention to new objects, or does not want to settle an activated question or a question that is activated by his utterance?Such a case automatically puts the speaker under an expressive obligation, since doing nothing means that the speaker will be misunderstood.If the misunderstanding is a question of the value of a semantic feature, it is likely that there is an expressive constraint to express the feature.And there are often grammaticalized devices for expressing such semantic features.The following are some markers that can associated with the task of putting the constraints out of action.
• : mirative markers (even, only, just, yet, already), correction markers (no, but, however), intonational contrastive marking.Even lexical words can be thought of as devices that mark against .
• *: indefinite marking, presentational constructions, contrastive markers and contrastive intonation, tense change (tense morphemes, now, then), mood change, additive marking (too, also, and) • : sentence final rising intonation, "wave" intonation, words like "some" and "certain" expressing uncertainty, downtoners, mood.This is as predicted.Language evolution has created markers to act against the tendencies promoted by pragmatics.The real problem is that one can also find many grammaticalised markers for tendencies that are being enforced by the constraints, in particular old markers, while old interpretations are already enforced by *.
The problem can be solved.If  were not restricted by , i.e. by absolute rules, it would make it nearly impossible to say anything interesting.E.g.I could not correct your opinion by saying (10).
(10) He is  in Spain.
It would be much better for you to interpret he as referring to somebody else than the person you have your opinion about: some person who -also according to you -is not in Spain.And if no such person presents himself, by inventing a person OSLa volume 1(1), 2009 [204]   like that.This would violate *, but would also make the interpretation more plausible.Consider another example (11).
(11) John believes that that is precious.
(pointing at a plastic coffeemug or without a pointing) Given that John is reasonable, any other referent for that for which it is more plausible that it is precious is preferable and such objects can be created by accommodation.But accommodations or resolutions to less activated antecedents are excluded for pronouns and demonstratives and that is exactly what allows corrections and the attribution of implausible opinions using these devices.The typical old markers are marking for certain properties: he marks its referent as a highly activated male person, that marks its referent as an object in the utterance situation that is indicated by the speaker.The hearer cannot reconstruct the marked feature of these items unless the referents are really highly activated or really in the utterance situation and indicated.If they are not, the interpreter has to give up or enter into real reparation strategies.So the conclusion must be that old marking of this kind is lexically coded and belongs to  in order to override .
It is not an accident that the DRT development algorithm (Kamp and Reyle 1993) includes absolute pronoun resolution: all pronouns must be resolved.This indicates that proper anaphora belongs to : otherwise, there would be exceptions due to .The property that licenses their use is a feature of the context as such and they are typically ways of satisfying expressive constraints.
Expressive constraints would also be responsible for preference based on the referential hierarchy assumed in Gundel et al. (1993) or on the similar hierarchies in natural language generation (Reiter and Dale 2000).The following hierarchy could be part of a generation system for Dutch.
(12) first and second person pronouns > reflexives > 3rd person pronoun > demonstrative pronouns > anaphoric and bridging definites and short names > full demonstrative NPs > full descriptions and full names > indefinites The higher type of NP is preferred if the condition for its use is met.This gives a corresponding hierarchy of max () constraints where F varies over conversational participant, c-commanded, high activation, indicated, activation 10 visible and uniqueness.
[10] This would be the common denominator of bridging and anaphoric uses of short definites.The high activation of the antecedent for bridging spreads to the referent of the bridging NP, earlier mention activated the antecedent of the anaphoric definite NP.
OSLa volume 1(1), 2009   [205] Most of these marking items can be thought of as presupposition triggers and the features they express as the associated presupposition, but they lack many of the properties of normal triggers like bindability and accommodation (see Beaver and Zeevat (2006)) for more discussion).
Proper presupposition triggers can be distinguished from marking devices and particles because there is no associated expressive constraint.This restricts this class to lexical triggers, i.e. to predicates that can be applied to an arrangement of objects only if that arrangement already meets some criterion, the presupposition.This leaves us with factives, quantifiers, start and finish and a class of nouns and verbs (bachelor, buy) 11 .Lexical presupposition seems important for disambiguation.Derived uses of words like strong or run are safe because conflicts between presuppositions force the assumption of different readings.In ( 13), one cannot assume a meaning with much muscular power or jumping locomotion, because that presupposes animacy (of the subject).( 13) strong (of drinks): with much alcohol run (of engines): functioning The emergence of lexical presupposition cannot be just due to the needs of disambigation however.The simplest way to think of how lexical presupposition arises is by general considerations about classification.Once one has a class of objects or situations, the members of that class can be classified further.Membership of the class is a lexical presupposition of the words that are used to divide the class in subclasses.
It is a consequence of being a lexical presupposition that the speaker must take the presupposition to hold in the local context of its use (see Zeevat (1992) for a fuller discussion).If it is not common ground that the presupposition holds in the local context there are open and activated questions "whether p?" for any context C of the trigger (searching for p in a context C is conceptually hard to distinguish from raising the question "whether p" in that context).To be more precise, positive contexts determine an operator X from which the natural question "whether X p? " can be formed.E.g. the context representing what is attributed as a propositional attitude to John corresponds with the operator "John believes that" and the question "whether John believes that p?".The context C corresponding to the consequence of a conditional to the operator "if the condition is true" and the question "whether if the condition is true, then p?".The [11] This leaves a number of triggers in an unclear position.E.g. clefts and pseudo-clefts do not clearly have the properties of lexical triggers, yet also seem to lack expressive constraints.I will defer discussion to another occasion.
OSLa volume 1(1), 2009 [206]   operator corresponding to the scope of a proper quantifier to the operator: "if an arbitrary x satisfies the restrictor" and the question to: "whether p(x) if x satisfies the restrictor".Non-positive contexts (monotone decreasing contexts) do not have associated operators.For any of these questions, the speaker has suggested that they have a positive answer by using the trigger in them.By  the speaker should be taken as settling these questions positively unless this leads to violations of .
Zeevat (ms) applies a formal version of these ideas to a range of outstanding problems of presupposition projection.Here I just mention three problems that seem to go away at once.First, Heim has always insisted on a proper rational motivation for the preference for global accommodation, while admitting that there was a strong empirical case for assuming it.The  constraint is such a motivation, since it can be rationally motivated and it gives a preference for global accommodation.The question with respect to the outermost context is always raised -if resolution is impossible -and positively settled unless there is a inconsistency with the outermost context.
Second, an old puzzle first noted by Karttunen (1973) is illustrated in ( 14). ( 14) Mary hopes her admirer will visit her tonight.
Karttunen notes that (by projection tests) two presuppositions are projected by "her admirer", the ones in (15).
Mary believes she has an admirer.
The present account immediately predicts this (the question associated with the context of Mary's hopes is: Does Mary believe that she has an admirer) while the utterance has a marked reading with only local accommodation.Heim's and Vander Sandt's account have no explanation.Third, the present account gives the correct explanation of partial accommodation where an incomplete antecedent of the presupposition is enriched to a fully matching one as in ( 16).
(16) John was ill last week and now he is glad he took care of his fever.
* will prefer adding fewer new discourse referents over adding more and will so prefer partial resolution over full accommodation and will thereby implicate that John was ill and had a fever last week.The main objection to an account of this kind -accommodation in non-positive contexts -is discussed in Zeevat (ms). [ It came as a surprise that presupposition theory would be relevant to rhetorical structure.To the degree that the following points are convincing, they form the   [207] strongest argument for the view on pragmatics presented in this paper: the application to implicature brings nothing new (except the programme of giving a generative account of non-literal use of language), the application to presupposition is important, but since presupposition was the stating point of the account, it is unsurprising that it should be applicable.
The application to rhetorical structure however came for free and a proper foundation in pragmatics was much needed.The current account does not lead far away from the extant accounts such as SDRT, Abduction 12 , Discourse Grammar and Rhetorical Structure Theory.From the perspective of the many views on rhetorical structure that there are, it is surprising that there is a purely pragmatic foundation for a number of central phenomena.The phenomena themselves are well-known and have been described by additional axioms or mechanisms.

Defaults on rhetorical relations and the Right Frontier Constraint can be underpinned by *
There are two versions of developing this idea.The first starts from the assumption that topics are just another discourse referent, which is then subject to *.This version will be adopted here 13 .The topic should be old, otherwise a part of the given topic and otherwise bridge.Being totally new is the worst case.There should also be the assumption that there is a preference for higher activation when selecting the old topic, the topic of which the current topic is part or to which it bridges.All of this should be part of *.Now adopt the following two common assumptions: (a) Lists, Contrastive Pairs and Narrations have a discourse topic that is different from the topics of the clauses that make them up, and (b) clauses have their own topic.Topics should here be taken as the characterisation of the point of the clause in the larger discourse, in terms of a question.E.g. the second clauses in (17-h) would be dealing with the indicated topics.How do you know that John fell?(Justification) [12] This is perhaps the closest relative to the account in this paper.Abduction is explanation and it is Hobbs' view that pragmatics is explanation.From the perspective of this paper, abduction is a combination of  and , but lacks * and .This section presents the case for the importance of these principles.Abduction could be a promising way of implementing the proposal in this paper, if one finds the right way of integrating the two additional principles.[13] The other starts from the main temporal discourse referent of the causes and applies * to those.The version with topics needs an argument to show that topics are discourse referents in order to motivate the application of *.The version with temporalities is slightly less straigthforward and used in Zeevat (t.a.).The last clause always gives the most activated topic.But when it cannot be the pivot (no plausible identity, part of, bridging, overarching topic or complement relation applies) the most activated topic takes its place.That the defaults between discourse relations are in fact as indicated can be seen by considering ambiguous pairs.C.f (Jasinskaja 2007).Concessions are almost obligatory marked, contrast less, while restatements and elaboration almost never and lists and narrations have a middle position (see Taboada (2006) for a systematic investigation).Non-transparent markers like "and", "but" and "though" occur at the bottom end which points to a much greater need for these markers and has presumably caused their grammaticalisation.If one finds markers higher up, they tend to be transparant, i.e. lexical.The exception are some markers for OSLa volume 1(1), 2009 Explanation (e.g.English "for", Dutch "immers" and "want" and German "ja" and "denn").
If one assumes that Lists, Narrations, Contrasts and Concessions deactivate the topic of their left daughters, while Elaborations, Backgrounds, and Justifications merely lower the activation level of the topic of the pivot, the Right Frontier Constraint also follows from * and moreover motivates a default for the closest element on the right frontier .
Both assumptions are very reasonable: the top three levels can be seen as the epistemic and causal grounding of the pivot and the pivot topic should be kept active while this happens.Moving to the next element in a List, Narration or Contrast is a sign that the last element (the pivot)'s topic has been dealt with to the satisfaction of the speaker and can be deactivated.
Optional marking can be seen as an interplay between , * and  Marking of rhetorical relations (especially in the higher levels of ( 19) is optional.The semantic/pragmatic effect of the marked and the unmarked clause seems the same in a context that prefers the relation marked by the marker.This sort of situation is difficult to capture in an account of the relation between meaning and form.In Montague grammar, one would have to construct multiple syntactic analyses (with hidden operators?) to account for the unmarked clause.In the more modern design of DRT and SDRT, one needs a separate process of discourse resolution that adds the extra operator after syntactic and semantic processing.This improves on Montague grammar by avoiding syntactically unmotivated syntactic ambiguities and by solving the problem of selecting the right reading in the ambiguous situation, but it offers no account of the decision problem for the speaker: when should the marker be added, when is it superfluous?
The following example, due to Jason Mattausch, illustrates this problem.The markers "then" and "because" are necessary because if they were omitted, the normal interpretation would be a different one (Explanation rather than Narration or the other way round).If the marker is inserted when the markerless version gives the same result, the markers seem redundant and, while not grammatically incorrect, it is definitely not good style to use them in that case.[210]    should be able to disregard a feature Explanation or Narration in the input, but also be able to insert "because" or "then" to express it, i.e. the version with and without the marker should be equally optimal.That is the grammatical side, but for an account of speaking, the form should be monitored by pragmatics.This means that the speaker (while planning, speaking or even afterwards) should try to interpret what she is producing.If the interpretation does not match what she is trying to say -the input to the selection of the optimal form -she should see if there is an equally optimal form which would lead to the intended interpretation.
Early detection would lead to an improved form, later detection to self-correction.
In this way, the markerless optimal form can lose out to an other optimal form that marks for the distinction.The selection of the form should be simultaneous with the interpretation system selecting the best interpretation of the form in the context.
Monitoring should be conceived as another optimisation problem, one that checks a set of ranked features 14 of the input for their intended values in the interpretation prefered by the system  > * > .Together with an economy constraint this gives a notion of pragmatic production optimality:  >  >  which would be able to handle optional marking and expressive constraints.
This comes down to saying that rhetorical relation is not grammatically marked (unlike number, definiteness, tense and aspect in English) but still has to be expressed in the sense that wrong preferred interpretations need to be marked against.The adequate characterisation of the process requires full-scale interpretation and has been one of the arguments for bidirectional optimality theory.Many proposals for bidirectional OT have been found wanting in Beaver and Lee (2003).Assuming interpretational monitoring is however enough and there is good psychological evidence that monitoring indeed takes place.
The reason for optional marking of rhetorical relations lies in the other constraints.In the example, it is  that prefers to let Mary smile after John's falling, overriding *.In the case of Bill's pushing, both  and * would prefer Explanation. should be unimportant, since the three stronger constraints suffice for a decision.The need to mark explains why lexical items have been recruited for marking rhetorical relations.The defaults generated by * explain why grammaticalisation is stronger at the lower end of the hierarchy.
Finally,  captures the effect that the strongest rhetorical relation compatible with the utterance in the context is always the one that is prefered by [14] In case of several features competing for a means of expression the most important feature should win.The famous case is competition for priority in word order between subjects and topics.If there is a conflict which would leave the subject unmarked, the subject wins this competition.For more discussion, see Zeevat (2006).
OSLa volume 1(1), 2009   [211] interpreters.This can strengthen a List to Narration and a Narration to Result, a Background to an Explanation, an Elaboration to a Restatement and a Restatement to a Conclusion.It is natural to assume that reporting a new event activates the questions like "what then?", "what resulted?","why?", "so?", etc.When these questions are not answered yet and the new statement can be taken as addressing them, the interpreter must assume that the statement settles them.This brief section tried to show that the interpretational constraint system derived from presupposition theory can explain the defaults in discourse interpretation and can in fact refine the picture.It would however be wrong to see it as a revolution: most of the insights in the area are preserved.
[8]            The pragmatic constraint system used in this paper comes from studies of the interpretational effect of presupposition triggers.If one tries to understand the particular assumptions in presupposition theory as instances of plausible general interpretational principles -as is done in this paper -and makes the further assumption that this is all there is in interpretation, one obtains a mono-directional pragmatic system like Relevance Theory or Hobbs's abduction framework.There is a divergence here with attempts to do pragmatics in the other direction as Grice does or the intriguing intermediate proposals by Horn (1984); Levinson (2000); Blutner (2000) to develop a bidirectional pragmatics.
Grice's insight that meaning NN crucially involves the intention that the intention behind the utterance is recognised seems to lead directly to the attempt to better understand the intentions that speakers have when they produce something with meaning NN : the intention is to be cooperative given the goals of the conversation.But to stop there misses the point that meaning NN is also quite continuous with natural meaning.It is at least a partial explanation of John's saying "I am hungry" that John is hungry.And this partial explanation is enough for reaching the further effects that John is hoping to achieve with his utterance.Or for trying the same trick oneself in order to achieve the same effects.
Cooperativity or relevance could well be something that emerges after an earlier stage in which communicative acts are just interpreted as natural events, with the emergence of cooperativity mainly due to engaging in such acts oneself.This makes the interpretation system a specialisation of the explanation system for natural events to communicative acts.The development of complex natural languages is conditioned by this interpretation system and the continuous monitoring that the system makes possible.
Put rather crudely, explanation can be refined to include relevance or cooperativity.But cooperativity cannot be refined to include explanation.It follows that there is a limit to how much a cooperativity based pragmatics can achieve.
The main case for the system proposed in this paper is however the broadness OSLa volume 1(1), 2009 [212]   of its scope without losing applicability ot detailed problems.The aim of this paper was to make this point in a sketchy and programmatic way.It surprised me that the "general nonsense" of presupposition theory can be used to do Gricean pragamtics and to explain rhetorical structure.But this is exactly what general pragmatics should be: insights on politeness marking should be applicable to the marking of discourse relations, obscure insights in pronoun resolution on conversational implicatures, and so on.
The idea that optimality theoretic pragmatics can be done as a generalisation of presupposition theory dates from January 2000 and appears in Zeevat (2001).Helen de Hoop and David Beaver managed to throw me in despair however by accurate questions about the relation of pronouns and *.In retrospect, I can only be grateful to these excellent colleagues.I owe a debt to Masha Averintseva who asked me to teach a course in Berlin about rhetorical structure, during which I discovered the relevance of the system for rhetorical structure.A first full version of this paper was presented at the Ustaoset Sprik workshop and I am very grateful to the organisers of that stimulating experience.I owe important feedback to audiences in Amsterdam, Oslo, Austin and Baltimore.Special thanks go to Katrin Erk who is responsible for a reformulation of the  constraint.Many thanks also to Katja Jasinskaja who managed to shake up my thinking about rhetorical relations in a profound way.

       
This appendix tries to give a DRT-based approximation to the system of four constraints introduced in the paper.My aim is to show that an approximation to an implementation is quite feasible, though much more work is needed in three areas: estimating plausibility that is not consistency, dealing with incorrect utterances and dealing with non-literal speech.
Question evocation for presupposition triggers can be handled, a full treatment of other cases needs further study.
I: an utterance U and an old DRS K 0 C: the set of all legal DRSs K. Evaluation of candidate interpretations K of U in K 0 : : The hearer, imagining to be the speaker, should be able to use U to effect the change from K 0 to K.   [213] algorithm must obey the following question regime: All questions activated in K 0 and all questions evoked by U of which K does not entail that they have no true answer are in K in one of the following three ways: 1 as an activated question of K 2 if K but not K 0 entails that the question has a positive answer, the unification of the positive answer with the exhaustive answer to the question is in K 3 if the question is evoked by a presupposition trigger, its positive answer is in K : There is no more plausible alternative DRS given K 0 .
Implementation: prove the consistency of K. Corrections lose from extensions of K 0 .Smaller corrections win over larger corrections; (approximation) use a sort-based plausibility estimate over all atomic predications in the new K \ (K 0 ∩ K) and multiply.
Winners are the maximally plausible extensions allowing for some uncertainty.
*: There is no alternative interpretation in which a discourse referent is more activated or, if not activated, more connected than in the candidate.

Implementation:
Count the number of new DRs.Subtract 2/3 for each new DR that is part of another DR, 1/3 for each new DR that is bridging to another DR.Candidates with lowest scores go through.
: All activated questions that may be settled are settled.

Implementation:
There is no other candidate K with fewer activated questions.
John came here, he would have seen the problem.b.Because John came here, he has seen the problem.c.If John had come here, he would have seen the problem.
fell.Bill pushed him.Why did John fall?(Explanation) b.John fell.I saw it.
fell.Bill pushed him.sb.John fell.Then Bill pushed him.c. John fell.Mary smiled at him.d.John fell.Because Mary smiled at him.
(21) a. John fell.Because Bill pushed him.b.John fell.Then Mary smiled at him.OSLa volume 1(1), 2009 Approximation: an adaptation of the DRS development algorithm geared to give all possible developments including corrections.The development OSLavolume 1(1), 2009 It is in virtue of those topics that one can assign rhetorical relations to the second clause, indicated between the brackets after the question.If the pivot as in the example is the last clause, * gives the following preferences: