Possibility Theory
Didier Dubois and Henri Prade (2007), Scholarpedia, 2(10):2074. | doi:10.4249/scholarpedia.2074 | revision #137677 [link to/cite this article] |
Possibility theory is an uncertainty theory devoted to the handling of incomplete information. As such, it complements probability theory. It differs from the latter by the use of a pair of dual set-functions (possibility and necessity measures) instead of only one. This feature makes it easier to capture partial ignorance. Besides, it is not additive and makes sense on ordinal structures. The name Theory of Possibility was coined in (Zadeh 1978), inspired by (Gaines and Kohout 1975). In Zadeh's view, possibility distributions were meant to provide a graded semantics to natural language statements. However, possibility and necessity measures can also be the basis of a full-fledged representation of partial belief that parallels probability (Dubois and Prade 1988). Then, it can be seen either as a coarse, non-numerical version of probability theory, or as a framework for reasoning with extreme probabilities, or yet as a simple approach to reasoning with imprecise probabilities (Dubois, Nguyen and Prade, 2000).
Contents |
Basic Notions
A possibility distribution is a mapping \(\pi\) from a set of states of affairs S to a totally ordered scale such as the unit interval \([0, 1]\ .\) The function \(\pi\) represents the knowledge of an agent (about the actual state of affairs) distinguishing what is plausible from what is less plausible, what is the normal course of things from what is not, what is surprising from what is expected. It represents a flexible restriction on what the actual state of affairs is, with the following conventions:
- \( \pi(s) = 0 \) means that state s is rejected as impossible;
- \( \pi(s) = 1\) means that state s is totally possible (= plausible or unsurprising).
If the state space is exhaustive, at least one of its elements should be the actual world, so that at least one state is totally possible (normalisation). Distinct values may simultaneously have a degree of possibility equal to 1.
Possibility theory is driven by the principle of minimal specificity. It states that any hypothesis not known to be impossible cannot be ruled out. A possibility distribution is said to be at least as specific as another one if and only if each state is at least as possible according to the latter as to the former (Yager 1983). Then, the most specific one is the most restrictive and informative.
In the possibilistic framework, extreme forms of partial knowledge can be captured, namely:
- Complete knowledge: for some state \(s_0, \pi(s_0) = 1 \) and \(\pi(s) = 0 \) for other states s (only \( s_0 \) is possible)
- Complete ignorance\[\pi(s) = 1,\forall s\in S \ ,\] (all states are totally possible).
Given a simple query of the form does an event A occur?, where A is a subset of states, or equivalently does the actual state lie in A, a response to the query can be obtained by computing degrees of possibility and necessity, respectively (if the possibility scale is \( [0, 1] \) ): \[ \Pi(A) = \sup_{s\in A} \pi(s); N(A) = \inf_{s\notin A} 1 - \pi(s). \]
The possibility degree \(\Pi(A) \) evaluates to what extent event A is consistent with the knowledge \(\pi \ ,\) while \(N(A) \) evaluates to what extent \(A \) is certainly implied by the knowledge. The possibility-necessity duality is expressed by \( N(A) = 1 - \Pi(A^c), \) where \( A^c \) is the complement of \(A. \) Generally, \(\Pi(S) = N(S) = 1 \) and \(\Pi(\emptyset ) = N(\emptyset ) = 0 \ .\) Possibility measures satisfy the basic maxitivity property: \[ \Pi(A \cup B) = \max(\Pi(A), \Pi(B) ). \]
Necessity measures satisfy an axiom dual to that of possibility measures, namely \( N(A \cap B) = \min(N(A), N(B)). \) On infinite spaces, these axioms must hold for infinite families of sets.
Human knowledge is often expressed in a declarative way using statements to which some belief qualification is attached. Certainty-qualified pieces of uncertain information of the form \(A \) is certain to degree \(\alpha \) can then be modelled by the constraint \(N(A) \geq \alpha. \) The least specific possibility distribution reflecting this information assign possibility 1 to states where \(A \) is true and \(1 - \alpha \) to states where A is false.
Apart from \(\Pi \ ,\) which represents the idea of potential possibility, another measure of guaranteed possibility can be defined (Dubois, Hajek and Prade, 2000) \[\Delta(A) = \inf_{s\in A} \pi(s).\] It estimates to what extent all states in \(A \) are actually possible according to evidence.
Notions of conditioning and independence were studied for possibility measures. Conditional possibility is defined similarly to probability theory using a Bayesian like equation of the form (Dubois and Prade, 1988): \[ \Pi(B\cap A) = \Pi(B\mid A) \star \Pi(A) \] However, in the ordinal setting the operation \(\star\) cannot be a product and is changed into the minimum. In the numerical setting, there are several ways to define conditioning, not all of which have this form (Walley, 1996). There are several variants of possibilistic independence (De Cooman, 1997; Dubois et al. 1997; De Campos and Huete, 1999). Generally, independence in ordinal possibility theory is neither symmetric, nor insensitive to negation. For non Boolean variables, independence between events is not equivalent to independence between variables. Joint possibility distributions on Cartesian products of domains can be represented by means of graphical structures similar to Bayesian networks for joint probabilities (see Borgelt et al. 2000; Benferhat Dubois Garcia and Prade 2002). Such graphical structures can be taken advantage of for evidence propagation (Ben Amor et al., 2003) or learning (Borgelt and Kruse, 2003).
Historical Background
Zadeh was not the first scientist to speak about formalising notions of possibility. The modalities possible and necessary have been used in philosophy at least since the Middle-Ages in Europe, based on Aristotle's works. More recently these concepts became the building blocks of modal logic that emerged at the beginning of the XXth century. In this approach, possibility and necessity are all-or-nothing notions, and handled at the syntactic level. Independently from Zadeh's view, the notion of possibility as opposed to probability was central in the works of one economist, Shackle, and is present in those of two philosophers, Cohen and Lewis:
- The English economist G. L. S. Shackle (1962) introduced a full-fledged approach to uncertainty and decision in the 1940-1970's based on degrees of potential surprise of events. They are degrees of impossibility, that is, degrees of necessity of the opposite events. The degree of surprise of an event made of a set of elementary hypotheses is the degree of surprise of its least surprising realisation. Potential surprise is understood as disbelief. The disbelief notion introduced later in (Spohn 1990) employs the same type of convention as potential surprise, using the set of ordinals as a disbelief scale. Shackle also introduces a notion of conditional possibility. A framework very similar to the one of Shackle was proposed by the philosopher L. J. Cohen who considered the problem of legal reasoning (Cohen, 1977).
- The philosopher David Lewis introduced a graded notion of possibility in the form of a weak order between possible worlds he calls comparative possibility (Lewis, 1973). He relates this concept of possibility to a notion of similarity between possible worlds and a most plausible one. Comparative possibility relations are instrumental in defining the truth conditions of counterfactual statements. Comparative possibility relations \( \geq_{\Pi} \) obey the key axiom, for all events A, B, C\[ A \geq_\Pi B \rightarrow C \cup A \geq_{\Pi} C \cup B.\] This axiom was later independently proposed in (Dubois 1986) in an attempt to derive a possibilistic counterpart to De Finetti and Savage comparative probabilities.
- Zadeh (1978) proposed an interpretation of membership functions of fuzzy sets as possibility distributions encoding flexible constraints induced by natural language statements. Zadeh articulated the relationship between possibility and probability, noticing that what is probable must preliminarily be possible. However, the view of possibility degrees developed in his paper refers to the idea of graded feasibility (degrees of ease, as in the example of how many eggs can Hans eat for his breakfast) rather than to the epistemic notion of plausibility laid bare by Shackle. Nevertheless, the key axiom of maxitivity for possibility measures is highlighted. Later on, Zadeh (1979) acknowledged the connection between possibility theory, belief functions and upper/lower probabilities, and proposed their extensions to fuzzy events and fuzzy information granules.
This state of fact lays bare two branches of possibility theory: the qualitative and the quantitative one (Dubois and Prade, 1998).
Qualitative possibility theory
Qualitative possibility relations can be partially specified by a set of constraints of the form \( A >_{\Pi} B \ ,\) where A and B are events that may take the form of logical formulae. If these constraints are consistent, this relation can be completed through the principle of minimal specificity. A plausibility ordering on S can be obtained by assigning to each state of affairs its highest possibility level in agreement with the constraints (see Benferhat et al.1997, 1998). A general discussion on the relation between possibility relations and partial orders on state of affairs is in (Halpern, 1997).
Qualitative possibility relations can be represented by (and only by) possibility measures ranging on any totally ordered set (especially a finite one, Dubois 1986). This absolute representation on an ordinal scale is slightly more expressive than the purely relational one. When the finite set S is large and generated by a propositional language, qualitative possibility distributions can be efficiently encoded in possibilistic logic (Dubois, Lang, Prade, 1994). A possibilistic logic base K is a set of pairs \( (\phi,\alpha )\ ,\) where \( \phi\) is a Boolean expression and \( \alpha \) takes on values on an ordinal scale. This pair encodes the constraint \( N(\phi ) \geq \alpha \) where \( N(\phi) \) is the degree of necessity of the set of models of \( \phi \ .\) Each prioritized formula \( (\phi,\alpha )\) expresses a necessity-qualified statement. It is interpreted as the least specific possibility distribution on interpretations where this statement holds. Thus a prioritized formula has a fuzzy set of models. The fuzzy intersection of the fuzzy sets of models of all prioritized formulas in K yields an associated plausibility ordering on S.
Syntactic deduction from a set of prioritized clauses is achieved by refutation using an extension of the standard resolution rule, whereby \( (\phi \vee \psi, \min(\alpha, \beta)) \) can be derived from \( (\phi \vee \xi, \alpha ) \) and \( (\psi \vee \neg \xi, \beta). \) This rule, which evaluates the validity of an inferred proposition by the validity of the weakest premiss, goes back to Theophrastus, a disciple of Aristotle. Possibilistic logic is an inconsistency-tolerant extension of propositional logic that provides a natural semantic setting for mechanizing non-monotonic reasoning (Benferhat, Dubois, Prade, 1998), with a computational complexity close to that of propositional logic.
The main idea behind qualitative possibility theory is that the state of the world is by default assumed to be as normal as possible, neglecting less normal states. In particular, the events accepted as true are those true in all the most plausible states, namely the ones with positive degrees of necessity These assumptions lead us to interpret the plausible inference of a proposition B from another A, under knowledge \( \pi \) as follows: B should be true in all the most plausible states were A is true. It also means that the agent considers B as an accepted belief in the context A. This kind of inference is nonmonotonic in the sense that in the presence of additional information C, B may fail to remain an accepted belief. This is similar to the fact that a conditional probability \( P(B\mid A\cap C) \) may be low even if \( P(B\mid A) \) is high. The properties of this consequence relation are now well-understood (Benferhat, Dubois and Prade, 1997).
Decision-theoretic foundations of qualitative possibility were established by Dubois et al. (2001) in the act-based setting of Savage. Two qualitative decision rules under uncertainty can be justified: a pessimistic one and an optimistic one, respectively generalizing Wald's maximin and maximax criteria under ignorance, refined by accounting for a plausibility ordering on the state space.
Quantitative possibility theory
The phrase "quantitative possibility" refers to the case when possibility degrees range in the unit interval. In that case, a precise articulation between possibility and probability theories is useful to provide an interpretation to possibility and necessity degrees. Several such interpretations can be consistently devised (see (Dubois, 2006) for a detailed survey):
- a degree of possibility can be viewed as an upper probability bound (Walley, 1996; De Cooman and Aeyels, 1999), Especially, probabilistic inequalities such as Chebychev one, can be interpreted as defining possibility distributions (Dubois et al., 2004).
- a possibility measure is also a special case of a plausibility function in the theory of evidence (Shafer, 1987). It is then equivalent to a consonant random set.
- a possibility distribution can be viewed as a likelihood function (Dubois, Moral and Prade, 1997).
- Confidence or dispersion intervals are often extracted from statistical information and are attached a confidence level like 0.95 per cent. Varying this confidence level yields a family of nested intervals that can be represented as a possibility distribution (Dubois et al. 2004).
- Following a very different approach, possibility theory can represent probability distributions with extreme, infinitesimal values (Spohn, 1990).
- The theory of large deviations in probability theory also handles set-functions that look like possibility measures (Nguyen and Bouchon-Meunier, 2003).
There are finally close connections between possibility theory and idempotent analysis (Kolokoltsov and Maslov, 1997).
Applications
Possibility theory has not been the main framework for engineering applications of fuzzy sets in the past. However, two directions where it proved useful can be highlighted:
- Interval analysis has been generalized in the setting of possibility theory. This is the calculus of fuzzy numbers; see (Dubois, Kerre, Mesiar and Prade, 2000) for a survey. It is then analogous to random variable calculations. Finding the potential of possibilistic representations in computing conservative bounds for probabilistic calculations is certainly a major challenge.
- Possibility theory also offers a framework for preference modeling in constraint-directed reasoning. Both prioritized and soft constraints can be captured by possibility distributions expressing degrees of feasibility rather than plausibility (Dubois, Fargier and Prade, 1996).
Other applications of possibility theory can be found in fields such as data analysis (Wolkenhauer, 1998; Tanaka and Guo, 1999), structural learning (Borgelt et al., 2000), database querying (Bosc and Prade, 1997), diagnosis (Cayrac et al., 1996), belief revision (Benferhat, Dubois, Prade and Williams, 2002), argumentation (Amgoud and Prade, 2004), case-based reasoning (Huellermeier, 2007).
Lastly, possibility theory is also being studied from the point of view of its relevance in cognitive psychology. Experimental results in cognitive psychology (Raufaste et al., 2003) suggest that there are situations where people reason about uncertainty using the rules or possibility theory, rather than with those of probability theory.
References
- L. Amgoud, H. Prade (2004) Reaching agreement through argumentation: A possibilistic approach. Proc. of the 9th Int. Conf. on Principles of Knowledge Representation and Reasoning (KR'04)(Whistler, BC, Canada), AAAI Press, p. 175-182.
- N. Ben Amor, S. Benferhat, K. Mellouli (2003) Anytime propagation algorithm for min-based possibilistic graphs. Soft Comput. 8(2): 150-161. doi:10.1007/s00500-002-0255-x.
- S. Benferhat, D. Dubois and H. Prade (1997) Nonmonotonic reasoning, conditional objects and possibility theory. Artificial Intelligence, 92: 259-276. doi:10.1016/S0004-3702(97)00012-X.
- S. Benferhat, D. Dubois, and H. Prade (1998) Practical handling of exception-tainted rules and independence information in possibilistic logic. Applied Intelligence, 9: 101-127,1998.
- S. Benferhat, D. Dubois, L. Garcia, H. Prade (2002) On the transformation between possibilistic logic bases and possibilistic causal networks. Int. J. Approx. Reasoning 29(2): 135-173. doi:10.1016/S0888-613X(01)00061-5.
- S. Benferhat, D. Dubois, H. Prade and M.-A. Williams (2002) A practical approach to revising prioritized knowledge bases, Studia Logica,70:105-130. doi:10.1023/A:1014658309853.
- C. Borgelt, J. Gebhardt, R. Kruse (2000) Possibilistic Graphical Models. In: G. Della Riccia et al. Computational Intelligence in Data Mining, CISM Series, N°408, Springer, Berlin.
- C. Borgelt, R. Kruse (2003) Operations and evaluation measures for learning possibilistic graphical models. Artif. Intell. 148(1-2): 385-418 doi:10.1016/S0004-3702(03)00024-9.
- P. Bosc and H. Prade (1997) An introduction to the fuzzy set and possibility theory-based treatment of soft queries and uncertain of imprecise databases. In: P. Smets, A. Motro, editors Uncertainty Management in Information Systems, Dordrecht: Kluwer, 285-324. doi:10.1007/978-1-4615-6245-0_10.
- D. Cayrac, D. Dubois, M. Haziza, H. Prade (1996) Handling uncertainty with possibility theory and fuzzy sets in a satellite fault diagnosis application. IEEE Trans. on Fuzzy Systems, 4, 251-269. doi:10.1109/91.531769.
- L.M. De Campos and J.F. Huete (1999) Independence concepts in possibility theory, Fuzzy Sets and Systems, 103, Part 1: 127-152; Part II: 487-506.
- L.J. Cohen (1977) The Probable and the Provable. Oxford: Clarendon. doi:10.1093/acprof:oso/9780198244127.001.0001.
- G. De Cooman (1997) Possibility theory. Part I: Measure- and integral-theoretic groundwork; Part II: Conditional possibility; Part III: Possibilistic independence. Int. J. of General Syst., 25: 291-371.
- G. De Cooman and D. Aeyels (1999) Supremum-preserving upper probabilities. Information Sciences, 118; 173-212.
- D. Dubois (1986) Belief structures, possibility theory and decomposable measures on finite sets. Computers and AI, 5: 403-416.
- D. Dubois (2006) Possibility theory and statistical reasoning. Comput. Stat. and Data Anal. 51(1): 47-69. doi:10.1016/j.csda.2006.04.015.
- D. Dubois, H. Fargier, H. Prade (1996) Possibility theory in constraint satisfaction problems: Handling priority, preference and uncertainty. Applied Intelligence, 6, 287-309. doi:10.1007/BF00132735.
- D. Dubois D., L. Fariñas del Cerro, A. Herzig and H. Prade (1997) Qualitative relevance and independence: A roadmap. Proc. of the 15h Inter. Joint Conf. on Artif. Intell., Nagoya, Japan, 62-67.
- D. Dubois, L. Foulloy, G. Mauris, H. Prade (2004) Probability-possibility transformations, triangular fuzzy sets, and probabilistic inequalities. Reliable Computing 10: 273-297. doi:10.1023/B:REOM.0000032115.22510.b5.
- D. Dubois, P. Hajek and H. Prade (2000) Knowledge-driven versus data-driven logics. J. Logic, Lang. and Inform., 9: 65-89. doi:10.1023/A:1008370109997.
- D. Dubois, E. Kerre, R. Mesiar, H. Prade (2000) Fuzzy interval analysis. In : D. Dubois, H. Prade (Eds.), Fundamentals of Fuzzy Sets. Kluwer, Boston, Mass, 483-581. doi:10.1007/978-1-4615-4429-6_11.
- D. Dubois, J. Lang and H. Prade (1994) Possibilistic logic. In D.M. Gabbay, et al., eds, Handbook of Logic in AI and Logic Programming, Vol. 3, Oxford University Press, 439-513.
- D. Dubois, S. Moral, H. Prade (1997) A semantics for possibility theory based on likelihoods. J. of Mathematical Analysis and Applications, 205, 359-380. doi:10.1006/jmaa.1997.5193.
- D. Dubois, H.T. Nguyen and H. Prade (2000) Fuzzy sets and probability: misunderstandings, bridges and gaps. In: D. Dubois and H. Prade, eds, Fundamentals of Fuzzy Sets. Boston, Mass: Kluwer, 343-438. doi:10.1007/978-1-4615-4429-6_8.
- D. Dubois and H. Prade (1988) Possibility Theory, New York: Plenum. doi:10.1007/978-1-4684-5287-7.
- D. Dubois and H. Prade (1998) Possibility theory: Qualitative and quantitative aspects. In D. M. Gabbay and P. Smets P., editors Handbook of Defeasible Reasoning and Uncertainty Management Systems, Vol. 1., Dordrecht: Kluwer Academic, 169-226. doi:10.1007/978-94-017-1735-9_6.
- D. Dubois, H. Prade and R. Sabbadin (2001) Decision-theoretic foundations of possibility theory. Eur. J. Operational Research, 128: 459-478. doi:10.1016/S0377-2217(99)00473-7.
- J. Y. Halpern (1997) Defining relative likelihood in partially-ordered structures. J. Artif. Intell. Res. 7: 1-24.
- B.R Gaines and L. Kohout (1975) Possible automata. Proc. Int. Symp. Multiple-Valued logics, Bloomington, IN, 183-196.
- E. Huellermeier (2007) Case-based Approximate Reasoning. Springer.
- V. N. Kolokoltsov, V. P. Maslov (1997) Idempotent analysis and applications, Kluwer Acad. Publisher. doi:10.1007/978-94-015-8901-7_1.
- D. L. Lewis (1973) Counterfactuals. Oxford: Basil Blackwell.
- H. T. Nguyen, B. Bouchon-Meunier (2003) Random sets and large deviations principle as a foundation for possibility measures. Soft Computing, 8:61-70. doi:10.1007/s00500-002-0258-7.
- E. Raufaste, R. Da Silva Neves, C. Mariné (2003) Testing the descriptive validity of possibility theory in human judgements of uncertainty. Artificial Intelligence, 148: 197-218. doi:10.1016/S0004-3702(03)00021-3.
- G. L.S. Shackle (1961) Decision, Order and Time in Human Affairs. 2nd edition, Cambridge University Press, UK.
- G. Shafer (1987) Belief functions and possibility measures. In J.C. Bezdek, ed., Analysis of Fuzzy Information Vol. I: Mathematics and Logic, Boca Raton, FL : CRC Press, 51-84.
- W. Spohn (1990) A general, nonprobabilistic theory of inductive reasoning. In R.D. Shachter, et al., editors, Uncertainty in Artificial Intelligence, Vol. 4. Amsterdam: North Holland. 149-158.
- H. Tanaka, P.J. Guo (1999) Possibilistic Data Analysis for Operations Research. Heidelberg: Physica-Verlag.
- P. Walley (1996) Measures of uncertainty in expert systems Artificial Intelligence, 83,1-58. doi:10.1016/0004-3702(95)00009-7.
- O. Wolkenhauer (1998) Possibility Theory with Applications to Data Analysis. Chichester: Research Studies Press.
- R.R. Yager (1983) An introduction to applications of possibility theory. Human Systems Management, 3: 246-269.
- L. A. Zadeh (1978) Fuzzy sets as a basis for a theory of possibility, Fuzzy Sets and Systems, 1: 3-28. doi:10.1016/0165-0114(78)90029-5.
- L. A. Zadeh (1979) Fuzzy sets and information granularity. In M.M. Gupta et al., eds, Advances in Fuzzy Set Theory and Applications}, Amsterdam: North-Holland, 3-18.
Internal references
- Zhong-Lin Lu and Barbara Anne Dosher (2007) Cognitive psychology. Scholarpedia, 2(8):2769. doi:10.4249/scholarpedia.2769.
- Olaf Sporns (2007) Complexity. Scholarpedia, 2(10):1623. doi:10.4249/scholarpedia.1623.
- Milan Mares (2006) Fuzzy sets. Scholarpedia, 1(10):2031. doi:10.4249/scholarpedia.2031.
- Mark Aronoff (2007) Language. Scholarpedia, 2(5):3175. doi:10.4249/scholarpedia.3175.