The most incomprehensible thing about the world is that it is comprehensible Albert Einstein
Bayesian Cognition Julien Diard Pierre Bessière Probabilistic models of action, perception, inference, decision and learning Julien Diard CNRS - Laboratoire de Psychologie et NeuroCognition Pierre Bessière CNRS - Laboratoire de Physiologie de la Perception et de l’Action
To get more info http://diard.wordpress.com Julien.Diard@upmf-grenoble.fr Bayesian-Programming.org ftp://ftp-serv.inrialpes.fr/pub/emotion/bayesian-programming/Cours Pierre.Bessiere@college-de-france.fr
Plan / planning Bessière c1 15/11 Diard c2 29/11, c3 13/12, c4 03/01 Incomplétude, incertitude, Programme Bayésien, inférence Bayésienne Diard c2 29/11, c3 13/12, c4 03/01 Modèles Bayésiens en robotique et sciences cognitives Diard c5 10/01 Sélection de modèles, machine learning, distinguabilité de modèles Bessière c6 17/01 Compléments : algorithmes d’inférence, maximum d’entropie
Daniel J. Simons & Christopher Chabris Perception test Daniel J. Simons & Christopher Chabris Harvard University
http://nivea.psycho.univ-paris5.fr/demos/BONETO.MOV http://www.youtube.com/watch?v=ubNF9QNEQLA http://viscog.beckman.illinois.edu/flashmovie/12.php
Probability Theory as an alternative to Logic The actual science of logic is conversant at present only with things either certain, impossible, or entirely doubtful, none of which (fortunately) we have to reason on. Therefore the true logic for this world is the calculus of Probabilities, which takes account of the magnitude of the probability which is, or ought to be, in a reasonable man's mind . James Clerk Maxwell
Incompleteness and Uncertainty A very small cause which escapes our notice determines a considerable effect that we cannot fail to see, and then we say that the effect is due to chance. H. Poincaré
Shape from Image
Shape from Motion DROULEZ COLAS PLOS 6 EXPE PSYCHO Colas, F., Droulez, J., Wexler, M. & Bessiere, P. (2008) Unified probabilistic model of perception of three-dimensional structure from optic flow; in Biological Cybernetics,in press Colas, F. (2006) Perception des objets en mouvement : Composition bayésienne du flux optique et du mouvement de l’observateur, Thèse INPG
Illusions: McGurkeffect Courtesy of Masso Arnt, Associate Professor, University of Oslo Cathiard, M.-A., Schwartz, J.-L. & Abry, C. (2001). Asking a naive question to the McGurk effect : why does audio [b] give more [d] percepts with usual [g] than with visual [d] ? In Proceedings of the /Auditory Visual Speech processing, AVSP'2001/, Aalborg, Copenhague, 138-142.
Credit card fraud detection
Beam-in-the-Bin experiment (Set-up)
Beam-in-the-Bin experiment (Results)
Beam-in-the-Bin experiment (Results)
Beam-in-the-Bin experiment (Results)
Logical Paradigm Incompleteness
Bayesian Paradigm =P(M | SDC) P(MS | DC)
Principle Incompleteness Uncertainty Decision Preliminary Knowledge + Experimental Data = Probabilistic Representation Bayesian Learning Uncertainty Bayesian Inference Decision
Thesis Probabilistic inference and learning theory, considered as a model of reasoning, is a new paradigm (an alternative to logic) to explain and understand perception, inference, decision, learning and action. La théorie des probabilités n'est rien d'autre que le sens commun fait calcul. Marquis Pierre-Simon de Laplace The actual science of logic is conversant at present only with things either certain, impossible, or entirely doubtful, none of which (fortunately) we have to reason on. Therefore the true logic for this world is the calculus of Probabilities, which takes account of the magnitude of the probability which is, or ought to be, in a reasonable man's mind . James Clerk Maxwell By inference we mean simply: deductive reasoning whenever enough information is at hand to permit it; inductive or probabilistic reasoning when - as is almost invariably the case in real problems - all the necessary information is not available. Thus the topic of « Probability as Logic » is the optimal processing of uncertain and incomplete knowledge . E.T. Jaynes Subjectivist vs Objectivist epistemology of probabilities ?
A water treatment unit (1) Complete simulation Incomplete model Observe the consequences of this incompleteness 11 values, 0 the worst
A water treatment unit (2)
A water treatment unit (3)
A water treatment unit (4)
A water treatment unit (5)
Uncertainty on O due to inaccuracy on S
Uncertainty due to the hidden variable H
Not taking into account the effect of hidden variables may lead to wrong decision (1) C=0,1 or 2 leads to optimal value O*=6 With H the “reality” is somewhat more complex The adequate choice of C is more complex but also more informed
Not taking into account the effect of hidden variables may lead to wrong decision (2) C=0,1 or 2 leads to optimal value O*=6 With H the “reality” is somewhat more complex The adequate choice of C is more complex but also more informed
Principle Incompleteness Uncertainty Decision Preliminary Knowledge + Experimental Data = Probabilistic Representation Bayesian Learning Uncertainty Bayesian Inference Decision
Basic Concepts Far better an approximate answer to the right question which is often vague, than an exact answer to the wrong question which can always be made precise. John W. Tuckey
Bayesian Spam Detection Classify texts in 2 categories “spam” or “nonspam” Only available information: a set of words Adapt to the user and learn from experience
Variable
Probability
Normalization postulate
Conditional probability
Variable conjunction
Conjunction postulate
Syllogisms Logical Syllogisms: Probabilistic Syllogisms: Modus Ponens: Modus Tollens: Probabilistic Syllogisms:
Marginalization rule
Joint distribution and questions (1)
Joint distribution and questions (2) 3^N+1-2^N+1
Joint distribution and questions (3) 3^N+1-2^N+1
Decomposition
Bayesian Network
Parametric Forms (1)
Parametric Forms (2)
Identification
Specification = Variables + Decomposition + Parametric Forms Variables: the choice of relevant variables for the problem Decomposition: the expression of the joint probability distribution as the product of simpler distribution Parametric Forms: the choice of the mathematical functions of each of these distributions
Description = Specification + Identification
Questions (1)
Questions (2)
Questions (3)
Question (4)
Bayesian Program = Description + Question Specification Identification Description Question Program Variables Parametrical Forms or Recursive Question Decomposition Preliminary Knowledge p Experimental Data d Utilization
Bayesian Program = Description + Question
Results SpamSieve http://c-command.com/spamsieve/
Theoretical Basis Content: Definitions and notations Inference rules Bayesian program Model specification Model identification Model utilization $$$citation$$$
Logical Proposition Logical Proposition are denoted by lowercase name: Usual logical operators:
Probability of Logical Proposition We assume that to assign a probability to a given proposition a, it is necessary to have at least some preliminary knowledge, summed up by a proposition p. Of course, we will be interested in reasoning on the probabilities of the conjunctions, disjunctions and negations of propositions, denoted, respectively, by: We will also be interested in the probability of proposition a conditioned by both the preliminary knowledge p and some other proposition b:
Normalization and Conjunction Postulates Bayes rule Cox Theorem Resolution Principle Why don't you take the disjunction rule as an axiom?
Discrete Variable Variable are denoted by name starting with one uppercase letter: By definition a discrete variable is a set of propositions Mutually exclusive: Exhaustive: at least one is true The cardinal of X is denoted:
Variable Conjunction Not a variable
Conjunction rule
Normalization rule Proof
Marginalization rule Proof
Contraction/Expansion rule
Rules
Description The purpose of a description is to specify an effective method to compute a joint distribution on a set of variables: Given some preliminary knowledge p and a set of experimental data d. This joint distribution is denoted as:
Decomposition Partion in K subsets: Conjunction rule: Conditional independance: Decomposition:
Parametrical Forms or Recursive Questions
Question Given a description, a question is obtained by partitionning the set of variables into 3 subsets: the searched variables, the known variables and the free variables. We define the Search, Known and Free as the conjunctions of the variables belonging to these three sets. We define the corresponding question as the distribution:
Inference
2 optimisation problems
API and Inference Engine main () { //Variables plFloat read_time; plIntegerType id_type(0,1); plFloat times[5] = {1,2,3,5,10}; plSparseType time_type(5,times); plSymbol id("id",id_type); plSymbol time("time",time_type); //Parametrical forms //Construction of P(id) plProbValue id_dist[2] = {0.75,0.25}; plProbTable P_id(id,id_dist); //Construction of P(time | id = john) plProbValue t_john_dist[5] = {20,30,10,5,2}; plProbTable P_t_john(time,t_john_dist); //Construction of P(time | id = bill) plProbValue t_bill_dist[5] = {2,6,10,40,20}; plProbTable P_t_bill(time,t_bill_dist); //Construction de P(time | id) plKernelTable Pt_id(time,id); plValues t_and_id(time^id); t_and_id[id] = 0; Pt_id.push(P_t_john,t_and_id); t_and_id[id] = 1; Pt_id.push(P_t_bill,t_and_id); //Decomposition // P(time id) = P(id) P(time | id) plJointDistribution jd(time^id,P_id*Pt_id); ProBT® ProBAYES.com Bayesian-Programming.org Specification Variables Decomposition Description Parametric Forms Bayesian Program Identification Learning from instances //Question //Getting the question P(id | time) plCndKernel Pid_t; jd.ask(Pid_t,id,time); //Read a time from the key board cout<<"P(id,time)= "<<Pid_t<<"\n"; cout<<"Time? : "; cin>>read_time; //Getting P(id | time = read_time) plKernel Pid_readTime; Question