2
CAP 6635 Artificial Intelligence
Homework 4
Question 1 [2 pts]: Table 1 shows probability values of different events. Using the table to calculate following values and show proof:
· The probability that a persona has cavity [0.25 pt]
· The probability of a toothache event [0.25 pt]
· The joint probability of cavity and toothache [0.25 pt]
· Calculate conditional probability of no cavity, given the patient has toothache [0.25 pt]
· Calculate conditional probability of no cavity, given the patient does not have toothache [0.25 pt]
· Determine whether cavity and toothache are independent or not, why [0.25 pt]
· Given a patient has cavity, determine whether the tooth probe catch is conditionally independent of toothache or not, why [0.25 pt]
· Given a patient does not have cavity, determine whether the tooth probe catch is conditionally independent of toothache or not, why [0.25 pt]
Table 1
Question 2 [2 pts]: A patient takes a lab test and the result comes back positive. Assume the test returns a co
ect positive result in only 95% of the cases in which the disease is actually present, and a co
ect negative result in only 95% of the cases in which the disease is not present. Assume further that 0.001 of the entire population have this cancer.
· Use Bayes Rule to derive the probability of any test results being positive [1 pt]
· Use Bayes Rule to derive the probability of the patient having the cancer given that his/her lab test is positive (list the major steps). [1 pt]
Question 3: [2 pts]: Figure 1 shows a Bayesian network, using first letter to denote each named variable, e.g, using T to denote tampering, and complete following questions:
· Show joint probability of the whole Bayesian network [0.25 pt]
· How many probability values are needed (i.e, should be given), in order to calculate the joint probability of the whole network, why [0.25 pt]
· Prove that “Alarm” and “Report” are conditionally independent, given “Learning” [0.25 pt]
· Prove that “Alarm” and “Smoke” are conditionally independent, given “Fire” [0.25 pt]
· Assume x ( y denotes that x are independent of y, x ( y | z denotes that x and y are conditionally independent, given z. Complete Table 2, and use ( to mark co
ect answers. [1 pt]
Table 2
Figure 1
Question 4: [2 pts]: Figure 2 shows a Bayesian network where r denotes “rain”, s denotes “sprinkler”, and w denotes “wet lawn” (each variable takes binary values 1 or 0). The prior probabilities of rain and sprinkler, and the conditional probabilities values are given as follows:
p(r = 1) = 0.05
p(s = 1) = 0.1
p(w = 1|r = 0, s = 0) = 0.001
p(w = 1|r = 0, s = 1) = 0.97
p(w = 1|r = 1, s = 0) = 0.90
p(w = 1|r = 1, s = 1) = 0.99
· Show joint probability value formula of the whole network, and calculate the joint probability value of P(r=1, s=1, w=1). [0.25 pt]
· Calculate overall probability of lawn is wet, ie., P(w=1) [0.25 pt]
· After observing the law is wet, calculate the probability that the sprinkler was left off (i.e., s=0). [0.5 pt]
· After observing the law is wet, please calculate the probability that there was rain (i.e., r=1). [0.5 pt]
Figure 2
Question 5 [2 pts]: Figure 3 shows a Bayesian network which is similar to Figure 2, but w1 denotes your lawn, and w2 denotes neighbor’s lawn (each variable takes binary values 1 or 0). In this case, rain will cause both yours and your neighbor’s lawn being wet, whereas your sprinkler will only cause your lawn to be wet. The prior probabilities and conditional probability values are given as follows:
p(r = 1) = 0.05
p(s = 1) = 0.1
p(w1 = 1|r = 0, s = 0) = 0.001
p(w1 = 1|r = 0, s = 1) = 0.97
p(w1 = 1|r = 1, s = 0) = 0.90
p(w1 = 1|r = 1, s = 1) = 0.99
p(w2 = 1|r = 1) = 0.90
p(w2 =1|r=0)=0.1
· Show joint probability value formula of the whole network, and calculate the joint probability value of P(r=1, s=1, w1=1, w2=1). [0.25 pt]
· Calculate overall probability of yours and your neighbor’s lawn are wet, i.e., P(w1=1, w2=1) [0.25 pt]
· After observing that yours and your neighbors’ lawn are both wet, calculate the probability that the sprinkler was left on (i.e., s=1). [0.5 pt]
· After observing that yours and your neighbors’ lawn are both wet, calculate the probability that there was rain (i.e., r=1). [0.5 pt]
Figure 3
Question 6 [3 pts]: Given the following toy dataset with 15 Instances
· Please manually construct a Naïve Bayes Classifier (list the major steps, including the values of the priori probability [1.0 pt] and the conditional probabilities [1.0 pt]. Please use m-estimate to calculate the conditional probabilities (m=1, and p equals to 1 divided by the number of attribute values for each attribute).
· Please use your Naïve Bayes classifier to determine whether a person should play tennis or not, under conditions that “Outlook=Overcast & Temperature=Hot & Humidity =Normal& Wind=Weak”. [1 pt]
ID
Outlook
Temperature
Humidity
Wind
Class
1
Sunny
Hot
High
Weak
No
2
Sunny
Hot
High
Strong
No
3
Overcast
Hot
High
Weak
Yes
4
Rain
Mild
High
Weak
Yes
5
Rain
Cool
Normal
Weak
Yes
6
Rain
Cool
Normal
Strong
No
7
Overcast
Cool
Normal
Strong
Yes
8
Sunny
Mild
High
Weak
No
9
Sunny
Cool
Normal
Weak
Yes
10
Rain
Mild
Normal
Weak
Yes
11
Sunny
Mild
Normal
Strong
Yes
12
Overcast
Mild
High
Strong
Yes
13
Overcast
Mild
Normal
Weak
No
14
Rain
Hot
High
Strong
Yes
15
Rain
Mild
High
Strong
No
Logical Agents
Uncertain Knowledge Reasoning
and Learning
Uncertainty, Bayesian Network, and Naïve Bayes Classification
Chapters: 12, 13, 19
Outline
• Uncertainty & Probability
– Distribution
– Independence, conditional independence
• Bayes’ Rule
• Bayesian Network
– Joint distribution
– 3-Way Bayesian Network
– Bayesian Network Construction & Inference
• Naïve Bayes Classification
Uncertainty
• Suppose we are catching a flight at 9AM at FLL
airport? When should you leave Boca for FLL?
Leaving Catch/Miss P(on time)
8:00AM 1/9 1/10=0.1
7:00AM 6/4 6/10=0.6
6:30AM 9/1 9/10=0.9
…
Which action to choose?
Depends on my preferences for missing flight vs. airport waiting
time, etc.
Probability Theory: Beliefs about events
Utility theory: Representation of preferences
Decision about when to leave is determined by two factors:
Decision theory = utility theory + probability theory
What is Probability
• Probability
– Calculus for dealing with nondeterminism and uncertainty
– The world is full of uncertainty
• Deterministic is inadequate and ineffective
• Probabilistic model
– Determine how often we expect different things to occu
• Where do the numbers for probabilities come from?
– Frequentist view
• Numbers from experiments
– Objectivist view
• Numbers inherent properties of universe
– Subjectivist view
• Numbers denote agent’s beliefs
Random Variable
• A random variable x takes on a defined set of
values with different probabilities.
• For example, if you roll a die, the outcome is random (not fixed)
and there are 6 possible outcomes, each of which occur with
probability one-sixth.
– X=1, 3, 2, 2, 1, 6, 5
• For example, if you poll people about their voting preferences,
the percentage of the sample that responds “Yes on Proposition
100” is a random variable (the percentage will be slightly
differently every time you poll).
• Roughly, probability is how frequently we expect
different outcomes to occur if we repeat the
experiment over and over (“frequentist” view)
Discrete or Continuous Random
Variables
• Discrete random variables have a
countable number of outcomes
– Examples: Dead/alive, treatment/placebo, dice,
counts, etc.
– Count frequency by histogram
• Continuous random variables have an
infinite continuum of possible values.
– Examples: blood pressure, weight, the speed
of a car, the real numbers from 1 to 6.
– Count frequency by density
Axioms of probability
• For any propositions A, B
1. 0 ≤ P(A) ≤ 1; 0 ≤ P(B) ≤ 1
2. Probability of whole space S is 1
1. P(S) = 1
3. P(A B) = P(A) + P(B) - P(A B)
Distributions
• Probability distribution
– Probabilities of all possible values of the random variable.
• Random variable:
– A numerical value whose measured value can change from one replicate of
the experiments to anothe
• Weather is one of
• P(Weather) = <0.72, 0.1, 0.08, 0.1>
– Normalized, i.e., sums to 1. Also note the bold font..
– Bold form normally means a vecto
– If E1, E2, …, Ek are mutually exclusive
• P(X{E1 E2 … Ek})=P(XE1)+P(XE2)+…+P(XEk)
Priors
• Prior or unconditional probabilities of propositions
– Co
esponding to belief prior to a
ival of any (new) evidence.
– Desired outcomes/The total number of outcomes
e.g., P(cavity) =P(Cavity=true)= 0.1
P(Weather=sunny) = 0.72
P(cavity Weather=sunny) = 0.072
Notations: P(AB), P(A and B), P(A, B), and P(AB) denote the
same meanings
Dilemma at the Dentist’s
• What is the probability of a
cavity given a toothache?
• What is the probability of
cavity given the probe
catches?
Joint Probability
The entire table sums to 1.
Toothache Toothache
Cavity XXXXXXXXXX
Cavity XXXXXXXXXX
P(CavityToothache)=0.12
Toothache Toothache
Cavity 12 8
Cavity 8 72
Observed 100
individuals
P(A, B)=P(B, A)
Law of Total Probability
• If B1, B2, B3, .., is a partition of the sample space S, then
for any event A, we have
– P(A)=iP(ABi)=iP(A|Bi)P(Bi)
– For event A=“cavity”, the partition space are {toothache,
toothache}
• P(Cavity)=P(Cavitytoothache)+P(Cavity toothache)
= XXXXXXXXXX=0.2
– For event A=“toothache”, the partition space are {cavity,
cavity}
• P(toothache)=P(toothachecavity)+P(toothache cavity)
= XXXXXXXXXX=0.8
Toothache Toothache
Cavity XXXXXXXXXX
Cavity XXXXXXXXXX
Will discuss late
Inference by Enumeration
Inference by Enumeration
Conditional Probability
P(AB)=P(A|B)P(B)
=P(B|A)P(A)
Inference by Enumeration
Absolute Independence
Absolute Independence
• Absolute independence is powerful but rare
– For n independent biased coins, O(2n) →O(n)
2*2*2*4=32 entries 2*2*2+4=12 entries
Conditional independence
• If I have a cavity, the probability that the probe catches in it
doesn't depend on whether I have a toothache:
– P(catch | toothache, cavity) = P(catch | cavity)
• P(catch|cavity)=P(catch^cavity)/P(cavity)
= XXXXXXXXXX)/ XXXXXXXXXX.008)=0.18/0.2=0.9
• P(catch | toothache,