I B  M A T H  A A  H L
Unit D2 · SolutionsUnit D2 · 解析

Probability — Solutions概率 —— 解析

Companion to the IB-Style Practice SetIB 风格练习题的解析配套

EASYMEDIUMHARD Paper 1APaper 1BPaper 2Paper 3 HL ONLY

Syllabus 4.5 – 4.6 SL + AHL 4.10 Bayes考纲 4.5 – 4.6 SL + AHL 4.10 贝叶斯(Bayes' theorem AA HL



v1.1 · companion to Unit_D2_Probability_Practice.html v1.1 · 12 Qs · 91 marks · mark-by-mark withUnit_D2_Probability_Practice.html v1.1 配套 · 12 题 · 91 分 · 逐分点附 M1 / A1 / R1 callouts分点标注

PART I  ·  PAPER 1 SECTION A — SOLUTIONS第一部分  ·  第一卷 A 节 —— 解析No calculator · 22 marks不可使用计算器 · 22 分

Section A — Worked SolutionsA 节 —— 详细解析

Q1EASYPaper 1A4.5 Sample Spaces[5 marks]

Two fair dice. (a) $P(\text{sum}=7)$. (b) $P(\text{sum even})$. (c) $P(\text{at least one }6)$.两枚均匀骰子。(a) $P(\text{sum}=7)$。(b) $P(\text{sum even})$。(c) $P(\text{at least one }6)$。

Answers:答案:  (a) $\dfrac{1}{6}$  ·  (b) $\dfrac{1}{2}$  ·  (c) $\dfrac{11}{36}$

(a) Sum equals $7$ M1·A1

Pairs giving sum $7$: $(1,6),(2,5),(3,4),(4,3),(5,2),(6,1)$ — exactly $6$ outcomes out of $36$. $P = 6/36 = \boxed{\tfrac{1}{6}}.$

(b) Sum is even A1

Sum is even iff both dice match parity. Both odd: $3 \times 3 = 9$ pairs. Both even: $3 \times 3 = 9$ pairs. Total $18/36 = \boxed{\tfrac{1}{2}}$.

(c) At least one $6$ — complement trick M1·A1

$P(\text{no }6 \text{ on either die}) = (5/6)^2 = 25/36$. So $$ P(\text{at least one }6) \;=\; 1 - \tfrac{25}{36} \;=\; \boxed{\tfrac{11}{36}}. $$
Complement first. "At least one" almost always opens with the complement rule. Counting the at-least-one outcomes directly is error-prone (double-counting); $1 - P(\text{none})$ is bulletproof.

(a) 点数和等于 $7$ M1·A1

点数和为 $7$ 的样本空间(sample space)中的结果(outcome):$(1,6),(2,5),(3,4),(4,3),(5,2),(6,1)$ —— $36$ 个结果中恰有 $6$ 个。$P = 6/36 = \boxed{\tfrac{1}{6}}.$

(b) 点数和为偶数 A1

点数和为偶数当且仅当两枚骰子奇偶性相同。两枚都为奇:$3 \times 3 = 9$ 对;两枚都为偶:$3 \times 3 = 9$ 对。共 $18/36 = \boxed{\tfrac{1}{2}}$。

(c) 至少一枚为 $6$ —— 余事件技巧 M1·A1

$P(\text{no }6 \text{ on either die}) = (5/6)^2 = 25/36$。因此用余事件(complement): $$ P(\text{at least one }6) \;=\; 1 - \tfrac{25}{36} \;=\; \boxed{\tfrac{11}{36}}. $$
先用余事件。 题目里出现 "at least one" 几乎都从余事件入手。直接枚举 "至少一次" 容易重复计数;用 $1 - P(\text{none})$ 几乎不出错。
Q2EASYPaper 1A4.5 Complementary / Expected[5 marks]

$P(\text{rain}) = 0.4$, days independent. (a) $P(\text{no rain})$. (b) Expected rainy days in $30$. (c) $P(\ge 1\text{ rainy day in }3)$.$P(\text{rain}) = 0.4$,各天独立。(a) $P(\text{no rain})$。(b) $30$ 天内下雨天数的期望。(c) $P(\ge 1\text{ rainy day in }3)$。

Answers:答案:  (a) $0.6$  ·  (b) $12$  ·  (c) $0.784$

(a) A1

$P(R^{\,\prime}) = 1 - 0.4 = 0.6.$

(b) Expected frequency M1·A1

$E[\text{rainy days in }30] = n \cdot p = 30 \cdot 0.4 = \boxed{12}$ days.

(c) At least one rainy day in three M1·A1

$P(\text{no rain on any of 3 days}) = 0.6^3 = 0.216$. $P(\ge 1 \text{ rainy}) = 1 - 0.216 = \boxed{0.784}.$
Expected frequency vs probability. $E[X] = np$ is a count (units: "days"); $P$ is a probability (unitless, $[0,1]$). Keep them straight in answer writing.

(a) A1

$P(R^{\,\prime}) = 1 - 0.4 = 0.6.$(余事件 complement

(b) 期望频数 M1·A1

$E[\text{rainy days in }30] = n \cdot p = 30 \cdot 0.4 = \boxed{12}$ 天。

(c) 三天中至少有一天下雨 M1·A1

$P(\text{no rain on any of 3 days}) = 0.6^3 = 0.216$。 $P(\ge 1 \text{ rainy}) = 1 - 0.216 = \boxed{0.784}.$
期望频数与概率的区别。 $E[X] = np$ 是一个计数(单位是 "天");$P$ 是概率(probability,无量纲,取值 $[0,1]$)。答题时务必区分清楚。
Q3MEDIUMPaper 1A4.6 ME vs Independent[6 marks]

$P(A) = 0.3$, $P(B) = 0.5$. (a) ME: $P(A \cup B)$. (b) Ind: $P(A \cap B)$. (c) Ind: $P(A \cup B)$.$P(A) = 0.3$,$P(B) = 0.5$。(a) 互斥时 $P(A \cup B)$。(b) 独立时 $P(A \cap B)$。(c) 独立时 $P(A \cup B)$。

Answers:答案:  (a) $0.8$  ·  (b) $0.15$  ·  (c) $0.65$

(a) Mutually exclusive case M1·A1

ME means $A \cap B = \varnothing$, so $P(A \cap B) = 0$. Addition rule: $$ P(A \cup B) \;=\; P(A) + P(B) - P(A \cap B) \;=\; 0.3 + 0.5 - 0 \;=\; \boxed{0.8}. $$

(b) Independent: intersection M1·A1

Independence means $P(A \cap B) = P(A)\,P(B) = 0.3 \cdot 0.5 = \boxed{0.15}.$

(c) Independent: union M1·A1

Apply addition rule with the intersection from (b): $$ P(A \cup B) \;=\; 0.3 + 0.5 - 0.15 \;=\; \boxed{0.65}. $$
ME $\ne$ Independent. Mutually exclusive events with non-zero probability are never independent ($P(A \cap B) = 0 \ne P(A)P(B) > 0$). The two concepts are often confused. Use them as opposites of each other.

(a) 互斥(mutually exclusive)情形 M1·A1

互斥意味着 $A \cap B = \varnothing$,故 $P(A \cap B) = 0$。由加法公式(并 union): $$ P(A \cup B) \;=\; P(A) + P(B) - P(A \cap B) \;=\; 0.3 + 0.5 - 0 \;=\; \boxed{0.8}. $$

(b) 独立(independent)时的交(intersectionM1·A1

独立意味着 $P(A \cap B) = P(A)\,P(B) = 0.3 \cdot 0.5 = \boxed{0.15}.$

(c) 独立时的并 M1·A1

用 (b) 的交代入加法公式: $$ P(A \cup B) \;=\; 0.3 + 0.5 - 0.15 \;=\; \boxed{0.65}. $$
互斥 $\ne$ 独立。 概率非零的互斥事件永远不可能独立($P(A \cap B) = 0 \ne P(A)P(B) > 0$)。这两个概念常被混淆,应把它们视为对立的两种关系。
Q4MEDIUMPaper 1A4.6 Conditional[6 marks]

Two-way table: $50$ M + $50$ F; $65$ pass total; $30$ M pass. (a) $P(\text{Pass})$. (b) $P(\text{Pass}\mid\text{Male})$. (c) $P(\text{Female}\mid\text{Pass})$. (d) Independence?列联表:$50$ 男 + $50$ 女;共 $65$ 人通过;男生通过 $30$ 人。(a) $P(\text{Pass})$。(b) $P(\text{Pass}\mid\text{Male})$。(c) $P(\text{Female}\mid\text{Pass})$。(d) 是否独立?

Answers:答案:  (a) $\tfrac{13}{20} = 0.65$  ·  (b) $\tfrac{3}{5} = 0.6$  ·  (c) $\tfrac{7}{13}$  ·  (d) not independent不独立

(a) A1

$P(\text{Pass}) = 65/100 = \boxed{0.65}.$

(b) M1·A1

Condition on Male; restrict to the male row: $P(\text{Pass}\mid\text{Male}) = 30/50 = \boxed{0.6}.$

(c) M1·A1

Condition on Pass; restrict to the pass column: $P(\text{Female}\mid\text{Pass}) = 35/65 = \boxed{7/13} \approx 0.538.$

(d) Independence check R1

$P(\text{Pass} \cap \text{Male}) = 30/100 = 0.3$. $P(\text{Pass})\,P(\text{Male}) = 0.65 \cdot 0.5 = 0.325$. Since $0.3 \ne 0.325$, the two events are not independent.
Conditional restricts the universe. $P(B \mid A)$ reads "given $A$"; the denominator switches from $n(U) = 100$ to $n(A)$. Both numerator and denominator change.

(a) A1

$P(\text{Pass}) = 65/100 = \boxed{0.65}.$

(b) M1·A1

以 Male 为条件(条件概率 conditional probabilityP(A|B)):仅看男生这一行,$P(\text{Pass}\mid\text{Male}) = 30/50 = \boxed{0.6}.$

(c) M1·A1

以 Pass 为条件:仅看通过这一列,$P(\text{Female}\mid\text{Pass}) = 35/65 = \boxed{7/13} \approx 0.538.$

(d) 独立性检验 R1

$P(\text{Pass} \cap \text{Male}) = 30/100 = 0.3$,$P(\text{Pass})\,P(\text{Male}) = 0.65 \cdot 0.5 = 0.325$。由 $0.3 \ne 0.325$ 可知两事件不独立
条件概率改变样本范围。 $P(B \mid A)$ 读作 "在 $A$ 发生的条件下",分母从 $n(U) = 100$ 变为 $n(A)$。分子与分母都同时改变。
PART II  ·  PAPER 1 SECTION B — SOLUTIONS第二部分  ·  第一卷 B 节 —— 解析No calculator · 23 marks不可使用计算器 · 23 分

Section B — Extended SolutionsB 节 —— 长答题解析

Q5MEDIUMPaper 1B4.6 Tree[7 marks]

$4$ red, $6$ blue. Two draws without replacement.$4$ 红、$6$ 蓝。不放回抽取两次。

Answers:答案:  (b) $\tfrac{2}{15}$  ·  (c) $\tfrac{8}{15}$  ·  (d) $\tfrac{2}{3}$

(a) Tree diagram M1·A1

                  3/9  R   →  RR : 4/10 · 3/9 = 12/90 = 2/15
              ___/
       4/10  R
       _____/    \___ 6/9  B   →  RB : 4/10 · 6/9 = 24/90 = 4/15
      |
      |
      |_____ 6/10  B
              ___       4/9  R   →  BR : 6/10 · 4/9 = 24/90 = 4/15
                 \___/
                      5/9  B   →  BB : 6/10 · 5/9 = 30/90 = 1/3
      
Award M1 for the structure (two-stage, branch probabilities reduced by $1$ on the second draw because no replacement); A1 for correctly written conditional probabilities.

(b) $P(RR)$ M1·A1

$P(RR) = \dfrac{4}{10} \cdot \dfrac{3}{9} = \dfrac{12}{90} = \boxed{\dfrac{2}{15}}.$

(c) One of each colour M1·A1

$P(RB) + P(BR) = \dfrac{4}{10}\cdot\dfrac{6}{9} + \dfrac{6}{10}\cdot\dfrac{4}{9} = \dfrac{24}{90} + \dfrac{24}{90} = \dfrac{48}{90} = \boxed{\dfrac{8}{15}}.$

(d) $P(\text{second blue} \mid \text{first red})$ A1

Given the first ball was red, the bag now has $3$ red and $6$ blue ($9$ total). $P(B_2 \mid R_1) = \boxed{\dfrac{6}{9} = \dfrac{2}{3}}.$
Symmetry check. The two "one-of-each" paths $RB$ and $BR$ give identical probabilities here ($24/90$). This is always true when independence (or in this case, just symmetric-sample-space) is involved — useful as an arithmetic check.

(a) 树状图(tree diagramM1·A1

                  3/9  R   →  RR : 4/10 · 3/9 = 12/90 = 2/15
              ___/
       4/10  R
       _____/    \___ 6/9  B   →  RB : 4/10 · 6/9 = 24/90 = 4/15
      |
      |
      |_____ 6/10  B
              ___       4/9  R   →  BR : 6/10 · 4/9 = 24/90 = 4/15
                 \___/
                      5/9  B   →  BB : 6/10 · 5/9 = 30/90 = 1/3
      
给 M1 的依据是树的结构(两阶段;由于不放回,第二次抽取的分母减 $1$);A1 给的是正确写出的条件概率。

(b) $P(RR)$ M1·A1

$P(RR) = \dfrac{4}{10} \cdot \dfrac{3}{9} = \dfrac{12}{90} = \boxed{\dfrac{2}{15}}.$

(c) 一红一蓝 M1·A1

$P(RB) + P(BR) = \dfrac{4}{10}\cdot\dfrac{6}{9} + \dfrac{6}{10}\cdot\dfrac{4}{9} = \dfrac{24}{90} + \dfrac{24}{90} = \dfrac{48}{90} = \boxed{\dfrac{8}{15}}.$

(d) $P(\text{second blue} \mid \text{first red})$ A1

已知第一球为红,袋中剩 $3$ 红 $6$ 蓝(共 $9$ 个)。$P(B_2 \mid R_1) = \boxed{\dfrac{6}{9} = \dfrac{2}{3}}.$
对称性检查。 此处两条 "一红一蓝" 路径 $RB$ 与 $BR$ 给出相同的概率(均为 $24/90$)。只要涉及独立性(或本题这样的对称样本空间)这都成立,可作为计算的快速检查。
Q6HARDPaper 1B4.6 Venn / Independence[8 marks]

$n=100$; $|M|=65$, $|P|=50$, $|M \cap P|=25$. (a) Venn. (b) Neither. (c) Exactly one. (d) $P(P\mid M)$. (e) Independence.$n=100$;$|M|=65$、$|P|=50$、$|M \cap P|=25$。(a) 维恩图。(b) 两门都不选。(c) 恰选一门。(d) $P(P\mid M)$。(e) 是否独立。

Answers:答案:  (b) $10$  ·  (c) $0.65$  ·  (d) $\tfrac{5}{13}$  ·  (e) not independent不独立

(a) Venn diagram regions M1·A1

Centre (both): $25$. Math-only: $65 - 25 = 40$. Physics-only: $50 - 25 = 25$. Outside (neither): $100 - 40 - 25 - 25 = 10$.

(b) Neither A1

$|(M \cup P)^{\,\prime}| = 100 - |M \cup P| = 100 - (65 + 50 - 25) = 100 - 90 = \boxed{10}.$

(c) Exactly one M1·A1

$|M\text{-only}| + |P\text{-only}| = 40 + 25 = 65 \Rightarrow P = \boxed{65/100 = 0.65}.$

(d) $P(P \mid M)$ M1·A1

$$ P(P \mid M) \;=\; \frac{P(P \cap M)}{P(M)} \;=\; \frac{25/100}{65/100} \;=\; \frac{25}{65} \;=\; \boxed{\frac{5}{13}}. $$

(e) Independence test R1

$P(M \cap P) = 25/100 = 0.25$ vs $P(M)\,P(P) = 0.65 \cdot 0.5 = 0.325$. Since $0.25 \ne 0.325$, $M$ and $P$ are not independent. (Taking one subject makes the other less likely — the events are negatively associated here.)
Two routes to independence-check. Either compare $P(A \cap B)$ to $P(A)P(B)$, or compare $P(A \mid B)$ to $P(A)$. They are equivalent — pick whichever uses numbers already on the page.

(a) 维恩图(Venn diagram)的各区域 M1·A1

中间(两门都选):$25$。仅数学:$65 - 25 = 40$。仅物理:$50 - 25 = 25$。外部(两门都不选):$100 - 40 - 25 - 25 = 10$。

(b) 两门都不选 A1

$|(M \cup P)^{\,\prime}| = 100 - |M \cup P| = 100 - (65 + 50 - 25) = 100 - 90 = \boxed{10}.$

(c) 恰好选一门 M1·A1

$|M\text{-only}| + |P\text{-only}| = 40 + 25 = 65 \Rightarrow P = \boxed{65/100 = 0.65}.$

(d) $P(P \mid M)$ M1·A1

$$ P(P \mid M) \;=\; \frac{P(P \cap M)}{P(M)} \;=\; \frac{25/100}{65/100} \;=\; \frac{25}{65} \;=\; \boxed{\frac{5}{13}}. $$

(e) 独立性检验 R1

$P(M \cap P) = 25/100 = 0.25$,对照 $P(M)\,P(P) = 0.65 \cdot 0.5 = 0.325$。由 $0.25 \ne 0.325$ 知 $M$ 与 $P$ 不独立。(选了其中一门会让另一门变得更不可能 —— 两事件在此呈负相关。)
独立性检验的两种路径。 可以比较 $P(A \cap B)$ 与 $P(A)P(B)$,也可以比较 $P(A \mid B)$ 与 $P(A)$。二者等价 —— 哪种用到已经写在纸上的数字就用哪种。
Q7HARDPaper 1BAHL 4.10 Bayes[8 marks]HL

Two machines $A$ (60%), $B$ (40%); defect rates $2\%, 5\%$. (a) Tree. (b) $P(D)$. (c) $P(A \mid D)$. (d) Why $P(A\mid D) < P(A)$?两台机器 $A$(60%)、$B$(40%);不合格率分别为 $2\%, 5\%$。(a) 树状图。(b) $P(D)$。(c) $P(A \mid D)$。(d) 为何 $P(A\mid D) < P(A)$?

Answers:答案:  (b) $0.032$  ·  (c) $P(A\mid D) = \tfrac{3}{8} = 0.375$  ·  (d) $B$ has higher defect rate$B$ 的不合格率更高

(a) Tree diagram M1·A1

                  0.02  D    →  P(A∩D) = 0.6 · 0.02 = 0.012
              ___/
       0.6  A
       ____/    \___ 0.98  D′   →  P(A∩D′) = 0.588
      |
      |____ 0.4  B
              ___       0.05  D    →  P(B∩D) = 0.4 · 0.05 = 0.020
                 \___/
                      0.95  D′   →  P(B∩D′) = 0.380
      

(b) Law of total probability M1·A1

$$ P(D) \;=\; P(A)P(D\mid A) + P(B)P(D\mid B) \;=\; 0.6 \cdot 0.02 + 0.4 \cdot 0.05 \;=\; 0.012 + 0.020 \;=\; \boxed{0.032}. $$

(c) Bayes' Theorem M1·M1·A1

$$ P(A \mid D) \;=\; \frac{P(D \mid A)\,P(A)}{P(D)} \;=\; \frac{0.02 \cdot 0.6}{0.032} \;=\; \frac{0.012}{0.032} \;=\; \boxed{\frac{3}{8} = 0.375}. $$

(d) Why posterior $<$ prior R1

Machine $B$ has a defect rate ($5\%$) more than double Machine $A$'s ($2\%$). So among defective items, $B$ is over-represented relative to its $40\%$ share of output, which pulls $P(A \mid D)$ below $P(A) = 0.6$.
Bayes in one symbolic line. Memorise the equation $P(H \mid E) = \dfrac{P(E \mid H)\,P(H)}{P(E)}$ where $H$ is the hypothesis and $E$ is the evidence. Then $P(E)$ is always built from the law of total probability. The whole AA HL Bayes machinery sits in those two formulas.

(a) 树状图 M1·A1

                  0.02  D    →  P(A∩D) = 0.6 · 0.02 = 0.012
              ___/
       0.6  A
       ____/    \___ 0.98  D′   →  P(A∩D′) = 0.588
      |
      |____ 0.4  B
              ___       0.05  D    →  P(B∩D) = 0.4 · 0.05 = 0.020
                 \___/
                      0.95  D′   →  P(B∩D′) = 0.380
      

(b) 全概率公式(law of total probabilityM1·A1

$$ P(D) \;=\; P(A)P(D\mid A) + P(B)P(D\mid B) \;=\; 0.6 \cdot 0.02 + 0.4 \cdot 0.05 \;=\; 0.012 + 0.020 \;=\; \boxed{0.032}. $$

(c) 贝叶斯定理(Bayes' theoremM1·M1·A1

$$ P(A \mid D) \;=\; \frac{P(D \mid A)\,P(A)}{P(D)} \;=\; \frac{0.02 \cdot 0.6}{0.032} \;=\; \frac{0.012}{0.032} \;=\; \boxed{\frac{3}{8} = 0.375}. $$

(d) 为何后验(posterior)$<$ 先验(priorR1

机器 $B$ 的不合格率($5\%$)是 $A$($2\%$)的两倍多。所以在不合格品中,$B$ 的占比远大于其 $40\%$ 的产量份额,这把 $P(A \mid D)$ 拉到了低于 $P(A) = 0.6$ 的水平。
一行公式即贝叶斯。 记住 $P(H \mid E) = \dfrac{P(E \mid H)\,P(H)}{P(E)}$,其中 $H$ 是假设,$E$ 是证据。$P(E)$ 始终通过全概率公式构造。AA HL 全部贝叶斯题型都靠这两条公式。
PART III  ·  PAPER 2 — SOLUTIONS第三部分  ·  第二卷 —— 解析Calculator · 18 marks允许计算器 · 18 分

Section C — Paper 2 (Calculator) SolutionsC 节 —— 第二卷(允许计算器)解析

Q8EASYPaper 24.6 Two-Way Table[5 marks]

$200$ students; Y11 = 100 (60 AA, 40 AI); Y12 = 100 (70 AA, 30 AI).$200$ 名学生;11 年级 100 人(60 AA、40 AI);12 年级 100 人(70 AA、30 AI)。

Answers:答案:  (a) $0.5$  ·  (b) $0.7$  ·  (c) $0.3$  ·  (d) not independent不独立

(a)–(c) A1·A1·A1

$P(\text{Y11}) = 100/200 = 0.5$; $\ P(\text{AA}\mid\text{Y12}) = 70/100 = 0.7$; $\ P(\text{Y11}\cap\text{AA}) = 60/200 = 0.3$.

(d) Independence check M1·R1

$P(\text{Y11})\,P(\text{AA}) = 0.5 \cdot (130/200) = 0.5 \cdot 0.65 = 0.325$. Compare with $P(\text{Y11}\cap\text{AA}) = 0.3$. Since $0.3 \ne 0.325$, the events are not independent.
Marginal probabilities first. Always read marginal $P(\text{AA}) = 130/200$ from the column total before comparing with conditionals. The full marginal/joint/conditional set is what tells you whether independence holds.

(a)–(c) A1·A1·A1

$P(\text{Y11}) = 100/200 = 0.5$;$\ P(\text{AA}\mid\text{Y12}) = 70/100 = 0.7$;$\ P(\text{Y11}\cap\text{AA}) = 60/200 = 0.3$。

(d) 独立性检验 M1·R1

$P(\text{Y11})\,P(\text{AA}) = 0.5 \cdot (130/200) = 0.5 \cdot 0.65 = 0.325$。 与 $P(\text{Y11}\cap\text{AA}) = 0.3$ 比较,$0.3 \ne 0.325$,因此不独立
先读边际概率。 在与条件概率比较之前,务必先从列总计读出边际概率 $P(\text{AA}) = 130/200$。要回答 "是否独立",需要把边际、联合、条件三套概率都列齐。
Q9MEDIUMPaper 24.6 Sequential Draws[7 marks]

$5$ white, $3$ black, no replacement.$5$ 白、$3$ 黑,不放回。

Answers:答案:  (a) $\tfrac{5}{14}$  ·  (b) $\tfrac{9}{14}$  ·  (c) $\tfrac{5}{7}$  ·  (d) $\tfrac{1}{2}$

(a) M1·A1

$P(WW) = \dfrac{5}{8} \cdot \dfrac{4}{7} = \dfrac{20}{56} = \boxed{\dfrac{5}{14}}.$

(b) Complement M1·A1

$P(\ge 1 \text{ black in 2 draws}) = 1 - P(WW) = 1 - \dfrac{5}{14} = \boxed{\dfrac{9}{14}}.$

(c) Conditional M1·A1

After drawing a black, the bag has $5$ white and $2$ black ($7$ total). $P(W_2 \mid B_1) = \boxed{\dfrac{5}{7}}.$

(d) Third draw A1

Two whites already drawn — bag now has $3$ white and $3$ black. $P(B_3 \mid W_1 \cap W_2) = \dfrac{3}{6} = \boxed{\dfrac{1}{2}}.$
Without replacement = condition on history. Each new draw updates the bag composition. Track the conditional bag state, not the original counts.

(a) M1·A1

$P(WW) = \dfrac{5}{8} \cdot \dfrac{4}{7} = \dfrac{20}{56} = \boxed{\dfrac{5}{14}}.$

(b) 余事件 M1·A1

$P(\ge 1 \text{ black in 2 draws}) = 1 - P(WW) = 1 - \dfrac{5}{14} = \boxed{\dfrac{9}{14}}.$

(c) 条件概率 M1·A1

抽出一个黑球后,袋中剩 $5$ 白 $2$ 黑(共 $7$ 个)。$P(W_2 \mid B_1) = \boxed{\dfrac{5}{7}}.$

(d) 第三次抽取 A1

已抽出两个白球,袋中剩 $3$ 白 $3$ 黑。$P(B_3 \mid W_1 \cap W_2) = \dfrac{3}{6} = \boxed{\dfrac{1}{2}}.$
不放回(without replacement)= 以历史为条件。 每抽一次袋中组成就更新。要追踪条件下的袋内状态,而不是原始数量。
Q10HARDPaper 24.6 Tree + Conditional[6 marks]

$P(B) = 0.7$ bus, $P(W) = 0.3$ walks; $P(OT\mid B) = 0.85$, $P(OT\mid W) = 0.6$.$P(B) = 0.7$ 乘公交,$P(W) = 0.3$ 步行;$P(OT\mid B) = 0.85$、$P(OT\mid W) = 0.6$。

Answers:答案:  (a) $0.775$  ·  (b) $0.595$  ·  (c) $\tfrac{119}{155} \approx 0.768$

(a) Law of total probability M1·M1·A1

$$ P(OT) \;=\; P(B)P(OT \mid B) + P(W)P(OT \mid W) \;=\; 0.7 \cdot 0.85 + 0.3 \cdot 0.6 \;=\; 0.595 + 0.18 \;=\; \boxed{0.775}. $$

(b) Joint A1

$P(B \cap OT) = P(B) \cdot P(OT \mid B) = 0.7 \cdot 0.85 = \boxed{0.595}.$

(c) Reverse conditional M1·A1

$$ P(B \mid OT) \;=\; \frac{P(B \cap OT)}{P(OT)} \;=\; \frac{0.595}{0.775} \;=\; \frac{119}{155} \;\approx\; \boxed{0.768}. $$ (Sanity: $P(B \mid OT) > P(B) = 0.7$ — yes, because the bus has a higher on-time rate, so knowing she was on time raises the posterior on the bus.)
Reverse-conditional = mini-Bayes. $P(B \mid OT) = \dfrac{P(OT \mid B)P(B)}{P(OT)}$. At SL this is framed as "two-way reading of the tree"; at HL it's exactly Bayes' Theorem. Same formula either way.

(a) 全概率公式 M1·M1·A1

$$ P(OT) \;=\; P(B)P(OT \mid B) + P(W)P(OT \mid W) \;=\; 0.7 \cdot 0.85 + 0.3 \cdot 0.6 \;=\; 0.595 + 0.18 \;=\; \boxed{0.775}. $$

(b) 联合概率 A1

$P(B \cap OT) = P(B) \cdot P(OT \mid B) = 0.7 \cdot 0.85 = \boxed{0.595}.$

(c) 反向条件概率 M1·A1

$$ P(B \mid OT) \;=\; \frac{P(B \cap OT)}{P(OT)} \;=\; \frac{0.595}{0.775} \;=\; \frac{119}{155} \;\approx\; \boxed{0.768}. $$ (直觉检查:$P(B \mid OT) > P(B) = 0.7$ —— 没错,因为公交的准时率更高,所以已知 "准时" 会拉高 "乘公交" 的后验概率。)
反向条件概率 = 小型贝叶斯。 $P(B \mid OT) = \dfrac{P(OT \mid B)P(B)}{P(OT)}$。在 SL 阶段它叫做 "树状图反向读";在 HL 阶段它就是贝叶斯定理。两边其实是同一公式。
PART IV  ·  PAPER 3 (HL) — SOLUTIONS第四部分  ·  第三卷(HL) —— 解析Calculator · 28 marks允许计算器 · 28 分

Section D — Paper 3 (HL Extended) SolutionsD 节 —— 第三卷(HL 长题探究)解析

Q11HARDPaper 3AHL 4.10 Multi-Source Bayes[12 marks]HL

Three regions; prevalence/sensitivity/specificity differ; find regional posteriors then a population-weighted posterior.三个地区;患病率/灵敏度/特异度各不相同;先求各地区后验概率,再求按人口加权的后验概率。

Answers:答案:  (a) $P(D \mid +)_A = \tfrac{1}{3} \approx 0.333$  ·  (b) B: $0.400$; C: $0.286$  ·  (d) $P(B \mid +) \approx 0.515$

Statement of Bayes (used throughout) R1

$$ P(D \mid +) \;=\; \frac{P(+ \mid D)\,P(D)}{P(+)}, \qquad P(+) \;=\; P(+\mid D)P(D) + P(+\mid D^{\,\prime})P(D^{\,\prime}). $$ Note $P(+ \mid D^{\,\prime}) = 1 - \text{specificity}$ (the false-positive rate).

(a) Region A M1·A1·A1

$P(D) = 0.05$, $P(+\mid D) = 0.95$, $P(+\mid D^{\,\prime}) = 1 - 0.90 = 0.10$. $$ P(+) \;=\; 0.05 \cdot 0.95 + 0.95 \cdot 0.10 \;=\; 0.0475 + 0.095 \;=\; 0.1425. $$ $$ P(D \mid +)_A \;=\; \frac{0.0475}{0.1425} \;=\; \boxed{\frac{1}{3} \approx 0.333}. $$

(b) Regions B and C M1·A1·A1

Region B: $P(+) = 0.10 \cdot 0.90 + 0.90 \cdot 0.15 = 0.09 + 0.135 = 0.225$. $$ P(D \mid +)_B \;=\; \frac{0.09}{0.225} \;=\; \boxed{0.400}. $$ Region C: $P(+) = 0.02 \cdot 0.98 + 0.98 \cdot 0.05 = 0.0196 + 0.049 = 0.0686$. $$ P(D \mid +)_C \;=\; \frac{0.0196}{0.0686} \;\approx\; \boxed{0.286}. $$

(c) Low-prevalence paradox R1·R1

Region C has only $2\%$ prevalence, so even a highly accurate test produces many false positives by sheer volume: $5\%$ of the $98\%$ healthy population $\approx 4.9\%$ test positive falsely, which dwarfs the $2\% \cdot 98\% = 1.96\%$ true positives. Result: most positives in Region C are false. Posterior probability depends on prevalence, not test accuracy alone.

(d) Population-weighted posterior — was the positive person from B? M1·M1·A1·A1

Use $P(+ \mid \text{region}) = $ values computed above:
Region$P(\text{Region})$$P(+\mid \text{Region})$Joint $P(\text{Region}\cap +)$
A$0.40$$0.1425$$0.0570$
B$0.35$$0.225$$0.07875$
C$0.25$$0.0686$$0.01715$
$P(+)$$0.1529$
$$ P(B \mid +) \;=\; \frac{P(B \cap +)}{P(+)} \;=\; \frac{0.07875}{0.1529} \;\approx\; \boxed{0.515}. $$
Three layers of conditioning. Part (d) compounds two Bayes layers: within a region you mix disease vs no-disease; across regions you mix the regional populations. The general rule: when posterior depends on a hierarchy of mixtures, build the joint distribution first (the table above), then condition on the observed evidence.

贝叶斯公式陈述(贯穿全题使用) R1

$$ P(D \mid +) \;=\; \frac{P(+ \mid D)\,P(D)}{P(+)}, \qquad P(+) \;=\; P(+\mid D)P(D) + P(+\mid D^{\,\prime})P(D^{\,\prime}). $$ 注意 $P(+ \mid D^{\,\prime}) = 1 - \text{specificity}$,即假阳性率(false positive rate)。

(a) 地区 A M1·A1·A1

$P(D) = 0.05$、$P(+\mid D) = 0.95$、$P(+\mid D^{\,\prime}) = 1 - 0.90 = 0.10$。 $$ P(+) \;=\; 0.05 \cdot 0.95 + 0.95 \cdot 0.10 \;=\; 0.0475 + 0.095 \;=\; 0.1425. $$ $$ P(D \mid +)_A \;=\; \frac{0.0475}{0.1425} \;=\; \boxed{\frac{1}{3} \approx 0.333}. $$

(b) 地区 B 与 C M1·A1·A1

地区 B:$P(+) = 0.10 \cdot 0.90 + 0.90 \cdot 0.15 = 0.09 + 0.135 = 0.225$。 $$ P(D \mid +)_B \;=\; \frac{0.09}{0.225} \;=\; \boxed{0.400}. $$ 地区 C:$P(+) = 0.02 \cdot 0.98 + 0.98 \cdot 0.05 = 0.0196 + 0.049 = 0.0686$。 $$ P(D \mid +)_C \;=\; \frac{0.0196}{0.0686} \;\approx\; \boxed{0.286}. $$

(c) 低患病率悖论 R1·R1

地区 C 患病率仅 $2\%$,所以即便检测准确度很高,从绝对数量上看仍会产生大量假阳性:$98\%$ 健康人群中 $5\%$ 检测为假阳性 $\approx 4.9\%$,远超 $2\% \cdot 98\% = 1.96\%$ 的真阳性。结果是地区 C 的阳性结果大多为假阳性。后验概率取决于患病率,而不仅仅取决于检测准确度。

(d) 按人口加权的后验 —— 阳性者来自地区 B 的概率? M1·M1·A1·A1

使用上面算出的 $P(+ \mid \text{region})$ 值:
地区$P(\text{Region})$$P(+\mid \text{Region})$联合 $P(\text{Region}\cap +)$
A$0.40$$0.1425$$0.0570$
B$0.35$$0.225$$0.07875$
C$0.25$$0.0686$$0.01715$
$P(+)$$0.1529$
$$ P(B \mid +) \;=\; \frac{P(B \cap +)}{P(+)} \;=\; \frac{0.07875}{0.1529} \;\approx\; \boxed{0.515}. $$
三层条件化。 (d) 把两层贝叶斯叠加:同一地区内混合患病与未患病;地区混合各地区人口分布。一般法则:当后验依赖于层层混合时,先把联合分布列出来(见上表),再以观测到的证据为条件。
Q12HARDPaper 3AHL 4.10 Sequential Bayes[16 marks]HL

Rare-marker screening, $P(D) = 0.01$, sens $0.99$, spec $0.95$, conditionally independent retests. Updates the posterior across one then two positives, then explores false-alarm rate and a weak-spec scenario.罕见标志物筛查,$P(D) = 0.01$、灵敏度 $0.99$、特异度 $0.95$,复检条件独立。先后用一次、两次阳性更新后验,并探究假阳性率与较低特异度的情景。

Answers:答案:  (b) $P(D \mid +) = \tfrac{1}{6} \approx 0.167$  ·  (c) $P(D \mid +,+) \approx 0.798$  ·  (e) $P(+,+ \mid D^{\,\prime}) = 0.0025$  ·  (f) $n = 4$, posterior后验 $\approx 0.990$

(a) Statement of Bayes + identification R1·A1

$$ P(D \mid +) \;=\; \frac{P(+ \mid D)\,P(D)}{P(+)}, \qquad P(+) \;=\; P(+\mid D)P(D) + P(+\mid D^{\,\prime})P(D^{\,\prime}). $$ For this scenario: $P(D) = 0.01$, $P(+\mid D) = 0.99$ (sensitivity), $P(+\mid D^{\,\prime}) = 1 - 0.95 = 0.05$ (false-positive rate). So $P(D^{\,\prime}) = 0.99$.

(b) Posterior after one positive M1·M1·A1

$$ P(+) \;=\; (0.99)(0.01) + (0.05)(0.99) \;=\; 0.0099 + 0.0495 \;=\; 0.0594. $$ $$ P(D \mid +) \;=\; \frac{0.0099}{0.0594} \;=\; \boxed{\frac{1}{6} \approx 0.167}. $$ Even with $99\%$ sensitivity, the posterior only reaches $\sim 17\%$ — the $1\%$ base rate dominates after one positive.

(c) Posterior after two consecutive positives M1·M1·A1

New prior $P(D) = 1/6$, so $P(D^{\,\prime}) = 5/6$. Apply Bayes again with the same test characteristics: $$ P(+) \;=\; (0.99)\!\cdot\!\tfrac{1}{6} + (0.05)\!\cdot\!\tfrac{5}{6} \;=\; \tfrac{0.99 + 0.25}{6} \;=\; \tfrac{1.24}{6} \;\approx\; 0.2067. $$ $$ P(D \mid +,+) \;=\; \frac{0.99 \cdot \tfrac{1}{6}}{1.24/6} \;=\; \frac{0.99}{1.24} \;\approx\; \boxed{0.798}. $$ Direct sanity check via the original prior: $$ P(D \mid +,+) \;=\; \frac{P(D)\,[P(+\mid D)]^{2}}{P(D)[P(+\mid D)]^{2} + P(D^{\,\prime})[P(+\mid D^{\,\prime})]^{2}} \;=\; \frac{0.01 \cdot 0.9801}{0.01 \cdot 0.9801 + 0.99 \cdot 0.0025} \;\approx\; 0.798. \;\checkmark $$

(d) Why one positive $\to$ two positives is so dramatic R1·R1

Bayes is multiplicative in odds: each independent positive multiplies the prior odds by the likelihood ratio $\text{LR}^{+} = \dfrac{P(+\mid D)}{P(+\mid D^{\,\prime})} = \dfrac{0.99}{0.05} = 19.8$. Prior odds $0.01/0.99 \approx 1/99$. One positive: odds $\to 19.8/99 \approx 0.2$ (so probability $\approx 1/6$). Two positives: odds $\to 19.8^{2}/99 \approx 3.96$ (probability $\approx 0.798$). The second positive doesn't "add" — it multiplies, and a $\approx 20\times$ amplifier applied twice gives a $\approx 400\times$ shift in the odds.

(e) False-alarm rate for two consecutive positives M1·A1·R1

Conditional independence given non-disease: $$ P(+,+ \mid D^{\,\prime}) \;=\; [P(+\mid D^{\,\prime})]^{2} \;=\; (0.05)^{2} \;=\; \boxed{0.0025}. $$ Screening interpretation. In a population of $100{,}000$ healthy patients screened twice, expect $\approx 250$ to test positive twice in a row purely by chance — still rare per individual ($0.25\%$), but a sizeable absolute number once the screen is large, so two-positive protocols need a confirmatory third assay or follow-up.

(f) Weaker specificity $0.90$ — smallest $n$ for posterior $> 0.95$ M1·M1·A1

Now $P(+\mid D^{\,\prime}) = 0.10$, so $\text{LR}^{+} = 0.99/0.10 = 9.9$. Posterior odds after $n$ independent positives: $$ \text{odds}_n \;=\; \text{prior odds} \times (\text{LR}^{+})^{n} \;=\; \frac{0.01}{0.99} \cdot (9.9)^{n} \;=\; \frac{(9.9)^{n}}{99}. $$ Want posterior prob $> 0.95 \Leftrightarrow$ odds $> 19$: $$ \frac{(9.9)^{n}}{99} > 19 \;\Longleftrightarrow\; (9.9)^{n} > 1881 \;\Longleftrightarrow\; n > \frac{\log 1881}{\log 9.9} \approx \frac{3.274}{0.9956} \approx 3.29. $$ Iterate on GDC:
$n$oddsposterior
$3$$9.9^{3}/99 \approx 9.80$$\approx 0.907$ (fails)
$4$$9.9^{4}/99 \approx 97.03$$\approx \boxed{0.990}$ (succeeds)
Smallest $n = \boxed{4}$, with posterior $\approx 0.990$.
The screening-test design heuristic. Doubling the specificity (lowering false positives) buys you posterior power much faster than improving sensitivity, because in low-prevalence settings the denominator $P(+)$ is dominated by false positives from the healthy majority. That's why two-stage screens (a cheap sensitive test, then an expensive specific confirmatory test) outperform any single test at fixed cost — they engineer the multiplication in (d) on purpose.

(a) 贝叶斯公式陈述 + 量的识别 R1·A1

$$ P(D \mid +) \;=\; \frac{P(+ \mid D)\,P(D)}{P(+)}, \qquad P(+) \;=\; P(+\mid D)P(D) + P(+\mid D^{\,\prime})P(D^{\,\prime}). $$ 本情境下:$P(D) = 0.01$、$P(+\mid D) = 0.99$(灵敏度 sensitivity)、$P(+\mid D^{\,\prime}) = 1 - 0.95 = 0.05$(假阳性率 false positive rate)。因此 $P(D^{\,\prime}) = 0.99$。

(b) 一次阳性后的后验 M1·M1·A1

$$ P(+) \;=\; (0.99)(0.01) + (0.05)(0.99) \;=\; 0.0099 + 0.0495 \;=\; 0.0594. $$ $$ P(D \mid +) \;=\; \frac{0.0099}{0.0594} \;=\; \boxed{\frac{1}{6} \approx 0.167}. $$ 即便灵敏度高达 $99\%$,一次阳性后后验也只到约 $17\%$ —— $1\%$ 的基率(base rate)依然占主导。

(c) 连续两次阳性后的后验 M1·M1·A1

新先验 $P(D) = 1/6$,因此 $P(D^{\,\prime}) = 5/6$。用同样的检测参数再做一次贝叶斯: $$ P(+) \;=\; (0.99)\!\cdot\!\tfrac{1}{6} + (0.05)\!\cdot\!\tfrac{5}{6} \;=\; \tfrac{0.99 + 0.25}{6} \;=\; \tfrac{1.24}{6} \;\approx\; 0.2067. $$ $$ P(D \mid +,+) \;=\; \frac{0.99 \cdot \tfrac{1}{6}}{1.24/6} \;=\; \frac{0.99}{1.24} \;\approx\; \boxed{0.798}. $$ 从原始先验直接核对: $$ P(D \mid +,+) \;=\; \frac{P(D)\,[P(+\mid D)]^{2}}{P(D)[P(+\mid D)]^{2} + P(D^{\,\prime})[P(+\mid D^{\,\prime})]^{2}} \;=\; \frac{0.01 \cdot 0.9801}{0.01 \cdot 0.9801 + 0.99 \cdot 0.0025} \;\approx\; 0.798. \;\checkmark $$

(d) 为何 "一次阳性 $\to$ 两次阳性" 跃升如此剧烈 R1·R1

贝叶斯定理在赔率odds)上是乘性的:每多一次独立阳性,先验赔率就乘以似然比(likelihood ratio)$\text{LR}^{+} = \dfrac{P(+\mid D)}{P(+\mid D^{\,\prime})} = \dfrac{0.99}{0.05} = 19.8$。先验赔率为 $0.01/0.99 \approx 1/99$。一次阳性:赔率 $\to 19.8/99 \approx 0.2$(对应概率 $\approx 1/6$)。两次阳性:赔率 $\to 19.8^{2}/99 \approx 3.96$(概率 $\approx 0.798$)。第二次阳性不是 "加",而是 "乘" —— $\approx 20\times$ 的放大器连用两次,对赔率的总放大约为 $\approx 400\times$。

(e) 连续两次阳性的假阳性率 M1·A1·R1

在未患病的条件下两次检测条件独立: $$ P(+,+ \mid D^{\,\prime}) \;=\; [P(+\mid D^{\,\prime})]^{2} \;=\; (0.05)^{2} \;=\; \boxed{0.0025}. $$ 筛查解读:在 $100{,}000$ 名两次受检的健康人中,预计约 $250$ 人会纯属偶然地连续两次阳性 —— 对个人而言仍属罕见($0.25\%$),但筛查规模一大绝对数量就显著,因此两次阳性的方案需要再做第三次确证或随访。

(f) 较弱特异度 $0.90$ —— 使后验 $> 0.95$ 的最小 $n$ M1·M1·A1

此时 $P(+\mid D^{\,\prime}) = 0.10$,因此 $\text{LR}^{+} = 0.99/0.10 = 9.9$。$n$ 次独立阳性后的后验赔率: $$ \text{odds}_n \;=\; \text{prior odds} \times (\text{LR}^{+})^{n} \;=\; \frac{0.01}{0.99} \cdot (9.9)^{n} \;=\; \frac{(9.9)^{n}}{99}. $$ 要后验概率 $> 0.95$ 等价于赔率 $> 19$: $$ \frac{(9.9)^{n}}{99} > 19 \;\Longleftrightarrow\; (9.9)^{n} > 1881 \;\Longleftrightarrow\; n > \frac{\log 1881}{\log 9.9} \approx \frac{3.274}{0.9956} \approx 3.29. $$ 用 GDC 迭代验证:
$n$赔率后验
$3$$9.9^{3}/99 \approx 9.80$$\approx 0.907$(未达标)
$4$$9.9^{4}/99 \approx 97.03$$\approx \boxed{0.990}$(达标)
最小 $n = \boxed{4}$,对应后验 $\approx 0.990$。
筛查检测的设计经验。 在低患病率场景下,提高特异度(降低假阳性)对后验能力的提升远快于提升灵敏度,因为分母 $P(+)$ 由健康多数中的假阳性主导。这就是为什么两阶段筛查(先做廉价的高灵敏度初筛,再做昂贵的高特异度确证)在同等成本下总能优于任何单一检测 —— 它们正是有意制造 (d) 中的乘性放大。