Unit 4: Data Collections
单元 4:数据集合
Arrays, ArrayLists, 2D arrays, text files, wrapper classes, searching & sorting algorithms, and recursion.
数组、ArrayList、二维数组、文本文件、包装类、查找与排序算法,以及递归。
Ethical and Social Issues Around Data Collection
数据收集中的伦理与社会问题
Before diving into data structures, consider the ethical implications of collecting and storing data. Every program that handles personal information must prioritize user privacy.
在深入数据结构之前,先思考收集和存储数据所带来的伦理影响。任何处理个人信息的程序都必须把用户隐私放在首位。
Privacy Risks: When using a computer, personal privacy is at risk. Programmers should attempt to safeguard the personal privacy of users when developing new programs.
Algorithmic Bias: Systemic and repeated errors in a program that create unfair outcomes for a specific group of users. Programmers should be aware of the data set collection method and the potential for bias.
Data Quality: Some data sets are incomplete or contain inaccurate data. Using such data can cause programs to work incorrectly or inefficiently.
隐私风险:使用计算机时,个人隐私存在被侵犯的风险。开发新程序时,程序员应尽力保护用户的个人隐私。
算法偏见:程序中存在的系统性、重复性错误,会对特定用户群体产生不公平的结果。程序员应了解数据集的收集方式以及潜在的偏见。
数据质量:有些数据集不完整或包含不准确的数据。使用这类数据可能导致程序运行不正确或效率低下。
Introduction to Using Data Sets
使用数据集入门
A data set is a collection of specific pieces of information or data. Data sets can be manipulated and analyzed to solve a problem or answer a question.
数据集是一组具体的信息或数据。我们可以对数据集进行操作和分析,以解决问题或回答某个问题。
Processing Data: When analyzing data sets, values within the set are accessed and utilized one at a time, then processed according to the desired outcome.
Visual Planning: Data can be represented in a diagram by using a chart or table. This visual can be used to plan the algorithm that will be used to manipulate the data.
处理数据:分析数据集时,集合中的值会被逐个访问、逐个使用,然后按所需结果进行处理。
可视化规划:可以用图表或表格把数据画出来。这种可视化形式可以用来规划接下来操作数据所要使用的算法。
Array Creation and Access
数组的创建与访问
An array stores multiple values of the same type. The values can be either primitive values or object references. An array's length is established at creation and cannot be changed.
数组(array)存储多个相同类型的值。这些值既可以是基本类型的值,也可以是对象引用。数组的长度(length)在创建时就确定,之后不能改变。
Creating Arrays
创建数组
// Method 1: Using the new keyword// 方法 1:使用 new 关键字 int[] numbers = new int[5]; // {0, 0, 0, 0, 0} double[] grades = new double[3]; // {0.0, 0.0, 0.0} boolean[] flags = new boolean[4]; // {false, false, false, false} String[] names = new String[3]; // {null, null, null} // Method 2: Using an initializer list// 方法 2:使用初始化列表 int[] primes = {2, 3, 5, 7, 11}; String[] days = {"Mon", "Tue", "Wed"};
new, elements are initialized to defaults: int → 0, double → 0.0, boolean → false, reference types → null.
new 创建时,元素会被初始化(initialization)为默认值:int → 0,double → 0.0,boolean → false,引用类型 → null。
Accessing & Modifying Elements
访问与修改元素
int[] arr = {10, 20, 30, 40, 50}; // Access using index (0-based)// 使用索引访问(从 0 开始) int first = arr[0]; // 10 int last = arr[4]; // 50 // Modify an element// 修改一个元素 arr[2] = 99; // arr is now {10, 20, 99, 40, 50}// arr 现在是 {10, 20, 99, 40, 50} // Array length (attribute, not method!)// 数组长度(是字段,不是方法!) int len = arr.length; // 5
0 through array.length - 1. Accessing an index outside this range throws an ArrayIndexOutOfBoundsException.
array index)范围是 0 到 array.length - 1。访问该范围之外的索引会抛出越界(out-of-bounds)异常 ArrayIndexOutOfBoundsException。
arr[2] after int[] arr = new int[5];?int[] arr = new int[5]; 之后,arr[2] 的值是多少?int arrays default to 0. Index 2 is valid (0–4), so the value is 0.int 数组的默认值为 0。索引 2 在合法范围内(0–4),因此值为 0。int array is created with new, all elements default to 0.new 创建 int 数组时,所有元素的默认值都是 0。length on arrays vs length() on Strings and ArrayLists
Arrays use arr.length, no parentheses, because length is a public field, not a method. Strings use s.length(); ArrayLists use list.size(). Writing arr.length() or s.length is a compile error and shows up on MCQs as a "which line fails to compile?" trap.
length 与 String/ArrayList 的 length()
数组使用 arr.length,没有括号,因为 length 是公有字段而不是方法。String 使用 s.length();ArrayList 使用 list.size()。写成 arr.length() 或 s.length 都是编译错误,常以"下列哪一行无法编译?"的形式出现在选择题中。
Array Traversals
数组遍历
Traversing an array means using repetition statements to access all or an ordered sequence of elements.
遍历(traverse)数组是指用循环语句访问数组的所有元素或一段有序的元素。
Indexed for Loop
带索引的 for 循环
int[] arr = {3, 7, 1, 9, 5}; for (int i = 0; i < arr.length; i++) { System.out.print(arr[i] + " "); } // Output: 3 7 1 9 5// 输出:3 7 1 9 5
Enhanced for Loop (for-each)
增强 for 循环(enhanced for loop / for-each)
for (int val : arr) { System.out.print(val + " "); } // Output: 3 7 1 9 5// 输出:3 7 1 9 5
| Feature | Indexed for Loop | Enhanced for Loop |
|---|---|---|
| Access by index | ✅ Yes | ❌ No |
| Modify array elements | ✅ Yes | ❌ No (copies only) |
| Traverse in reverse | ✅ Yes | ❌ No |
| Traverse partial array | ✅ Yes | ❌ No (full traversal) |
| Simpler syntax | ❌ More verbose | ✅ Cleaner |
| 特性 | 带索引的 for 循环 | 增强 for 循环 |
|---|---|---|
| 通过索引访问 | ✅ 可以 | ❌ 不可以 |
| 修改数组元素 | ✅ 可以 | ❌ 不可以(只是副本) |
| 反向遍历 | ✅ 可以 | ❌ 不可以 |
| 遍历部分数组 | ✅ 可以 | ❌ 不可以(必须完整遍历) |
| 语法更简洁 | ❌ 较冗长 | ✅ 更简洁 |
int[] arr = {1, 2, 3}; // Doesn't work, val is a copy// 不起作用,val 是副本 for (int val : arr) { val *= 2; } // arr is still {1, 2, 3}// arr 仍是 {1, 2, 3} // Works, indexed for assigns through the array reference// 有效,带索引的 for 通过数组引用赋值 for (int i = 0; i < arr.length; i++) { arr[i] *= 2; } // arr is now {2, 4, 6}// arr 现在是 {2, 4, 6}
Why this matters: the enhanced for loop binds a copy of each value (for primitives) or a copy of each reference (for objects) to the loop variable. Assigning to that variable replaces the local copy and nothing else. To mutate the array itself, you need the index so you can write through arr[i] = ....
Object arrays are different: if the array holds object references, the enhanced for variable holds a reference to the same object. Calling a mutator method on it (e.g., account.deposit(10)) does change the object the array still points to. Only reassigning the variable (account = new Account()) leaves the array unchanged.
int[] arr = {1, 2, 3}; // Doesn't work, val is a copy// 不起作用,val 是副本 for (int val : arr) { val *= 2; } // arr is still {1, 2, 3}// arr 仍是 {1, 2, 3} // Works, indexed for assigns through the array reference// 有效,带索引的 for 通过数组引用赋值 for (int i = 0; i < arr.length; i++) { arr[i] *= 2; } // arr is now {2, 4, 6}// arr 现在是 {2, 4, 6}
为什么这很重要:增强 for 循环把每个值的副本(基本类型)或每个引用的副本(对象类型)绑定到循环变量上。给该变量赋值只会替换本地的副本,对其他东西没有影响。要真正修改数组本身,必须通过索引写入 arr[i] = ...。
对象数组则不同:如果数组存的是对象引用,增强 for 的循环变量持有的是同一对象的引用。调用其上的修改方法(例如 account.deposit(10))会改变数组所指向的同一个对象。只有重新赋值变量(account = new Account())时,数组才不会受到影响。
Implementing Array Algorithms
实现数组算法
Standard algorithms that traverse arrays to accomplish common tasks are fundamental to AP CSA.
通过遍历数组完成常见任务的标准算法,是 AP CSA 的基础内容。
Find Minimum / Maximum求最小值 / 最大值
public static int findMax(int[] arr) { int max = arr[0]; for (int val : arr) { if (val > max) max = val; } return max; }
Compute Sum / Average计算总和 / 平均值
public static double average(int[] arr) { int sum = 0; for (int val : arr) { sum += val; } return (double) sum / arr.length; }
Count / Check Elements with a Property统计 / 判断具有某种性质的元素
// Count elements greater than a threshold// 统计大于阈值的元素 public static int countAbove(int[] arr, int threshold) { int count = 0; for (int val : arr) { if (val > threshold) count++; } return count; } // Check if ALL elements are positive// 判断是否所有元素都为正 public static boolean allPositive(int[] arr) { for (int val : arr) { if (val <= 0) return false; } return true; } // Check if ANY element is negative// 判断是否存在负数元素 public static boolean hasNegative(int[] arr) { for (int val : arr) { if (val < 0) return true; } return false; }
Shift / Rotate / Reverse Elements元素的平移 / 循环移位 / 反转
// Shift left (first element lost, last becomes 0)// 左移(首元素丢失,末位变为 0) for (int i = 0; i < arr.length - 1; i++) { arr[i] = arr[i + 1]; } arr[arr.length - 1] = 0; // Reverse the array// 反转数组 for (int i = 0; i < arr.length / 2; i++) { int temp = arr[i]; arr[i] = arr[arr.length - 1 - i]; arr[arr.length - 1 - i] = temp; }
Detect Duplicates / Consecutive Pairs检测重复元素 / 相邻元素对
// Check for duplicates// 检测重复元素 public static boolean hasDuplicates(int[] arr) { for (int i = 0; i < arr.length; i++) { for (int j = i + 1; j < arr.length; j++) { if (arr[i] == arr[j]) return true; } } return false; } // Access consecutive pairs// 访问相邻元素对 for (int i = 0; i < arr.length - 1; i++) { System.out.println(arr[i] + " and " + arr[i + 1]); }
Using Text Files
使用文本文件
A file provides persistent storage, data survives after the program ends. Java uses the File and Scanner classes to read from text files.
文件提供持久化存储,数据在程序结束后依然保留。Java 使用 File 和 Scanner 类来读取文本文件。
Reading from a File
从文件读取内容
import java.io.File; import java.io.IOException; import java.util.Scanner; public static void readFile() throws IOException { File f = new File("data.txt"); Scanner sc = new Scanner(f); while (sc.hasNext()) { String line = sc.nextLine(); System.out.println(line); } sc.close(); }
| Method | Returns | Description |
|---|---|---|
nextInt() | int | Reads next integer |
nextDouble() | double | Reads next double |
nextBoolean() | boolean | Reads next boolean |
nextLine() | String | Reads next full line |
next() | String | Reads next token (word) |
hasNext() | boolean | True if more data remains |
close() | void | Closes the scanner/file |
| 方法 | 返回值 | 说明 |
|---|---|---|
nextInt() | int | 读取下一个整数 |
nextDouble() | double | 读取下一个 double 值 |
nextBoolean() | boolean | 读取下一个布尔值 |
nextLine() | String | 读取下一整行 |
next() | String | 读取下一个 token(单词) |
hasNext() | boolean | 是否还有更多数据 |
close() | void | 关闭扫描器/文件 |
The split() Method
split() 方法
String line = "Alice,90,85,92"; String[] parts = line.split(","); // parts = {"Alice", "90", "85", "92"}// parts = {"Alice", "90", "85", "92"}
File class, you must add throws IOException to the method header. The File and IOException classes are in java.io and must be imported.
File 类时,必须在方法签名上加上 throws IOException。File 和 IOException 类位于 java.io 包中,必须 import。
Wrapper Classes
包装类(wrapper class)
The Integer and Double classes wrap primitive types into objects. Both are immutable, once created, their values cannot change.
Integer 和 Double 类把基本类型包装成对象。它们都是不可变的——一旦创建,值就不能再改变。
Autoboxing & Unboxing
自动装箱(autoboxing)与拆箱(unboxing)
// Autoboxing: primitive → wrapper object// 自动装箱:基本类型 → 包装对象 Integer x = 5; // int → Integer// int → Integer Double y = 3.14; // double → Double// double → Double // Unboxing: wrapper object → primitive// 拆箱:包装对象 → 基本类型 int a = x; // Integer → int// Integer → int double b = y; // Double → double// Double → double
Parsing Strings to Numbers
把字符串解析成数字
int num = Integer.parseInt("42"); // "42" → 42// "42" → 42 double d = Double.parseDouble("3.14"); // "3.14" → 3.14// "3.14" → 3.14
int and double values in an ArrayList via autoboxing: ArrayList<Integer>.
int 和 double 值存进 ArrayList,例如 ArrayList<Integer>。
ArrayList Methods
ArrayList 的方法
An ArrayList is a mutable-size collection that stores object references. Unlike arrays, it can grow and shrink dynamically.
ArrayList 是一种大小可变的集合,用来存储对象引用。与数组不同,它可以动态地增大或缩小。
Creating an ArrayList
创建 ArrayList
import java.util.ArrayList; ArrayList<String> names = new ArrayList<String>(); ArrayList<Integer> nums = new ArrayList<Integer>();
| Method | Returns | Description |
|---|---|---|
size() | int | Number of elements |
add(E obj) | boolean | Appends obj to end; returns true |
add(int i, E obj) | void | Inserts obj at index i; shifts others right |
get(int i) | E | Returns element at index i |
set(int i, E obj) | E | Replaces element at i; returns old element |
remove(int i) | E | Removes at i; shifts others left; returns removed |
| 方法 | 返回值 | 说明 |
|---|---|---|
size() | int | 元素个数 |
add(E obj) | boolean | 把 obj 追加到末尾;返回 true |
add(int i, E obj) | void | 在索引 i 处插入 obj;其后元素整体右移 |
get(int i) | E | 返回索引 i 处的元素 |
set(int i, E obj) | E | 替换索引 i 处的元素;返回原元素 |
remove(int i) | E | 删除索引 i 处的元素;其后元素整体左移;返回被删元素 |
ArrayList<String> list = new ArrayList<String>(); list.add("A"); // ["A"] list.add("B"); // ["A", "B"] list.add(1, "X"); // ["A", "X", "B"] list.set(0, "Z"); // ["Z", "X", "B"] list.remove(1); // ["Z", "B"] String s = list.get(0); // "Z" int sz = list.size(); // 2
| Feature | Array | ArrayList |
|---|---|---|
| Size | Fixed at creation | Dynamic (grows/shrinks) |
| Stores | Primitives or objects | Objects only (use wrappers) |
| Access | arr[i] | list.get(i) |
| Length | arr.length | list.size() |
| Modify | arr[i] = val | list.set(i, val) |
| Insert/Delete | Manual shifting | Built-in methods |
| 特性 | 数组 | ArrayList |
|---|---|---|
| 大小 | 创建时固定 | 动态(可增可缩) |
| 存储内容 | 基本类型或对象 | 只能存对象(使用包装类) |
| 访问方式 | arr[i] | list.get(i) |
| 长度 | arr.length | list.size() |
| 修改 | arr[i] = val | list.set(i, val) |
| 插入 / 删除 | 手动移动元素 | 内置方法 |
list.add(1, "X") on list ["A", "B", "C"], what is the list?["A", "B", "C"] 执行 list.add(1, "X") 之后,列表变成什么?add(1, "X") inserts "X" at index 1, shifting "B" and "C" to the right.add(1, "X") 在索引 1 处插入 "X","B" 和 "C" 整体右移。add(1, "X") inserts at index 1: elements at index 1+ shift right.add(1, "X") 在索引 1 处插入:索引 1 及之后的元素整体右移。ArrayList Traversals
ArrayList 遍历
Traversing an ArrayList works similarly to arrays but with critical pitfalls when modifying the list during traversal.
遍历 ArrayList 的方式与遍历数组类似,但在遍历过程中修改列表时存在严重的坑。
// Indexed for loop// 带索引的 for 循环 for (int i = 0; i < list.size(); i++) { System.out.println(list.get(i)); } // Enhanced for loop// 增强 for 循环 for (String s : list) { System.out.println(s); }
Skipping elements on removal: When removing elements with an indexed for loop, either traverse backward or decrement the index after removal.
删除时跳过元素:在带索引的 for 循环中删除元素时,要么倒序遍历,要么在删除后将索引减一。
Safe Removal During Traversal
遍历过程中的安全删除
// Method 1: Traverse backward// 方法 1:倒序遍历 for (int i = list.size() - 1; i >= 0; i--) { if (list.get(i).equals("remove me")) { list.remove(i); } } // Method 2: Decrement index after removal// 方法 2:删除后将索引减一 for (int i = 0; i < list.size(); i++) { if (list.get(i).equals("remove me")) { list.remove(i); i--; // adjust to not skip next element// 调整以免跳过下一个元素 } }
Start with list = ["a", "x", "x", "b"] and try to remove every "x" with a forward indexed loop and no i-- adjustment:
i | list at start | get(i) | action | list at end |
|---|---|---|---|---|
| 0 | ["a", "x", "x", "b"] | "a" | keep | ["a", "x", "x", "b"] |
| 1 | ["a", "x", "x", "b"] | "x" | remove → shift left | ["a", "x", "b"] |
| 2 | ["a", "x", "b"] | "b" | keep, ❌ skipped the "x" that slid to index 1 | ["a", "x", "b"] |
Final list: ["a", "x", "b"]. One "x" was never removed because the remaining elements shifted left under the loop cursor. The two fixes shown above (i-- after removal, or traverse backward) both keep every element exactly under inspection once.
从 list = ["a", "x", "x", "b"] 开始,用正向带索引循环且不加 i-- 调整地删除所有 "x":
i | 开始时的 list | get(i) | 动作 | 结束时的 list |
|---|---|---|---|---|
| 0 | ["a", "x", "x", "b"] | "a" | 保留 | ["a", "x", "x", "b"] |
| 1 | ["a", "x", "x", "b"] | "x" | 删除 → 左移 | ["a", "x", "b"] |
| 2 | ["a", "x", "b"] | "b" | 保留,❌ 跳过了滑到索引 1 的那个 "x" | ["a", "x", "b"] |
最终列表:["a", "x", "b"]。有一个 "x" 始终没被删除,因为后续元素在循环游标下方左移了。上面给出的两种修复方法(删除后执行 i--,或倒序遍历)都能保证每个元素正好被检查一次。
list.add(...) or list.remove(...) inside the body throws ConcurrentModificationException at runtime, even though the code compiles. MCQs love to plant this in a snippet that "looks correct". The fix is always: switch to an indexed for loop, or use an iterator with its own .remove().
list.add(...) 或 list.remove(...),编译是能过的,但运行时会抛出 ConcurrentModificationException。选择题最喜欢把这种"看起来没问题"的代码作为陷阱。解决办法永远是:改用带索引的 for 循环,或使用带有自己 .remove() 方法的迭代器。
Implementing ArrayList Algorithms
实现 ArrayList 算法
The same standard algorithms from arrays apply to ArrayLists, plus the ability to insert and delete elements. Some algorithms require traversing multiple String, array, or ArrayList objects simultaneously.
数组上的那套标准算法同样适用于 ArrayList,并且还可以插入和删除元素。有些算法需要同时遍历多个 String、数组或 ArrayList 对象。
Remove All Elements with a Property删除所有具有某种性质的元素
// Remove all even numbers (traverse backward)// 删除所有偶数(倒序遍历) for (int i = nums.size() - 1; i >= 0; i--) { if (nums.get(i) % 2 == 0) { nums.remove(i); } }
Insert Elements Conditionally按条件插入元素
// Insert separator between consecutive duplicates// 在相邻重复元素之间插入分隔符 for (int i = 0; i < list.size() - 1; i++) { if (list.get(i).equals(list.get(i + 1))) { list.add(i + 1, "---"); i++; // skip the inserted element// 跳过刚插入的元素 } }
2D Array Creation and Access
二维数组的创建与访问
A 2D array is stored as an array of arrays, essentially a table with rows and columns. Its size is fixed at creation.
二维数组本质上是数组的数组,相当于一张带行列的表格。它的大小在创建时就已固定。
Creating 2D Arrays
创建二维数组
// Using new (3 rows, 4 columns, all zeros)// 使用 new(3 行 4 列,全为 0) int[][] grid = new int[3][4]; // Using initializer list// 使用初始化列表 int[][] matrix = { {1, 2, 3}, {4, 5, 6} }; // 2 rows, 3 columns// 2 行 3 列
Accessing Elements
访问元素
int val = matrix[1][2]; // row 1, col 2 → 6// 第 1 行第 2 列 → 6 matrix[0][0] = 99; // set top-left to 99// 将左上角设为 99 // Dimensions// 维度 int rows = matrix.length; // 2 int cols = matrix[0].length; // 3 // Access a single row (it's a 1D array)// 访问单独一行(它本身就是一维数组) int[] firstRow = matrix[0]; // {99, 2, 3}
arr.length = number of rows. arr[0].length = number of columns. First index is row, second is column: arr[row][col].
arr.length = 行数。arr[0].length = 列数。第一个下标是行,第二个下标是列:arr[row][col]。
2D Array Traversals
二维数组的遍历
2D arrays require nested loops to traverse all elements. Two common orderings are row-major and column-major.
二维数组需要用嵌套循环才能遍历所有元素。常见的两种遍历顺序是行优先(row-major)和列优先(column-major)。
Row-Major Order
行优先顺序
for (int r = 0; r < grid.length; r++) { for (int c = 0; c < grid[0].length; c++) { System.out.print(grid[r][c] + " "); } System.out.println(); }
Column-Major Order
列优先顺序
for (int c = 0; c < grid[0].length; c++) { for (int r = 0; r < grid.length; r++) { System.out.print(grid[r][c] + " "); } }
Enhanced for Loop with 2D Arrays
二维数组中的增强 for 循环
for (int[] row : grid) { // outer: each row (1D array)// 外层:每一行(一维数组) for (int val : row) { // inner: each element// 内层:每一个元素 System.out.print(val + " "); } }
int[]) because it receives each row. The inner loop variable is the element type (e.g., int). Assigning to the enhanced for variable does not change the array.
int[]),因为它接收每一行。内层循环变量的类型是元素类型(例如 int)。给增强 for 的循环变量赋值不会改变数组本身。
int array, the outer loop variable type should be:int 数组的嵌套增强 for 循环中,外层循环变量的类型应该是?int[].int[]。int[].int[]。int[][] g = { {1, 2, 3}, {4, 5, 6} }; // Row-major prints: 1 2 3 4 5 6// 行优先输出:1 2 3 4 5 6 for (int r = 0; r < g.length; r++) for (int c = 0; c < g[0].length; c++) System.out.print(g[r][c] + " "); // Column-major prints: 1 4 2 5 3 6// 列优先输出:1 4 2 5 3 6 for (int c = 0; c < g[0].length; c++) for (int r = 0; r < g.length; r++) System.out.print(g[r][c] + " ");
Quick way to tell which one a snippet is doing: look at which index is in the outer loop. If the row index r moves slowest (outer), the output goes row by row. If the column index moves slowest, the output goes column by column. Total iteration count is rows × cols either way.
int[][] g = { {1, 2, 3}, {4, 5, 6} }; // Row-major prints: 1 2 3 4 5 6// 行优先输出:1 2 3 4 5 6 for (int r = 0; r < g.length; r++) for (int c = 0; c < g[0].length; c++) System.out.print(g[r][c] + " "); // Column-major prints: 1 4 2 5 3 6// 列优先输出:1 4 2 5 3 6 for (int c = 0; c < g[0].length; c++) for (int r = 0; r < g.length; r++) System.out.print(g[r][c] + " ");
判断代码是哪种遍历的快速方法:看哪个下标在外层循环。如果行下标 r 变化最慢(在外层),输出就是按行进行。如果列下标变化最慢,输出就是按列进行。无论哪种方式,总迭代次数都是 rows × cols。
{{1,2},{3,4,5}}). Using g[0].length as the inner bound assumes every row is the same length as row 0, which fails on ragged arrays. The safer pattern is g[r].length as the inner bound, which re-reads the row length each pass.
ragged 2D array)
Java 的二维数组是数组的数组,所以各行的长度可以不同(例如 {{1,2},{3,4,5}})。用 g[0].length 作为内层循环边界,就是默认所有行的长度都和第 0 行一样,遇到不规则数组就会出错。更安全的写法是用 g[r].length 作为内层边界,每次都重新读取当前行的长度。
Implementing 2D Array Algorithms
实现二维数组算法
Standard 2D array algorithms extend 1D algorithms to work on the entire grid or specific rows, columns, or subsections.
标准的二维数组算法是把一维算法扩展到整张网格,或者扩展到特定的行、列或子区域上。
Sum of a Specific Row or Column求某一行或某一列的和
// Sum of row r// 第 r 行的和 public static int rowSum(int[][] grid, int r) { int sum = 0; for (int c = 0; c < grid[r].length; c++) sum += grid[r][c]; return sum; } // Sum of column c// 第 c 列的和 public static int colSum(int[][] grid, int c) { int sum = 0; for (int r = 0; r < grid.length; r++) sum += grid[r][c]; return sum; }
Find Max of Entire 2D Array求整张二维数组的最大值
public static int findMax2D(int[][] grid) { int max = grid[0][0]; for (int[] row : grid) { for (int val : row) { if (val > max) max = val; } } return max; }
Searching Algorithms
查找(searching)算法
Linear Search
线性查找(linear search)
Linear search checks each element in order until the target is found or all elements have been checked. Works on both sorted and unsorted data.
线性查找按顺序依次检查每一个元素,直到找到目标或检查完所有元素为止。对有序和无序数据都适用。
public static int linearSearch(int[] arr, int target) { for (int i = 0; i < arr.length; i++) { if (arr[i] == target) return i; } return -1; // not found// 未找到 }
Linear Search on a 2D Array
在二维数组上的线性查找
public static boolean search2D(int[][] grid, int target) { for (int[] row : grid) { for (int val : row) { if (val == target) return true; } } return false; }
Sorting Algorithms
排序(sorting)算法
Selection Sort
选择排序
Repeatedly finds the smallest element in the unsorted portion and swaps it into its correct final position.
不断地在未排序部分中找出最小的元素,并将它交换到它最终正确的位置上。
public static void selectionSort(int[] arr) { for (int i = 0; i < arr.length - 1; i++) { int minIdx = i; for (int j = i + 1; j < arr.length; j++) { if (arr[j] < arr[minIdx]) minIdx = j; } int temp = arr[i]; arr[i] = arr[minIdx]; arr[minIdx] = temp; } }
Insertion Sort
插入排序
Takes each element and inserts it into its correct (but not necessarily final) position in the sorted portion by shifting elements.
取出每个元素,通过移动其他元素,把它插入到已排序部分中正确的位置(但不一定是最终位置)。
public static void insertionSort(int[] arr) { for (int i = 1; i < arr.length; i++) { int key = arr[i]; int j = i - 1; while (j >= 0 && arr[j] > key) { arr[j + 1] = arr[j]; j--; } arr[j + 1] = key; } }
| Feature | Selection Sort | Insertion Sort |
|---|---|---|
| Method | Select min, swap into place | Insert into sorted portion |
| Position after pass | Final position | Correct but not necessarily final |
| Best on | Small arrays | Nearly sorted arrays |
| 特性 | 选择排序 | 插入排序 |
|---|---|---|
| 方法 | 选出最小值,交换到位 | 插入到已排序部分 |
| 每轮过后元素位置 | 最终位置 | 正确但不一定是最终位置 |
| 适合的场景 | 小数组 | 接近有序的数组 |
Both algorithms keep the first k elements sorted after pass k, but only one of them puts those elements in their final positions.
Start: [5, 2, 4, 6, 1, 3]
Selection sort, after pass 1: [1, 2, 4, 6, 5, 3]
↳ Position 0 holds the global minimum, FINAL position.
Insertion sort, after pass 1: [2, 5, 4, 6, 1, 3]
↳ The first two elements are sorted, but 5 is NOT in its final position.
The 1 and 3 still to come will push it right.
How AP MCQs phrase this: "After pass 3, which of the following must be true?" The selection-sort answer is "the smallest 3 elements are in positions 0–2". The insertion-sort answer is "the first 3 elements are in sorted order relative to each other" (not relative to the full array).
两种算法在第 k 轮过后,前 k 个元素都是有序的,但只有其中一种算法会让这些元素处于它们的最终位置。
起始: [5, 2, 4, 6, 1, 3]
选择排序,第 1 轮后: [1, 2, 4, 6, 5, 3]
↳ 位置 0 已放上全局最小值,处于最终位置。
插入排序,第 1 轮后: [2, 5, 4, 6, 1, 3]
↳ 前两个元素已经有序,但 5 还不在它的最终位置。
后面还会出现的 1 和 3 会把它向右挤。
AP 选择题的常见问法:"第 3 轮过后,下列哪一项一定成立?"选择排序的答案是"最小的 3 个元素位于位置 0–2"。插入排序的答案是"前 3 个元素相对彼此有序"(而不是相对于整个数组有序)。
n elements, selection sort runs n - 1 outer passes (the last element falls into place automatically). Insertion sort also runs n - 1 outer passes (it starts at index 1 and inserts each subsequent element). The classic wrong answer on MCQs is n passes.
n 的数组,选择排序的外层循环跑 n - 1 轮(最后一个元素会自动落到正确位置)。插入排序的外层循环也跑 n - 1 轮(它从索引 1 开始,依次插入后续每一个元素)。选择题中经典的错误答案是 n 轮。
Recursion
递归(recursion)
A recursive method is a method that calls itself. Every recursive method needs at least one base case (stops the recursion) and at least one recursive call.
递归方法是会调用自身的方法。每个递归方法都至少需要一个基线条件(base case,用来终止递归)和至少一次递归调用(recursive case)。
public static int factorial(int n) { if (n == 0) // base case// 基线条件 return 1; return n * factorial(n - 1); // recursive call// 递归调用 } // factorial(4) → 4 * 3 * 2 * 1 * 1 = 24// factorial(4) → 4 * 3 * 2 * 1 * 1 = 24
Key Insights:
• Each recursive call gets its own set of local variables and parameters.
• Parameter values capture progress just like loop control variables do.
• Any recursive solution can be replicated with iteration, and vice versa.
关键要点:
• 每次递归调用都有自己一套独立的局部变量和形参。
• 形参的值记录着进度,就像循环的控制变量那样。
• 任何递归解法都可以用迭代来实现,反之亦然。
mystery(5) return if mystery(int n) returns 0 when n <= 0, else n + mystery(n - 2)?mystery(int n) 在 n <= 0 时返回 0,否则返回 n + mystery(n - 2),则 mystery(5) 返回什么?public static String mystery(int n) { if (n <= 0) return ""; return mystery(n - 1) + n + " "; } // mystery(3) returns what?// mystery(3) 返回什么?
Step through it:
mystery(3)
= mystery(2) + 3 + " "
= (mystery(1) + 2 + " ") + 3 + " "
= ((mystery(0) + 1 + " ") + 2 + " ") + 3 + " "
= (("" + 1 + " ") + 2 + " ") + 3 + " "
= ("1 " + 2 + " ") + 3 + " "
= "1 2 " + 3 + " "
= "1 2 3 "
The pattern to look for: when the recursive call comes before the rest of the expression, the work happens on the way back up the call stack, in reverse order of the calls. When the recursive call comes after, the work happens on the way down. Whichever side the work sits on determines whether you build the string forward or backward.
public static String mystery(int n) { if (n <= 0) return ""; return mystery(n - 1) + n + " "; } // mystery(3) returns what?// mystery(3) 返回什么?
一步步追踪:
mystery(3)
= mystery(2) + 3 + " "
= (mystery(1) + 2 + " ") + 3 + " "
= ((mystery(0) + 1 + " ") + 2 + " ") + 3 + " "
= (("" + 1 + " ") + 2 + " ") + 3 + " "
= ("1 " + 2 + " ") + 3 + " "
= "1 2 " + 3 + " "
= "1 2 3 "
需要观察的规律:当递归调用出现在表达式的前面时,真正的工作发生在调用栈返回的过程中,并且顺序与调用的顺序相反。当递归调用出现在后面时,工作则发生在向下递归的过程中。工作落在哪一边,决定了你是正着构造字符串还是反着构造。
"". For one that returns an int sum, it returns 0. For one that returns a count, it might return 0 or 1 depending on whether the base case "counts itself". The wrong empty value is the most common reason a recursive MCQ trace is off by one element.
""。对于返回 int 累加和的方法,则返回 0。对于返回计数的方法,可能返回 0 也可能返回 1,要看基线条件本身是否"算上自己"。错误的空值是递归选择题追踪结果差一位的最常见原因。
Recursive Searching and Sorting
递归查找与排序
Binary Search
二分查找(binary search)
Binary search requires data to be sorted. It starts at the middle and eliminates half of the remaining elements each step.
二分查找要求数据必须有序。它从中间开始,每一步都把剩下的元素排除掉一半。
public static int binarySearch(int[] arr, int lo, int hi, int target) { if (lo > hi) return -1; // base case: not found// 基线条件:未找到 int mid = (lo + hi) / 2; if (arr[mid] == target) return mid; else if (arr[mid] < target) return binarySearch(arr, mid + 1, hi, target); else return binarySearch(arr, lo, mid - 1, target); }
| Feature | Linear Search | Binary Search |
|---|---|---|
| Requires sorted data? | No | Yes |
| Worst-case comparisons (n items) | n | log₂(n) |
| Example: 1,000,000 elements | 1,000,000 | ~20 |
| 特性 | 线性查找 | 二分查找 |
|---|---|---|
| 是否要求数据有序? | 否 | 是 |
| 最坏情况比较次数(n 个元素) | n | log₂(n) |
| 例:1,000,000 个元素 | 1,000,000 | 约 20 |
Merge Sort
归并排序
Merge sort is a recursive sorting algorithm that divides the array in half, sorts each half, then merges the sorted halves together.
归并排序是一种递归的排序算法:把数组对半分开,分别对两半进行排序,然后再把已排序的两半合并起来。
1. Divide: Split the array into two halves repeatedly until each subarray has one element.
2. Conquer: Each single-element subarray is trivially sorted.
3. Merge: Combine sorted subarrays back together, maintaining order.
1. 分解:反复把数组对半分开,直到每个子数组只剩一个元素。
2. 解决:单元素子数组本身就是有序的。
3. 合并:把有序的子数组重新合并起来,保持有序。
// Trace: merge sort on [38, 27, 43, 3]// 追踪:对 [38, 27, 43, 3] 进行归并排序 // Split: [38, 27] and [43, 3]// 分解:[38, 27] 和 [43, 3] // Split: [38] [27] [43] [3]// 分解:[38] [27] [43] [3] // Merge: [27, 38] [3, 43]// 合并:[27, 38] [3, 43] // Merge: [3, 27, 38, 43]// 合并:[3, 27, 38, 43]
Unit 4 MCQ Patterns That Show Up Every Year
单元 4 每年都会出现的选择题套路
Unit 4 carries the most weight on the AP CSA MCQ section. The questions cluster around a handful of operational pitfalls. Knowing the pattern usually beats tracing the whole snippet.
单元 4 在 AP CSA 选择题中占比最大。题目集中考查几类操作上的陷阱。摸清套路往往比从头追踪整段代码更高效。
length vs length() vs size()
Arrays: arr.length (field, no parens). Strings: s.length() (method). ArrayLists: list.size() (method). MCQs frequently include a "won't compile" trap that swaps one for another. (See Topic 4.3.)
length 与 length() 与 size()
数组:arr.length(字段,不带括号)。String:s.length()(方法)。ArrayList:list.size()(方法)。选择题经常埋"无法编译"的陷阱,把它们互相调换。(见小节 4.3。)
for (int x : arr) doesn't change arr. Calling a mutator on an object loop variable does change the object. The answer choice that says "the array is unchanged" is right for primitives. (See Topic 4.4.)
for (int x : arr) 中给循环变量重新赋值不会改变 arr。如果循环变量是对象,调用它的修改方法会改变对象本身。对于基本类型,正确答案是"数组保持不变"。(见小节 4.4。)
list.remove(i) in a forward indexed loop without a compensating i-- skips the element that slid into position i. Either traverse backward or adjust the index. Using an enhanced for at all throws ConcurrentModificationException. (See Topic 4.9.)
list.remove(i) 而没有相应的 i--,会跳过滑到位置 i 的元素。要么倒序遍历,要么调整索引。只要使用增强 for 循环就会抛出 ConcurrentModificationException。(见小节 4.9。)
rows × cols. Ragged 2D arrays use g[r].length as the per-row inner bound, not g[0].length. (See Topic 4.12.)
rows × cols。不规则二维数组的内层边界要用 g[r].length(按行读取),而不是 g[0].length。(见小节 4.12。)
k elements are the k smallest, in final position. Insertion sort: the first k elements are sorted relative to each other, but not necessarily in their final spots. (See Topic 4.15.)
k 个元素就是最小的 k 个,且处于最终位置。插入排序:前 k 个元素相对彼此有序,但不一定在它们的最终位置上。(见小节 4.15。)
ArrayList<Integer>.remove() overload trap
list.remove(2) on an ArrayList<Integer> removes the element at index 2, not the value 2. To remove the value, pass it boxed: list.remove(Integer.valueOf(2)). AP MCQs occasionally hide this in a "what is the list after these calls?" trace.
ArrayList<Integer>.remove() 重载陷阱
对 ArrayList<Integer> 调用 list.remove(2),删除的是索引 2 处的元素,而不是值为 2 的元素。要删除某个值,必须把它装箱后传入:list.remove(Integer.valueOf(2))。AP 选择题偶尔会把这一点藏在"执行这些调用后列表是什么?"的追踪题里。
Flashcards
闪卡
Click each card to flip and reveal the answer.
点击每张卡片翻面查看答案。
int array element?int 数组元素的默认值?0.lengthArrayList: ???数组:
.lengthArrayList:???
.size(), it's a method, not a field.size(),是方法,不是字段ArrayIndexOutOfBoundsExceptionConcurrentModificationException不行,会抛出ConcurrentModificationExceptionint → ???自动装箱:int → ???IntegerAutomatic primitive → wrapper conversion
Integer基本类型 → 包装类的自动转换
grid?二维数组 grid 的行数?grid.length2. A recursive call (moves toward base)1. 一个基线条件(终止递归)
2. 一次递归调用(向基线条件靠近)
split(",") return?split(",") 返回什么?String[] array split by the delimiter按分隔符切分后得到的 String[] 数组java.util.ArrayListUnit 4, Practice Quiz
单元 4 · 练习小测
x after int[] arr = {5, 10, 15, 20}; int x = arr[arr.length - 1];?int[] arr = {5, 10, 15, 20}; int x = arr[arr.length - 1]; 之后,x 的值是多少?arr.length is 4, so arr[3] is 20 (the last element).arr.length 为 4,因此 arr[3] 是 20(最后一个元素)。arr.length - 1 = 3, so arr[3] = 20.arr.length - 1 = 3,所以 arr[3] = 20。ConcurrentModificationException?ConcurrentModificationException?int[][] m = new int[3][4];, what is m[0].length?int[][] m = new int[3][4];,m[0].length 的值是多少?m.length is 3 (rows). m[0].length is 4 (columns in each row).m.length 为 3(行数)。m[0].length 为 4(每行的列数)。m.length = rows (3). m[0].length = columns (4).m.length = 行数(3)。m[0].length = 列数(4)。mystery(4) return?public static int mystery(int n) { if (n == 1) return 1; return n + mystery(n - 1); }mystery(4) 返回什么?public static int mystery(int n) { if (n == 1) return 1; return n + mystery(n - 1); }AP-Style Practice Problems
AP 风格练习题
15 exam-level multiple-choice questions (3 medium · 12 hard), code-trace heavy with realistic distractors and fully worked solutions. Built for top-score prep — go here after you've worked through the notes and the in-page quiz above.
15 道考试难度的选择题(3 道中等 · 12 道困难),以代码追踪为主,配有逼真的干扰项和完整的解析。专为冲高分而设计——读完笔记并完成上面的小测之后再来做这部分。
鼎睿学苑 · Dingrui Scholars
AP Computer Science A, Unit 4: Data Collections · 2026 Edition
AP 计算机科学 A · 单元 4:数据集合 · 2026 版