]> git.saurik.com Git - apple/xnu.git/blame - doc/pac.md
xnu-7195.60.75.tar.gz
[apple/xnu.git] / doc / pac.md
CommitLineData
f427ee49
A
1ARMv8.3 Pointer Authentication in xnu
2=====================================
3
4Introduction
5------------
6
7This document describes xnu's use of the ARMv8.3-PAuth extension. Specifically,
8xnu uses ARMv8.3-PAuth to protect against Return-Oriented-Programming (ROP)
9and Jump-Oriented-Programming (JOP) attacks, which attempt to gain control flow
10over a victim program by overwriting return addresses or function pointers
11stored in memory.
12
13It is assumed the reader is already familar with the basic concepts behind
14ARMv8.3-PAuth and what its instructions do. The "ARMv8.3-A Pointer
15Authentication" section of Google Project Zero's ["Examining Pointer
16Authentication on the iPhone
17XS"](https://googleprojectzero.blogspot.com/2019/02/examining-pointer-authentication-on.html)
18provides a good introduction to ARMv8.3-PAuth. The reader may find more
19comprehensive background material in:
20
21* The "Pointer authentication in AArch64 state" section of the [ARMv8
22 ARM](https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile)
23 describes the new instructions and registers associated with ARMv8.3-PAuth.
24
25* [LLVM's Pointer Authentication
26 documentation](https://github.com/apple/llvm-project/blob/apple/master/clang/docs/PointerAuthentication.rst)
27 outlines how clang uses ARMv8.3-PAuth instructions to harden key C, C++,
28 Swift, and Objective-C language constructs.
29
30### Threat model
31
32Pointer authentication's threat model assumes that an attacker has found a gadget
33to read and write arbitrary memory belonging to a victim process, which may
34include the kernel. The attacker does *not* have the ability to execute
35arbitrary code in that process's context. Pointer authentication aims to
36prevent the attacker from gaining control flow over the victim process by
37overwriting sensitive pointers in its address space (e.g., return addresses
38stored on the stack).
39
40Following this threat model, xnu takes a two-pronged approach to prevent the
41attacker from gaining control flow over the victim process:
42
431. Both xnu and first-party binaries are built with LLVM's `-arch arm64e` flag,
44 which generates pointer-signing and authentication instructions to protect
45 addresses stored in memory (including ones pushed to the stack). This
46 process is generally transparent to xnu, with exceptions discussed below.
47
482. On exception entry, xnu hashes critical register state before it is spilled
49 to memory. On exception return, the reloaded state is validated against this
50 hash.
51
52The ["xnu PAC infrastructure"](#xnu-pac-infrastructure) section discusses how
53these hardening techniques are implemented in xnu in more detail.
54
55
56Key generation on Apple CPUs
57----------------------------
58
59ARMv8.3-PAuth implementations may use an <span style="font-variant:
60small-caps">implementation defined</span> cipher. Apple CPUs implement an
61optional custom cipher with two key-generation changes relevant to xnu.
62
63
64### Per-boot diversifier
65
66Apple's optional cipher adds a per-boot diversifier. In effect, even if xnu
67initializes the "ARM key" registers (`APIAKey`, `APGAKey`, etc.) with constants,
68signing a given value will still produce different signatures from boot to boot.
69
70
71### Kernel/userspace diversifier
72
73Apple CPUs also contain a second diversifier known as `KERNKey`. `KERNKey` is
74automatically mixed into the final signing key (or not) based on the CPU's
75exception level. When xnu needs to sign or authenticate userspace-signed
76pointers, it uses the `ml_enable_user_jop_key` and `ml_disable_user_jop_key`
77routines to manually enable or disable `KERNKey`. `KERNKey` allows the CPU to
78effectively use different signing keys for userspace and kernel, without needing
79to explicitly reprogram the generic ARM keys on every kernel entry and exit.
80
81
82xnu PAC infrastructure
83----------------------
84
85For historical reasons, the xnu codebase collectively refers to xnu + iOS's
86pointer authentication infrastructure as Pointer Authentication Codes (PAC). The
87remainder of this document will follow this terminology for consistency with
88xnu.
89
90### arm64e binary "slice"
91
92Binaries with PAC instructions are not fully backwards-compatible with non-PAC
93CPUs. Hence LLVM/iOS treat PAC-enabled binaries as a distinct ABI "slice" named
94arm64e. xnu enforces this distinction by disabling the PAC keys when returning
95to non-arm64e userspace, effectively turning ARMv8.3-PAuth auth and sign
96instructions into no-ops (see the ["SCTLR_EL1"](#sctlr-el1) heading below for
97more details).
98
99### Kernel pointer signing
100
101xnu is built with `-arch arm64e`, which causes LLVM to automatically sign and
102authenticate function pointers and return addresses spilled onto the stack. This
103process is largely transparent to software, with some exceptions:
104
105- During early boot, xnu rebases and signs the pointers stored in its own
106 `__thread_starts` section (see `rebase_threaded_starts` in
107 `osfmk/arm/arm_init.c`).
108
109- As parts of the userspace shared region are paged in, the page-in handler must
110 also slide and re-sign any signed pointers stored in it. The ["Signed
111 pointers in shared regions"](#signed-pointers-in-shared-regions) section
112 discusses this in further detail.
113
114- Assembly routines must manually sign the return address with `pacibsp` before
115 pushing it onto the stack, and use an authenticating `retab` instruction in
116 place of `ret`. xnu provides assembly macros `ARM64_STACK_PROLOG` and
117 `ARM64_STACK_EPILOG` which emit the appropriate instructions for both arm64
118 and arm64e targets.
119
120 Likewise, branches in assembly to signed C function pointers must use the
121 authenticating `blraa` instruction in place of `blr`.
122
123- Signed pointers must be stripped with `ptrauth_strip` before they can be
124 compared against compile-time constants like `VM_MIN_KERNEL_ADDRESS`.
125
126### Testing data pointer signing
127
128xnu contains tests for each manually qualified data pointer that should be
129updated as new pointers are qualified. The tests allocate a structure
130containing a __ptrauth qualified member, and write a pointer to that member.
131We can then compare the stored value, which should be signed, with a manually
132constructed signature. See `ALLOC_VALIDATE_DATA_PTR`.
133
134Tests are triggered by setting the `kern.run_ptrauth_data_tests` sysctl. The
135sysctl is implemented, and BSD structures are tested, in `bsd/tests/ptrauth_data_tests_sysctl.c`.
136Mach structures are tested in `osfmk/tests/ptrauth_data_tests.c`.
137
138### Managing PAC register state
139
140xnu generally tries to avoid reprogramming the CPU's PAC-related registers on
141kernel entry and exit, since this could add significant overhead to a hot
142codepath. Instead, xnu uses the following strategies to manage the PAC register
143state.
144
145#### A keys
146
147Userspace processes' A keys (`AP{IA,DA,GA}Key`) are derived from the field
148`jop_pid` inside `struct task`. For implementation reasons, an exact duplicate
149of this field is cached in the corresponding `struct machine_thread`.
150
151
152A keys are randomly generated at shared region initialization time (see ["Signed
153pointers in shared regions"](#signed-pointers-in-shared-regions) below) and
154copied into `jop_pid` during process activation. This shared region, and hence
155associated A keys, may be shared among arm64e processes under specific
156circumstances:
157
1581. "System processes" (i.e., processes launched from first-party signed binaries
159 on the iOS system image) generally use a common shared region with a default
160 `jop_pid` value, separate from non-system processes.
161
162 If a system process wishes to isolate its A keys even from other system
163 processes, it may opt into a custom shared region using an entitlement in
164 the form `com.apple.pac.shared_region_id=[...]`. That is, two processes with
165 the entitlement `com.apple.pac.shared_region_id=foo` would share A keys and
166 shared regions with each other, but not with other system processes.
167
1682. Other arm64e processes automatically use the same shared region/A keys if
169 their respective binaries are signed with the same team-identifier strings.
170
1713. `posix_spawnattr_set_ptrauth_task_port_np()` allows explicit "inheriting" of
172 A keys during `posix_spawn()`, using a supplied mach task port. This API is
173 intended to support debugging tools that may need to auth or sign pointers
174 using the target process's keys.
175
176#### B keys
177
178Each process is assigned a random set of "B keys" (`AP{IB,DB}Key`) on process
179creation. As a special exception, processes which inherit their parents' memory
180address space (e.g., during `fork`) will also inherit their parents' B keys.
181These keys are stored as the field `rop_pid` inside `struct task`, with an exact
182duplicate in `struct machine_thread` for implementation reasons.
183
184xnu reprograms the ARM B-key registers during context switch, via the macro
185`set_process_dependent_keys_and_sync_context` in `cswitch.s`.
186
187xnu uses the B keys internally to sign pointers pushed onto the kernel stack,
188such as stashed LR values. Note that xnu does *not* need to explicitly switch
189to a dedicated set of "kernel B keys" to do this:
190
1911. The `KERNKey` diversifier already ensures that the actual signing keys are
192 different between xnu and userspace.
193
1942. Although reprogramming the ARM B-key registers will affect xnu's signing keys
195 as well, pointers pushed onto the stack are inherently short-lived.
196 Specifically, there will never be a situation where a stack pointer value is
197 signed with one `current_task()`, but needs to be authed under a different
198 active `current_task()`.
199
200#### SCTLR_EL1
201
202As discussed above, xnu disables the ARM keys when returning to non-arm64e
203userspace processes. This is implemented by manipulating the `EnIA`, `EnIB`,
204and `EnDA`, and `EnDB` bits in the ARM `SCTLR_EL1` system register. When
205these bits are cleared, auth or sign instruction using the respective keys
206will simply pass through their inputs unmodified.
207
208Initially, xnu cleared these bits during every `exception_return` to a
209non-arm64e process. Since xnu itself uses these keys, the exception vector
210needs to restore the same bits on every exception entry (implemented in the
211`EL0_64_VECTOR` macro).
212
213Apple A13 CPUs now have controls that allow xnu to keep the PAC keys enabled at
214EL1, independent of `SCTLR_EL1` settings. On these CPUs, xnu only needs to
215reconfigure `SCTLR_EL1` when context-switching from a "vanilla" arm64 process to
216an arm64e process, or vice-versa (`pmap_switch_user_ttb_internal`).
217
218### Signed pointers in shared regions
219
220Each userspace process has a *shared region* mapped into its address space,
221consisting of code and data shared across all processes of the same processor
222type, bitness, root directory, and (for arm64e processes) team ID. Comments at
223the top of `osfmk/vm/vm_shared_region.c` discuss this region, and the process of
224populating it, in more detail.
225
226As the VM layer pages in parts of the shared region, any embedded pointers must
227be rebased. Although this process is not new, PAC adds a new step: these
228embedded pointers may be signed, and must be re-signed after they are rebased.
229This process is implemented as `vm_shared_region_slide_page_v3` in
230`osfmk/vm/vm_shared_region.c`.
231
232xnu signs these embedded pointers using a shared-region-specific A key
233(`sr_jop_key`), which is randomly generated when the shared region is created.
234Since these pointers will be consumed by userspace processes, xnu temporarily
235switches to the userspace A keys when re-signing them.
236
237### Signing spilled register state
238
239xnu saves register state into kernel memory when taking exceptions, and reloads
240this state on exception return. If an attacker has write access to kernel
241memory, it can modify this saved state and effectively get control over a
242victim thread's control flow.
243
244xnu hardens against this attack by calling `ml_sign_thread_state` on exception
245entry to hash certain registers before they're saved to memory. On exception
246return, it calls the complementary `ml_check_signed_state` function to ensure
247that the reloaded values still match this hash. `ml_sign_thread_state` hashes a
248handful of particularly sensitive registers:
249
250* `pc, lr`: directly affect control-flow
251* `cpsr`: controls process's exception level
252* `x16, x17`: used by LLVM to temporarily store unauthenticated addresses
253
254`ml_sign_thread_state` also uses the address of the thread's `arm_saved_state_t`
255as a diversifier. This step keeps attackers from using `ml_sign_thread_state`
256as a signing oracle. An attacker may attempt to create a sacrificial thread,
257set this thread to some desired state, and use kernel memory access gadgets to
258transplant the xnu-signed state onto a victim thread. Because the victim
259process has a different `arm_saved_state_t` address as a diversifier,
260`ml_check_signed_state` will detect a hash mismatch in the victim thread.
261
262Apart from exception entry and return, xnu calls `ml_check_signed_state` and
263`ml_sign_thread_state` whenever it needs to mutate one of these sensitive
264registers (e.g., advancing the PC to the next instruction). This process looks
265like:
266
2671. Disable interrupts
2682. Load `pc, lr, cpsr, x16, x17` values and hash from thread's
269 `arm_saved_state_t` into registers
2703. Call `ml_check_signed_state` to ensure values have not been tampered with
2714. Mutate one or more of these values using *only* register-to-register
272 instructions
2735. Call `ml_sign_thread_state` to re-hash the mutated thread state
2746. Store the mutated values and new hash back into thread's `arm_saved_state_t`.
2757. Restore old interrupt state
276
277Critically, none of the sensitive register values can be spilled to memory
278between steps 1 and 7. Otherwise an attacker with kernel memory access could
279modify one of these values and use step 5 as a signing oracle. xnu implements
280these routines entirely in assembly to ensure full control over register use,
281using a macro `MANIPULATE_SIGNED_THREAD_STATE()` to generate boilerplate
282instructions.
283
284Interrupts must be disabled whenever `ml_check_signed_state` or
285`ml_sign_thread_state` are called, starting *before* their inputs (`x0`--`x5`)
286are populated. To understand why, consider what would happen if the CPU could
287be interrupted just before step 5 above. xnu's exception handler would spill
288the entire register state to memory. If an attacker has kernel memory access,
289they could attempt to replace the spilled `x0`--`x5` values. These modified
290values would then be reloaded into the CPU during exception return; and
291`ml_sign_thread_state` would be called with new, attacker-controlled inputs.
292
293### thread_set_state
294
295The `thread_set_state` call lets userspace modify the register state of a target
296thread. Signed userspace state adds a wrinkle to this process, since the
297incoming FP, LR, SP, and PC values are signed using the *userspace process's*
298key.
299
300xnu handles this in two steps. First, `machine_thread_state_convert_from_user`
301converts the userspace thread state representation into an in-kernel
302representation. Signed values are authenticated using `pmap_auth_user_ptr`,
303which involves temporarily switching to the userspace keys.
304
305Second, `thread_state64_to_saved_state` applies this converted state to the
306target thread. Whenever `thread_state64_to_saved_state` modifies a register
307that makes up part of the thread state hash, it uses
308`MANIPULATE_SIGNED_THREAD_STATE()` as described above to update this hash.
309
310
311### Signing arbitrary data blobs
312
313xnu provides `ptrauth_utils_sign_blob_generic` and `ptrauth_utils_auth_blob_generic`
314to sign and authenticate arbitrary blobs of data. Callers are responsible for
315storing the pointer-sized signature returned. The signature is a rolling MAC
316of the data, using the `pacga` instruction, mixed with a provided salt and optionally
317further diversified by storage address.
318
319Use of these functions is inherently racy. The data must be read from memory
320before each pointer-sized block can be added to the signature. In normal operation,
321standard thread-safety semantics protect from corruption, however in the malicious
322case, it may be possible to time overwriting the buffer before signing or after
323authentication.
324
325Callers of these functions must take care to minimise these race windows by
326using them immediately preceeding/following a write/read of the blob's data.