Instead of protecting the global read-only data structure after startup
via the read-only flag, which can be reverted, use the in Linux 6.10
introduced irreversible syscall mseal(2).
On Android, MTE is always enabled in Zygote, and is disabled after fork for apps that didn't opt-in
to MTE.
Depends on the slab canary adjustments in the previous commit.
- tag slab allocations with [1..14] tags
- tag freed slab allocations with the "15" tag value to detect accesses to freed slab memory
- when generating tag value for a slab slot, always exclude most recent tag value for that slot
(to make use-after-free detection more reliable) and most recent tag values of its immediate
neighbors (to detect linear overflows and underflows)
If N_ARENA is greater than 1 `thread_arena` is initially to N_ARENA,
which is an invalid index into `ro.size_class_metadata[]`.
The actual used arena is computed in init().
Ensure init() is called if a new thread is only using realloc() to avoid
UB, e.g. pthread_mutex_lock() might crash due the memory not holding an
initialized mutex.
Affects mesa 23.2.0~rc4.
Example back trace using glmark2 (note `arena=4` with the default
N_ARENA being 4):
Program terminated with signal SIGSEGV, Segmentation fault.
#0 ___pthread_mutex_lock (mutex=0x7edff8d3f200) at ./nptl/pthread_mutex_lock.c:80
type = <optimized out>
__PRETTY_FUNCTION__ = "___pthread_mutex_lock"
id = <optimized out>
#1 0x00007f0ab62091a6 in mutex_lock (m=0x7edff8d3f200) at ./mutex.h:21
No locals.
#2 0x00007f0ab620c9b5 in allocate_small (arena=4, requested_size=24) at h_malloc.c:517
info = {size = 32, class = 2}
size = 32
c = 0x7edff8d3f200
slots = 128
slab_size = 4096
metadata = 0x0
slot = 0
slab = 0x0
p = 0x0
#3 0x00007f0ab6209809 in allocate (arena=4, size=24) at h_malloc.c:1252
No locals.
#4 0x00007f0ab6208e26 in realloc (old=0x72b138199120, size=24) at h_malloc.c:1499
vma_merging_reliable = false
old_size = 16
new = 0x0
copy_size = 139683981990973
#5 0x00007299f919e556 in attach_shader (ctx=0x7299e9ef9000, shProg=0x7370c9277d30, sh=0x7370c9278230) at ../src/mesa/main/shaderapi.c:336
n = 1
#6 0x00007299f904223e in _mesa_unmarshal_AttachShader (ctx=<optimized out>, cmd=<optimized out>) at src/mapi/glapi/gen/marshal_generated2.c:1539
program = <optimized out>
shader = <optimized out>
cmd_size = 2
#7 0x00007299f8f2e3b2 in glthread_unmarshal_batch (job=job@entry=0x7299e9ef9168, gdata=gdata@entry=0x0, thread_index=thread_index@entry=0) at ../src/mesa/main/glthread.c:139
cmd = 0x7299e9ef9180
batch = 0x7299e9ef9168
ctx = 0x7299e9ef9000
pos = 0
used = 3
buffer = 0x7299e9ef9180
shared = <optimized out>
lock_mutexes = <optimized out>
batch_index = <optimized out>
#8 0x00007299f8ecc2d9 in util_queue_thread_func (input=input@entry=0x72c1160e5580) at ../src/util/u_queue.c:309
job = {job = 0x7299e9ef9168, global_data = 0x0, job_size = 0, fence = 0x7299e9ef9168, execute = <optimized out>, cleanup = <optimized out>}
queue = 0x7299e9ef9058
thread_index = 0
#9 0x00007299f8f1bcbb in impl_thrd_routine (p=<optimized out>) at ../src/c11/impl/threads_posix.c:67
pack = {func = 0x7299f8ecc190 <util_queue_thread_func>, arg = 0x72c1160e5580}
#10 0x00007f0ab5aa63ec in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:444
ret = <optimized out>
pd = <optimized out>
out = <optimized out>
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {139683974242608, 2767510063778797177, -168, 11, 140727286820160, 126005371879424, -4369625917767903623, -2847048016936659335}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0,
0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
not_first_call = <optimized out>
#11 0x00007f0ab5b26a2c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
This needs to be disabled for compatibility with the exploit protection
compatibility mode on GrapheneOS. hardened_malloc shouldn't be trying to
initialize itself when exploit protection compatibility mode is enabled.
This has to be handled in our Bionic integration instead.
Calls to malloc with a zero size are extremely rare relative to normal
usage of the API. It's generally only done by inefficient C code with
open coded dynamic array implementations where they aren't handling zero
size as a special case for their usage of malloc/realloc. Efficient code
wouldn't be making these allocations. It doesn't make sense to optimize
for the performance of rare edge cases caused by inefficient code.