Correct title of README from Hardened malloc to hardened_malloc

Remove spaces around the slash (like one/two)
build(deps): bump actions/checkout from 5 to 6
2025-12-14 17:56:32 +01:00 · 2025-12-06 00:40:28 -05:00 · 2025-12-05 21:55:56 -05:00 · 2025-11-20 13:27:29 -05:00 · 2025-11-15 17:04:35 -05:00 · 2025-10-29 16:26:38 -04:00
14 changed files with 154 additions and 90 deletions
--- a/.github/workflows/build-and-test.yml
+++ b/.github/workflows/build-and-test.yml
@ -11,9 +11,9 @@ jobs:
    runs-on: ubuntu-latest
    strategy:
      matrix:
-        version: [12]
+        version: [14]
    steps:
-      - uses: actions/checkout@v4
+      - uses: actions/checkout@v6
      - name: Setting up gcc version
        run: |
          sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-${{ matrix.version }} 100
@ -24,9 +24,11 @@ jobs:
    runs-on: ubuntu-latest
    strategy:
      matrix:
-        version: [14, 15]
+        version: [19, 20]
    steps:
-      - uses: actions/checkout@v4
+      - uses: actions/checkout@v6
      - name: Install dependencies
        run: sudo apt-get update && sudo apt-get install -y --no-install-recommends clang-19 clang-20
      - name: Setting up clang version
        run: |
          sudo update-alternatives --install /usr/bin/clang++ clang++ /usr/bin/clang++-${{ matrix.version }} 100
@ -38,7 +40,7 @@ jobs:
    container:
      image: alpine:latest
    steps:
-      - uses: actions/checkout@v4
+      - uses: actions/checkout@v6
      - name: Install dependencies
        run: apk update && apk add build-base python3
      - name: Build
@ -46,7 +48,7 @@ jobs:
  build-ubuntu-gcc-aarch64:
    runs-on: ubuntu-latest
    steps:
-      - uses: actions/checkout@v4
+      - uses: actions/checkout@v6
      - name: Install dependencies
        run: sudo apt-get update && sudo apt-get install -y --no-install-recommends gcc-aarch64-linux-gnu g++-aarch64-linux-gnu libgcc-s1-arm64-cross cpp-aarch64-linux-gnu
      - name: Build
--- a/.gitignore
+++ b/.gitignore
@ -1,2 +1,2 @@
-out/
+/out/
-out-light/
+/out-light/
--- a/Android.bp
+++ b/Android.bp
@ -5,8 +5,6 @@ common_cflags = [
    "-fPIC",
    "-fvisibility=hidden",
    //"-fno-plt",
    "-Wall",
    "-Wextra",
    "-Wcast-align",
    "-Wcast-qual",
    "-Wwrite-strings",
--- a/2
+++ b/2
@ -1,4 +1,4 @@
-Copyright © 2018-2024 GrapheneOS
+Copyright © 2018-2025 GrapheneOS
 Permission is hereby granted, free of charge, to any person obtaining a copy
 of this software and associated documentation files (the "Software"), to deal
--- a/README.md
+++ b/README.md
@ -1,4 +1,4 @@
-# Hardened malloc
+# hardened_malloc
 * [Introduction](#introduction)
 * [Dependencies](#dependencies)
@ -65,14 +65,14 @@ used instead as this allocator fundamentally doesn't support that environment.
 ## Dependencies
-Debian stable (currently Debian 12) determines the most ancient set of
+Debian stable (currently Debian 13) determines the most ancient set of
 supported dependencies:
-* glibc 2.36
+* glibc 2.41
-* Linux 6.1
+* Linux 6.12
-* Clang 14.0.6 or GCC 12.2.0
+* Clang 19.1.7 or GCC 14.2.0
-For Android, the Linux GKI 5.10, 5.15 and 6.1 branches are supported.
+For Android, the Linux GKI 6.1, 6.6 and 6.12 branches are supported.
 However, using more recent releases is highly recommended. Older versions of
 the dependencies may be compatible at the moment but are not tested and will
@ -83,7 +83,7 @@ there will be custom integration offering better performance in the future
 along with other hardening for the C standard library implementation.
 For Android, only the current generation, actively developed maintenance branch of the Android
-Open Source Project will be supported, which currently means `android13-qpr2-release`.
+Open Source Project will be supported, which currently means `android16-qpr1-release`.
 ## Testing
@ -159,14 +159,17 @@ line to the `/etc/ld.so.preload` configuration file:
 The format of this configuration file is a whitespace-separated list, so it's
 good practice to put each library on a separate line.
-On Debian systems `libhardened_malloc.so` should be installed into `/usr/lib/`
+For maximum compatibility `libhardened_malloc.so` can be installed into
-to avoid preload failures caused by AppArmor profile restrictions.
+`/usr/lib/` to avoid preload failures caused by AppArmor profiles or systemd
 ExecPaths= restrictions. Check for logs of the following format:
    ERROR: ld.so: object '/usr/local/lib/libhardened_malloc.so' from /etc/ld.so.preload cannot be preloaded (failed to map segment from shared object): ignored.
 Using the `LD_PRELOAD` environment variable to load it on a case-by-case basis
 will not work when `AT_SECURE` is set such as with setuid binaries. It's also
 generally not a recommended approach for production usage. The recommendation
 is to enable it globally and make exceptions for performance critical cases by
-running the application in a container / namespace without it enabled.
+running the application in a container/namespace without it enabled.
 Make sure to raise `vm.max_map_count` substantially too to accommodate the very
 large number of guard pages created by hardened\_malloc. As an example, in
@ -252,7 +255,7 @@ The following boolean configuration options are available:
 * `CONFIG_WRITE_AFTER_FREE_CHECK`: `true` (default) or `false` to control
  sanity checking that new small allocations contain zeroed memory. This can
  detect writes caused by a write-after-free vulnerability and mixes well with
-  the features for making memory reuse randomized / delayed. This has a
+  the features for making memory reuse randomized/delayed. This has a
  performance cost scaling to the size of the allocation, which is usually
  acceptable. This is not relevant to large allocations because they're always
  a fresh memory mapping from the kernel.
@ -338,7 +341,7 @@ larger caches can substantially improves performance).
 ## Core design
-The core design of the allocator is very simple / minimalist. The allocator is
+The core design of the allocator is very simple/minimalist. The allocator is
 exclusive to 64-bit platforms in order to take full advantage of the abundant
 address space without being constrained by needing to keep the design
 compatible with 32-bit.
@ -370,13 +373,13 @@ whether it's free, along with a separate bitmap for tracking allocations in the
 quarantine. The slab metadata entries in the array have intrusive lists
 threaded through them to track partial slabs (partially filled, and these are
 the first choice for allocation), empty slabs (limited amount of cached free
-memory) and free slabs (purged / memory protected).
+memory) and free slabs (purged/memory protected).
 Large allocations are tracked via a global hash table mapping their address to
 their size and random guard size. They're simply memory mappings and get mapped
 on allocation and then unmapped on free. Large allocations are the only dynamic
 memory mappings made by the allocator, since the address space for allocator
-state (including both small / large allocation metadata) and slab allocations
+state (including both small/large allocation metadata) and slab allocations
 is statically reserved.
 This allocator is aimed at production usage, not aiding with finding and fixing
@ -387,7 +390,7 @@ messages. The design choices are based around minimizing overhead and
 maximizing security which often leads to different decisions than a tool
 attempting to find bugs. For example, it uses zero-based sanitization on free
 and doesn't minimize slack space from size class rounding between the end of an
-allocation and the canary / guard region. Zero-based filling has the least
+allocation and the canary/guard region. Zero-based filling has the least
 chance of uncovering latent bugs, but also the best chance of mitigating
 vulnerabilities. The canary feature is primarily meant to act as padding
 absorbing small overflows to render them harmless, so slack space is helpful
@ -421,11 +424,11 @@ was a bit less important and if a core goal was finding latent bugs.
    * Top-level isolated regions for each arena
    * Divided up into isolated inner regions for each size class
        * High entropy random base for each size class region
-        * No deterministic / low entropy offsets between allocations with
+        * No deterministic/low entropy offsets between allocations with
          different size classes
    * Metadata is completely outside the slab allocation region
        * No references to metadata within the slab allocation region
-        * No deterministic / low entropy offsets to metadata
+        * No deterministic/low entropy offsets to metadata
    * Entire slab region starts out non-readable and non-writable
    * Slabs beyond the cache limit are purged and become non-readable and
      non-writable memory again
@ -646,7 +649,7 @@ other. Static assignment can also reduce memory usage since threads may have
 varying usage of size classes.
 When there's substantial allocation or deallocation pressure, the allocator
-does end up calling into the kernel to purge / protect unused slabs by
+does end up calling into the kernel to purge/protect unused slabs by
 replacing them with fresh `PROT_NONE` regions along with unprotecting slabs
 when partially filled and cached empty slabs are depleted. There will be
 configuration over the amount of cached empty slabs, but it's not entirely a
@ -693,7 +696,7 @@ The secondary benefit of thread caches is being able to avoid the underlying
 allocator implementation entirely for some allocations and deallocations when
 they're mixed together rather than many allocations being done together or many
 frees being done together. The value of this depends a lot on the application
-and it's entirely unsuitable / incompatible with a hardened allocator since it
+and it's entirely unsuitable/incompatible with a hardened allocator since it
 bypasses all of the underlying security and would destroy much of the security
 value.
@ -957,7 +960,7 @@ doesn't handle large allocations within the arenas, so it presents those in the
 For example, with 4 arenas enabled, there will be a 5th arena in the statistics
 for the large allocations.
-The `nmalloc` / `ndalloc` fields are 64-bit integers tracking allocation and
+The `nmalloc`/`ndalloc` fields are 64-bit integers tracking allocation and
 deallocation count. These are defined as wrapping on overflow, per the jemalloc
 implementation.
--- a/androidtest/memtag/memtag_test.cc
+++ b/androidtest/memtag/memtag_test.cc
@ -44,7 +44,7 @@ void *set_pointer_tag(void *ptr, u8 tag) {
    return (void *) (((uintptr_t) tag << 56) | (uintptr_t) untag_pointer(ptr));
 }
-// This test checks that slab slot allocation uses tag that is distint from tags of its neighbors
+// This test checks that slab slot allocation uses tag that is distinct from tags of its neighbors
 // and from the tag of the previous allocation that used the same slot
 void tag_distinctness() {
    // tag 0 is reserved
--- a/calculate_waste.py
+++ b/calculate_waste.py
--- a/chacha.c
+++ b/chacha.c
@ -41,7 +41,7 @@ static const unsigned rounds = 8;
    a = PLUS(a, b); d = ROTATE(XOR(d, a), 8); \
    c = PLUS(c, d); b = ROTATE(XOR(b, c), 7);
-static const char sigma[16] = "expand 32-byte k";
+static const char sigma[16] NONSTRING = "expand 32-byte k";
 void chacha_keysetup(chacha_ctx *x, const u8 *k) {
    x->input[0] = U8TO32_LITTLE(sigma + 0);
--- a/h_malloc.c
+++ b/h_malloc.c
@ -94,6 +94,24 @@ static inline bool is_memtag_enabled(void) {
 }
 #endif
 static void *memory_map_tagged(size_t size) {
 #ifdef HAS_ARM_MTE
    if (likely51(is_memtag_enabled())) {
        return memory_map_mte(size);
    }
 #endif
    return memory_map(size);
 }
 static bool memory_map_fixed_tagged(void *ptr, size_t size) {
 #ifdef HAS_ARM_MTE
    if (likely51(is_memtag_enabled())) {
        return memory_map_fixed_mte(ptr, size);
    }
 #endif
    return memory_map_fixed(ptr, size);
 }
 #define SLAB_METADATA_COUNT
 struct slab_metadata {
@ -470,7 +488,7 @@ static void write_after_free_check(const char *p, size_t size) {
    }
 #ifdef HAS_ARM_MTE
-    if (likely(is_memtag_enabled())) {
+    if (likely51(is_memtag_enabled())) {
        return;
    }
 #endif
@ -505,7 +523,7 @@ static void set_slab_canary_value(UNUSED struct slab_metadata *metadata, UNUSED
 static void set_canary(UNUSED const struct slab_metadata *metadata, UNUSED void *p, UNUSED size_t size) {
 #if SLAB_CANARY
 #ifdef HAS_ARM_MTE
-    if (likely(is_memtag_enabled())) {
+    if (likely51(is_memtag_enabled())) {
        return;
    }
 #endif
@ -517,7 +535,7 @@ static void set_canary(UNUSED const struct slab_metadata *metadata, UNUSED void
 static void check_canary(UNUSED const struct slab_metadata *metadata, UNUSED const void *p, UNUSED size_t size) {
 #if SLAB_CANARY
 #ifdef HAS_ARM_MTE
-    if (likely(is_memtag_enabled())) {
+    if (likely51(is_memtag_enabled())) {
        return;
    }
 #endif
@ -624,7 +642,7 @@ static inline void *allocate_small(unsigned arena, size_t requested_size) {
                write_after_free_check(p, size - canary_size);
                set_canary(metadata, p, size);
 #ifdef HAS_ARM_MTE
-                if (likely(is_memtag_enabled())) {
+                if (likely51(is_memtag_enabled())) {
                    p = tag_and_clear_slab_slot(metadata, p, slot, size);
                }
 #endif
@ -661,7 +679,7 @@ static inline void *allocate_small(unsigned arena, size_t requested_size) {
            if (requested_size) {
                set_canary(metadata, p, size);
 #ifdef HAS_ARM_MTE
-                if (likely(is_memtag_enabled())) {
+                if (likely51(is_memtag_enabled())) {
                    p = tag_and_clear_slab_slot(metadata, p, slot, size);
                }
 #endif
@ -688,7 +706,7 @@ static inline void *allocate_small(unsigned arena, size_t requested_size) {
        if (requested_size) {
            set_canary(metadata, p, size);
 #ifdef HAS_ARM_MTE
-            if (likely(is_memtag_enabled())) {
+            if (likely51(is_memtag_enabled())) {
                p = tag_and_clear_slab_slot(metadata, p, slot, size);
            }
 #endif
@ -717,7 +735,7 @@ static inline void *allocate_small(unsigned arena, size_t requested_size) {
        write_after_free_check(p, size - canary_size);
        set_canary(metadata, p, size);
 #ifdef HAS_ARM_MTE
-        if (likely(is_memtag_enabled())) {
+        if (likely51(is_memtag_enabled())) {
            p = tag_and_clear_slab_slot(metadata, p, slot, size);
        }
 #endif
@ -805,7 +823,7 @@ static inline void deallocate_small(void *p, const size_t *expected_size) {
        bool skip_zero = false;
 #ifdef HAS_ARM_MTE
-        if (likely(is_memtag_enabled())) {
+        if (likely51(is_memtag_enabled())) {
            arm_mte_tag_and_clear_mem(set_pointer_tag(p, RESERVED_TAG), size);
            // metadata->arm_mte_tags is intentionally not updated, see tag_and_clear_slab_slot()
            skip_zero = true;
@ -890,7 +908,7 @@ static inline void deallocate_small(void *p, const size_t *expected_size) {
        if (c->empty_slabs_total + slab_size > max_empty_slabs_total) {
            int saved_errno = errno;
-            if (!memory_map_fixed(slab, slab_size)) {
+            if (!memory_map_fixed_tagged(slab, slab_size)) {
                label_slab(slab, slab_size, class);
                stats_slab_deallocate(c, slab_size);
                enqueue_free_slab(c, metadata);
@ -1242,15 +1260,7 @@ COLD static void init_slow_path(void) {
    if (unlikely(memory_protect_rw_metadata(ra->regions, ra->total * sizeof(struct region_metadata)))) {
        fatal_error("failed to unprotect memory for regions table");
    }
-#ifdef HAS_ARM_MTE
+    ro.slab_region_start = memory_map_tagged(slab_region_size);
    if (likely(is_memtag_enabled())) {
        ro.slab_region_start = memory_map_mte(slab_region_size);
    } else {
        ro.slab_region_start = memory_map(slab_region_size);
    }
 #else
    ro.slab_region_start = memory_map(slab_region_size);
 #endif
    if (unlikely(ro.slab_region_start == NULL)) {
        fatal_error("failed to allocate slab region");
    }
@ -1895,7 +1905,7 @@ EXPORT int h_malloc_trim(UNUSED size_t pad) {
            struct slab_metadata *iterator = c->empty_slabs;
            while (iterator) {
                void *slab = get_slab(c, slab_size, iterator);
-                if (memory_map_fixed(slab, slab_size)) {
+                if (memory_map_fixed_tagged(slab, slab_size)) {
                    break;
                }
                label_slab(slab, slab_size, class);
--- a/memory.c
+++ b/memory.c
@ -17,8 +17,8 @@
 #include "memory.h"
 #include "util.h"
-void *memory_map(size_t size) {
+static void *memory_map_prot(size_t size, int prot) {
-    void *p = mmap(NULL, size, PROT_NONE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
+    void *p = mmap(NULL, size, prot, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
    if (unlikely(p == MAP_FAILED)) {
        if (errno != ENOMEM) {
            fatal_error("non-ENOMEM mmap failure");
@ -28,22 +28,19 @@ void *memory_map(size_t size) {
    return p;
 }
 void *memory_map(size_t size) {
    return memory_map_prot(size, PROT_NONE);
 }
 #ifdef HAS_ARM_MTE
 // Note that PROT_MTE can't be cleared via mprotect
 void *memory_map_mte(size_t size) {
-    void *p = mmap(NULL, size, PROT_MTE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
+    return memory_map_prot(size, PROT_MTE);
    if (unlikely(p == MAP_FAILED)) {
        if (errno != ENOMEM) {
            fatal_error("non-ENOMEM MTE mmap failure");
        }
        return NULL;
    }
    return p;
 }
 #endif
-bool memory_map_fixed(void *ptr, size_t size) {
+static bool memory_map_fixed_prot(void *ptr, size_t size, int prot) {
-    void *p = mmap(ptr, size, PROT_NONE, MAP_ANONYMOUS|MAP_PRIVATE|MAP_FIXED, -1, 0);
+    void *p = mmap(ptr, size, prot, MAP_ANONYMOUS|MAP_PRIVATE|MAP_FIXED, -1, 0);
    bool ret = p == MAP_FAILED;
    if (unlikely(ret) && errno != ENOMEM) {
        fatal_error("non-ENOMEM MAP_FIXED mmap failure");
@ -51,6 +48,17 @@ bool memory_map_fixed(void *ptr, size_t size) {
    return ret;
 }
 bool memory_map_fixed(void *ptr, size_t size) {
    return memory_map_fixed_prot(ptr, size, PROT_NONE);
 }
 #ifdef HAS_ARM_MTE
 // Note that PROT_MTE can't be cleared via mprotect
 bool memory_map_fixed_mte(void *ptr, size_t size) {
    return memory_map_fixed_prot(ptr, size, PROT_MTE);
 }
 #endif
 bool memory_unmap(void *ptr, size_t size) {
    bool ret = munmap(ptr, size);
    if (unlikely(ret) && errno != ENOMEM) {
--- a/memory.h
+++ b/memory.h
@ -15,6 +15,9 @@ void *memory_map(size_t size);
 void *memory_map_mte(size_t size);
 #endif
 bool memory_map_fixed(void *ptr, size_t size);
 #ifdef HAS_ARM_MTE
 bool memory_map_fixed_mte(void *ptr, size_t size);
 #endif
 bool memory_unmap(void *ptr, size_t size);
 bool memory_protect_ro(void *ptr, size_t size);
 bool memory_protect_rw(void *ptr, size_t size);
--- a/test/test_smc.py
+++ b/test/test_smc.py
@ -98,7 +98,7 @@ class TestSimpleMemoryCorruption(unittest.TestCase):
        self.assertEqual(stderr.decode("utf-8"),
                         "fatal allocator error: invalid free\n")
-    def test_invalid_malloc_usable_size_small_quarantene(self):
+    def test_invalid_malloc_usable_size_small_quarantine(self):
        _stdout, stderr, returncode = self.run_test(
            "invalid_malloc_usable_size_small_quarantine")
        self.assertEqual(returncode, -6)
--- a/third_party/libdivide.h
+++ b/third_party/libdivide.h
@ -11,9 +11,11 @@
 #ifndef LIBDIVIDE_H
 #define LIBDIVIDE_H
-#define LIBDIVIDE_VERSION "5.1"
+// *** Version numbers are auto generated - do not edit ***
 #define LIBDIVIDE_VERSION "5.2.0"
 #define LIBDIVIDE_VERSION_MAJOR 5
-#define LIBDIVIDE_VERSION_MINOR 1
+#define LIBDIVIDE_VERSION_MINOR 2
 #define LIBDIVIDE_VERSION_PATCH 0
 #include <stdint.h>
@ -34,8 +36,15 @@
 #include <arm_neon.h>
 #endif
 // Clang-cl prior to Visual Studio 2022 doesn't include __umulh/__mulh intrinsics
 #if defined(_MSC_VER) && defined(LIBDIVIDE_X86_64) && (!defined(__clang__) || _MSC_VER>1930)
 #define LIBDIVIDE_X64_INTRINSICS
 #endif
 #if defined(_MSC_VER)
 #if defined(LIBDIVIDE_X64_INTRINSICS)
 #include <intrin.h>
 #endif
 #pragma warning(push)
 // disable warning C4146: unary minus operator applied
 // to unsigned type, result still unsigned
@ -238,18 +247,28 @@ static LIBDIVIDE_INLINE struct libdivide_u32_branchfree_t libdivide_u32_branchfr
 static LIBDIVIDE_INLINE struct libdivide_s64_branchfree_t libdivide_s64_branchfree_gen(int64_t d);
 static LIBDIVIDE_INLINE struct libdivide_u64_branchfree_t libdivide_u64_branchfree_gen(uint64_t d);
-static LIBDIVIDE_INLINE int16_t libdivide_s16_do_raw(int16_t numer, int16_t magic, uint8_t more);
+static LIBDIVIDE_INLINE int16_t libdivide_s16_do_raw(
    int16_t numer, int16_t magic, uint8_t more);
 static LIBDIVIDE_INLINE int16_t libdivide_s16_do(
    int16_t numer, const struct libdivide_s16_t *denom);
-static LIBDIVIDE_INLINE uint16_t libdivide_u16_do_raw(uint16_t numer, uint16_t magic, uint8_t more);
+static LIBDIVIDE_INLINE uint16_t libdivide_u16_do_raw(
    uint16_t numer, uint16_t magic, uint8_t more);
 static LIBDIVIDE_INLINE uint16_t libdivide_u16_do(
    uint16_t numer, const struct libdivide_u16_t *denom);
 static LIBDIVIDE_INLINE int32_t libdivide_s32_do_raw(
    int32_t numer, int32_t magic, uint8_t more);
 static LIBDIVIDE_INLINE int32_t libdivide_s32_do(
    int32_t numer, const struct libdivide_s32_t *denom);
 static LIBDIVIDE_INLINE uint32_t libdivide_u32_do_raw(
    uint32_t numer, uint32_t magic, uint8_t more);
 static LIBDIVIDE_INLINE uint32_t libdivide_u32_do(
    uint32_t numer, const struct libdivide_u32_t *denom);
 static LIBDIVIDE_INLINE int64_t libdivide_s64_do_raw(
    int64_t numer, int64_t magic, uint8_t more);
 static LIBDIVIDE_INLINE int64_t libdivide_s64_do(
    int64_t numer, const struct libdivide_s64_t *denom);
 static LIBDIVIDE_INLINE uint64_t libdivide_u64_do_raw(
    uint64_t numer, uint64_t magic, uint8_t more);
 static LIBDIVIDE_INLINE uint64_t libdivide_u64_do(
    uint64_t numer, const struct libdivide_u64_t *denom);
@ -315,7 +334,7 @@ static LIBDIVIDE_INLINE int32_t libdivide_mullhi_s32(int32_t x, int32_t y) {
 }
 static LIBDIVIDE_INLINE uint64_t libdivide_mullhi_u64(uint64_t x, uint64_t y) {
-#if defined(LIBDIVIDE_VC) && defined(LIBDIVIDE_X86_64)
+#if defined(LIBDIVIDE_X64_INTRINSICS)
    return __umulh(x, y);
 #elif defined(HAS_INT128_T)
    __uint128_t xl = x, yl = y;
@ -341,7 +360,7 @@ static LIBDIVIDE_INLINE uint64_t libdivide_mullhi_u64(uint64_t x, uint64_t y) {
 }
 static LIBDIVIDE_INLINE int64_t libdivide_mullhi_s64(int64_t x, int64_t y) {
-#if defined(LIBDIVIDE_VC) && defined(LIBDIVIDE_X86_64)
+#if defined(LIBDIVIDE_X64_INTRINSICS)
    return __mulh(x, y);
 #elif defined(HAS_INT128_T)
    __int128_t xl = x, yl = y;
@ -914,12 +933,11 @@ struct libdivide_u32_branchfree_t libdivide_u32_branchfree_gen(uint32_t d) {
    return ret;
 }
-uint32_t libdivide_u32_do(uint32_t numer, const struct libdivide_u32_t *denom) {
+uint32_t libdivide_u32_do_raw(uint32_t numer, uint32_t magic, uint8_t more) {
-    uint8_t more = denom->more;
+    if (!magic) {
    if (!denom->magic) {
        return numer >> more;
    } else {
-        uint32_t q = libdivide_mullhi_u32(denom->magic, numer);
+        uint32_t q = libdivide_mullhi_u32(magic, numer);
        if (more & LIBDIVIDE_ADD_MARKER) {
            uint32_t t = ((numer - q) >> 1) + q;
            return t >> (more & LIBDIVIDE_32_SHIFT_MASK);
@ -931,6 +949,10 @@ uint32_t libdivide_u32_do(uint32_t numer, const struct libdivide_u32_t *denom) {
    }
 }
 uint32_t libdivide_u32_do(uint32_t numer, const struct libdivide_u32_t *denom) {
    return libdivide_u32_do_raw(numer, denom->magic, denom->more);
 }
 uint32_t libdivide_u32_branchfree_do(
    uint32_t numer, const struct libdivide_u32_branchfree_t *denom) {
    uint32_t q = libdivide_mullhi_u32(denom->magic, numer);
@ -1074,12 +1096,11 @@ struct libdivide_u64_branchfree_t libdivide_u64_branchfree_gen(uint64_t d) {
    return ret;
 }
-uint64_t libdivide_u64_do(uint64_t numer, const struct libdivide_u64_t *denom) {
+uint64_t libdivide_u64_do_raw(uint64_t numer, uint64_t magic, uint8_t more) {
-    uint8_t more = denom->more;
+   if (!magic) {
    if (!denom->magic) {
        return numer >> more;
    } else {
-        uint64_t q = libdivide_mullhi_u64(denom->magic, numer);
+        uint64_t q = libdivide_mullhi_u64(magic, numer);
        if (more & LIBDIVIDE_ADD_MARKER) {
            uint64_t t = ((numer - q) >> 1) + q;
            return t >> (more & LIBDIVIDE_64_SHIFT_MASK);
@ -1091,6 +1112,10 @@ uint64_t libdivide_u64_do(uint64_t numer, const struct libdivide_u64_t *denom) {
    }
 }
 uint64_t libdivide_u64_do(uint64_t numer, const struct libdivide_u64_t *denom) {
    return libdivide_u64_do_raw(numer, denom->magic, denom->more);
 }
 uint64_t libdivide_u64_branchfree_do(
    uint64_t numer, const struct libdivide_u64_branchfree_t *denom) {
    uint64_t q = libdivide_mullhi_u64(denom->magic, numer);
@ -1430,11 +1455,10 @@ struct libdivide_s32_branchfree_t libdivide_s32_branchfree_gen(int32_t d) {
    return result;
 }
-int32_t libdivide_s32_do(int32_t numer, const struct libdivide_s32_t *denom) {
+int32_t libdivide_s32_do_raw(int32_t numer, int32_t magic, uint8_t more) {
    uint8_t more = denom->more;
    uint8_t shift = more & LIBDIVIDE_32_SHIFT_MASK;
-    if (!denom->magic) {
+    if (!magic) {
        uint32_t sign = (int8_t)more >> 7;
        uint32_t mask = ((uint32_t)1 << shift) - 1;
        uint32_t uq = numer + ((numer >> 31) & mask);
@ -1443,7 +1467,7 @@ int32_t libdivide_s32_do(int32_t numer, const struct libdivide_s32_t *denom) {
        q = (q ^ sign) - sign;
        return q;
    } else {
-        uint32_t uq = (uint32_t)libdivide_mullhi_s32(denom->magic, numer);
+        uint32_t uq = (uint32_t)libdivide_mullhi_s32(magic, numer);
        if (more & LIBDIVIDE_ADD_MARKER) {
            // must be arithmetic shift and then sign extend
            int32_t sign = (int8_t)more >> 7;
@ -1458,6 +1482,10 @@ int32_t libdivide_s32_do(int32_t numer, const struct libdivide_s32_t *denom) {
    }
 }
 int32_t libdivide_s32_do(int32_t numer, const struct libdivide_s32_t *denom) {
    return libdivide_s32_do_raw(numer, denom->magic, denom->more);
 }
 int32_t libdivide_s32_branchfree_do(int32_t numer, const struct libdivide_s32_branchfree_t *denom) {
    uint8_t more = denom->more;
    uint8_t shift = more & LIBDIVIDE_32_SHIFT_MASK;
@ -1599,11 +1627,10 @@ struct libdivide_s64_branchfree_t libdivide_s64_branchfree_gen(int64_t d) {
    return ret;
 }
-int64_t libdivide_s64_do(int64_t numer, const struct libdivide_s64_t *denom) {
+int64_t libdivide_s64_do_raw(int64_t numer, int64_t magic, uint8_t more) {
    uint8_t more = denom->more;
    uint8_t shift = more & LIBDIVIDE_64_SHIFT_MASK;
-    if (!denom->magic) {  // shift path
+    if (!magic) {  // shift path
        uint64_t mask = ((uint64_t)1 << shift) - 1;
        uint64_t uq = numer + ((numer >> 63) & mask);
        int64_t q = (int64_t)uq;
@ -1613,7 +1640,7 @@ int64_t libdivide_s64_do(int64_t numer, const struct libdivide_s64_t *denom) {
        q = (q ^ sign) - sign;
        return q;
    } else {
-        uint64_t uq = (uint64_t)libdivide_mullhi_s64(denom->magic, numer);
+        uint64_t uq = (uint64_t)libdivide_mullhi_s64(magic, numer);
        if (more & LIBDIVIDE_ADD_MARKER) {
            // must be arithmetic shift and then sign extend
            int64_t sign = (int8_t)more >> 7;
@ -1628,6 +1655,10 @@ int64_t libdivide_s64_do(int64_t numer, const struct libdivide_s64_t *denom) {
    }
 }
 int64_t libdivide_s64_do(int64_t numer, const struct libdivide_s64_t *denom) {
    return libdivide_s64_do_raw(numer, denom->magic, denom->more);
 }
 int64_t libdivide_s64_branchfree_do(int64_t numer, const struct libdivide_s64_branchfree_t *denom) {
    uint8_t more = denom->more;
    uint8_t shift = more & LIBDIVIDE_64_SHIFT_MASK;
--- a/util.h
+++ b/util.h
@ -9,7 +9,9 @@
 #define noreturn __attribute__((noreturn))
 #define likely(x) __builtin_expect(!!(x), 1)
 #define likely51(x) __builtin_expect_with_probability(!!(x), 1, 0.51)
 #define unlikely(x) __builtin_expect(!!(x), 0)
 #define unlikely51(x) __builtin_expect_with_probability(!!(x), 0, 0.51)
 #define min(x, y) ({ \
    __typeof__(x) _x = (x); \
@ -30,6 +32,13 @@
 #define STRINGIFY(s) #s
 #define ALIAS(f) __attribute__((alias(STRINGIFY(f))))
 // supported since GCC 15
 #if __has_attribute (nonstring)
 #  define NONSTRING __attribute__ ((nonstring))
 #else
 #  define NONSTRING
 #endif
 typedef uint8_t u8;
 typedef uint16_t u16;
 typedef uint32_t u32;
Author	SHA1	Message	Date
Ganwtrs	261b7bbf09	Correct title of README from Hardened malloc to hardened_malloc	2025-12-06 00:40:28 -05:00
Ganwtrs	74ef8a96ed	Remove spaces around the slash (like one/two)	2025-12-05 21:55:56 -05:00
dependabot[bot]	c110ba88f3	build(deps): bump actions/checkout from 5 to 6 Bumps [actions/checkout](https://github.com/actions/checkout) from 5 to 6. - [Release notes](https://github.com/actions/checkout/releases) - [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md) - [Commits](https://github.com/actions/checkout/compare/v5...v6) --- updated-dependencies: - dependency-name: actions/checkout dependency-version: '6' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com>	2025-11-20 13:27:29 -05:00
charles25565	a000fd4b5e	Bump minimum AOSP version to QPR1	2025-11-15 17:04:35 -05:00
Charles	5cb0ff9f4d	gitignore: use exact matches	2025-10-29 16:26:38 -04:00
Daniel Micay	e371736b17	drop legacy compiler versions from GitHub workflow	2025-09-23 18:12:57 -04:00
Daniel Micay	c46d3cab33	add newer Clang versions for GitHub workflow	2025-09-23 18:12:39 -04:00
Christian Göttsche	33ed3027ab	Fix two typos	2025-09-21 12:35:28 -04:00
Christian Göttsche	86dde60fcf	ReadMe: adjust section about library location	2025-09-21 12:35:28 -04:00
charles25565	ff99511eb4	Update dependencies in README Update from bookworm to trixie, updating GKIs, and changing to Android 16.	2025-09-17 11:03:53 -04:00
Daniel Micay	c392d40843	update GitHub actions/checkout to 5	2025-08-12 00:28:58 -04:00
Віктор Дуйко	7481c8857f	docs: updated the license date	2025-04-05 13:13:18 -04:00
Christian Göttsche	1d7fc7ffe0	support GCC15 GCC 15 starts warning about non NUL-terminated string literals: chacha.c:44:31: error: initializer-string for array of ‘char’ truncates NUL terminator but destination lacks ‘nonstring’ attribute (17 chars into 16 available) [-Werror=unterminated-string-initialization] 44 \| static const char sigma[16] = "expand 32-byte k"; \| ^~~~~~~~~~~~~~~~~~	2025-04-03 18:31:55 -04:00
Daniel Micay	4fe9018b6f	rename calculate_waste.py to calculate-waste	2025-02-17 12:47:30 -05:00
Daniel Micay	3ab23f7ebf	update libdivide to 5.2.0	2025-01-25 16:13:22 -05:00
Daniel Micay	c894f3ec1d	add newer compiler versions for GitHub workflow	2024-12-15 22:20:01 -05:00
Daniel Micay	c97263ef0c	handle GitHub runner image updates clang-14 and clang-15 are no longer installed by default.	2024-12-15 22:18:40 -05:00
Daniel Micay	a7302add63	update outdated branch in README	2024-10-23 06:36:02 -04:00
Daniel Micay	b1d9571fec	remove trailing whitespace	2024-10-12 03:23:52 -04:00
Daniel Micay	e03579253a	preserve PROT_MTE when releasing memory	2024-10-12 03:19:16 -04:00
Daniel Micay	9739cb4690	use wrapper for calling memory_map_mte	2024-10-12 03:19:03 -04:00
Daniel Micay	aa950244f8	reuse code for memory_map_mte This drops the separate error message since that doesn't seem useful.	2024-10-12 03:18:36 -04:00
Daniel Micay	6402e2b0d4	reduce probability hint for is_memtag_enabled	2024-10-12 03:17:44 -04:00
Daniel Micay	e86192e7fe	remove redundant warning switches for Android Android already enables -Wall and -Wextra in the global soong build settings.	2024-10-09 19:57:15 -04:00