From f4b5bb033dc4430bbd31dcae8a55f988360bcec5 Mon Sep 17 00:00:00 2001 From: Sebastian Pipping Date: Wed, 17 Sep 2025 23:14:02 +0200 Subject: [PATCH] lib: Document and regression-proof absence of integer overflow from expat_realloc Matthew Fernandez (@Smattr) and I teamed up on whether function expat_realloc could be vulnerable to integer overflow in line: mallocedPtr = parser->m_mem.realloc_fcn(mallocedPtr, sizeof(size_t) + size); ^ We ended up with a mathematical proof that, fortunately, the current code already is safe from overflow. The proof uses technique "proof by contradiction". Let's assume, there *was* a risk of integer overflow. For a risk of overflow, these four conditions would all need to be met, together: (1) `SIZE_MAX < sizeof(size_t) + size` or we would not hit an overflow on `size_t`. (2) `size > prevSize` or `expat_malloc` would have already not allocated earlier as `expat_realloc` relies on `expat_malloc` for the initial allocation. (3) `rootParser->m_alloc_tracker.bytesAllocated >= sizeof(size_t) + prevSize` or the previous allocation would be gone already or have bypassed accounting. The code is not thread-safe in general, race conditions are off the table. (4) `rootParser->m_alloc_tracker.bytesAllocated + (size - prevSize) <= SIZE_MAX` or `expat_heap_increase_tolerable` would have returned `false` and the overflow line would not be reached. We encoded this for the Z3 Theorem Prover (https://github.com/Z3Prover/z3) and ended up with this document: $ cat proof_v2.smt2 ; Copyright (c) 2025 Matthew Fernandez ; Copyright (c) 2025 Sebastian Pipping ; Licensed under the MIT license ; (1), (2), (3), (4) form a contradiction ; define `SIZE_MAX` (declare-fun SIZE_MAX () (_ BitVec 64)) (assert (= SIZE_MAX #xffffffffffffffff)) ; define `sizeof(size_t)` (declare-fun sizeof_size_t () (_ BitVec 64)) (assert (= sizeof_size_t #x0000000000000008)) ; claim we have inputs `size`, `prevSize`, and `bytesAllocated` (declare-fun size () (_ BitVec 64)) (declare-fun prevSize () (_ BitVec 64)) (declare-fun bytesAllocated () (_ BitVec 64)) ; assume `SIZE_MAX - sizeof(size_t) < size` (1) (assert (bvult (bvsub SIZE_MAX sizeof_size_t) size)) ; assume `bytesAllocated >= sizeof(size_t) + prevSize` (3) (assert (bvuge bytesAllocated (bvadd sizeof_size_t prevSize))) ; assume `bytesAllocated - prevSize <= SIZE_MAX - size` (4) (assert (bvule (bvsub bytesAllocated prevSize) (bvsub SIZE_MAX size))) ; assume `SIZE_MAX - sizeof(size_t) >= prevSize` (anti-overflow for 3) (assert (bvuge (bvsub SIZE_MAX sizeof_size_t) prevSize)) ; prove we have a contradiction (check-sat) Note that we operate on fixed-size bit vectors here, and hence had to transform the assertions to not allow integer overflow by themselves. Z3 confirms the contradiction, and thus the absence of integer overflow: $ z3 -smt2 -model proof_v2.smt2 unsat Co-authored-by: Matthew Fernandez CVE: CVE-2025-59375 Upstream-Status: Backport [https://github.com/libexpat/libexpat/commit/f4b5bb033dc4430bbd31dcae8a55f988360bcec5] Signed-off-by: Peter Marko --- lib/xmlparse.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/lib/xmlparse.c b/lib/xmlparse.c index de159493..24fd7b97 100644 --- a/lib/xmlparse.c +++ b/lib/xmlparse.c @@ -969,6 +969,10 @@ expat_realloc(XML_Parser parser, void *ptr, size_t size, int sourceLine) { } } + // NOTE: Integer overflow detection has already been done for us + // by expat_heap_increase_tolerable(..) above + assert(SIZE_MAX - sizeof(size_t) >= size); + // Actually allocate mallocedPtr = parser->m_mem.realloc_fcn(mallocedPtr, sizeof(size_t) + size);