The Atrocious State Of Binary Compatibility on Linux and How To Address It.

By Dale Weiler GitHub

Time To Read: ~30 Minutes
Last Updated: Monday, March 17th 2025

Summary

Linux binary compatibility is plagued by one thing that is often overlooked when evaluating shipping software on Linux. This article will deconstruct how to arrive to that conclusion, how to address it when shipping software today and what needs to be done to actually fix it.

Table of contents

Introduction
Containers
Versioning
System Libraries
Our Approach
Fixing It
Questioning It

Introduction

At JangaFX, we make several products that run natively on Linux. We love the flexibility and power that Linux offers our developers, but shipping software on it is a whole different challenge.

Linux is an incredibly powerful platform, but when it comes to shipping software, it can feel like a minefield. Unlike other operating systems, Linux isn’t just one system—it’s a chaotic mix of different services, libraries, and even philosophies. Every distribution does things a little differently, and that means the same executable that works flawlessly on one system might completely break on another.

This shouldn’t even be a problem. The Linux kernel itself has maintained relatively stable system calls. But everything built on top of it changes constantly in ways that break compatibility, making it incredibly frustrating to ship software that "just works." If you’re developing for Linux, you’re not targeting a single platform—you’re navigating an ecosystem that has evolved without much concern for binary compatibility.

Some of us, coming from the game industry before moving into VFX have dealt with this problem before. Shipping games on Linux has always been a nightmare, and the same issues persists regardless of industry. In this article, We're going to explain why we think containers are the wrong approach, how we build and ship Linux software in a way that actually works, what we think is responsible for Linux's binary compatibility problem and what needs to change to fix it.

The latter part of this article will get deeply technical about what exactly the problem is and how it can be fixed.

Containers

Tools like Flatpak, AppImage, and similar solutions attempt to simplify shipping executables by creating "containers"—or as as we've recently taken to calling them, "a Linux Environment inside a Linux" Using Linux features like namespaces and chroots, these solutions package an entire Linux environment, complete with all required dependencies, into a single self-contained bundle. In extreme cases, this means shipping an entire Linux user-space just for one application.

One of the major challenges with these containerized solutions is that they often don’t work well with applications that need to interact with the rest of the system. To access hardware-accelerated APIs like OpenGL, Vulkan, VDPAU or CUDA, an application must dynamically link against the system's graphics driver libraries. Since these libraries exist outside the container and cannot be shipped with the application, various "pass-through" techniques have been developed to work around this, some of which introduce runtime overhead (e.g., shimming libraries). Because containerized applications are isolated from the system, they often feel isolated too. This creates consistency issues, where the application may not recognize the user’s name, home directory, system settings, desktop environment preferences, or even have proper access to the filesystem.

To work around these limitations, many containerized environments rely on the XDG Desktop Portal protocol, which introduces yet another layer of complexity. This system requires IPC (inter-process communication) through DBus just to grant applications access to basic system features like file selection, opening URLs, or reading system settings—problems that wouldn’t exist if the application weren’t artificially sandboxed in the first place.

We don’t believe that piling on more layers is an acceptable solution. As engineers, we need to stop and ask ourselves: "should we keep adding to this tower of Babel?", or is it time to peel back some of these abstractions and reevaluate them? At some point, the right solution isn’t more complexity—it’s less.

While containerized solutions can work under certain conditions, we believe that shipping lean, native executables—without containers—provides a more seamless and integrated experience that better aligns with user expectations.

Versioning

When you compile your application, it links against the specific library versions present on the build machine. This means that by default, the versions on a user's system may not match, causing compatibility issues. Let’s assume the user has all the necessary libraries installed, but the versions don’t match what your application was built against. This is where the real problem begins. Short of shipping the exact machine used to deploy your application, how do you ensure compatibility with the versions installed on a user’s system?

We believe there are two ways to solve this problem, and we've given them our own names:

Replication Approach – This means bundling all the libraries from the build machine and shipping them with your application. This is the philosophy behind Flatpak and AppImage. We do not use this approach at JangaFX.
Relaxation Approach – Instead of relying on specific or newer library versions, you link against versions that are so old they’re almost guaranteed to be compatible everywhere. This minimizes the risk of incompatibility on a user’s system.

The first approach works well in cases where the necessary libraries may not exist on a user’s machine, but it fails for libraries that cannot be shipped (we call these "system libraries") The second approach is particularly effective for system libraries and is the approach we use at JangaFX.

System Libraries

There are various libraries present on a Linux machine that cannot be shipped because they are system libraries. These are libraries tied to the system itself and cannot be provided in a container. Typically these include things like user-space drivers for the GPU, enterprise installed components for security, and of course, libc itself.

If you’ve ever tried to distribute Linux binaries, you may have encountered an error message like this:

/lib64/libc.so.6: version `GLIBC_2.18' not found

For those unaware, glibc (GNU C Library) provides the C standard library, POSIX APIs, and the dynamic linker responsible for loading shared libraries, and itself.

GLIBC is an example of a "system library" that cannot be bundled with your application because it includes the dynamic linker itself. This linker is responsible for loading other libraries, some of which may also depend on GLIBC—but not always. Complicating matters further, since GLIBC is a dynamic library, it must also load itself. This self-referential, chicken-and-egg problem highlights GLIBC’s complexity and monolithic design, as it attempts to fulfill multiple roles simultaneously. A large downside to this monolithic design is that upgrading GLIBC often requires upgrading the entire system. Later in this article, we will explain why this structure needs to change to truly solve Linux’s binary compatibility problem.

Before you suggest statically linking GLIBC—that’s not an option. GLIBC relies on dynamic linking for features like NSS modules, which handle hostname resolution, user authentication, and network configuration, among other dynamically loaded components. Static linking breaks this because it does not include the dynamic linker, which is why GLIBC does not officially support it. Even if you managed to statically link GLIBC—or used an alternative like musl—your application would be unable to load any dynamic libraries at runtime. Static linking the dynamic linker itself is not possible, for reasons that will be explained later. In short, this would prevent your application from dynamically linking against any system libraries at all.

Our Approach

Since our application relies on many non-system libraries that may not be installed on the user’s system, we need a way to include them. The most straightforward approach is the Replication Approach, where we ship these libraries alongside our application. However, this negates the benefits of dynamic linking, such as shared memory usage and system-wide updates. In such cases, statically linking these libraries into the application is a better choice, as it eliminates dependency issues entirely. It also enables additional optimizations, such as LTO, and results in a smaller package by stripping unused components from the included libraries.

Instead, we take a different approach: statically linking everything we can. When doing so, special care is needed if a dependency embeds another dependency within its static library. We've encountered static libraries that include object files from other static libraries (e.g., libcurl), but we still need to link them separately. This duplication is conveniently avoided with dynamic libraries, but with static libraries, you may need to extract all object files from the archive and remove the embedded ones manually. Similarly, compiler runtimes like libgcc default to dynamic linking. We recommend using -static-libgcc.

Finally, when it comes to dealing with system libraries, we use the Relaxation Approach. Rather than requiring exact or newer versions of system libraries, we link against versions that are old enough to be nearly universally compatible. This increases the likelihood that the user’s system libraries will work with our application, reducing dependency issues without the need for containerization or bundling system components and shims.

The method we suggest when linking against older system libraries is to obtain a corresponding older Linux environment. You don’t need to install an old Linux version on physical hardware or even set up a full virtual machine. Instead, a chroot provides a lightweight, isolated environment within an existing Linux installation, allowing you to build against an older system without the overhead of full virtualization. Ironically, this suggests that containers were the right solution all along—just not at runtime, but at build time.

To achieve this, we use debootstrap, an excellent script that creates a minimal Debian installation from scratch. Debian is particularly suited for this approach due to its stability and long-term support for older releases, making it a great choice for ensuring compatibility with older system libraries.

Of course, once you have an older Linux setup, you may find that its binary package toolchains are too outdated to build your software. To address this, we compile a modern LLVM toolchain from source and use it to build both our dependencies and our software. The details of this process are beyond the scope of this article.

Finally, we automate the entire debootstrap process with a Python script, which we've included here for reference.

#!/bin/env python3
import os, subprocess, shutil, multiprocessing

PACKAGES = [ 'build-essential' ]
DEBOOSTRAP = 'https://salsa.debian.org/installer-team/debootstrap.git'
ARCHIVE = 'http://archive.debian.org/debian'
VERSION = 'jessie' # Released in 2015

def chroot(pipe):
  try:
    os.chroot('chroot')
    os.chdir('/')

    # Setup an environment for the chroot
    env = {
      'HOME': '/root',
      'TERM': 'xterm',
      'PATH': '/bin:/usr/bin:/sbin:/usr/sbin'
    }

    # The Debian is going to be quite old and so the keyring keys will likely be
    # expired. To work around this we will replace the sources.list to contain
    # '[trusted=yes]'
    with open('/etc/apt/sources.list', 'w') as fp:
      fp.write(f'deb [trusted=yes] http://archive.debian.org/debian {VERSION} main\n')

    # Update and install packages
    subprocess.run(['apt', 'update'], env=env)
    subprocess.run(['apt', 'install', '-y', *PACKAGES], env=env)

    #
    # Script your Linux here, remember to pass `env=env` to subprocess.run.
    #
    # We suggest downloading GCC 7.4.0, compiling from source, and installing
    # it since it's the minimum version required to compile the latest LLVM from
    # source. We then suggest downloading, compiling from source, and installing
    # the latest LLVM, which as of time of writing is 20.1.0.
    #
    # You can then compile and install all other source packages your software
    # requires from source using this modern LLVM toolchain.
    #
    # You can also enter the chroot with an interactive shell from this script
    # by uncommenting the following and running this script as usual.
    #  subprocess.run(['bash'])
    #

    # You can send messages to the parent with pipe.send()
    pipe.send('Done') # This one has special meaning in main
  except Exception as exception:
    pipe.send(exception)

def main():
  # We need to run as root to use 'mount', 'umount', and 'chroot'
  if os.geteuid() != 0:
    print('Script must be run as root')
    return False

  with multiprocessing.Manager() as manager:
    mounts = manager.list()
    pipe = multiprocessing.Pipe()
    def mount(parts):
      subprocess.run(['mount', *parts])
      mounts.append(parts[-1])

    # Ensure we have a fresh chroot and clone of debootstrap
    shutil.rmtree('chroot', ignore_errors=True)
    shutil.rmtree('debootstrap', ignore_errors=True)
    os.mkdir('chroot')

    # Clone debootstrap
    subprocess.run(['git', 'clone', DEBOOSTRAP])
    subprocess.run(['debootstrap', '--arch', 'amd64', VERSION, '../chroot', ARCHIVE],
                    env={**os.environ, 'DEBOOTSTRAP_DIR': '.'},
                    cwd='debootstrap')

    # Mount nodes needed for the chroot
    mount(['-t', 'proc', '/proc', 'chroot/proc'])
    mount(['--rbind', '/sys', 'chroot/sys'])
    mount(['--make-rslave', 'chroot/sys'])
    mount(['--rbind', '/dev', 'chroot/dev'])
    mount(['--make-rslave', 'chroot/dev'])

    # Setup the chroot in a separate process
    process = multiprocessing.Process(target=chroot, args=(pipe[1],))
    process.start()
    try:
      while True:
        data = pipe[0].recv()
        if isinstance(data, Exception):
          raise data
        else:
          print(data)
          if data == 'Done':
            break
    finally:
      process.join()
      for umount in reversed(list(set(mounts))):
        subprocess.run(['umount', '-R', umount])
        subprocess.run(['sync'])

if __name__ == '__main__':
  try:
    main()
  except KeyboardInterrupt:
    print('Cancelled')

Fixing it

Generally, most applications do not link directly against system libraries, and instead load which ever is present on the user's machine already at runtime. So while these libraries are considered system components, they typically have few system dependencies beyond libc itself. This is what makes libc—specifically GLIBC—the real source of compatibility issues, it's essentially the only system component directly linked against.

In just the past two years, our team has encountered three separate GLIBC-specific compatibility issues, each directly impacting our products:

From our perspective, the core issue with GLIBC is that it tries to do far too much. It’s a massive, monolithic system that handles everything—from system calls and memory management to threading and even the dynamic linker. This tight coupling is why upgrading GLIBC often means upgrading the entire system; everything is intertwined. If it were broken into smaller, more focused components, users could update only the parts that change, rather than dragging their whole system along with it.

More importantly, separating the dynamic linker from the C library itself would allow multiple versions of libc to coexist, eliminating a major source of compatibility issues. This is exactly how Windows handles it, which is one of the reasons Windows maintains such strong binary compatibility. You can still run decades-old Windows software today because Microsoft doesn’t force everything to be tied to a single, ever-changing libc.

Of course, this isn’t as simple as just splitting things apart. GLIBC has deep cross-cutting concerns, particularly with threading, TLS (Thread-Local Storage), and global memory management.

For example, even if you managed to get two versions of GLIBC to coexist, returning allocated memory from one and attempting to free it in another would likely lead to serious issues. Their heaps would be unaware of each other, potentially using different allocation strategies, causing unpredictable failures. To avoid this, even the heap would likely need to be separated into its own dedicated libheap.

We think a better approach would be breaking GLIBC into distinct libraries, something like this:

libsyscall – Handles making system calls and nothing else. This is provided as a static library only. Used by libheap, libthread and libc to gain access to shared system call code. Since it's static it's embedded in all three. You can pretend this library otherwise does not exist.
libdl (Dynamic Linker) – A standalone linker that loads shared libraries. Links only against libsyscall statically. Is a true, free-standing library, depending on nothing. Provided as both a static and dynamic library. When you link against it statically you can still load things with it dynamically. You just end up with a dynamic linker inside your executable.
libheap - The single heap shared by all below. Links against libsyscall statically. Provided only as a dynamic library. Cannot ever be linked against statically.
libthread – Deals with threading and TLS, links against libheap. Provided only as a dynamic library. Cannot ever be linked against statically.
libc – Links against libthread, and thus libheap, and libdl transitively. Provided as both a static and dynamic library. When linking statically it links libdl statically. The linking of libthread and libheap is always done dynamically though through the included libdl if linked statically or through libdl program loader if linked dynamically.

These libraries would be aware of each other and allow multiple versions to coexist in the same address space. That way, we don’t end up with this brittle mess where upgrading GLIBC can break everything. The actual structure would look something like

This architecture is quite similar to Windows, where the equivalents of libsyscall, libdl, libheap, and libthread are all bundled within a single kernel32.dll. This DLL is pre-mapped and automatically loaded into the address space of every executable on Windows.

Statically Linked `libc`

Application statically links libc and libdl (it is not the program loader).
Application starts execution and dynamically loads libheap and libthread using embedded libdl.

[Application]
   │
   ▼
[libc (static)]
   │
   ▼
[libdl (static)]
   ├── [libheap (dynamic)]
   └── [libthread (dynamic)]
          └── [libheap (dynamic)]

libc and libdl are embedded in the executable, meaning the application itself starts execution.
The embedded libdl dynamically loads libthread and libheap.

Dynamically Linked `libc`

Application starts execution via the program interpreter (libdl).
libdl (program loader) loads the application and resolves dependencies.
Application dynamically links libc, libheap, and libthread.

[Application (interpreter entry)]
   │
   ▼
[libdl (program loader)]
   │
   ▼
[libc (dynamic)]
   ├── [libheap (dynamic)]
   ├── [libthread (dynamic)]
   │      └── [libheap (dynamic)]
   ▼
[Application (regular entry)]

Comparison Table

Scenario	`libdl` (included by)	`libc` (how it loads)	`libthread` (via `libdl`)	`libheap` (via `libdl`)
Static `libc`	Static linking	Static linking	Linked by `libdl`	Linked by `libdl`
Dynamic `libc`	Program interpreter	Linked by `libdl`	Linked by `libdl`	Linked by `libdl`

This architecture effectively reduces the binary compatibility problem to two key system libraries: libheap and libthread. These cannot be statically linked because they manage shared resources critical to the entire system.

The reason is straightforward—heap memory must be shared across all components, ensuring compatibility between allocations and deallocations. Similarly, TLS and threading require a unified system-wide approach, as they involve complex initialization and finalization logic, particularly for global constructors and destructors. However, these components are relatively small and stable, meaning they undergo fewer changes that would necessitate version updates.

Questioning it

This is, of course, a non-trivial amount of rearchitecting, which naturally raises the question: why is libc implemented the way it is instead of this alternative approach?

Setting historical reasons aside, attempting to solve this problem quickly becomes difficult the moment you start writing any code using libc. Here is a trivial example of the issues that arise when trying to support multiple versions of libc.

Suppose you have a dynamic library that contains the following C code.

#include <stdio.h>
FILE* open_thing() {
  return fopen("thing.bin", "r");
}

And your application links against this library and calls open_thing. Your application would be responsible for calling fclose on the returned FILE*. If your code links against a different version of libc from the version the library links against, then it would be calling the wrong implementation of fclose!

Suppose though, that libc was written in such a way that the FILE* returned always required a version field or a pointer to a vtable containing the implementation of fclose (and other functions) and every version of libc agreed on this so that it could always call the correct one across this ABI boundary. This would solve this compatibility issue, but now lets say your code calls fflush.

// Defined in header <stdio.h>
int fflush(FILE *fp);

Except it doesn't flush the file, and instead passes NULL.

fflush(NULL);

If you're not familiar with C's fflush function, passing NULL to it requires flushing all open files (every FILE*). However, in this scenario, it would only flush files seen by the libc version your application is linked against, not those opened by other libc versions (such as the one used by open_thing).

To handle this correctly, each libc version would need a way to enumerate files across all other libc instances, including dynamically loaded ones, ensuring that every file is visited exactly once without forming cycles. This enumeration must also be thread-safe. Additionally, while enumeration is in progress, another libc could be dynamically loaded (e.g., via dlopen) on a separate thread, or a new file could be opened (e.g., a global constructor in a dynamically loaded library calling fopen).

This global list of things owned by libc shows up in multiple places. Take for instance:

// Defined in header <stdlib.h>
int atexit(void (*func)(void));

Registers the function pointed to by func to be called on normal program termination (via exit() or returning from main()). The functions will be called in reverse order they were registered, i.e. the function registered last will be executed first.

There is also another variant of this called at_quick_exit.

This implies that somewhere within libc, there must be a list of functions registered via atexit that need to be executed in reverse order. For multiple libc implementations to coexist, any system handling atexit must not only enumerate and call all registered functions, but also establish a total order for how they were inserted across all instances of libc.

Essentially, any resource owned by one libc would need to be sharable and accessible from any other version of libc. This turns out to be quite a lot of things. For the sake of our argument we've actually gone through all of the standard C (not POSIX) functions which produce or operate on a resource which has an opaque implementation where careful considerations would need to be made.

Header	Function	Resource	Notes
`<fenv.h>`	N/A	`fexcept_t`	Float environment exceptions need to be stable across libc.
`<fenv.h>`	`*`	`fexcept_t`	Any functions using this type
`<fenv.h>`	`fegetenv`	`fenv_t`	Float environment needs to be stable across libc.
`<fenv.h>`	`*`	`fenv_t`	Any functions using this type
`<locale.h>`	`localeconv`	`struct lconv`	Common initial sequence needs to be stable across libc.
`<math.h>`	N/A	`int`	Math defines need to have a stable set of integer values across libc.
`<setjmp.h>`	N/A	`jmp_buf`	Usually defined by compiler
`<setjmp.h>`	`*`	`jmp_buf`	Usually defined by compiler intrinsic
`<signal.h>`	N/A	`int`	Signal defines need to have a stable set of integer values across libc.
`<signal.h>`	N/A	`sig_atomic_t`	Stable type across libc
`<stdarg.h>`	N/A	`va_list`	Usually defined by compiler
`<stdarg.h>`	`va_start`	`va_list`	Usually defined by compiler intrinsic
`<stdarg.h>`	`*`	`va_list`	Any functions or macros using this type
`<stdatomic.h>`	`*`	`_Atomic T`	Stable type across libc
`<stdatomic.h>`	N/A	`int`	Atomic defines need to have a stable set of integer values across libc.
`<stdatomic.h>`	N/A	`typedef`	Many typedefs need to have a stable set of types across libc.
`<stddef.h>`	N/A	`typedef`	Many typedefs need to have a stable set of types across libc.
`<stdint.h>`	N/A	`typedef`	Many typedefs need to have a stable set of types across libc.
`<stdint.h>`	N/A	`int`	Many defines need to have a stable set of types across libc.
`<stdio.h>`	`*`	`FILE`	Many functions (any taking `FILE` or returning `FILE`)
`<stdio.h>`	N/A	`typedef`	Many types need to have a stable set of types across libc.
`<stdio.h>`	N/A	`int`	Many defines need to have a stable set of integer values across libc.
`<stdio.h>`	N/A	N/A	Locale for string formatting needs to be shared across libc
`<stdio.h>`	`stderr`	N/A	Needs to be a macro that expands to a function call like `__stdio(STDERR_FILENO)`
`<stdio.h>`	`stdout`	N/A	Needs to be a macro that expands to a function call like `__stdio(STDOUT_FILENO)`
`<stdio.h>`	`stdin`	N/A	Needs to be a macro that expands to a function call like `__stdio(STDIO_FILENO)`
`<stdlib.h>`	N/A	`div_t`,	Needs to have a stable definition across libc.
`<stdlib.h>`	N/A	`ldiv_t`,	Needs to have a stable definition across libc.
`<stdlib.h>`	N/A	`lldiv_t`	Needs to have a stable definition across libc.
`<stdlib.h>`	N/A	`int`	Many defines need to have a stable set of integer values across libc.
`<stdlib.h>`	`call_once`	`once_flag`	Needs to be stable across libc and also libthread
`<stdlib.h>`	`rand`	N/A	Global PRNG needs to be shared across libc.
`<stdlib.h>`	`srand`	N/A	Global PRNG needs to be shared across libc.
`<stdlib.h>`	`aligned_alloc`	`void*`	Shared heap
`<stdlib.h>`	`calloc`	`void*`	Shared heap
`<stdlib.h>`	`free`	`void*`	Shared heap
`<stdlib.h>`	`free_sized`	`void*`	Shared heap
`<stdlib.h>`	`free_aligned_size`	`void*`	Shared heap
`<stdlib.h>`	`malloc`	`void*`	Shared heap
`<stdlib.h>`	`realloc`	`void*`	Shared heap
`<stdlib.h>`	`atexit`	N/A	Global list needs to be shared across libc.
`<stdlib.h>`	`at_quick_exit`	N/A	Global list needs to be shared across libc.
`<string.h>`	`strcoll`	N/A	`LC_COLLATE` locale needs to be shared across libc
`<threads.h>`	N/A	`cnd_t`	Any opaque method
`<threads.h>`	N/A	`thrd_t`	Any opaque method
`<threads.h>`	N/A	`tss_t`	Any opaque method
`<threads.h>`	N/A	`mtx_t`	Any opaque method
`<threads.h>`	`*`	`cnd_t`	Many functions using this type
`<threads.h>`	`*`	`thrd_t`	Many functions using this type
`<threads.h>`	`*`	`tss_t`	Many functions using this type
`<threads.h>`	`*`	`mtx_t`	Many functions using this type
`<threads.h>`	`*`	`typedef`	Many types need to have a stable set of types across libc.
`<threads.h>`	N/A	`int`	Many defines need to have a stable set of integer values across libc.
`<threads.h>`	`call_once`	`once_flag`	See `<stdlib.h>` above
`<time.h>`	N/A	`typedef`	Many types need to have a stable set of types across libc.
`<time.h>`	N/A	`struct tm`	Common initial sequence needs to be stable across libc.
`<uchar.h>`	N/A	`char8_t`	Needs to be same across libc
`<uchar.h>`	N/A	`char16_t`	Needs to be same across libc
`<uchar.h>`	N/A	`char32_t`	Needs to be same across libc
`<uchar.h>`	`*`	`char8_t`	Many functions using this type
`<uchar.h>`	`*`	`char16_t`	Many functions using this type
`<uchar.h>`	`*`	`char32_t`	Many functions using this type
`<uchar.h>`	N/A	`mbstate_t`	Any opaque method
`<uchar.h>`	`*`	`mbstate_t`	Many functions using this type.
`<wchar.h>`	`*`	`*`	Repeat of `<uchar.h>` essentially
`<wctype.h>`	N/A	`wctrans_t`	Need to be same across libc
`<wctype.h>`	N/A	`wctype_t`	Need to be same across libc
`<wctype.h>`	`*`	`wctrans_t`	Many functions using this type
`<wctype.h>`	`*`	`wctype_t`	Many functions using this type

Most defines (of constants), ABI-exposed types (and typedefs) should just be stable across libc implementations for this to work reliably. Since these are baked into executables there's no real way to modify or change them without breaking stuff anyways. When discussing stuff that is opaque (listed as Any opaque method) we propose affixing a pointer to a vtable containing the implementation as the first value in the type, this way functions operating on it can always recover the correct implementation and dispatch it indirectly through the vtable. Other methods like using a version field can also work here too.

Regardless, certain aspects of libc introduce complexity, particularly global and thread-local elements like errno and locale. However, with a well-designed architecture, these challenges can be effectively addressed.

Memory allocation functions—calloc, malloc, aligned_alloc, realloc, and free—from <stdlib.h> pose another difficulty. Since they can return any pointer, tracking them is non-trivial. One possible approach is to store a pointer to a vtable in the allocation header, allowing each allocation to reference its implementation. However, this method incurs significant performance and memory overhead. Instead, we propose centralizing heap management in a dedicated libheap. This would also contain implementation of the POSIX extensions like posix_memalign.

Things get even more interesting when moving from standard C to POSIX, which introduces unique challenges that require libc support. Some of these functionalities might be better split into separate libraries (for example, why is the DNS resolver in libc?). Among these challenges though, setxid stands out.

For those unfamiliar, permissions in POSIX—such as the real, effective, and saved user/group IDs—apply at the process level. However, Linux treats threads as independent processes that share memory, meaning these permissions are managed per thread rather than per process. To comply with POSIX semantics, libc must interrupt every thread, forcing them to execute code that invokes the system call to modify their thread-local permissions. This must be done atomically, without failure, and while remaining async-signal safe. This is a nightmare to implement and has proven to be a major challenge. More importantly, getting it right is crucial for security.

Ultimately, this means libc must track every thread and provide a way to synchronously execute code across all threads. To address this, we propose consolidating threading, TLS, and the necessary POSIX compliance mechanisms within a single libthread.

There are many additional complexities we’ve glossed over and many alternative ways this can be implemented. The key takeaway is that these issues are solvable—they just require significant architectural changes. This necessitates reevaluating this aspect of the Linux userspace from the ground up with binary compatibility as a core design principle. GLIBC has never seriously attempted this. Until someone decides, "enough is enough, let’s fix this," binary compatibility on Linux will remain an unsolved problem; we strongly believe this is a problem worth solving.