By Dale Weiler GitHub
Linux binary compatibility is plagued by one thing that is often overlooked when evaluating shipping software on Linux. This article will deconstruct how to arrive to that conclusion, how to address it when shipping software today and what needs to be done to actually fix it.
Table of contents
At JangaFX, we make several products that run natively on Linux. We love the flexibility and power that Linux offers our developers, but shipping software on it is a whole different challenge.
Linux is an incredibly powerful platform, but when it comes to shipping software, it can feel like a minefield. Unlike other operating systems, Linux isn’t just one system—it’s a chaotic mix of different services, libraries, and even philosophies. Every distribution does things a little differently, and that means the same executable that works flawlessly on one system might completely break on another.
This shouldn’t even be a problem. The Linux kernel itself has maintained relatively stable system calls. But everything built on top of it changes constantly in ways that break compatibility, making it incredibly frustrating to ship software that "just works." If you’re developing for Linux, you’re not targeting a single platform—you’re navigating an ecosystem that has evolved without much concern for binary compatibility.
Some of us, coming from the game industry before moving into VFX have dealt with this problem before. Shipping games on Linux has always been a nightmare, and the same issues persists regardless of industry. In this article, We're going to explain why we think containers are the wrong approach, how we build and ship Linux software in a way that actually works, what we think is responsible for Linux's binary compatibility problem and what needs to change to fix it.
Tools like Flatpak, AppImage, and similar solutions attempt to simplify shipping executables by creating "containers"—or as as we've recently taken to calling them, "a Linux Environment inside a Linux" Using Linux features like namespaces and chroots, these solutions package an entire Linux environment, complete with all required dependencies, into a single self-contained bundle. In extreme cases, this means shipping an entire Linux user-space just for one application.
One of the major challenges with these containerized solutions is that they often don’t work well with applications that need to interact with the rest of the system. To access hardware-accelerated APIs like OpenGL, Vulkan, VDPAU or CUDA, an application must dynamically link against the system's graphics driver libraries. Since these libraries exist outside the container and cannot be shipped with the application, various "pass-through" techniques have been developed to work around this, some of which introduce runtime overhead (e.g., shimming libraries). Because containerized applications are isolated from the system, they often feel isolated too. This creates consistency issues, where the application may not recognize the user’s name, home directory, system settings, desktop environment preferences, or even have proper access to the filesystem.
To work around these limitations, many containerized environments rely on the XDG Desktop Portal protocol, which introduces yet another layer of complexity. This system requires IPC (inter-process communication) through DBus just to grant applications access to basic system features like file selection, opening URLs, or reading system settings—problems that wouldn’t exist if the application weren’t artificially sandboxed in the first place.
We don’t believe that piling on more layers is an acceptable solution. As engineers, we need to stop and ask ourselves: "should we keep adding to this tower of Babel?", or is it time to peel back some of these abstractions and reevaluate them? At some point, the right solution isn’t more complexity—it’s less.
While containerized solutions can work under certain conditions, we believe that shipping lean, native executables—without containers—provides a more seamless and integrated experience that better aligns with user expectations.
When you compile your application, it links against the specific library versions present on the build machine. This means that by default, the versions on a user's system may not match, causing compatibility issues. Let’s assume the user has all the necessary libraries installed, but the versions don’t match what your application was built against. This is where the real problem begins. Short of shipping the exact machine used to deploy your application, how do you ensure compatibility with the versions installed on a user’s system?
We believe there are two ways to solve this problem, and we've given them our own names:
The first approach works well in cases where the necessary libraries may not exist on a user’s machine, but it fails for libraries that cannot be shipped (we call these "system libraries") The second approach is particularly effective for system libraries and is the approach we use at JangaFX.
There are various libraries present on a Linux machine that cannot be shipped because they are system libraries. These are libraries tied to the system itself and cannot be provided in a container. Typically these include things like user-space drivers for the GPU, enterprise installed components for security, and of course, libc itself.
If you’ve ever tried to distribute Linux binaries, you may have encountered an error message like this:
/lib64/libc.so.6: version `GLIBC_2.18' not found
For those unaware, glibc (GNU C Library) provides the C standard library, POSIX APIs, and the dynamic linker responsible for loading shared libraries, and itself.
GLIBC is an example of a "system library" that cannot be bundled with your application because it includes the dynamic linker itself. This linker is responsible for loading other libraries, some of which may also depend on GLIBC—but not always. Complicating matters further, since GLIBC is a dynamic library, it must also load itself. This self-referential, chicken-and-egg problem highlights GLIBC’s complexity and monolithic design, as it attempts to fulfill multiple roles simultaneously. A large downside to this monolithic design is that upgrading GLIBC often requires upgrading the entire system. Later in this article, we will explain why this structure needs to change to truly solve Linux’s binary compatibility problem.
Before you suggest statically linking GLIBC—that’s not an option. GLIBC relies on dynamic linking for features like NSS modules, which handle hostname resolution, user authentication, and network configuration, among other dynamically loaded components. Static linking breaks this because it does not include the dynamic linker, which is why GLIBC does not officially support it. Even if you managed to statically link GLIBC—or used an alternative like musl—your application would be unable to load any dynamic libraries at runtime. Static linking the dynamic linker itself is not possible, for reasons that will be explained later. In short, this would prevent your application from dynamically linking against any system libraries at all.
Since our application relies on many non-system libraries that may not be installed on the user’s system, we need a way to include them. The most straightforward approach is the Replication Approach, where we ship these libraries alongside our application. However, this negates the benefits of dynamic linking, such as shared memory usage and system-wide updates. In such cases, statically linking these libraries into the application is a better choice, as it eliminates dependency issues entirely. It also enables additional optimizations, such as LTO, and results in a smaller package by stripping unused components from the included libraries.
Instead, we take a different approach: statically linking everything we can. When doing so, special care is needed if a dependency embeds another dependency within its static library. We've encountered static libraries that include object files from other static libraries (e.g., libcurl), but we still need to link them separately. This duplication is conveniently avoided with dynamic libraries, but with static libraries, you may need to extract all object files from the archive and remove the embedded ones manually. Similarly, compiler runtimes like libgcc
default to dynamic linking. We recommend using -static-libgcc
.
Finally, when it comes to dealing with system libraries, we use the Relaxation Approach. Rather than requiring exact or newer versions of system libraries, we link against versions that are old enough to be nearly universally compatible. This increases the likelihood that the user’s system libraries will work with our application, reducing dependency issues without the need for containerization or bundling system components and shims.
The method we suggest when linking against older system libraries is to obtain a corresponding older Linux environment. You don’t need to install an old Linux version on physical hardware or even set up a full virtual machine. Instead, a chroot provides a lightweight, isolated environment within an existing Linux installation, allowing you to build against an older system without the overhead of full virtualization. Ironically, this suggests that containers were the right solution all along—just not at runtime, but at build time.
To achieve this, we use debootstrap, an excellent script that creates a minimal Debian installation from scratch. Debian is particularly suited for this approach due to its stability and long-term support for older releases, making it a great choice for ensuring compatibility with older system libraries.
Of course, once you have an older Linux setup, you may find that its binary package toolchains are too outdated to build your software. To address this, we compile a modern LLVM toolchain from source and use it to build both our dependencies and our software. The details of this process are beyond the scope of this article.
Finally, we automate the entire debootstrap process with a Python script, which we've included here for reference.
#!/bin/env python3 import os, subprocess, shutil, multiprocessing PACKAGES = [ 'build-essential' ] DEBOOSTRAP = 'https://salsa.debian.org/installer-team/debootstrap.git' ARCHIVE = 'http://archive.debian.org/debian' VERSION = 'jessie' # Released in 2015 def chroot(pipe): try: os.chroot('chroot') os.chdir('/') # Setup an environment for the chroot env = { 'HOME': '/root', 'TERM': 'xterm', 'PATH': '/bin:/usr/bin:/sbin:/usr/sbin' } # The Debian is going to be quite old and so the keyring keys will likely be # expired. To work around this we will replace the sources.list to contain # '[trusted=yes]' with open('/etc/apt/sources.list', 'w') as fp: fp.write(f'deb [trusted=yes] http://archive.debian.org/debian {VERSION} main\n') # Update and install packages subprocess.run(['apt', 'update'], env=env) subprocess.run(['apt', 'install', '-y', *PACKAGES], env=env) # # Script your Linux here, remember to pass `env=env` to subprocess.run. # # We suggest downloading GCC 7.4.0, compiling from source, and installing # it since it's the minimum version required to compile the latest LLVM from # source. We then suggest downloading, compiling from source, and installing # the latest LLVM, which as of time of writing is 20.1.0. # # You can then compile and install all other source packages your software # requires from source using this modern LLVM toolchain. # # You can also enter the chroot with an interactive shell from this script # by uncommenting the following and running this script as usual. # subprocess.run(['bash']) # # You can send messages to the parent with pipe.send() pipe.send('Done') # This one has special meaning in main except Exception as exception: pipe.send(exception) def main(): # We need to run as root to use 'mount', 'umount', and 'chroot' if os.geteuid() != 0: print('Script must be run as root') return False with multiprocessing.Manager() as manager: mounts = manager.list() pipe = multiprocessing.Pipe() def mount(parts): subprocess.run(['mount', *parts]) mounts.append(parts[-1]) # Ensure we have a fresh chroot and clone of debootstrap shutil.rmtree('chroot', ignore_errors=True) shutil.rmtree('debootstrap', ignore_errors=True) os.mkdir('chroot') # Clone debootstrap subprocess.run(['git', 'clone', DEBOOSTRAP]) subprocess.run(['debootstrap', '--arch', 'amd64', VERSION, '../chroot', ARCHIVE], env={**os.environ, 'DEBOOTSTRAP_DIR': '.'}, cwd='debootstrap') # Mount nodes needed for the chroot mount(['-t', 'proc', '/proc', 'chroot/proc']) mount(['--rbind', '/sys', 'chroot/sys']) mount(['--make-rslave', 'chroot/sys']) mount(['--rbind', '/dev', 'chroot/dev']) mount(['--make-rslave', 'chroot/dev']) # Setup the chroot in a separate process process = multiprocessing.Process(target=chroot, args=(pipe[1],)) process.start() try: while True: data = pipe[0].recv() if isinstance(data, Exception): raise data else: print(data) if data == 'Done': break finally: process.join() for umount in reversed(list(set(mounts))): subprocess.run(['umount', '-R', umount]) subprocess.run(['sync']) if __name__ == '__main__': try: main() except KeyboardInterrupt: print('Cancelled')
Generally, most applications do not link directly against system libraries, and instead load which ever is present on the user's machine already at runtime. So while these libraries are considered system components, they typically have few system dependencies beyond libc itself. This is what makes libc—specifically GLIBC—the real source of compatibility issues, it's essentially the only system component directly linked against.
In just the past two years, our team has encountered three separate GLIBC-specific compatibility issues, each directly impacting our products:
From our perspective, the core issue with GLIBC is that it tries to do far too much. It’s a massive, monolithic system that handles everything—from system calls and memory management to threading and even the dynamic linker. This tight coupling is why upgrading GLIBC often means upgrading the entire system; everything is intertwined. If it were broken into smaller, more focused components, users could update only the parts that change, rather than dragging their whole system along with it.
More importantly, separating the dynamic linker from the C library itself would allow multiple versions of libc to coexist, eliminating a major source of compatibility issues. This is exactly how Windows handles it, which is one of the reasons Windows maintains such strong binary compatibility. You can still run decades-old Windows software today because Microsoft doesn’t force everything to be tied to a single, ever-changing libc.
Of course, this isn’t as simple as just splitting things apart. GLIBC has deep cross-cutting concerns, particularly with threading, TLS (Thread-Local Storage), and global memory management.
For example, even if you managed to get two versions of GLIBC to coexist, returning allocated memory from one and attempting to free it in another would likely lead to serious issues. Their heaps would be unaware of each other, potentially using different allocation strategies, causing unpredictable failures. To avoid this, even the heap would likely need to be separated into its own dedicated libheap.
We think a better approach would be breaking GLIBC into distinct libraries, something like this:
libheap
, libthread
and libc
to gain access to shared system call code. Since it's static it's embedded in all three. You can pretend this library otherwise does not exist.libsyscall
statically. Is a true, free-standing library, depending on nothing. Provided as both a static and dynamic library. When you link against it statically you can still load things with it dynamically. You just end up with a dynamic linker inside your executable.libsyscall
statically. Provided only as a dynamic library. Cannot ever be linked against statically.libheap
. Provided only as a dynamic library. Cannot ever be linked against statically.libthread
, and thus libheap
, and libdl
transitively. Provided as both a static and dynamic library. When linking statically it links libdl
statically. The linking of libthread
and libheap
is always done dynamically though through the included libdl
if linked statically or through libdl
program loader if linked dynamically.These libraries would be aware of each other and allow multiple versions to coexist in the same address space. That way, we don’t end up with this brittle mess where upgrading GLIBC can break everything. The actual structure would look something like
This architecture is quite similar to Windows, where the equivalents of
libsyscall
,libdl
,libheap
, andlibthread
are all bundled within a singlekernel32.dll
. This DLL is pre-mapped and automatically loaded into the address space of every executable on Windows.
libc
libc
and libdl
(it is not the program loader).libheap
and libthread
using embedded libdl
.[Application]
│
▼
[libc (static)]
│
▼
[libdl (static)]
├── [libheap (dynamic)]
└── [libthread (dynamic)]
└── [libheap (dynamic)]
libc
and libdl
are embedded in the executable, meaning the application itself starts execution.libdl
dynamically loads libthread
and libheap
.libc
libdl
).libdl
(program loader) loads the application and resolves dependencies.libc
, libheap
, and libthread
.[Application (interpreter entry)]
│
▼
[libdl (program loader)]
│
▼
[libc (dynamic)]
├── [libheap (dynamic)]
├── [libthread (dynamic)]
│ └── [libheap (dynamic)]
▼
[Application (regular entry)]
Scenario | libdl (included by) | libc (how it loads) | libthread (via libdl ) | libheap (via libdl ) |
---|---|---|---|---|
Static libc | Static linking | Static linking | Linked by libdl | Linked by libdl |
Dynamic libc | Program interpreter | Linked by libdl | Linked by libdl | Linked by libdl |
This architecture effectively reduces the binary compatibility problem to two key system libraries: libheap
and libthread
. These cannot be statically linked because they manage shared resources critical to the entire system.
The reason is straightforward—heap memory must be shared across all components, ensuring compatibility between allocations and deallocations. Similarly, TLS and threading require a unified system-wide approach, as they involve complex initialization and finalization logic, particularly for global constructors and destructors. However, these components are relatively small and stable, meaning they undergo fewer changes that would necessitate version updates.
This is, of course, a non-trivial amount of rearchitecting, which naturally raises the question: why is libc implemented the way it is instead of this alternative approach?
Setting historical reasons aside, attempting to solve this problem quickly becomes difficult the moment you start writing any code using libc. Here is a trivial example of the issues that arise when trying to support multiple versions of libc.
Suppose you have a dynamic library that contains the following C code.
#include <stdio.h> FILE* open_thing() { return fopen("thing.bin", "r"); }
And your application links against this library and calls open_thing
. Your application would be responsible for calling fclose
on the returned FILE*
. If your code links against a different version of libc from the version the library links against, then it would be calling the wrong implementation of fclose
!
Suppose though, that libc was written in such a way that the FILE*
returned always required a version field or a pointer to a vtable containing the implementation of fclose
(and other functions) and every version of libc agreed on this so that it could always call the correct one across this ABI boundary. This would solve this compatibility issue, but now lets say your code calls fflush
.
// Defined in header <stdio.h> int fflush(FILE *fp);
Except it doesn't flush the file, and instead passes NULL
.
fflush(NULL);
If you're not familiar with C's fflush
function, passing NULL
to it requires flushing all open files (every FILE*
). However, in this scenario, it would only flush files seen by the libc version your application is linked against, not those opened by other libc versions (such as the one used by open_thing
).
To handle this correctly, each libc version would need a way to enumerate files across all other libc instances, including dynamically loaded ones, ensuring that every file is visited exactly once without forming cycles. This enumeration must also be thread-safe. Additionally, while enumeration is in progress, another libc could be dynamically loaded (e.g., via dlopen
) on a separate thread, or a new file could be opened (e.g., a global constructor in a dynamically loaded library calling fopen
).
This global list of things owned by libc shows up in multiple places. Take for instance:
// Defined in header <stdlib.h> int atexit(void (*func)(void));
Registers the function pointed to by func to be called on normal program termination (via
exit()
or returning frommain()
). The functions will be called in reverse order they were registered, i.e. the function registered last will be executed first.
There is also another variant of this called
at_quick_exit
.
This implies that somewhere within libc, there must be a list of functions registered via atexit
that need to be executed in reverse order. For multiple libc implementations to coexist, any system handling atexit
must not only enumerate and call all registered functions, but also establish a total order for how they were inserted across all instances of libc.
Essentially, any resource owned by one libc would need to be sharable and accessible from any other version of libc. This turns out to be quite a lot of things. For the sake of our argument we've actually gone through all of the standard C (not POSIX) functions which produce or operate on a resource which has an opaque implementation where careful considerations would need to be made.
Header | Function | Resource | Notes |
---|---|---|---|
<fenv.h> | N/A | fexcept_t | Float environment exceptions need to be stable across libc. |
<fenv.h> | * | fexcept_t | Any functions using this type |
<fenv.h> | fegetenv | fenv_t | Float environment needs to be stable across libc. |
<fenv.h> | * | fenv_t | Any functions using this type |
<locale.h> | localeconv | struct lconv | Common initial sequence needs to be stable across libc. |
<math.h> | N/A | int | Math defines need to have a stable set of integer values across libc. |
<setjmp.h> | N/A | jmp_buf | Usually defined by compiler |
<setjmp.h> | * | jmp_buf | Usually defined by compiler intrinsic |
<signal.h> | N/A | int | Signal defines need to have a stable set of integer values across libc. |
<signal.h> | N/A | sig_atomic_t | Stable type across libc |
<stdarg.h> | N/A | va_list | Usually defined by compiler |
<stdarg.h> | va_start | va_list | Usually defined by compiler intrinsic |
<stdarg.h> | * | va_list | Any functions or macros using this type |
<stdatomic.h> | * | _Atomic T | Stable type across libc |
<stdatomic.h> | N/A | int | Atomic defines need to have a stable set of integer values across libc. |
<stdatomic.h> | N/A | typedef | Many typedefs need to have a stable set of types across libc. |
<stddef.h> | N/A | typedef | Many typedefs need to have a stable set of types across libc. |
<stdint.h> | N/A | typedef | Many typedefs need to have a stable set of types across libc. |
<stdint.h> | N/A | int | Many defines need to have a stable set of types across libc. |
<stdio.h> | * | FILE | Many functions (any taking FILE* or returning FILE* ) |
<stdio.h> | N/A | typedef | Many types need to have a stable set of types across libc. |
<stdio.h> | N/A | int | Many defines need to have a stable set of integer values across libc. |
<stdio.h> | N/A | N/A | Locale for string formatting needs to be shared across libc |
<stdio.h> | stderr | N/A | Needs to be a macro that expands to a function call like __stdio(STDERR_FILENO) |
<stdio.h> | stdout | N/A | Needs to be a macro that expands to a function call like __stdio(STDOUT_FILENO) |
<stdio.h> | stdin | N/A | Needs to be a macro that expands to a function call like __stdio(STDIO_FILENO) |
<stdlib.h> | N/A | div_t , | Needs to have a stable definition across libc. |
<stdlib.h> | N/A | ldiv_t , | Needs to have a stable definition across libc. |
<stdlib.h> | N/A | lldiv_t | Needs to have a stable definition across libc. |
<stdlib.h> | N/A | int | Many defines need to have a stable set of integer values across libc. |
<stdlib.h> | call_once | once_flag | Needs to be stable across libc and also libthread |
<stdlib.h> | rand | N/A | Global PRNG needs to be shared across libc. |
<stdlib.h> | srand | N/A | Global PRNG needs to be shared across libc. |
<stdlib.h> | aligned_alloc | void* | Shared heap |
<stdlib.h> | calloc | void* | Shared heap |
<stdlib.h> | free | void* | Shared heap |
<stdlib.h> | free_sized | void* | Shared heap |
<stdlib.h> | free_aligned_size | void* | Shared heap |
<stdlib.h> | malloc | void* | Shared heap |
<stdlib.h> | realloc | void* | Shared heap |
<stdlib.h> | atexit | N/A | Global list needs to be shared across libc. |
<stdlib.h> | at_quick_exit | N/A | Global list needs to be shared across libc. |
<string.h> | strcoll | N/A | LC_COLLATE locale needs to be shared across libc |
<threads.h> | N/A | cnd_t | Any opaque method |
<threads.h> | N/A | thrd_t | Any opaque method |
<threads.h> | N/A | tss_t | Any opaque method |
<threads.h> | N/A | mtx_t | Any opaque method |
<threads.h> | * | cnd_t | Many functions using this type |
<threads.h> | * | thrd_t | Many functions using this type |
<threads.h> | * | tss_t | Many functions using this type |
<threads.h> | * | mtx_t | Many functions using this type |
<threads.h> | * | typedef | Many types need to have a stable set of types across libc. |
<threads.h> | N/A | int | Many defines need to have a stable set of integer values across libc. |
<threads.h> | call_once | once_flag | See <stdlib.h> above |
<time.h> | N/A | typedef | Many types need to have a stable set of types across libc. |
<time.h> | N/A | struct tm | Common initial sequence needs to be stable across libc. |
<uchar.h> | N/A | char8_t | Needs to be same across libc |
<uchar.h> | N/A | char16_t | Needs to be same across libc |
<uchar.h> | N/A | char32_t | Needs to be same across libc |
<uchar.h> | * | char8_t | Many functions using this type |
<uchar.h> | * | char16_t | Many functions using this type |
<uchar.h> | * | char32_t | Many functions using this type |
<uchar.h> | N/A | mbstate_t | Any opaque method |
<uchar.h> | * | mbstate_t | Many functions using this type. |
<wchar.h> | * | * | Repeat of <uchar.h> essentially |
<wctype.h> | N/A | wctrans_t | Need to be same across libc |
<wctype.h> | N/A | wctype_t | Need to be same across libc |
<wctype.h> | * | wctrans_t | Many functions using this type |
<wctype.h> | * | wctype_t | Many functions using this type |
Most defines (of constants), ABI-exposed types (and typedefs) should just be stable across libc implementations for this to work reliably. Since these are baked into executables there's no real way to modify or change them without breaking stuff anyways. When discussing stuff that is opaque (listed as Any opaque method
) we propose affixing a pointer to a vtable containing the implementation as the first value in the type, this way functions operating on it can always recover the correct implementation and dispatch it indirectly through the vtable. Other methods like using a version field can also work here too.
Regardless, certain aspects of libc introduce complexity, particularly global and thread-local elements like errno and locale. However, with a well-designed architecture, these challenges can be effectively addressed.
Memory allocation functions—calloc
, malloc
, aligned_alloc
, realloc
, and free
—from <stdlib.h>
pose another difficulty. Since they can return any pointer, tracking them is non-trivial. One possible approach is to store a pointer to a vtable in the allocation header, allowing each allocation to reference its implementation. However, this method incurs significant performance and memory overhead. Instead, we propose centralizing heap management in a dedicated libheap
. This would also contain implementation of the POSIX extensions like posix_memalign
.
Things get even more interesting when moving from standard C to POSIX, which introduces unique challenges that require libc support. Some of these functionalities might be better split into separate libraries (for example, why is the DNS resolver in libc?). Among these challenges though, setxid
stands out.
For those unfamiliar, permissions in POSIX—such as the real, effective, and saved user/group IDs—apply at the process level. However, Linux treats threads as independent processes that share memory, meaning these permissions are managed per thread rather than per process. To comply with POSIX semantics, libc must interrupt every thread, forcing them to execute code that invokes the system call to modify their thread-local permissions. This must be done atomically, without failure, and while remaining async-signal safe. This is a nightmare to implement and has proven to be a major challenge. More importantly, getting it right is crucial for security.
Ultimately, this means libc must track every thread and provide a way to synchronously execute code across all threads. To address this, we propose consolidating threading, TLS, and the necessary POSIX compliance mechanisms within a single libthread
.
There are many additional complexities we’ve glossed over and many alternative ways this can be implemented. The key takeaway is that these issues are solvable—they just require significant architectural changes. This necessitates reevaluating this aspect of the Linux userspace from the ground up with binary compatibility as a core design principle. GLIBC has never seriously attempted this. Until someone decides, "enough is enough, let’s fix this," binary compatibility on Linux will remain an unsolved problem; we strongly believe this is a problem worth solving.