Add article on ELF linking

This commit is contained in:
Emery Hemingway 2022-10-07 21:24:42 -05:00
parent d2bf5375d8
commit 20796fabae
2 changed files with 26 additions and 1 deletions

View File

@ -24,7 +24,8 @@ portability.
## Articles
[Genodepkgs post-mortem report](https://gemini.spam.works/users/emery/sigil-report.gmi)
- [Deferred linking in Sigil](./doc/deferred-linking.md)
- [Genodepkgs post-mortem report](https://gemini.spam.works/users/emery/sigil-report.gmi)
## Repository layout

24
doc/deferred-linking.md Normal file
View File

@ -0,0 +1,24 @@
# Deferred linking
Sigil generally avoids dynamically linking its ELF binaries. To explain why we must first review the meaning of the term "dynamic-link".
> dynamic - Characterized by or tending to produce continuous change or advance.
That is the definition from 1973 edition of the "The American Heritage Dictionary" (as good as any for clarifying the language of UNIX). Dynamic linking is so called because the runtime behavior of a program may continuously change as a system's "dynamic libraries" are upgraded. To borrow a [metaphor](https://www.tweag.io/blog/2022-07-14-taming-unix-with-nix/), a dynamic link is like an pointer to some behavior. These pointers might be weakly typed or not typed at all, and they might dangle. Obviously this sort of pointer is useful or it wouldn't be tolerated. For example, such a pointer could point to some hardware-specific behavior that isn't static at build time, or it could point to behavior with proprietary internals. In the case of Genode it is kernel specific behaviors that are hidden beneath such pointers, which this is why it can only be said that Sigil generally avoids this sort of linking.
Sigil binaries aren't required to be static-linked at runtime, but they are usually "shared-linked" using absolute references, and the same can be said of Nix packages for any platform. This is because we discipline our builds and tests results to be deterministic and we likewise expect the behaviors of a program at runtime to selected deterministically. For Nix this is enforced in two ways. First there is no global library namespace at /lib, ELF binaries list a "runtime path" for finding libraries with some explicit /nix/store/… directories that must be determined at compile time. Second, relative names like `libz.so` can be replaced by absolute file-system paths, which is preferred because it avoids expensive directory searches that can certainly be resolved at compile-time. The Nix method can be described as "deferred-linking" because the link is finalized at runtime rather than compile time, and it is without continuous change in behavior at runtime.
Sigil takes this concept further by linking using absolute identifiers that are independent of the file-system. We do not want file-system traversals at program load time because we do not give programs access to a file-system unless absolutely necessary. This is to simply avoid the complexity of a file-system, to avoid open-on-fail policies that might allow mutation of system libraries, to prevent programs from observing what other libraries are present on a system, and because there are cases where a file-system isn't required otherwise.
## ERIS linking
Sigil ELF binaries are mostly linked using [ERIS](https://eris.codeberg.page) URNs instead of file-system paths. The Genode loader and linker operate over named ROM dataspaces (memory region capabilities) and are already independent of a file-system layer, so that layer retains the standard Genode behavior and the loading the ERIS content into a ROM dataspace is handled externally.
When packages are built the ELF binaries are compiled and linked normally (or as they would for either Genode or NixOS). After linking a fixup step resolves all libraries to absolute file-system paths and then the ELF headers are patched to replace libraries with deterministic ERIS identifiers that correspond to each absolute file-path. The runtime closure of all libraries that will be loaded via ERIS and their mappings to build-time file-system paths are recorded in `/nix/store/…/nix-support/eris-manifest.dhall`. This manifest is detected by the Nix package manager as an explicit dependency on those paths and the manifest can be evaluated later to collect the closure from the build file-system. The mappings are also recorded within a `.note.eris-links` ELF section as inspired by the "[Package information on ELF objects](https://lwn.net/ml/fedora-devel/CA+voJeU--6Wk8j=D=i3+Eu2RrhWJACUiirX2UepMhp0krBM2jg@mail.gmail.com/)" proposal and subsequent standards, albeit with a [CBOR](https://cbor.io/) rather than JSON encoding. This ELF note allows any ERIS URN in a binary to be backtracked to its build-time path.
The format of the CBOR data within the `.note.eris-links` section is a the self-describing CBOR tag (55799) and a map of text keys to byte-strings [tagged as ERIS read-capabilities](http://purl.org/eris#name-cbor-tag) (276). The normative [CDDL](https://datatracker.ietf.org/doc/html/rfc8610) description:
``` cddl
#6.55799({ * tstr => #6.276(bstr) })
```
At this time of writing there is no utility for dumping the note into a human readable format.