docs/Linker-scripts-explained.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32

## Linking

Linking is a process of creating an executable, library or another object file out of object files. [wikipedia](https://en.wikipedia.org/wiki/Linker_%28computing%29)
During linking, values previously unknown to the compiler (i.e. what will be the addresses of external functions/variables, from what address will the code be executing) might be injected into the code.

Linker script is, among others, used to tell the linker, where in memory the specific parts of the executable should lie.

In a hosted environment (when building a program to run under an full-featured operting system, like GNU/Linux), a linker script is usually provided by the toolchain and used if no other script is provided. In a bare-metal project, the developer usually has to write their own linker script, in which they specify the binary image's **load address** and section layout.

Contents of an object code file or executable (our .o or .elf) are grouped into sections. Sections have names. Common named are .text (usually contains code), .data (usually contains statically-allocated variables initialized to non-zero values), .bss (usually used to reserve memory for statically allocated variables initialized to zero), .rodata (usually contains statically-allocated variables, that are not going to be modified).

In a hosted environment, when an executable (say, of elf format) is executed, contents of it's sections are usually placed in different memory segments with different access privileges, so that, for example, code is not writable and variable contents are not executable. This helps reduce the risk of buffer overflow exploits.

In a bare-environment like ours, we don't execute an elf file directly (except in qemu, which is the unpreferred approach anyway), but rather a raw binary image created from an elf file. Still, the notion of section is used along the way.

During link, one or more object code files are combined into one file (in our case an executable). Section contents of input files land in some sections of the output file, in a way defined in the linker script. In a hosted environment, a linker script would likely put contents of input .text sections in a .text section, contents of input .data sections in a .data section, etc. The developer can, however, use sections with different names (although weird behaviour of some linkers might occur) and assign their contents in their preferred way using a linker script.

In linker script it is possible to specify a section as NOLOAD (usually used for .bss), which, in our case, causes that section not to be included in the binary image later created with objcopy.

It is also possible to treat same-named input sections differently depending on what file they came from and even use wildcards when specifying file names.

Variables can be created, as well as new symbols, which can then be references from C code.

Defining alignment of specific parts of future image is also easily achievable.

We made use of all those possibilities in our scripts.

In src/arm/PL1/kernel/kernel_stage2.ld the physical memory layout of thkernel is defined. Symbols defined there, such as _stack_end, are referenced in C header src/arm/PL1/kernel/memory.h.

While src/arm/PL1/kernel/kernel.ld and src/arm/PL1/loader/loader.ld define the starting address, it is irrelevant, as the assembly-written position-independent code for [first stages of loader and kernel](./Boot_explained.txt) does not depend on that address.

At the beginning of this project, we had very little understanding of linker scripts' syntax. [This article](https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/4/html/Using_ld_the_GNU_Linker/sections.html#OUTPUT-SECTION-DESCRIPTION) proved useful and allowed us to learn the required parts in a short time. As discussing the entire syntax of linker scripts is beyond the scope of this documentation, we refer the reader to that resource.