From 039cc132cb42fbd026482d18d04cc60bfe8b9ce3 Mon Sep 17 00:00:00 2001 From: Wojtek Kosior Date: Sat, 18 Jan 2020 19:51:30 +0100 Subject: explain linker scripts --- Linker-scripts-explained.txt | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) create mode 100644 Linker-scripts-explained.txt diff --git a/Linker-scripts-explained.txt b/Linker-scripts-explained.txt new file mode 100644 index 0000000..428cf52 --- /dev/null +++ b/Linker-scripts-explained.txt @@ -0,0 +1,24 @@ +Linking is a process of creating an executable, library or another object file out of object files. +During linking, values previously unknown to the compiler (i.e. what will be the addresses of external functions/variables, from what address will the code be executing) might be injected into the code. + +Linker script is, among others, used to tell the linker, where in memory the specific parts of the executable should lie. + +In a hosted environment (when building a program to run under an full-featured operting system, like GNU/Linux), a linker script is usually provided by the toolchain and used if no other script is provided. In a bare-metal project, the developer usually has to write their own linker script, in which they specify the binary image's **load address** and section layout. + +Contents of an object code file or executable (our .o or .elf) are grouped into sections. Sections have names. Common named are .text (usually contains code), .data (usually contains statically-allocated variables initialized to non-zero values), .bss (usually used to reserve memory for statically allocated variables initialized to zero), .rodata (usually contains statically-allocated variables, that are not going to be modified). +In a hosted environment, when an executable (say, of elf format) is executed, contents of it's sections are usually placed in different memory segments with different access privileges, so that, for example, code is not writable and variable contents are not executable. This helps reduce the risk of buffer overflow exploits. +In a bare-environment like ours, we don't execute an elf file directly (except in qemu, which is the unpreferred approach anyway), but rather a raw binary image created from an elf file. Still, the notion of section is used along the way. + +During link, one or more object code files are combined into one file (in our case an executable). Section contents of input files land in some sections of the output file, in a way defined in the linker script. In a hosted environment, a linker script would likely put contents of input .text sections in a .text section, contents of input .data sections in a .data section, etc. The developer can, however, use sections with different names (although weird behaviour of some linkers might occur) and assign their contents in their preferred way using a linker script. + +In linker script it is possible to specify a section as NOLOAD (usually used for .bss), which, in our case, causes that section not to be included in the binary image later created with objcopy. +It is also possible to treat same-named input sections differently depending on what file they came from and even use wildcards when specifying file names. +Variables can be created, as well as new symbols, which can then be references from C code. +Defining alignment of specific parts of future image is also easily achievable. +We made use of all those possibilities in our scripts. + +In src/arm/PL1/kernel/kernel_stage2.ld the physical memory layout of thkernel is defined. Symbols defined there, such as _stack_end, are referenced in C header src/arm/PL1/kernel/memory.h. + +While src/arm/PL1/kernel/kernel.ld and src/arm/PL1/loader/loader.ld define the starting address, it is irrelevant, as the assembly-written position-independent code for first stages of loader and kernel does not depend on that address. + +At the beginning of this project, we had very little understanding of linker scripts' syntax. proved useful and allowed us to learn the required parts in a short time. As discussing the entire syntax of linker scripts is beyond the scope of this documentation, we refer the reader to that resource. -- cgit v1.2.3 From 5c8308c7ef5f4e528389ef7787dfc5e87d0a16d2 Mon Sep 17 00:00:00 2001 From: Wojtek Kosior Date: Sat, 18 Jan 2020 20:26:34 +0100 Subject: add UART-related TODO needed for compatibility with stock firmware --- TODOs | 2 ++ 1 file changed, 2 insertions(+) diff --git a/TODOs b/TODOs index af2bf64..9143a9b 100644 --- a/TODOs +++ b/TODOs @@ -70,6 +70,8 @@ high priority TODOs are higher; low priority ones and completed ones are lower; * Check if setting user mode's sp and lr can be achieved by msr instead of switching to system mode. If so, use this method. +* Explicitly select PL011 UART for communication on GPIO 14 & 15 (right now it is already selected when using rpi-open-firmware, but stock firmware on RPi 3 has miniUART there as default; perhaps this is all, that is needed to run the kernel under stock firmware). This is done with alternative function assignments - described in BCM2835 ARM Peripherals + * partially DONE - one can always add more, but we have the most important stuff * Implement some basic utilities for us to use (memcpy, printf, etc...) * partailly DONE - svc works; once we implement processes we could also kill them on aborts * develop userspace process supervision (handling of interrupt caused by svc instruction, proper handling of other data abort, undefined instruction, etc.) -- cgit v1.2.3 From ad15edc4b3a3901812c823422f415b4ec9cc1177 Mon Sep 17 00:00:00 2001 From: Wojtek Kosior Date: Sat, 18 Jan 2020 21:03:06 +0100 Subject: make UART our own --- src/arm/PL1/PL1_common/uart.c | 57 ++++++++++++++++++++----------------------- src/arm/PL1/PL1_common/uart.h | 18 ++++++++++++-- src/arm/PL1/kernel/setup.c | 2 +- 3 files changed, 43 insertions(+), 34 deletions(-) diff --git a/src/arm/PL1/PL1_common/uart.c b/src/arm/PL1/PL1_common/uart.c index 4dd1c2b..94aae46 100644 --- a/src/arm/PL1/PL1_common/uart.c +++ b/src/arm/PL1/PL1_common/uart.c @@ -3,53 +3,48 @@ #include "uart.h" #include "global.h" -// Loop times in a way that the compiler won't optimize away -static inline void delay(int32_t count) +void uart_init(uint32_t baud, uint32_t config) { - asm volatile("__delay_%=: subs %[count], %[count], #1; bne __delay_%=\n" - : "=r"(count): [count]"0"(count) : "cc"); -} - -void uart_init() -{ - // Disable PL011_UART. + // PL011 UART must be disabled before configuring wr32(PL011_UART_CR, 0); - - // Setup the GPIO pin 14 && 15. - // Disable pull up/down for all GPIO pins & delay for 150 cycles. + // GPIO pins used for UART should have pull up/down disabled + // Procedure as described in BCM2835 ARM Peripherals wr32(GPPUD, 0); - delay(150); - // Disable pull up/down for pin 14,15 & delay for 150 cycles. + for (int i = 0; i < 150; i++) // delay for at least 150 cycles + asm volatile("nop"); + wr32(GPPUDCLK0, (1 << 14) | (1 << 15)); - delay(150); - // Write 0 to GPPUDCLK0 to make it take effect. - wr32(GPPUDCLK0, 0); + for (int i = 0; i < 150; i++) + asm volatile("nop"); - // Set integer & fractional part of baud rate. - // Divider = UART_CLOCK/(16 * Baud) - // Fraction part register = (Fractional part * 64) + 0.5 - // UART_CLOCK = 3000000; Baud = 115200. + wr32(GPPUDCLK0, 0); + + wr32(GPPUD, 0); - // Divider = 3000000 / (16 * 115200) = 1.627 = ~1. - wr32(PL011_UART_IBRD, 1); - // Fractional part register = (.627 * 64) + 0.5 = 40.6 = ~40. - wr32(PL011_UART_FBRD, 40); + // Setting clock rate + // As described in UART (PL011) Technical Reference Manual + uint32_t int_part = DEFAULT_UART_CLOCK_RATE / (16 * baud); + uint32_t rest = DEFAULT_UART_CLOCK_RATE % (16 * baud); + uint32_t fract_part = (rest * 64 * 2 + 1) / (2 * 16 * baud); + + wr32(PL011_UART_IBRD, int_part); + wr32(PL011_UART_FBRD, fract_part); - // Set 8 bit data transmission (1 stop bit, no parity) - // and disable FIFO to be able to receive interrupt every received + // Set data transmission specified by caller + // Don't enable FIFO to be able to receive interrupt every received // char, not every 2 chars - wr32(PL011_UART_LCRH, (1 << 5) | (1 << 6)); + wr32(PL011_UART_LCRH, config); - // set interrupt to come when transmit FIFO becomes ≤ 1/8 full + // Set interrupt to come when transmit FIFO becomes ≤ 1/8 full // or receive FIFO becomes ≥ 1/8 full // (not really matters, since we disabled FIFOs) wr32(PL011_UART_IFLS, 0); - // Enable PL011_UART, receive & transfer part of UART.2 - wr32(PL011_UART_CR, (1 << 0) | (1 << 8) | (1 << 9)); + // Enable UART receiving and transmitting, as well as UART itself + wr32(PL011_UART_CR, (1 << 9) | (1 << 8) | 1); // At first, it's probably safer to disable interrupts :) uart_irq_disable(); diff --git a/src/arm/PL1/PL1_common/uart.h b/src/arm/PL1/PL1_common/uart.h index 96f3634..e02b3c8 100644 --- a/src/arm/PL1/PL1_common/uart.h +++ b/src/arm/PL1/PL1_common/uart.h @@ -5,12 +5,26 @@ #include "global.h" #include "interrupts.h" +#define DEFAULT_UART_CLOCK_RATE 3000000 + +#define UART_WLEN_8_BITS (0b11 << 5) +#define UART_WLEN_7_BITS (0b10 << 5) +#define UART_WLEN_6_BITS (0b01 << 5) +#define UART_WLEN_5_BITS 0b00 + +#define UART_2_STOP (1 << 3) +#define UART_1_STOP 0 + +#define UART_ODD_PAR (1 << 1) +#define UART_EVEN_PAR ((1 << 1) | (1 << 2)) +#define UART_NO_PAR 0 + // The offsets for reach register. -// Controls actuation of pull up/down to ALL GPIO pins. +// GPIO Pin Pull-up/down Enable #define GPPUD (GPIO_BASE + 0x94) -// Controls actuation of pull up/down for specific GPIO pin. +// GPIO Pin Pull-up/down Enable Clock 0 #define GPPUDCLK0 (GPIO_BASE + 0x98) // The base address for UART. diff --git a/src/arm/PL1/kernel/setup.c b/src/arm/PL1/kernel/setup.c index bf7c9a1..865a719 100644 --- a/src/arm/PL1/kernel/setup.c +++ b/src/arm/PL1/kernel/setup.c @@ -11,7 +11,7 @@ void setup(uint32_t r0, uint32_t machine_type, struct atag_header *atags) { - uart_init(); + uart_init(115200, UART_1_STOP | UART_NO_PAR | UART_WLEN_8_BITS); // When we attach screen session after loading kernel with socat // we miss kernel's greeting... So we'll make the kernel wait for -- cgit v1.2.3 From 65165039f351cc694bf300743d296c2eb4c25fe9 Mon Sep 17 00:00:00 2001 From: Wojtek Kosior Date: Sat, 18 Jan 2020 21:46:47 +0100 Subject: last doc pieces in this week --- ...reating-a-separate-txt-for-each-or-ordering-them-in-any-way.txt | 7 +++++++ 1 file changed, 7 insertions(+) create mode 100644 Various-random-things-explained-I-dont-feel-like-creating-a-separate-txt-for-each-or-ordering-them-in-any-way.txt diff --git a/Various-random-things-explained-I-dont-feel-like-creating-a-separate-txt-for-each-or-ordering-them-in-any-way.txt b/Various-random-things-explained-I-dont-feel-like-creating-a-separate-txt-for-each-or-ordering-them-in-any-way.txt new file mode 100644 index 0000000..0a3ae06 --- /dev/null +++ b/Various-random-things-explained-I-dont-feel-like-creating-a-separate-txt-for-each-or-ordering-them-in-any-way.txt @@ -0,0 +1,7 @@ +Supervisor call happens, when the svc (previously called swi) instruction get executed. Exception is then entered. Supervisor call is the standard way for user process to ask the kernel for something. As user code might request many different things, the kernel must somehow know which one was requested. The svc instruction takes one immediate operand. The supervisor call exception handler can check at what address the execution was, read svc instruction from there and inspect it's bytes. This way, by executing svc with different immediate values, the used mode code can request different things from the kernel - the value in svc shall encode the request's type. +To save time and for the sake of simplicity, we don't make use of immediades in svc and instead we encode call's type in r0. In our implementation we decided, that supervisor call will preserve and clobber the same registers as function call and it will return values through r0, just as function call. This enables us to use actually perform the supervisor call as call to function defined in src/arm/PL0/svc.S. Calls from C are performed in src/arm/PL0/PL0_utils.c and request type encodings are defined in src/arm/common/svc_interface.h (they must be known to both user mode code and handler code). + +We've compiled useful utilities (i.e. memcpy(), strlen(), etc.) in src/arm/common/strings.c. Those Do not depend on the environment and can be used by both user mode code, kernel code, even bootloader code. +Functions used for io (like puts()) are also defined in common way for privileged and unprivileged code. They do, however, rely on the existence of putchar() and getchar(). In PL0 code (src/arm/PL0/PL0_utils.c), putchar() and getchar() are defined to perform a supervisor call, that does that. In the PL1 code, they are defined as operations on UART. + +src/arm/PL1/PL1_common/uart.c implements putchar() and getchar() in terms of UART. Those implementations are blocking - the poll UART peripheral registers in a loop, checking, if the device is ready to perform the operation. They are, however, accompanied by functions getchar_non_blocking() and putchar_non_blocking(), that check **once** if the device is ready and only perform the operation if it is. Otherwise, they return an error value, They purpose is to use them with interrupts. In interrupt-driven UART we avoid waiting in a loop - instead, an IRQ comes when desired UART's operation completes. The code that wants to write/read from UART, does, however, need to tie it's operation with IRQ handler and scheduler. -- cgit v1.2.3