re-add problems

author: Wojtek Kosior <kwojtus@protonmail.com> 2020-01-21 17:59:14 +0100
committer: Wojtek Kosior <kwojtus@protonmail.com> 2020-01-21 17:59:14 +0100
commit: f9435ea17e8d5143651da9e0f530a77f36aa7ebb (patch)
tree: 0cc30f34d59fdb8e0b91594f83ea3eccfdd17ef7
parent: 01e7392be71591fc6ae8de2bcb53734527b7190e (diff)
download: rpi-MMU-example-f9435ea17e8d5143651da9e0f530a77f36aa7ebb.tar.gz
rpi-MMU-example-f9435ea17e8d5143651da9e0f530a77f36aa7ebb.zip
2 files changed, 116 insertions, 5 deletions
diff --git a/README.md b/README.md
index 56aa06c..5afae68 100644
--- a/README.md
+++ b/README.md
@@ -68,8 +68,21 @@
 <li><a href="#sec-13-4">13.4. UARTs</a></li>
 </ul>
 </li>
-<li><a href="#sec-14">14. Afterword</a></li>
-<li><a href="#sec-15">15. Sources of Information</a></li>
+<li><a href="#sec-14">14. Problems faced</a>
+<ul>
+<li><a href="#sec-14-1">14.1. Ramfs alignment</a></li>
+<li><a href="#sec-14-2">14.2. <i>COM</i> section</a></li>
+<li><a href="#sec-14-3">14.3. Bare-metal position indeppendent code</a></li>
+<li><a href="#sec-14-4">14.4. Linker section naming</a></li>
+<li><a href="#sec-14-5">14.5. Context switches</a></li>
+<li><a href="#sec-14-6">14.6. Different modes' sp register</a></li>
+<li><a href="#sec-14-7">14.7. Swithing between system mode and user mode</a></li>
+<li><a href="#sec-14-8">14.8. UART interrupt masking</a></li>
+<li><a href="#sec-14-9">14.9. Terminal stdin breaking</a></li>
+</ul>
+</li>
+<li><a href="#sec-15">15. Afterword</a></li>
+<li><a href="#sec-16">16. Sources of Information</a></li>
 </ul>
 </div>
 </div>
@@ -1344,7 +1357,61 @@ disabled when being configured, which is also fulfilled by uart\\<sub>init</sub>
 The PL011 is toroughly described in
 [BCM2837 ARM Peripherals](https://cs140e.sergio.bz/docs/BCM2837-ARM-Peripherals.pdf) as well as [PrimeCell UART (PL011) Technical Reference Manual](http://infocenter.arm.com/help/topic/com.arm.doc.ddi0183f/DDI0183.pdf).
 
-# Afterword<a id="sec-14" name="sec-14"></a>
+# Problems faced<a id="sec-14" name="sec-14"></a>
+
+## Ramfs alignment<a id="sec-14-1" name="sec-14-1"></a>
+
+Our ramfs needs to be 4-aligned in memory, but when objcopy creates the embeddable file, it doesn't (at least by default) mark it's data section as requiring 2\*\*2 alignment. There has to be .=ALIGN(4) line in linker script before ramfs<sub>embeddable</sub>.o. At some point we forgot about it, which caused the ramfs to misbehave.
+Bugs located in linker script, like this one, are often non-obvoius. This makes them hard to trace.
+
+## *COM* section<a id="sec-14-2" name="sec-14-2"></a>
+
+Many sources mention *COMMON* as the section in object files resulting from compilation, that contains some specific kind of uninitialized (0-initialized) data (simillar to .bss). Obviously, it has to be included in the linker script.
+Unfortunately, gcc names this section differently, mainly - *COM*. This caused our linker script to not include it in the actual image. Instead, it was placed somewhere after the last section defined in the linker script. This happened to be after our NOLOAD stack section, where first free MMU section is. Due to how our memory management algorithm works, this part of physical memory always gets allocated to the first process, which gets it's code copied there.
+This bug caused incredibly weird behaviour. The user space code would fail with either abort or undefined instruction, always on the second PL0 instruction. That was because some statically allocated scheduler variable in *COM* was getting mapped at that address. It took probably a few hours of analysing generated assembly in radare2 and modyfying [scheduler.c](../src/arm/PL1/kernel/scheduler.c) and [PL0<sub>test</sub>.c](../src/arm/PL0/PL0<sub>test</sub>.c) to find, that the problem lies in the linker script.
+
+## Bare-metal position indeppendent code<a id="sec-14-3" name="sec-14-3"></a>
+
+We wanted to make bootloader and kernel able to run regardless of what address they are loaded at (also see comment in [kernel's stage1 linker script](../src/arm/PL1/kernel/kernel.ld)).
+To achieve the goal, we added -fPIC to compilation options of all arm code. With this, we decided we can, instead of embedding code in other code using objcopy, put relevant pieces of code in separate linker script sections, link them together and then copy entire sections to some other addresss in runtime. I.e. the exception vector would be linked with the actual kernel (loaded at 0x8000), but the copied along with exception handling routines to 0x0. It did work in 2 cases (of exception vector and libkernel), but once most of the project was modified to use this method of code embedding, it turned out to be faulty and work had to be done to move back to the use of objcopy.
+The problem is, -fPIC (as well af -fPIE) requires code to be loaded by some operating system or bootloader, that can fill it's got (global offset table). This is not being done in environment like ours.
+It is possible to generate ARM bare-metal position-independent code, that would work without got, but support for this is not implemented in gcc and is not a common feature in general.
+The solution was to write stage1 of both bootloader and the kernel in careful, position-independent assembly This required more effort, but was ultimately successful.
+
+## Linker section naming<a id="sec-14-4" name="sec-14-4"></a>
+
+Weird behaviour occurs, when trying to link object code files with nonstandard section names using GNU linker. Output sections defined in the linker script didn't cause problems in our case. Problems occured when input sections were nonstandard (such as sections generated by using \_<sub>attribute</sub>\_<sub>((section("name")))</sub> in GCC-compiled C code), as they would not be included or would be included in wrong place, despite being explicitly listed for inclusion in the linker script's SECTION command.
+At some point, renaming a section from .boot to .text.boot would make the code work properly.
+
+## Context switches<a id="sec-14-5" name="sec-14-5"></a>
+
+This is a description of a mistake made by us during work on the project.
+At first, we didn't know about special features of SUBS pc, lr and ldm rn {pc} ^ instructions. Our code would switch to user mode by branching to code in PL0-accessible memory section and having it execute cps instruction. This worked, but was not good, because code executed by the kernel was in memory section writable by userspace code.
+First improvement was separating that code into "libkernel". Libkernel would be in a PL0-executable but non-writable section and would perform the switch.
+It did work, however, it was not the right way.
+We later learned how to achieve the same with subs/ldm and removed, making the project a bit simpler.
+
+## Different modes' sp register<a id="sec-14-6" name="sec-14-6"></a>
+
+System mode has separate stack pointer from supervisor mode, so when upon switch from supervisor to system mode it has to be set to point to the actual stack.
+At first we didn't know about that and we had undefined behaviour occur. At some points during the development, changing a line of code in one place would make a bug occur or not occur in some other, unrelated place in the kernel.
+
+## Swithing between system mode and user mode<a id="sec-14-7" name="sec-14-7"></a>
+
+It is also not allowed (undefined behaviour) to switch from system mode directly to user mode, which we were not aware of and which also caused some problem/bugs.
+
+## UART interrupt masking<a id="sec-14-8" name="sec-14-8"></a>
+
+Both BCM2835 ARM Peripherals manual and the manual to PL011 UART itself say, that writing 0s to PL011<sub>UART</sub><sub>IMSC</sub> unmasks specific interrupts. Practical experiments showed, that it's the opposite: writing 1s enables specific interrupts and writing 0s disables them.
+UART code on wiki.osdev was also written to disable interrupts in the way described in the manuals. The interrrupts were then unmasked instead of masked. This didn't cause problems in practice, as UART interrupts have to also be unmasked elsewhere (register defined ARM<sub>ENABLE</sub><sub>IRQS</sub><sub>2</sub> in [interrupts.h](../src/arm/PL1/kernel/interrupts.h)) to actually occur.
+
+## Terminal stdin breaking<a id="sec-14-9" name="sec-14-9"></a>
+
+The very simple pipe<sub>image</sub> program breaks stdin when run.
+Even other programs run in that same (bash) shell after pipe<sub>image</sub> cannot read from stdin.
+In zsh other commands run interactively after pipe<sub>image</sub> do work, but commands executed after pipe<sub>image</sub> inside a shell function still have the problem occur.
+
+# Afterword<a id="sec-15" name="sec-15"></a>
 
 This project has been done as part of the Embedded Systems course on
 [AGH University of Science and Technology](https://www.agh.edu.pl/en/). The goal of the project was to investigate and program the
@@ -1378,7 +1445,7 @@ colleagues who happen to be work with the codebase.
 
 In case on any bugs or questions, the authors can be contacted at kwojtus@protonmail.com.
 
-# Sources of Information<a id="sec-15" name="sec-15"></a>
+# Sources of Information<a id="sec-16" name="sec-16"></a>
 
 -   wiki.osdev.org
 -   ARM GCC Inline Assembler Cookbook - <http://www.ethernut.de/en/documents/arm-inline-asm.html>
diff --git a/README.org b/README.org
index ab071dd..d30b091 100644
--- a/README.org
+++ b/README.org
@@ -235,7 +235,6 @@ Rule clean removes all the files generated in build/.
 
 Rules that don't generate files are marked as PHONY.
 
-
 * Project structure
   Directory structure of the project:
 
@@ -1278,6 +1277,51 @@ disabled when being configured, which is also fulfilled by uart\_init().
 The PL011 is toroughly described in
 [[https://cs140e.sergio.bz/docs/BCM2837-ARM-Peripherals.pdf][BCM2837 ARM Peripherals]] as well as [[http://infocenter.arm.com/help/topic/com.arm.doc.ddi0183f/DDI0183.pdf][PrimeCell UART (PL011) Technical Reference Manual]].
 
+* Problems faced
+
+** Ramfs alignment
+Our ramfs needs to be 4-aligned in memory, but when objcopy creates the embeddable file, it doesn't (at least by default) mark it's data section as requiring 2**2 alignment. There has to be .=ALIGN(4) line in linker script before ramfs_embeddable.o. At some point we forgot about it, which caused the ramfs to misbehave.
+Bugs located in linker script, like this one, are often non-obvoius. This makes them hard to trace.
+
+** /COM/ section
+Many sources mention /COMMON/ as the section in object files resulting from compilation, that contains some specific kind of uninitialized (0-initialized) data (simillar to .bss). Obviously, it has to be included in the linker script.
+Unfortunately, gcc names this section differently, mainly - /COM/. This caused our linker script to not include it in the actual image. Instead, it was placed somewhere after the last section defined in the linker script. This happened to be after our NOLOAD stack section, where first free MMU section is. Due to how our memory management algorithm works, this part of physical memory always gets allocated to the first process, which gets it's code copied there.
+This bug caused incredibly weird behaviour. The user space code would fail with either abort or undefined instruction, always on the second PL0 instruction. That was because some statically allocated scheduler variable in /COM/ was getting mapped at that address. It took probably a few hours of analysing generated assembly in radare2 and modyfying [scheduler.c](../src/arm/PL1/kernel/scheduler.c) and [PL0_test.c](../src/arm/PL0/PL0_test.c) to find, that the problem lies in the linker script.
+
+** Bare-metal position indeppendent code
+We wanted to make bootloader and kernel able to run regardless of what address they are loaded at (also see comment in [kernel's stage1 linker script](../src/arm/PL1/kernel/kernel.ld)).
+To achieve the goal, we added -fPIC to compilation options of all arm code. With this, we decided we can, instead of embedding code in other code using objcopy, put relevant pieces of code in separate linker script sections, link them together and then copy entire sections to some other addresss in runtime. I.e. the exception vector would be linked with the actual kernel (loaded at 0x8000), but the copied along with exception handling routines to 0x0. It did work in 2 cases (of exception vector and libkernel), but once most of the project was modified to use this method of code embedding, it turned out to be faulty and work had to be done to move back to the use of objcopy.
+The problem is, -fPIC (as well af -fPIE) requires code to be loaded by some operating system or bootloader, that can fill it's got (global offset table). This is not being done in environment like ours.
+It is possible to generate ARM bare-metal position-independent code, that would work without got, but support for this is not implemented in gcc and is not a common feature in general.
+The solution was to write stage1 of both bootloader and the kernel in careful, position-independent assembly This required more effort, but was ultimately successful.
+
+** Linker section naming
+Weird behaviour occurs, when trying to link object code files with nonstandard section names using GNU linker. Output sections defined in the linker script didn't cause problems in our case. Problems occured when input sections were nonstandard (such as sections generated by using __attribute__((section("name"))) in GCC-compiled C code), as they would not be included or would be included in wrong place, despite being explicitly listed for inclusion in the linker script's SECTION command.
+At some point, renaming a section from .boot to .text.boot would make the code work properly.
+
+** Context switches
+This is a description of a mistake made by us during work on the project.
+At first, we didn't know about special features of SUBS pc, lr and ldm rn {pc} ^ instructions. Our code would switch to user mode by branching to code in PL0-accessible memory section and having it execute cps instruction. This worked, but was not good, because code executed by the kernel was in memory section writable by userspace code.
+First improvement was separating that code into "libkernel". Libkernel would be in a PL0-executable but non-writable section and would perform the switch.
+It did work, however, it was not the right way.
+We later learned how to achieve the same with subs/ldm and removed, making the project a bit simpler.
+
+** Different modes' sp register
+System mode has separate stack pointer from supervisor mode, so when upon switch from supervisor to system mode it has to be set to point to the actual stack.
+At first we didn't know about that and we had undefined behaviour occur. At some points during the development, changing a line of code in one place would make a bug occur or not occur in some other, unrelated place in the kernel.
+
+** Swithing between system mode and user mode
+It is also not allowed (undefined behaviour) to switch from system mode directly to user mode, which we were not aware of and which also caused some problem/bugs.
+
+** UART interrupt masking
+Both BCM2835 ARM Peripherals manual and the manual to PL011 UART itself say, that writing 0s to PL011_UART_IMSC unmasks specific interrupts. Practical experiments showed, that it's the opposite: writing 1s enables specific interrupts and writing 0s disables them.
+UART code on wiki.osdev was also written to disable interrupts in the way described in the manuals. The interrrupts were then unmasked instead of masked. This didn't cause problems in practice, as UART interrupts have to also be unmasked elsewhere (register defined ARM_ENABLE_IRQS_2 in [interrupts.h](../src/arm/PL1/kernel/interrupts.h)) to actually occur.
+
+** Terminal stdin breaking
+The very simple pipe_image program breaks stdin when run.
+Even other programs run in that same (bash) shell after pipe_image cannot read from stdin.
+In zsh other commands run interactively after pipe_image do work, but commands executed after pipe_image inside a shell function still have the problem occur.
+
 * Afterword
 
 This project has been done as part of the Embedded Systems course on
author	Wojtek Kosior <kwojtus@protonmail.com>	2020-01-21 17:59:14 +0100
committer	Wojtek Kosior <kwojtus@protonmail.com>	2020-01-21 17:59:14 +0100
commit	f9435ea17e8d5143651da9e0f530a77f36aa7ebb (patch)
tree	0cc30f34d59fdb8e0b91594f83ea3eccfdd17ef7
parent	01e7392be71591fc6ae8de2bcb53734527b7190e (diff)
download	rpi-MMU-example-f9435ea17e8d5143651da9e0f530a77f36aa7ebb.tar.gz rpi-MMU-example-f9435ea17e8d5143651da9e0f530a77f36aa7ebb.zip