finish explaining how MMU works and how we use it

author: Wojtek Kosior <kwojtus@protonmail.com> 2020-01-15 16:45:35 +0100
committer: Wojtek Kosior <kwojtus@protonmail.com> 2020-01-15 16:45:35 +0100
commit: 65328387c7880af777dcfd9c399cb453a87b44c1 (patch)
tree: 6cb8e318db3467ccce71b471599203394cf815e3
parent: c77286c6951223be1c216c19278cecca3b43ceb5 (diff)
download: rpi-MMU-example-65328387c7880af777dcfd9c399cb453a87b44c1.tar.gz
rpi-MMU-example-65328387c7880af777dcfd9c399cb453a87b44c1.zip
1 files changed, 23 insertions, 5 deletions
diff --git a/MMU-explained.txt b/MMU-explained.txt
index 61fbd9c..e640aaa 100644
--- a/MMU-explained.txt
+++ b/MMU-explained.txt
@@ -14,9 +14,10 @@ This aids operating system's memory management in several ways
 
 A given mapping can be made valid for only one execution mode (i.e. region only accessible from privileged mode) or only certain types of accesses (i.e. a memory region can be made non-executable, which guards against accidental jumping there by program code (important for countering buffer-overflow exploits)). An unallowed access triggers a processor exception, which passes control to an appropriate interrupt service routine.
 
-General configuration of the MMU in ARM processors it is present on is done through registers of the appropriate coprocessor (cp15). Translations are managed through translation table. It is an array of 32-bit or 64-bit entries (also called descriptors) describing how their corresponding memory regions should be mapped. A number of leftmost bits of a virtual address constitutes an index into the translation table to be used for translating it. This way no virtual addresses need to be stored in the table and MMU can perform translations in O(1) time.
 
+In RaspberryPi environments used by us, there are ARMv7-A-compatible processors, which we currently use only in 32-bit mode. Information here is relevant to those systems (there are Pi boards with both older and newer processors, with more or less functionality and features available).
 
+General configuration of the MMU in ARM processors it is present on is done through registers of the appropriate coprocessor (cp15). Translations are managed through translation table. It is an array of 32-bit or 64-bit entries (also called descriptors) describing how their corresponding memory regions should be mapped. A number of leftmost bits of a virtual address constitutes an index into the translation table to be used for translating it. This way no virtual addresses need to be stored in the table and MMU can perform translations in O(1) time.
 
 Coprocessor 15 contains several registers, that control the behaviour of the MMU. They are all accessed through mcr and mrc arm instructions.
 1. SCTLR, System Control Register - "provides the top level control of the system, including its memory system"
@@ -25,7 +26,7 @@ Coprocessor 15 contains several registers, that control the behaviour of the MMU
       · whether data cache is enabled
       · whether instruction cache is enabled
       · whether TEX remap is enabled
-         TEX remap is a feacher, that changes how some translation table entry bit fields (called C, B and TEX) are used. We're not using TEX remap in our project.
+         TEX remap is a feature, that changes how some translation table entry bit fields (called C, B and TEX) are used. We're not using TEX remap in our project.
       · whether access flags are enabled
          Enabling access flag causes one translation table descriptor bit normally used to specify access permissions of a region to be used as access flag. We don't use this feature either
 2. DACR, Domain Access Control Register - "defines the access permission for each of the sixteen memory domains"
@@ -49,11 +50,28 @@ Besides, translation table descriptor also specifies:
 2. Other memory attributes (cacheability, shareability).
 3. which domain the memory belongs to.
 
-Page table is something simillar to translation table, but it's entries define smaller regions (called, well - pages). When a translation table descriptor describing a page table gets used for translation, then entry in that page table (with some middle bits of the virtual address used as index into it) is fetched and used. This allows for better granularity of mappings while not requiring the page tables to occupy space if small pages are not needed. We can say, that 2-level translations are performed. On some versions of ARM translations can have more levels than here.
+Page table is something simillar to translation table, but it's entries define smaller regions (called, well - pages). When a translation table descriptor describing a page table gets used for translation, then entry in that page table (with some middle bits of the virtual address used as index into it) is fetched and used. This allows for better granularity of mappings while not requiring the page tables to occupy space if small pages are not needed. We can say, that 2-level translations are performed. On some versions of ARM translations can have more levels than here. This means the MMU might sometimes need to fetch several entries from different level tables to compute the physical address. This is called a translation table walk.
 
 As of 15.01.2020 page tables and small pages are not used in the project (although programming them is on the TODO list).
 
-Our project uses C bitfield structs for operating on coprocessor registers' contents and translation table descriptors. This is an elegant and readable approach, yet little-portable across compilers. Current struct definitions are sure to work properly with GCC.
-
 Despite the overhelming amount of configuration options available, most can be left with deafults and this is how it's done in this project. Those default settings usually make the MMU behave as in older ARM versions, when some options were not yet available (and hence, the entire system was simpler).
 
+Our project uses C bitfield structs for operating on SCTLR and TTBR contents (with DACR - bit shifts are more appropriate and with TTBCR - our default configuration means just writing 0 to register) and translation table descriptors. This is an elegant and readable approach, yet little-portable across compilers. Current struct definitions are sure to work properly with GCC.
+
+Structs describing SCTLR, DACR and TTBR are defined in src/arm/PL1/kernel/cp_regs.h, while those describing translation table descriptors - in src/arm/PL1/kernel/translation_table_descriptors.h.
+
+Before the MMU is enabled, all memory is seen as it really is. Therefore, the only feasible way of enabling it is by initially setting the descriptors in translation table to map all addresses (mapping just addresses used by the kernel would be enough) to themselves. It is called a flat map.
+
+How setting up a flat map and turning on the MMU and management of memory sections is done in our project:
+1. Translation table is defined in the linker script src/arm/PL1/kernel/kernel_stage2.ld as a NOLOAD section. C code gets the table's start and end addresses from smbols defined in that linker script (see arm/PL1/kernel/memory.h).
+2. Function setup_flat_map() defined in arm/PL1/kernel/paging.c enables MMU with a flat map. It prints relevant information to uart while performing the following operations:
+   · In a loop writes all descriptors to the translation table, setting them as sections, accessible from PL1 only, belonging to domain 0.
+   · Sets DACR to allow domain 0 memory accesses based on translation table descriptor permissions and block accesses to other domains (only domain 0 is used in this project).
+   · Makes sure TEX remap, access flag, caches and the MMU are disabled in SCTLR. Disabling some of them might be unnecessary, because MMU is assumend to be disabled on the start and enabled caches might cause no problems as long as only flat map is used. Still, the way it is done right now is known to work well and optimizations are not needed.
+   · Clears all caches and TLBs (again, it is suspected that at some of this is unnecessary).
+   · Writes TTBCR setting, that causes only one, 32-bit translation table to be used.
+   · Makes TTBR0 point to the start of translation table. Rest of attributes in TTBR0 (concerning how table entries are being accessed) are left as 0s (defaults).
+   · Enables the MMU and caches by setting the appropriate bits in SCTLR.
+After some cp15 register writes, the isb assembly instruction is used, which causes ARM core to wait until changes take effect (otherwise some later instructions could possibly be executed before this happens).
+
+In arm/PL1/kernel/paging.c the function claim_and_map_section() can be used to modify an entry in translation table to create a new mapping. Memory allocation also done in that source file uses some lists to describe free and taken sections and has nothing to do with with the MMU.
author	Wojtek Kosior <kwojtus@protonmail.com>	2020-01-15 16:45:35 +0100
committer	Wojtek Kosior <kwojtus@protonmail.com>	2020-01-15 16:45:35 +0100
commit	65328387c7880af777dcfd9c399cb453a87b44c1 (patch)
tree	6cb8e318db3467ccce71b471599203394cf815e3
parent	c77286c6951223be1c216c19278cecca3b43ceb5 (diff)
download	rpi-MMU-example-65328387c7880af777dcfd9c399cb453a87b44c1.tar.gz rpi-MMU-example-65328387c7880af777dcfd9c399cb453a87b44c1.zip