From c5532c2b0fcca6ddfb838f050231fabbf92cdcaa Mon Sep 17 00:00:00 2001 From: Wojtek Kosior Date: Tue, 14 Jan 2020 18:17:21 +0100 Subject: delete old debug line --- Makefile | 1 - 1 file changed, 1 deletion(-) diff --git a/Makefile b/Makefile index e9d9852..66f3b0a 100644 --- a/Makefile +++ b/Makefile @@ -1,7 +1,6 @@ # actual recipes for everything are in build/Makefile; % : - echo generic $(MAKE) -C build $@ # below is just for shell auto-completion -- cgit v1.2.3 From f5c270d1b5177a9c0c006356ed2b8b32301d7491 Mon Sep 17 00:00:00 2001 From: Wojtek Kosior Date: Wed, 15 Jan 2020 02:10:05 +0100 Subject: add more explaination about how MMU works --- MMU-explained.txt | 43 ++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 42 insertions(+), 1 deletion(-) diff --git a/MMU-explained.txt b/MMU-explained.txt index af350ff..61fbd9c 100644 --- a/MMU-explained.txt +++ b/MMU-explained.txt @@ -14,5 +14,46 @@ This aids operating system's memory management in several ways A given mapping can be made valid for only one execution mode (i.e. region only accessible from privileged mode) or only certain types of accesses (i.e. a memory region can be made non-executable, which guards against accidental jumping there by program code (important for countering buffer-overflow exploits)). An unallowed access triggers a processor exception, which passes control to an appropriate interrupt service routine. -General configuration of the MMU in ARM processors it is present on is done through registers of the appropriate coprocessor (cp15). Translations are managed through translation table. It is an array of 32-bit or 64-bit entries describing how their corresponding memory regions should be mapped. A number of leftmost bits of a virtual address constitutes an index into the translation table to be used for translating it. This way no virtual addresses need to be stored in the table and MMU can perform translations in O(1) time. +General configuration of the MMU in ARM processors it is present on is done through registers of the appropriate coprocessor (cp15). Translations are managed through translation table. It is an array of 32-bit or 64-bit entries (also called descriptors) describing how their corresponding memory regions should be mapped. A number of leftmost bits of a virtual address constitutes an index into the translation table to be used for translating it. This way no virtual addresses need to be stored in the table and MMU can perform translations in O(1) time. + + + +Coprocessor 15 contains several registers, that control the behaviour of the MMU. They are all accessed through mcr and mrc arm instructions. +1. SCTLR, System Control Register - "provides the top level control of the system, including its memory system" + Bits of this register control, among other things: + · whether the MMU is enabled + · whether data cache is enabled + · whether instruction cache is enabled + · whether TEX remap is enabled + TEX remap is a feacher, that changes how some translation table entry bit fields (called C, B and TEX) are used. We're not using TEX remap in our project. + · whether access flags are enabled + Enabling access flag causes one translation table descriptor bit normally used to specify access permissions of a region to be used as access flag. We don't use this feature either +2. DACR, Domain Access Control Register - "defines the access permission for each of the sixteen memory domains" + Entries in translation table define which of available 16 memory domains a memory region belongs to. Bits of DACR specify what permissions apply to each of the domains. Possible setting are to allow accesses to regions based on settings in translation table descriptor or to allow/disallow all accesses regardless of access permission bits in translation table. +3. TTBR0, Translation Table Base Register 0 - "holds the base address of translation table 0, and information about the memory it occupies" + System mode programmer can choose (with respect to some alignment requirements) where in the physical memory to put the translation table. Chosen address (actually, only a number of it's leftmost bits) has to be put in TTBR for the MMU to know where the table lies. Other bits of this register control some memory attributes relevant for accesses to table entries by the MMU +3. TTBR1, Translation Table Base Register 1 - simillar function to TTBR0 (see below for explaination of dual TTBR) +4. TTBCR, Translation Table Base Control Register + Bits of this register control + · How TLBs (Translation Lookaside Buffers) are used. TLBs are a mechanism of caching translation table entries. + · Whether to use some extension feature, that changes traslation table entries and TTBR* lengths to 64-bit (we're not using this, so we won't go into details) + · How a translation table is selected. There can be 2 translation tables and there are 2 cp15 registers (TTBR0 and TTBR1) to hold their base addresses. When 2 tables are in use, then on each memory access some leftmost bits of virtual address determine which one should be used. If the bits are all 0s - TTBR0-pointed table is used. Otherwise - TTBR1 is used. This allows OS developer to use separate translation tables for kernelspace and userspace (i.e. by having the kernelspace code run from virtual addresses starting with 1 and userspace code run from virtual addresses starting with 0). A field of TTBCR determines how many leftmost bits of virtual address are used for that (and also affects TTBR0 format). In the simplest setup (as in our project) this number is 0, so only the table specified in TTBR0 is used. + +Translation table consists of 4096 entries, each describing a 1MB memory region. An entry can be of several types: +1. Invalid entry - the corresponding virtual addresses can not be used +2. Section - description of a mapping of 1MB memory region +3. Supersection - description of a mapping of 16MB memory region, that has to be repeated 16 times in consecutive memory sections (can be used to map to physical addresses higher than 2^32) +4. Page table - no mapping is given yet, but a page table is pointed. See below. +Besides, translation table descriptor also specifies: +1. Access permissions. +2. Other memory attributes (cacheability, shareability). +3. which domain the memory belongs to. + +Page table is something simillar to translation table, but it's entries define smaller regions (called, well - pages). When a translation table descriptor describing a page table gets used for translation, then entry in that page table (with some middle bits of the virtual address used as index into it) is fetched and used. This allows for better granularity of mappings while not requiring the page tables to occupy space if small pages are not needed. We can say, that 2-level translations are performed. On some versions of ARM translations can have more levels than here. + +As of 15.01.2020 page tables and small pages are not used in the project (although programming them is on the TODO list). + +Our project uses C bitfield structs for operating on coprocessor registers' contents and translation table descriptors. This is an elegant and readable approach, yet little-portable across compilers. Current struct definitions are sure to work properly with GCC. + +Despite the overhelming amount of configuration options available, most can be left with deafults and this is how it's done in this project. Those default settings usually make the MMU behave as in older ARM versions, when some options were not yet available (and hence, the entire system was simpler). -- cgit v1.2.3 From c7b47accc6de3521f10c323983a1b325a60fb421 Mon Sep 17 00:00:00 2001 From: Wojtek Kosior Date: Wed, 15 Jan 2020 15:22:16 +0100 Subject: also enable data and instruction cache when enabling the MMU --- src/arm/PL1/kernel/paging.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/src/arm/PL1/kernel/paging.c b/src/arm/PL1/kernel/paging.c index 771c681..4c3dccf 100644 --- a/src/arm/PL1/kernel/paging.c +++ b/src/arm/PL1/kernel/paging.c @@ -101,10 +101,11 @@ void setup_flat_map(void) // enable MMU puts("enabling the MMU"); - // redundant - we already have SCTLR contents in the variable - // asm("mrc p15, 0, %0, c1, c0, 0" : "=r" (SCTLR.raw)); + // we already have SCTLR contents in the variable - SCTLR.fields.M = 1; + SCTLR.fields.M = 1; // enable MMU + SCTLR.fields.C = 1; // enable data cache + SCTLR.fields.I = 1; // enable instruction cache asm("mcr p15, 0, %0, c1, c0, 0\n\r" "isb" :: "r" (SCTLR.raw) : "memory"); -- cgit v1.2.3 From c77286c6951223be1c216c19278cecca3b43ceb5 Mon Sep 17 00:00:00 2001 From: Wojtek Kosior Date: Wed, 15 Jan 2020 16:30:05 +0100 Subject: for safety - invalidate caches when creating a new mapping --- src/arm/PL1/kernel/paging.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/src/arm/PL1/kernel/paging.c b/src/arm/PL1/kernel/paging.c index 4c3dccf..6da9905 100644 --- a/src/arm/PL1/kernel/paging.c +++ b/src/arm/PL1/kernel/paging.c @@ -242,6 +242,14 @@ uint16_t claim_and_map_section // write modified descriptor to the table *section_entry = descriptor; + // invalidate instruction cache + asm("mcr p15, 0, r0, c7, c5, 0\n\r" // r0 gets ignored + "isb" ::: "memory"); + + // invalidate branch-prediction + asm("mcr p15, 0, r0, c7, c5, 6\n\r" // r0 - same as above + "isb" ::: "memory"); + // invalidate main Translation Lookup Buffer asm("mcr p15, 0, r1, c8, c7, 0\n\r" "isb" ::: "memory"); -- cgit v1.2.3 From 65328387c7880af777dcfd9c399cb453a87b44c1 Mon Sep 17 00:00:00 2001 From: Wojtek Kosior Date: Wed, 15 Jan 2020 16:45:35 +0100 Subject: finish explaining how MMU works and how we use it --- MMU-explained.txt | 28 +++++++++++++++++++++++----- 1 file changed, 23 insertions(+), 5 deletions(-) diff --git a/MMU-explained.txt b/MMU-explained.txt index 61fbd9c..e640aaa 100644 --- a/MMU-explained.txt +++ b/MMU-explained.txt @@ -14,9 +14,10 @@ This aids operating system's memory management in several ways A given mapping can be made valid for only one execution mode (i.e. region only accessible from privileged mode) or only certain types of accesses (i.e. a memory region can be made non-executable, which guards against accidental jumping there by program code (important for countering buffer-overflow exploits)). An unallowed access triggers a processor exception, which passes control to an appropriate interrupt service routine. -General configuration of the MMU in ARM processors it is present on is done through registers of the appropriate coprocessor (cp15). Translations are managed through translation table. It is an array of 32-bit or 64-bit entries (also called descriptors) describing how their corresponding memory regions should be mapped. A number of leftmost bits of a virtual address constitutes an index into the translation table to be used for translating it. This way no virtual addresses need to be stored in the table and MMU can perform translations in O(1) time. +In RaspberryPi environments used by us, there are ARMv7-A-compatible processors, which we currently use only in 32-bit mode. Information here is relevant to those systems (there are Pi boards with both older and newer processors, with more or less functionality and features available). +General configuration of the MMU in ARM processors it is present on is done through registers of the appropriate coprocessor (cp15). Translations are managed through translation table. It is an array of 32-bit or 64-bit entries (also called descriptors) describing how their corresponding memory regions should be mapped. A number of leftmost bits of a virtual address constitutes an index into the translation table to be used for translating it. This way no virtual addresses need to be stored in the table and MMU can perform translations in O(1) time. Coprocessor 15 contains several registers, that control the behaviour of the MMU. They are all accessed through mcr and mrc arm instructions. 1. SCTLR, System Control Register - "provides the top level control of the system, including its memory system" @@ -25,7 +26,7 @@ Coprocessor 15 contains several registers, that control the behaviour of the MMU · whether data cache is enabled · whether instruction cache is enabled · whether TEX remap is enabled - TEX remap is a feacher, that changes how some translation table entry bit fields (called C, B and TEX) are used. We're not using TEX remap in our project. + TEX remap is a feature, that changes how some translation table entry bit fields (called C, B and TEX) are used. We're not using TEX remap in our project. · whether access flags are enabled Enabling access flag causes one translation table descriptor bit normally used to specify access permissions of a region to be used as access flag. We don't use this feature either 2. DACR, Domain Access Control Register - "defines the access permission for each of the sixteen memory domains" @@ -49,11 +50,28 @@ Besides, translation table descriptor also specifies: 2. Other memory attributes (cacheability, shareability). 3. which domain the memory belongs to. -Page table is something simillar to translation table, but it's entries define smaller regions (called, well - pages). When a translation table descriptor describing a page table gets used for translation, then entry in that page table (with some middle bits of the virtual address used as index into it) is fetched and used. This allows for better granularity of mappings while not requiring the page tables to occupy space if small pages are not needed. We can say, that 2-level translations are performed. On some versions of ARM translations can have more levels than here. +Page table is something simillar to translation table, but it's entries define smaller regions (called, well - pages). When a translation table descriptor describing a page table gets used for translation, then entry in that page table (with some middle bits of the virtual address used as index into it) is fetched and used. This allows for better granularity of mappings while not requiring the page tables to occupy space if small pages are not needed. We can say, that 2-level translations are performed. On some versions of ARM translations can have more levels than here. This means the MMU might sometimes need to fetch several entries from different level tables to compute the physical address. This is called a translation table walk. As of 15.01.2020 page tables and small pages are not used in the project (although programming them is on the TODO list). -Our project uses C bitfield structs for operating on coprocessor registers' contents and translation table descriptors. This is an elegant and readable approach, yet little-portable across compilers. Current struct definitions are sure to work properly with GCC. - Despite the overhelming amount of configuration options available, most can be left with deafults and this is how it's done in this project. Those default settings usually make the MMU behave as in older ARM versions, when some options were not yet available (and hence, the entire system was simpler). +Our project uses C bitfield structs for operating on SCTLR and TTBR contents (with DACR - bit shifts are more appropriate and with TTBCR - our default configuration means just writing 0 to register) and translation table descriptors. This is an elegant and readable approach, yet little-portable across compilers. Current struct definitions are sure to work properly with GCC. + +Structs describing SCTLR, DACR and TTBR are defined in src/arm/PL1/kernel/cp_regs.h, while those describing translation table descriptors - in src/arm/PL1/kernel/translation_table_descriptors.h. + +Before the MMU is enabled, all memory is seen as it really is. Therefore, the only feasible way of enabling it is by initially setting the descriptors in translation table to map all addresses (mapping just addresses used by the kernel would be enough) to themselves. It is called a flat map. + +How setting up a flat map and turning on the MMU and management of memory sections is done in our project: +1. Translation table is defined in the linker script src/arm/PL1/kernel/kernel_stage2.ld as a NOLOAD section. C code gets the table's start and end addresses from smbols defined in that linker script (see arm/PL1/kernel/memory.h). +2. Function setup_flat_map() defined in arm/PL1/kernel/paging.c enables MMU with a flat map. It prints relevant information to uart while performing the following operations: + · In a loop writes all descriptors to the translation table, setting them as sections, accessible from PL1 only, belonging to domain 0. + · Sets DACR to allow domain 0 memory accesses based on translation table descriptor permissions and block accesses to other domains (only domain 0 is used in this project). + · Makes sure TEX remap, access flag, caches and the MMU are disabled in SCTLR. Disabling some of them might be unnecessary, because MMU is assumend to be disabled on the start and enabled caches might cause no problems as long as only flat map is used. Still, the way it is done right now is known to work well and optimizations are not needed. + · Clears all caches and TLBs (again, it is suspected that at some of this is unnecessary). + · Writes TTBCR setting, that causes only one, 32-bit translation table to be used. + · Makes TTBR0 point to the start of translation table. Rest of attributes in TTBR0 (concerning how table entries are being accessed) are left as 0s (defaults). + · Enables the MMU and caches by setting the appropriate bits in SCTLR. +After some cp15 register writes, the isb assembly instruction is used, which causes ARM core to wait until changes take effect (otherwise some later instructions could possibly be executed before this happens). + +In arm/PL1/kernel/paging.c the function claim_and_map_section() can be used to modify an entry in translation table to create a new mapping. Memory allocation also done in that source file uses some lists to describe free and taken sections and has nothing to do with with the MMU. -- cgit v1.2.3 From ca3bf744f826225bb041afc4779ee19493d5440e Mon Sep 17 00:00:00 2001 From: Wojtek Kosior Date: Wed, 15 Jan 2020 16:47:14 +0100 Subject: remove garbage-comments --- src/arm/PL1/kernel/interrupts.c | 33 +-------------------------------- 1 file changed, 1 insertion(+), 32 deletions(-) diff --git a/src/arm/PL1/kernel/interrupts.c b/src/arm/PL1/kernel/interrupts.c index 121d79c..5695e6f 100644 --- a/src/arm/PL1/kernel/interrupts.c +++ b/src/arm/PL1/kernel/interrupts.c @@ -3,11 +3,8 @@ #include "svc_interface.h" #include "armclock.h" #include "scheduler.h" -/** - @brief The undefined instruction interrupt handler -**/ - +// defined in setup.c void __attribute__((noreturn)) setup(void); // from what I've heard, reset is never used on the Pi; @@ -105,31 +102,3 @@ void fiq_handler(void) { error("fiq happened"); } - - -/* Here is your interrupt function */ -//void -//__attribute__((interrupt("IRQ"))) -//__attribute__((section(".interrupt_vectors.text"))) -//irq_handler2(void) { -// /* You code goes here */ -//// uart_puts("GOT INTERRUPT!\r\n"); -// -// local_timer_clr_reload_reg_t temp = { .IntClear = 1, .Reload = 1 }; -// QA7->TimerClearReload = temp; // Clear interrupt & reload -//} - -///* here is your main */ -//int enable_timer(void) { -// -// QA7->TimerRouting.Routing = LOCALTIMER_TO_CORE0_IRQ; // Route local timer IRQ to Core0 -// QA7->TimerControlStatus.ReloadValue = 100; // Timer period set -// QA7->TimerControlStatus.TimerEnable = 1; // Timer enabled -// QA7->TimerControlStatus.IntEnable = 1; // Timer IRQ enabled -// QA7->TimerClearReload.IntClear = 1; // Clear interrupt -// QA7->TimerClearReload.Reload = 1; // Reload now -// QA7->Core0TimerIntControl.nCNTPNSIRQ_IRQ = 1; // We are in NS EL1 so enable IRQ to core0 that level -// QA7->Core0TimerIntControl.nCNTPNSIRQ_FIQ = 0; // Make sure FIQ is zero -//// uart_puts("Enabled Timer\r\n"); -// return(0); -//} \ No newline at end of file -- cgit v1.2.3 From ebc85a24c9c04232e775f5bc7cf4ea9af8e1caa7 Mon Sep 17 00:00:00 2001 From: Wojtek Kosior Date: Wed, 15 Jan 2020 17:04:14 +0100 Subject: add TODO concerning smarter use of memory attributes --- TODOs | 2 ++ 1 file changed, 2 insertions(+) diff --git a/TODOs b/TODOs index ebfafc5..e99a150 100644 --- a/TODOs +++ b/TODOs @@ -54,6 +54,8 @@ high priority TODOs are higher; low priority ones and completed ones are lower; * write some procedures for dumping registers and other stuff (for use in debugging); maybe print registers' contents on data/prefetch abort? +* Memory regions can be configured as one of several types, which affects how memory reads/writes are performed by the processor. Dig into that and use the best appropriate settings in paging.c (i.e. normal memory instead of strongly-ordered memory for RAM). + * partially DONE - one can always add more, but we have the most important stuff * Implement some basic utilities for us to use (memcpy, printf, etc...) * partailly DONE - svc works; once we implement processes we could also kill them on aborts * develop userspace process supervision (handling of interrupt caused by svc instruction, proper handling of other data abort, undefined instruction, etc.) -- cgit v1.2.3 From 98ffccc6a86529c1479b3b17bbff3c13f654c49c Mon Sep 17 00:00:00 2001 From: Wojtek Kosior Date: Wed, 15 Jan 2020 17:27:15 +0100 Subject: add TODO concerning variable assignment in Makefile --- TODOs | 2 ++ 1 file changed, 2 insertions(+) diff --git a/TODOs b/TODOs index e99a150..dd7707e 100644 --- a/TODOs +++ b/TODOs @@ -56,6 +56,8 @@ high priority TODOs are higher; low priority ones and completed ones are lower; * Memory regions can be configured as one of several types, which affects how memory reads/writes are performed by the processor. Dig into that and use the best appropriate settings in paging.c (i.e. normal memory instead of strongly-ordered memory for RAM). +* In the Makefile: is =? the right assignment for, say, CFLAGS? + * partially DONE - one can always add more, but we have the most important stuff * Implement some basic utilities for us to use (memcpy, printf, etc...) * partailly DONE - svc works; once we implement processes we could also kill them on aborts * develop userspace process supervision (handling of interrupt caused by svc instruction, proper handling of other data abort, undefined instruction, etc.) -- cgit v1.2.3 From 9ed55d7612be0ffd17e3e9cc08bea7225470ee67 Mon Sep 17 00:00:00 2001 From: Wojtek Kosior Date: Thu, 16 Jan 2020 16:19:00 +0100 Subject: START explaining makefile --- Makefile-explained.txt | 13 +++++++++++++ 1 file changed, 13 insertions(+) create mode 100644 Makefile-explained.txt diff --git a/Makefile-explained.txt b/Makefile-explained.txt new file mode 100644 index 0000000..c199059 --- /dev/null +++ b/Makefile-explained.txt @@ -0,0 +1,13 @@ +Our project contains 2 Makefiles: one in it's root directory and one in build/. The reason is that it is possible to use Makefile to simply, elegantly and efficiently produce files in the same directory where it is, but to produce files in directory other than Makefile's own, it requires this directory to be specified in many rules across the Makefile and in general it complicates things. Also, a problem arises when trying to link objects not from within the current directory. If an object is referenced by name in linker script (which is a frequent practice in our scripts) and is passed to gcc with a path, then it'd need to also appear with that path in the linker script. +Because of that a Makefile in build/ is present, that produces files into it's own directory and the Makefile in project's root is used as a proxy to that first one - it calls make recursively in build/ with the same target it was called with. + +From now on only Makefile in build/ will be discussed. + +In the Makefile, variables with the names of certain tools and their command line flags are defined (using =? assignment, which allows one to specify their own value of that variable on the command line). In case a cross-compiler with a different triple should be used, ARM_BASE, normally set to arm-none-eabi, can be set to something like arm-linux-gnueabi or even /usr/local/bin/arm-none-eabi. + +All variables discussed below are defined using := assignment, which causes them to only be evaluated once instead of on every reference to them. + +Objects that should be linked together to create each of the .elf files are listed in their respective variables. I.e. objects to be used for creating kernel_stage2.elf are all listed in KERNEL_STAGE2_OBJECTS. When adding a new source file to the kernel, it is enough to add it's respective .o file to that list to make it compile and link properly. No other Makefile modifications are needed. +In a simillar fashion, RAMFS_FILES variable specifies files, that should be put in the ramfs image, that will be embedded in the kernel. Adding another file only requires listing it there. However, if the file is to be found somewhere else that build/, it might be useful to use the vpath directive to tell make where to look for it. + +Variables dirs and dirs_colon are defined to -- cgit v1.2.3