#### Warning! This README has not been updated! I have finished and successfully defended the thesis since then. #### ## About This repository shall contain the the code for 'Laboratory station based on programmable logic device for WebAssembly execution evaluation' developed as my engineering thesis at AGH University of Science and Technology in Cracov, Poland. The project utilizes Verilog HDL. Icarus Verilog Simulator is used for simulation and test benches, while Yosys, arachne-pnr/nextpnr and icestorm are the tools chosen for synthesis, p&r and bitstream generation for Olimex's iCE40HX8K-EVB FPGA. ## Technical choices I'm using one of few FPGAs with fully libre toolchain. SystemVerilog and VHDL are not yet (officially) supported in Yosys, so I'm using Verilog2005. I'm writing my own stack machine CPU for the job. Another option would be to run an existing register-based CPU (picorv32?) on the FPGA and interpret Wasm on it. Despite my thesis' topis is broad anough it would allow that, I didn't go this way, because: - there'd be nothing innovative in this approach, - I'd end up mostly copying other's code, ending up with a copy-paster's thesis... I'm using Wishbone pipelined interconnect for CPU and other components. WebAsm binary format was not designed for direct execution, so I'm instead creating a minimal stack machine, that would allow almost 1:1 translation of Wasm code to it's own instruction format. I still think it's possible to make a CPU, that would execute Wasm directly - it's just matter of a bit more effort. The stack machine is and will be limited. That's why some more complex Wasm instructions (e.g. 64-bit operations, maybe float operations) have to be replaced with calls to software routines. The goal is to write some minimal "bootloader", that would translate Wasm to my stack machine's instructions on-device. The SPI chip on iCE40HX8K-EVB is 2MB big. The configuration stored on it is below 137KB. I'm going to use the remaining memory to store the actual Wasm code for execution. The initial booting code will be preloaded to embedded RAM (iCE40HX8K has such feature and Yosys supports it). I'm using VGA (640x480@60Hz) with self-created text mode for communicating to the outside. UART is also planned. I wrote an assembly for my stack machine (tclasm.tcl). The actual assembly instructions are expressed in terms of tcl command executions, so we could call it pseudo-assembly. Before embracing tcl, I needed a way to express memory reads and writes for some test benches and created a simple macroassembly (include/macroasm.vh). I probably should have used tcl from the beginning... Everything is done through some (quite sophisticated) Makefiles. ## Project structure - Makefile - needs no explaination... - Makefile.config - included by Makefile and Makefile.test, defines variables, makes it easy to, e.g., change the compiler command - Makefile.util - also included by Makefile and Makefile.test - defines things, that didn't semantically fit into Makefile.config - design/ - Verilog sources, that will get synthesized for FPGA (+some other files like initial memory contents) - models/ - Verilog modules used in testing - tests/ - benches, each in its own subdirectory, with a Makefile including Makefile.test - tclasm.tcl - implementation of simple assembly in terms of tcl commands - include/ - Verilog header files for inclusion - tools/ - small C programs - COPYING - 0BSD license - README.txt - You're reading it ## Project status I'm a huge bit delayed with the work (should have had a working prototype in June...), but I'm working on it. I had a previous approach to the problem in July. Work was going extremely slowly and I felt, that my code was really bad. This is also because I haven't had any serious hardware design experience before. Now, I started anew. My current approach is less CISCy. I'm also doing everything in the simulator, with test benches for every module and plans to get it to run on the FPGA once the design is able to display something through VGA. That's different from my previous approach, where I was trying to make something run on the board and then write tests for it. I'm now determined to use Wishbone, because I believe it helps me keep the design clean. My stack machine is currently able to do some operations like memory accesses, addition and unsigned division, jumps, but it's not yet ready to have most of Wasm translated to it. At the beginning of September I changed the design and instruction format and rewrote the stack machine. The current one can be considered my third approach :p ### Thoughts It's indeed an interesting project, but from practical point of view - it's still going to be more efficient to JIT-compile Wasm on a register-based architecture... Perhaps it'd be more useful to optimize an exisiting processor (OpenRISC, OpenSPARC, RiscV) for that?