nito, a proof-of-concept compiler from Uxntal to C
 
 
 
 
Go to file
Wim Vanderbauwhede ffe7d82f47 nm 2022-10-18 19:40:04 +01:00
demos Mandelbrot works after changes to make procblim happy 2022-10-10 14:34:59 +01:00
gen-c-src ... 2022-10-06 16:22:46 +01:00
lib Removed trailing spaces 2022-10-18 14:42:45 +01:00
utils Clean-up before release 2022-10-18 12:22:25 +01:00
uxn-runtime-libs More work on arg inference 2022-10-14 10:04:10 +01:00
uxntal-testcases nm 2022-09-23 19:16:59 +01:00
CHANGELOG.md Clean-up before release 2022-10-18 12:22:25 +01:00
LICENSE Initial commit 2022-05-07 12:56:04 +02:00
README.md Updated README, ready for release 2022-10-18 11:37:14 +01:00
build_uxn_sdl.sh Eval now works, for the limited case of #00 STR 2022-10-10 12:01:23 +01:00
nito.raku Removed trailing spaces 2022-10-18 14:42:45 +01:00

README.md

nito -- 二兎

わたしとあなたの違いは何?
眠れない まぶたにかけめぐる
一兎二兎 数えるうちに訪れる丑三つ時

(チリヌルヲワカ [it]

A proof-of-concept compiler from Uxntal to C. The generated code is linked with a slightly modified version of the Uxn VM/Varvara code to provide stand-alone applications. For more details on the compiler design and implementation, please read my blog post.

Installation

You need Raku because that's what I wrote it in. The easiest way to install it is to use Rakubrew. As the generated code relies on the Uxn VM, it needs the SDL2 GUI library.

Usage

Generating code

./nito.raku <flags> <path to the source file>

The flags are -T to generate tal, -C to generate C. The -C flag has an optional -O flag for optimisations, currently the values are 0, 1, 2 or 3.

The default is currently -O=0 but the best performance is achieved with -O=2. See the blog post for details on the optimisations.

Building the generated code

The generated C code should be put in gen-c-src/program.c. In demos:

../nito.raku -C > ../gen-c-src/program.c

To build the emulator code, use the script build_uxn_sdl.sh:

./build_uxn_sdl.sh

This will build an exe uxnprog for GUI applications or uxncliprog for command line applications.

The code has macros DBG and DBG_ for printing out debug info and a macro DEFENSIVE guarding code to catch stack over/underflows, you can change the defaults in the script build_uxn_sdl.sh.

Testing and debugging

I am using the examples in the demos folder. In that folder there are a few helper scripts:

./generate_compile_run.sh <tal file>

will generate, compile and run a demo. For command-line apps, use

./generate_compile_run_cli.sh <tal file>

For both commands you can add -O=... to turn on optimisations.

Finally,

./test_gen_uxn.sh <tal file>

or

./test_gen_uxn_cli.sh <tal file>

will generate the modified tal code that represents the intermediate representation for emitting C.

Status

Working demos

GUI demos

  • dvd
  • polycat
  • move
  • amiga
  • bitwise
  • bunnymark
  • bifurcan
  • life
  • snake
  • wireworld
  • cube3d
  • mandelbrot
  • piano
  • calc
  • ray

Command line demos

  • procblim: Uxntal macro processor
  • primes: Prime number generator
  • fib, fib2, fi32: Fibonacci number generators
  • stencil: A 3-D 6-point stencil

Failing demos

  • drool

Performance

I did some preliminary performance evaluation using primes, fib* and stencil. With -O=2, the compiled version is up to 12x faster than the original version. There are still a lot of additional optimsations that could be implemented but I think they will only result in a small additional speed-up.

Limitations

This is a proof-of-concept so it will certainly have bugs. Also, fundamentally it does not support run-time evaluation through self-modification of the instructions. So something like this is not supported:

#06 #07 LIT ADD 
#00 STR BRK 

Self-modification of data is supported though, and this is used in several of the demos listed above. So patterns like

LIT2 &v $2
LIT2 &x $1 &y $1

are supported.

Also fundamentally, the compiler expects human-readable Uxntal code, in particular it relies on the mnemonics to identify instructions. So while this is valid Uxntal code, it will not work:

80 06 80 07 1a

It is in general impossible for a compiler to distinguish between opcodes and data because of Uxntal's dynamic nature. In fact, a value can be used as both depending on a run-time condition. So the compiler needs the meta-information provided by the mnemonic notation.

For example, the following code uses 1a both as an instruction and as data (an unsigned integer):

|0100 
    LIT &v 1a
    POP
    ,&v LDR #02 MUL 
    #06 #07
    ,&v LDR 
    #00 STR 00 
    ADD
    #18 DEO
BRK

This example only illustrates the dynamic, self-modifying nature of Uxntal code, not that it is impossible to statically determine if something is an opcode or data. The reason for this is that data used as code can only be executed at run time. We could of course build a runtime into the compiler, but then "compilation" would mean running the code, in other words the compiler becomes an interpreter. And as the runtime behaviour can depend on input values, the compiler would have to generate code for all possible input values.