The ML1 Compiler (http://www.ml1compiler.org) Outline For January 2008 SVFIG Meeting, By Steven D. Nichols ------------------------------------------------------------ All '.txt' files referred to are in the ML1 documentation. 1. What it is: ----------- o The ML1 compiler is a DOS program that runs (processes) user (me and/or you) created text macros to produce ASM (assembly language) output. With the ML1 compiler you can program at any level, ASM to very high level. The gap between BASIC and ASM inspired me to write the compiler, and the first versions were written for the 6502 in around 1989. Compiler input is a macro set defined in text file(s) and included (with an 'Include' command) in program sourcecode. Output is a text .ASM file which is assembled by NASM (included with compiler) or any other assembler. Assemblers with NO type checking are easiest to use because type checking is done by the compiler. The user can fully determine type checking rules, and data coercion is supported although not used by the included macro sets. The ML1 compiler itself is a 45K executable that was written entirely in runtime ML1, compiled with an optimized macro set. What can I do with ML1? ----------------------- You can create a custom programming language that compiles to ASM for just about any CPU. The ML1 parser is hard wired (its rules are immutable). Commands and output code are fully defined in user written macros. Assembler pseudo opcodes are output in NASM assembler format ('db value' for byte definitions, etc) and can be converted with a simple converter. Why use the ML1 compiler rather than some common language? ---------------------------------------------------------- ML1 allows detailed control over the code generated, and it can produce very compact, efficient programs. The ML1 license agreement allows you to distribute executables created with the ML1 compiler as you see fit, without you having to pay fees or distribute your sourcecode. The author would like to also see the sharing of macros and sourcecode to provide a bigger pool of ML1 resources for everyone. ML1 is simple minded (like its author). o To Install ML1, download it from 'www.ml1compiler.org', unzip the archive, then change your path statement. See 'install.txt'. CD to 'examp' and use 'mll filename' (no extension) to compile examples. o Compile-time command set and its purpose: The compiletime commands control compiler operation and are used in macros to define the language compiled. Documentation is in 'macrcmds.txt' , 'compqref.txt', 'optimize.txt', and 'glossary.txt', in the DOC directory. o Runtime command set and its purpose: The runtime command set is found in the INC directory and is a superset of the language the ML1 compiler itself is written in. Documentation is in 'ml1-lang.txt', 'commands.txt', and 'glossary.txt'. The included runtime command set are simple commands that are designed to each operate on a single data type. 3 runtime macro sets come with the ML1 compiler. stdmacro.mac Standard macro set, use first stdopt.mac Use after initial testing for smaller faster programs. stdmacd.mac Use to convert all short branch statements to long branch. This macro set also contains debugging macros. Runtime ML1 tends to use more sourcecode lines that C and less lines than ASM. Runtime ML1 is a structured low level language with excellent support of string operations. o Compiler Input Files: RTL 'ML1N' directory runtime library files (or user defined directory) Classes CLASS directory. The object oriented class and superclass. Include INC directory, which are macros that define the language to compile. Library of common routines in 'ILIBN' (or user defined) directory. o Compiler Output Files: ASM compiler output with comments (for debugging) which is assembled by an assembler. The .E file which is the output of the preprocessor and the input of the command processor. The E file is useful for debugging macros. The MER file that records compile errors (if present). The ERR file for assembly errors. The user can change all of these because the compile is controlled by a batch file. Elements common to any program compiled with the ML1 compiler: Learn This First, the rest is just learning individual commands. ---------------------------------------------------------------- o The 'Define;' section in sourcecode: The define section is where variables, named constants, aliases and macros are defined. The 'A' alias substitutes text in the ASM output. See the documentation file 'define.txt'. Local '#' and '?' (class) suffixed variables and Global variables can be defined, along with user created local variable types. The '$' object oriented prefix is used to create classes (subroutines) that can have multiple instances of themselves compiled into a program. See 'objects.txt'. Custom variable types of any data size can be defined. See 'custtype.txt'. o !S, !LS, !T, are used to flag String, Long String and Table constants used in the main body of a program (all of the sourcecode outside of Define sections). o Control structures are between brackets '[]' and are different than 'Cs' '{}' compound statements. IF, Else, Exit and Exit- commands determine control flow in structures. Control structures can be nested to 16 levels deep. 'Goto' philosophy: 'Use it when it makes sense'. See 'ml1_lang.txt' for details. o A function is any code segment 254 characters or less that returns a value which is used to replace the function as a parameter to a command, function, or expression. Functions are documented in 'express.txt'. 'bignum1.ml1' is an example in the EXAMP directory. o An expression can replace any runtime macro parameter and the expression processor itself can be user programmed. Expressions are documented in 'express.txt', 'expprog.txt' 'bignum1.ml1'. o There are 'reserved aliases' which are used by the compiler to setup segments in the ASM output file. See 'define.txt'. Examples of these program elements can be found in 'bignum?.ml1', in the EXAMP directory. Also see 'elements.txt'. Some high level runtime output commands --------------------------------------- Write Writes parameters to any subroutine that accepts a single byte in the ML1 'A' register (which is the 386 AL register). hwrite Writes parameters to an open file handle. 'stdin', 'stdout', 'stderr' are kept already open by DOS. fwrite Writes to an open file which was setup and opened using the 'fileio.sup' superclass. (See routines in the CLASS directory and 'inputout.txt') Directories used by the compiler -------------------------------- DOC Documentation INC Macro sets and include files ML1N Executable programs used by compiler, and ASM RTL files. (RTL directory can be changed). ILIBN Runtime ML1 library code. CLASS Classes and Superclasses to allow multiple instances of code. A 'class' is usually a link file that can be linked more than one time. A superclass can control, link, include and setup superclasses and classes. Quick summary of internal compiler parts ---------------------------------------- o The command processor determines whether a command is a built in command whereupon it calls the built in command processor to execute the command, or the command processor calls the macro processor to process commands that are user defined (macros). 16 banks (for 16 parameters) of 3 registers. reg1 = parameter name reg2 = parameter initial value reg3 = parameter type 16 banks of 2 control structure registers These registers are accessed by built in commands and user created macros to access command parameters and make use of control structures. o The preprocessor handles expressions, functions, and multi line escapes by breaking them down into simple macro and/or compile time command calls which are processed by the command processor. o The expression processor converts expressions into user defined macro calls to process expressions. See 'express.txt' and 'expprog.txt'. o The function processor converts functions into user defined macro calls to process functions and return a value. o \\ is a multi line escape that allows long sourcecode lines to span multiple lines. o The macro processor accesses the 16 x 3 command processor registers and uses the contents to determine the output code to generate. The text '!A' and '!X' are used in macros to output ASM to the output file and exit ASM mode respectively. Macros can be nested to 12 levels deep. All documentation covers macros because ML1 is a macro compiler. See 'macros.txt'. o The Optimizer is programmable and has a 32 x 3 register set. The ML1 compiler can track the contents of up to 32 CPU registers. User created macros test these registers and perform optimizations based on optimizer register content. See 'optimize.txt'. Runtime ML1 supports IF argument optimization, redundant CPU loads removed, constant folding (LMATH.INC), 0 register loads converted to 'xor reg,reg', reuse of already loaded registers, various other techniques. o The Input macro processor substitutes text or symbol table data for a name. See 'inpmacro.txt'. Multi processor support ----------------------- After creating macros to compile language 'Y' for CPU 'X' you include your macros (using 'Include') in your program sourcecode and compile the program. After the program is compiled the ASM pseudo opcodes are then converted (using a separate program or script) to the target CPU format, then it's assembled into an executable by a cross assembler. Steven D. Nichols P.O. Box 2 Cupertino, CA 95014-0002 email: sdn@ml1compiler.org or sdn@svpal.org ML1 compiler: http://www.ml1compiler.org |