The Virtual Tcode Machine

Release 2

(3rd revised issue)

Copyright © 1997-2000 Nils M Holm

mail: nmh@t3x.org
home: http://www.t3x.org/

Table Of Contents

1. The Architecture

The Tcode machine is a virtual 16-bit machine basically consisting of the following parts:

+---------------+FFFFh                   FFFFh+---------------+
|               |                             |     Stack     |
|               |                             |      and      |
|  U n u s e d  |       +-------------+  +--->|    Dynamic    |
|               |   +---|     IP      |  |    |    Storage    |
|               |   |   +-------------+  | +->|- - - - - - - -|
|- - - - - - - -|   |   |     RR      |  | |  |               |
|               |   |   +-------------+  | |  |    F r e e    |
|               |<--+   |     FP      |--+ |  |               |
| Tcode Program |       +-------------+    |  |  M e m o r y  |
|               |       |     SP      |----+  |               |
| Instructions  |       +-------------+       |- - - - - - - -|
|               |          REGISTERS          |    Static     |
|               |                             |     Data      |
|               |                             |               |
+---------------+0000h                   0000h+---------------+
   CODE ARRAY                                    DATA ARRAY

There are two byte-addressable memory regions called the code array and the data array. The code array holds the Tcode program which is to be executed and the data array is used to hold the data used by the program. Each cell in one of the arrays is completely addressable using a 16-bit pointer. Therefore, the maximum capacity per array is 65536 bytes.

Machine words -- which are always 2 bytes wide -- are stored with the least significant byte in the cell with the higher address: 1234h = 34h 12h (little endian byte ordering).

The Tcode machine has four 16-bit wide special purpose registers which are outlined in the following table.

RegisterDescription
FP Frame Pointer.
The frame pointer always points to the stack frame (aka context) of the currently running procedure. FP is implicitly used by the instructions LDL, LDLV, SAVL, INCL which address local objects. FP is modified only by HDR and END instructions. See also calling conventions.
IP Instruction Pointer.
This register always points to the instruction which will be interpreted next. IP is always interpreted as an offset into the code array. It cannot be accessed directly, but only through jump, call, and branch instructions.
RR Return Register.
This register is used to transport procedure results back to the caller. The return register is loaded by a POP instruction and saved by CLEAN. See also calling conventions.
SP Stack Pointer.
The stack pointer points to the object most recently placed on the stack. Moving an object to the stack implicitly decreases SP by 2. Removing an object increases it by 2. SP may be explicitly modified using the STACK instruction.

The Tcode machine instructions can be divided into four groups: declarations, load\sand\sstore instructions, branches, and arithmetic instructions.

Declarations will be processed only once (therefore, they may be resolved in a preprocessing step). This means that an instruction like

STR 5 H e l l o

will not create a new string literal each time it is interpreted, but only at the first time. (One also might think of this behaviour as creating the same object each time a declaration executed).

Arithmetic instructions expect their arguments on the runtime stack and also place their results there. Since there are no general purpose registers, most operations are performed on stack elements.

2. Calling Conventions

A procdure should always begin with a HDR instruction which saves the caller's context and creates a new stack frame and end with an END instruction which restores the saved stack frame and jumps back to the caller.

A procedure call

P(a, b, c)

where a, b and c be global variables, is coded as follows:

LDG a LDG b LDG c CALL LP CLEAN 3

Each LDG instruction loads the value of a global variable onto the stack. CALL performs the procedure call which returns with its result in the return register RR. LP denotes the label tagging the procedure P. The final CLEAN instruction removes the three arguments from the stack and replaces them with the value returned in RR so that the top stack element finally holds the procedure return value.

Each called procedure may expect the following stack configuration:

FP+M -- Argument #1
FP+3 -- Argument #N-1
FP+2 -- Argument #N
FP+1 -- Return Address (saved by CALL or CALR)
FP+0 -- Old SP (saved by HDR)
FP-1 -- Local variable #1
FP-2 -- Local variable #2
FP-J -- Local variable #K
SP   -- ( Free Memory below )

Note that the arguments are passed to the procedure in reverse order with the first argument at the highest address. Both, arguments and local variables may be accessed using LDL instructions. Given the above context,

LDL M

would access the first argument.

LDL 2

always loads the value of the last argument, if any. Local storage is accessed using negative offets:

NUM 25 SAVL -2

would load the second local variable with the value 25.

3. Instruction Cycles

A cycle is the set of operations which is required to execute one single Tcode instruction. Each cycle consists of the following steps:

  1. Load the instruction pointed to by IP. Increment IP by 1.
  2. If the instruction has an operand (indicated by bit #7 set), load the machine word pointed to by IP into an internal register and increment IP by 2.
  3. If the loaded instruction is an INCG, INCL, or INIT command, load another machine word (pointed to by IP) into another internal register and increment IP by 2.
  4. If the loaded instruction is valid, execute it, otherwise signal an error and halt the machine.

These steps are repeated until the Tcode machine is halted by executing HALT.

If an instruction modifies stack elements, first all its operands are removed from the stack, then the operation denoted by the instruction is performed, and finally the result is placed back on the stack.

Declaration instructions may have more than two arguments. See the section about declarations for details.

4. Startup Conditions

This sections describes the state of the Tcode machine when it is started.

5. Conventions Used in the Following Sections

SymbolDenotes ...
M, N generic numeric values (16 bits wide).
.N an unsigned numeric value. For example, if N=-2, then .N=65534. Normally, the most significant bit of each machine word is interpreted as a sign flag (1 indicates a negative number). The leading dot indicates that the MSB should be treated as part of the value instead of a sign indicator.
L a label which may tag either a data word or a procedure. Label IDs are 16 bits wide.
A an address. An address is an offset into the code or data array where the base (code or data array) is implicitly determined by the associated instruction. (16 bits wide.)
C a character (an 8-bit wide value).
X1 ... XN a vector containing N elements of the type X.
memory[X] the content of the X'th machine word in the data array.
memory::X the content of the X'th byte in the data array.
S0 ... SN the N+1 elements most recently pushed onto the stack. When popping elements, S0 will be removed first and SN will be removed last.

6. Declarations

OpcodeInstruction FormatDescription
81 CLAB L Code LABel.
Define a label identified by the value L which tags the following procedure.
85 CREF L Code REFerence.
Define a word-size storage location holding the address of the procedure tagged by the label L.
84 DATA N DATA definition.
Define a word-size storage location containing the value N.
83 DECL N vector DECLaration.
Define a vector with a length of N machine words and undefined contents.
82 DLAB L Data LABel.
Define a label identified by the value L which tags the following data object.
86 DREF L Data REFerence.
Define a word-size storage location holding the address of the data object tagged by the label L.
BF EXT N C1 ... CN EXTernal reference.
Create an external reference to the symbol represented by the characters C1 through CN at the current position in the code array.
When interpreting a Tcode program, this instruction should lead to an error. It is limited to compiled programs.
89 INIT N V INITialize.
Originally used to initialize the Tcode environment -- hence its name. Each program must begin with this instruction. The argument N specifies the Tcode version the program complies to. This document describes version 2 of the Tcode language.
V is a bit vector describing some features required by the following program. See also the description of the feature vector, below.
88 PSTR N C1 ... CN define Packed STRing.
Define a vector with a length of
(N+2) / 2
machine words containing the characters C1 through CN. Each character is stored in a separate byte, not machine word. All trailing bytes of the vector are filled with zeroes so that a properly terminated packed string is created.
BE PUB N C1 ... CN PUBlic reference.
Create a public tag at the current position in the code array. The characters C1 through CN contain the name of the tag which may be used for external linkage. When interpreting Tcode programs, this instruction type may be ignored safely.
87 STR N M1 ... MN define STRing.
Define a vector with a length of N+1 machine words containing the values M1 through MN. The last vector member always holds the value zero. The primary use of this instruction is the creation of string literals.

6.1 The Feature Vector

As of realease 2, the INIT instruction contains an additional argument called the feature vector. This is a bit vector with the size of a machine word describing some features required by the Tcode program containing the vector. Currently, the following bits are used:

Bit #ValueDescription (if bit set)
01hBASIC Extensions required.
12hVIO (terminal I/O) Extensions required.
24hVector GRAPHICS Extensions required.

The Tcode interpreter must make sure that all required extensions are present before running a program containg a feature vector with any bits set. If it cannot provide an extension requested by the program, it must write an informational message describing the missing extension and exit. Although, a fully T3X compliant compiler may decide not to set any FV bits, even if some of the above extensions are used.

7. Load and Store Instructions

OpcodeInstr. Fmt.Description
15 DEREF DEREFerence.
Remove two elements S0 and S1 and compute the absolute address of the S0'th member of the vector pointed to by S1: S0*2 + S1. Push the computed address.
When used together with IND, this instruction dereferences the S0'th member of S1. Hence the name of this instruction.
14 DREFB DeREFerence Byte.
Remove two elements S0 and S1 and compute the absolute address of the S0'th byte of the vector pointed to by S1 (S0+S1). Push the computed address. See also DEREF.
13 IND INDirect access.
Replace the top element with the value stored in the storage cell pointed to by the top element:
S0 := memory[S0/2].
12 INDB INDirect Byte access.
Replace the top element with a machine word which contains the zero-extended byte located at the address contained in the top element: S0 := memory::S0.
A8 LDG A LoaD Global.
Push the value stored in the memory cell with the address A: memory[A/2].
A9 LDGV A LoaD Global Vector.
Push the address A.
AA LDL N LoaD Local.
Push the value stored at the N'th position `below' the stack frame base. The address of the cell is computed as follows:
FP - N*2
Consequently, negative values of N may be used to access locations `above' the frame base.
AC LDLAB L LoaD LABel.
Push the address which is tagged by the label L. Used to reference vectors and procedures.
AB LDLV N LoaD Local Vector.
Push the absolute address of a local object. N specifies the offset of the object to the stack frame base. The absolute address is computed using the formula FP - N*2.
AD NUM N load NUMber.
Push the value N.
AE SAVG A SAVe Global.
Pop one element and save it in the memory cell with the address A.
AF SAVL N SAVe Local.
Pop one element and save it in the storage cell with the address FP - N*2. See also LDL.
31 STORB STORe Byte.
Pop two elements S0 and S1 and store the least significant 8 bits of S0 in the byte pointed to by S1: memory::S1 := S0.
30 STORE Pop two elements S0 and S1 and store the value S0 in the memory cell pointed to by S1:
memory[S1/2] := S0.

8. Arithmetic Instructions

OpcodeInstr. Fmt.Description
1B ADD Remove two elements S0 and S1 and push their sum: S1+S0.
1D BAND Bitwise AND.
Remove two elements S0 and S1, perform a bitwise AND on them and push the result: S1 & S0.
0E BNOT Bitwise NOT.
Invert each bit in the top element: ~S0.
1E BOR Bitwise OR.
Remove two elements S0 and S1, perform a bitwise OR on them and push the result: S1 | S0.
20 BSHL Bitwise SHift Left.
Remove two elements S0 and S1, shift the bits of S1 to the left by S0 positions and push the result: S1 << S0.
21 BSHR Bitwise SHift Right.
Remove two elements S0 and S1, shift the bits of S1 to the right by S0 positions and push the result: S1 >> S0.
1F BXOR Bitwise eXclusive OR.
Remove two elements S0 and S1, perform a bitwise XOR on them and push the result: S1 ^ S0.
19 DIV integer DIVide.
Remove two elements S0 and S1, compute the (signed) integer part of their quotient and push it: S1 / S0.
If S0=0, signal a fatal error and halt.
22 EQU EQUal.
Remove two elements S0 and S1. Push true, if they are equal and otherwise false: S1=S0-> -1: 0.
25 GRTR GReaTeR than.
Remove two elements S0 and S1. Push true, if S1 is greater than S0 and otherwise false: S1>S0-> -1: 0.
S0 and S1 are both signed.
27 GTEQ Greater Than or EQual to.
Remove two elements S0 and S1. Push true, if S1 is greater than or equal to S0 and otherwise false: S1>=S0-> -1: 0.
S0 and S1 are both signed.
24 LESS LESS than
Remove two elements S0 and S1. Push true, if S1 is less than S0 and otherwise false: S1<S0-> -1: 0.
S0 and S1 are both signed.
0D LNOT Logical NOT.
If the top element is equal to zero, replace it with -1, otherwise with zero: S0=0-> -1: 0.
26 LTEQ Less Than or EQual to.
Remove two elements S0 and S1. Push true, if S1 is less than or equal to S0 and otherwise false: S1<=S0-> -1: 0.
S0 and S1 are both signed.
1A MOD MODulo.
Remove two elements S0 and S1, compute their division remainder and push it: S1 MOD S0. S1 MOD S0 is defined as S1 - S1/S0*S0 where `/' denotes an integer division.
If S0=0, signal a fatal error and halt.
18 MUL MULtiply.
Remove two elements S0 and S1, compute their (signed) product and push it: S1 * S0. Do not perform any overflow checking.
0C NEG NEGate.
Negate the top element: -S0.
23 NEQU Not EQUal.
Remove two elements S0 and S1. Push true, if they are not equal and otherwise false: S1\=S0-> -1: 0.
1C SUB SUBtract.
Remove two elements S0 and S1 and push their difference: S1-S0.
41 UDIV Unsigned integer DIVide.
Remove two elements S0 and S1, compute the unsigned integer part of their quotient and push it: .S1 / .S0.
If S0=0, signal a fatal error and halt.
43 UGRTR Unsigned GReaTeR than.
Remove two elements S0 and S1. Push true, if .S1 is greater than .S0 and otherwise false: .S1>.S0-> -1: 0.
45 UGTEQ Unsigned Greater Than or EQual to.
Remove two elements S0 and S1. Push true, if .S1 is greater than or equal to .S0 and otherwise false: .S1>=.S0-> -1: 0.
42 ULESS Unsigned LESS than
Remove two elements S0 and S1. Push true, if .S1 is less than .S0 and otherwise false: .S1<.S0-> -1: 0.
44 ULTEQ Unsigned Less Than or EQual to.
Remove two elements S0 and S1. Push true, if .S1 is less than or equal to .S0 and otherwise false: .S1<=.S0-> -1: 0.
40 UMUL Unsigned MULtiply
Remove two elements S0 and S1, compute their unsigned product and push it: .S1 * .S0. Do not perform any overflow checking.

9. Procedures and Flow Control

OpcodeInstr. Fmt.Description
B6 BRF L BRanch on False.
Remove one element S0 and branch to the label L, if S0 is false.
B7 BRT L BRanch on True.
Remove one element S0 and branch to the label L, if S0 is true.
B3 CALL L procedure CALL.
Push the current value of the instruction pointer IP and then perform a branch to the code array location tagged by the label L.
34 CALR CALl through Register.
Push the current value of the instruction pointer IP and then remove one element and perform a branch to the location it points to. The destination is implicitly located in the code array of the program.
90 CLEAN N CLEAN up arguments.
Remove N procedure arguments from the stack:
SP := SP + N*2
and then push the content of RR, the return register.
0B END END procedure.
Remove two elements S0 and S1. Restore the context of the calling procedure by loading FP with S0 and then perform a branch to S1. S1 is usually a return address which has been saved by a CALL or CALR instruction.
B5 EXEC N EXECute extension procedure.
Execute the extension procedure whose address is stored in slot number N of the extension vector. If N is not a valid slot number or points to an unused slot, signal a fatal error and halt. The exact semantics of EXEC depend on the called procedure.
EXEC may be used to provide Tcode programs with some extended functionality, like access to resources of the host system (files, devices, etc, ...).
BC DNEXT L Downward NEXT.
Remove two elements S0 and S1 and branch to the label L, if S1 <= S0. (*)
3D HALT Instantly halt the Tcode machine.
0A HDR HeaDeR.
Push the context of the calling procedure (FP) and create a fresh procedure context by loading FP with SP.
BA JUMP L Unconditionally jump to the label L.
B8 NBRF L Nondestructive BRanch on False.
Branch to the label L, if S0 is false. Do not remove S0.
B9 NBRT L Nondestructive BRanch on True.
Branch to the label L, if S0 is true. Do not remove S0.
0F POP Pop the top element S0 and load it into the return register RR.
B2 STACK N Move the stack pointer SP by N machine words:
SP := SP - N*2.
Moving the stack pointer `down' (N>=1) allocates space on the stack, moving it `up' (N<=-1) deallocates space. STACK is primarily used to allocate and release dynamic memory in procedures.
BB UNEXT L Upward NEXT.
Remove two elements S0 and S1 and branch to the label L, if S1 >= S0. (*)

(*) The instructions UNEXT and DNEXT have been designed for use in counting loops (FOR-NEXT loops in BASIC). The idea is as follows:
At the end of the loop, the current loop index and the loop limit are both pushed onto the stack. UNEXT compares the values and branches out of the loop, if index>=limit. DNEXT branches, if index<=limit. Therefore, UNEXT is used in upward counting loops and DNEXT is used in countdown loops.

10. Other Instructions

OpcodeInstr. Fmt.Description
46 DUP DUPlicate
Push the current top of the stack (S0), thereby duplicating it.
96 INCG A N INCrement Global.
Add the value N to the value of the memory cell located at the address A. This is exactly the same as
LDG A NUM N ADD SAVG A
but more efficient.
97 INCL M N INCrement Local.
Add the value N to the value of the memory cell whose absolute address is FP - M*2.
INCL M N
is equal to
LDL M NUM N ADD SAVL M
but more efficient.
11 NOP NO Operation.
Rest for a cycle.
47 SWAP Exchange the values of S0 and S1.

11. Source Level Debugging Support

OpcodeInstr. Fmt.Description
C9 GSYM N C1...CN A Global SYMbol.
Name a global symbol. C1 through CN contain the single characters of the symbol name. A is the ID of the label which marks the named symbol. GSYM instructions should be generated only for global variable names.
C8 LINE N LINE number.
The following instructions have been created from line N in the source code this Tcode program has been created from.
C9 LSYM N C1...CN A Local SYMbol.
Name a local symbol. C1 through CN contain the single characters of the symbol name. A is a signed number holding the position of the variable relative to the value of FP.

12. License

This document is part of the T3X compiler package which is subject to the following terms.

T3X -- A Compiler for the Procedural Language T, version 3X
Copyright (C) 1996-2000 Nils M Holm. All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

  1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
  2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.