BIT: A Hardware Description Language and CPU

BIT is a hardware description language that I designed and wrote in 1992.  Sadly, the original C++ source code has been lost forever.  In 2011, I resurrected the project and re-wrote BIT in C#.  The new version includes a context sensitive editor, real time language parser,  optimizing compiler, and CPU simulator.  Download: Bit-V4.exe (Windows XP/Vista/7)

In addition to BIT (the hardware description language), I also designed a 32 bit CPU.  It currently has 41 instructions and 18 registers.  All instructions are 32 bits, optionally followed by an immediate 32 bit word, and execute in 1 to 4 cycles depending on the indexing mode.  The CPU currently has 11864 logic gates.

Context Sensitive Editor with Syntax Highlighting:

BIT Overview

The BIT hardware description language is based on Boolean algebra, where logic signals can carry two values: TRUE (which has a value of 1) or FALSE (which has a value of 0).  All logic circuits are boiled down to the basic Boolean operators:  NOT, AND, OR, and XOR (exclusive OR).  Visit the operators page for more information.

The basic building block in BIT is the BOX.  A box is a re-usable circuit that performs a Boolean calculation.  For instance:

// Single bit adder with carry in and out
box Add1(in a, in b, in c, out answer, out carry) is
    answer = a # b # c;        // Use XOR to calculate the result
    carry = a*b + a*c + b*c;   // Use AND and OR to calculate the carry
end

The circuit above takes three inputs (a, b, and c) and generates a two bit sum.  This is called a full adder.  For more information about a full adder, visit Wikipedia’s Full Adder page.

A 4 bit adder can be created by using 4 single bit full adders.  This circuit has 14 pins (9 inputs, and 5 outputs):

// A 4 bit ripple adder (very slow)
box RippleAdd4(in cin, in left[4], in right[4], out answer[4], out carryOut) is
    bit carrys[3];
    Add1(left[0], right[0], cin, answer[0], carrys[0]);
    Add1(left[1], right[1], carrys[0], answer[1], carrys[1]);
    Add1(left[2], right[2], carrys[1], answer[2], carrys[2]);
    Add1(left[3], right[3], carrys[2], answer[3], carryOut);
end

Try simulating a 32 bit ripple adder by running BIT, pressing F5, and selecting “RippleAdd32”.  Set the left parameter to “FFFFFFFF” and then set the cin parameter to “1”.  The output will slowly change from “FFFFFFFFF” to “00000000”, and you will see why it’s called a ripple adder.  It takes 224 logic gates to implement the circuit, and it takes 69 gate times to perform the addition.

Now try simulating a 32 bit look ahead adder by pressing F5 and selecting “LahAdd32”.  It takes 466 gates to implement this particular look ahead adder, but only 8 gate times to perform the same addition.  You can read more about look ahead adders at Wikipedia’s Look Ahead Adder page.

An RS flip flop (see Wikipedia’s RS flip flop page) can be created as follows:

// Standard RS flip-flop.
// 		s - Active low (when it goes low, the output is set)
//		r - Active low (when it goes low, the output is reset)
// s and r should never be low at the same time
// See http://en.wikipedia.org/wiki/Flip-flop_(electronics)
box RSFF(in s, in r, out q, out notQ) is
    q = !(s*notQ);
    notQ = !(r*q);
end

Visit the BIT Source Registers page to see how to build the latches and registers used by the CPU.

Here is an example of the if-elif-else-end statement, which uses the “sum of products” to calculate AlignBus32In.

// Align the bus input on an int/short/byte - BIG ENDIAN
//      byte: True if the reading a byte, false for short or int
//      word: True if reading a short, false if int
//      address: Last two bits of the address being read
//      dataIn: The 32 bit int read from the address (divisible by 4)
// RETURNS: The bus data re-aligned for the given type and address
// NOTE: Data must be aligned properly for the given type
// (int on 4 byte boundary, short on 2 byte boundary, byte anywhere)
box AlignBus32In[32](in byte, in word, in address[2], in dataIn[32]) is
    bit zWord[16] = 0[0:16];
    bit zByte[8] = 0[0:8];
    if (byte)
        // 8 bit value (byte)
        if (address == 0)
            AlignBus32In = set(zByte, zByte, zByte, dataIn[24:8]);
        elif (address == 1)
            AlignBus32In = set(zByte, zByte, zByte, dataIn[16:8]);
        elif (address == 2)
            AlignBus32In = set(zByte, zByte, zByte, dataIn[8:8]);
        else
            AlignBus32In = set(zByte, zByte, zByte, dataIn[0:8]);
        end
    elif (word)
        // 16 bit value (short)
        if (address[1] == 0)
            AlignBus32In = set(zWord, dataIn[16:16]);
        else
            AlignBus32In = set(zWord, dataIn[0:16]);
        end
    else
        // 32 bit value (int)
        AlignBus32In = dataIn;
    end
end

The CPU uses this circuit to re-align the data bus input for the given type.  For instance, a byte read from address 0xA9 translates to a word read from address 0xA8, and the output is re-aligned so the correct byte goes to the least significant bits.  If the bus data at 0xA8 is AABBCCDD, the output is 000000BB (in big endian format, AA is at location 0xA8, BB is at location 0xA9, CC is at location 0xAA, and DD is at location 0xAB)

BIT Operators

The BIT operators are listed in this table, from highest precedence to lowest:

Name Operator Description
Array [] The array operator is used to select a single bit (output = input[8]) or a range of bits (output = input[8..15])
NOT ! The NOT operator negates its input.  !0 is 1, and !1 is 0.  The operand may be a single bit, or an array of bits.
AND * The AND operator multiples its two inputs:
0*0=0,    0*1=0,    1*0=0,    1*1=1
The operands may be a single bit, or arrays of equal length.
XOR # The XOR operator performs an exclusive or on its two inputs:
0#0=0,    0#1=1,    1#0=1,    1#1=0
The operands may be a single bit, or arrays of equal length.
OR + The OR operator adds its two inputs:
0+0=0,    0+1=1,    1+0=1,    1+1=1
The operands may be a single bit, or arrays of equal length.
Equal ==, != The == operator translates to an XNOR, and the != operator translates to an XOR.  The operands may be a single pin, or arrays of equal length.
Ternary ?: The ternary operator (x ? y : z) uses the “sum of products” to translate to (x*y + !x*z).  Both Y and Z may be a single bit, or arrays of the same length.  X may be a single bit, or an array of the same length as Y and Z.
Range :, .. The range operator is used to select a range of bits from an array.  It works only on const int (not on bits, as most other operators do).  The “:” operator takes two operands (start_index  :  length) to form a range.  The “..” operator takes two indices (start_index .. end_index).  For example, if input is an array of 32 bits, the second byte can be selected using either of these two expressions: (output = input[8:8]) or (output = input[8..15])
Assign = The assignment operator assigns the left variable to the expression on the right.  Both operands must be a single bit, or arrays of the same length.
If if
elif
else
end
While not technically an operator, the “if-elif-else-end” statement combines a series of assignments using the “sum of products” (similar to the ternary operator above).  NOTE: The “if” statement can only be used for combinatorial logic.  Unlike VHDL, it can not be used to create a memory cell.

NOTE: Most operators can also be used with the const int type, and work as you would expect them to work.  In addition to the above operators, integers can be divided “/” and subtracted “-”

CPU Assembly Language Quick Reference

All instructions have been implemented, except for POP.

There are 16 general purpose 32 bit registers:
    0..13     		General purpose registers (R0..R13)
    14 = S		Stack (mostly by convention, not because it's very special)
    15 = PC		Program counter

Extended registers (when X bit is set):
    0 = CC		Condition Codes
    1 = LR		Link register (used to return from subroutine)
    2..12		Reserved (will be used for interrupts)
    13..15		Map to general purpose registers (R13, S, PC)
Push/Pop
0000001P ######## ######## ########
        02:Push   r0..r15,cc,link
	03:Pop    cc,link,r15..r0 (Pop Not Yet Implemented)
Conditional Branches, Jumps, and Jump to Subroutine
001CCCCC ######## ######## ######## [optional immediate word]
    C = 20:BRN          21:BRA          22:BEQ          23:BNE
        24:BCS/BULT     25:BCC/BUGE     26:BVS          27:BVC
        28:BMI          29:BPL          2A:BULE         2B:BUGT
        2C:BSLT         2D:BSGE         2E:BSLE         2F:BSGT
        30:JSR #        Jump to subroutine (absolute)
        31:BSR #        Branch to subroutine (relative to PC)
    # = 24 bit signed value times 4, or 0x800000 for immediate value follows
    NOTE: ALU instruction 0x56 is also a jump to subroutine (e.g LD.J PC,[R0+R2*4])

ALU integer op-code chart:
        40:LD           41:ST           42:ADD          43:ADC
        44:SUB          45:SUBC         46:SUBR         47:SUBRC
        48:CMP          49:BIT          4A:AND          4B:OR
        4C:XOR          4D:SL           4E:SLC          4F:SR
        50:USR          51:SRC          52:MIN          53:UMIN
        54:MAX          55:UMAX         56:LD.J

CPU Instruction Modes (for ALU instructions listed above):
01CCCCCC    MMMMDDDD    [see below for next 16 bits]
    C: Alu operation (see ALU integer chart above)
    M: Mode, the next 16 bits are defined as follows:
        0: ######## ########    D = D op #16u
        1: ######## ########    D = D op #~16u
        2: SSSS#### ########    D = S op #12
        3: SSSS#### TTTx####    D = S op #8             [#32 follows when 0x80]
        4: SSSSRRRR TTTx0nnn    D = S op (R<<n)
        5: SSSSRRRR TTTx0iii    D = S op [R]            iii = [++R],[--R],[R++],[R--]
        6: SSSSRRRR TTTx0nnn    D = D op [S+(R<<n)]
        7: SSSSRRRR TTT#####    D = S op [R+#~5ux]      [#32 follows when 0]
        8: SSSSRRRR TTT#####    D = S op [R+#5ux]
        9: SSSSRRRR TTT#####    D = S op [R+#5ux+32x]
        A: SSSSRRRR TTT#####    D = S op [R+#5ux+64x]
        B: SSSSRRRR TTT#####    D = S op [R+#5ux+96x]
        C: ####RRRR TTT#####    D = D op [R+#9ux]
        D: ####RRRR TTT#####    D = D op [R+#9ux+512x]
        E: ####RRRR TTT#####    D = D op ->[R+#9ux]     [#32 follows when 0x1F]
        F: Reserved for future
    D: Destination register (and source register 1 for two register operations)
    S: Source register 1
    R: Source register 2
    n: Three bit number to shift left (unused if the ALU operation is a shift)
    i: Pre/post inc/dec, i = (0:[++R], 1:[--R], 2:[R++], 3:[R--], 4:[R])
    x: Use extended registers (r0, r1, cc, link, ... s, pc)
    T: Type - sbyte,byte,short,ushort,int,(reserved for future: float,long,double)
              NOTE: The right hand operand is sign extended (or zero filled) before
              the ALU operation.  The ALU operation is always performed as a 32
              bit integer operation.  The ALU output can be sign extended (or
              zero filled) by setting the E bit.    #16u = unsigned 16 bit number
    #~16u = unsigned 16 bit number | 0xFFFF0000 - always negative
    #12 (and #7) = signed 12 bit (or 7 bit) number
    #~5ux = (5 bit unsigned number | 0xFFFFFFe0) * sizeof(type) - always negative
    #5ux = 5 bit unsigned number * sizeof(type)
        NOTE: +32x, +64x, +96x = the number * sizeof(type)
    #9ux = 9 bit unsigned number * sizeof(type)

CPU Assembly Language Example

// *** CRC Example (in C) ***
uint Crc32(uint crc, uint *table, byte *buffer, int length)
{
    while(--length >= 0)
        crc = table[((crc >> 24) ^ *buffer++)] ^ (crc << 8);
}

// *** CRC Example (in Assembly Language) ***
#define crc     r0
#define table   r1
#define buffer  r2
#define length  r3

44030001    sub     length,1
0Cxxxxxx    bslt    crc_done		// Branch signed less than
         crc_loop:
50240018    usr     r4,crc,24           // r4 = ((uint)crc >> 24)
4C544202    xor.b   r4,[buffer++]       // r4 = r4 ^ *buffer++
4D000008    sl      crc,8		
4C601482    xor     crc,[table+r4*4]
44040001    sub     length,1
0Dxxxxxx    bsge    crc_loop		// Branch signed greater than or equal
         crc_done:
404F4490    ld      pc,rlink            // Return to subroutine

BIT “Source Code” for CPU Registers

NOTE: You can see the BIT “Source Code” for the entire CPU by downloading BIT.

Visit Wikipedia’s Flip Flop page read about the RS flip flop.

// Standard RS flip-flop.
// 		s - Active low (when it goes low, the output is set)
//		r - Active low (when it goes low, the output is reset)
// s and r should never be low at the same time
// See http://en.wikipedia.org/wiki/Flip-flop_(electronics)
box RSFF(in s, in r, out q, out notQ) is
    q = !(s*notQ);
    notQ = !(r*q);
end

Notice that this latch returns a single bit output by appending “[1]” to the name of the box:

// Gated latch, with enable high data pass through.
// When enable is true, data passes through.  When enable
// is low, the data is locked and holds its previous value.
box Latch1[1](in enable, in data) is
    bit s = !(enable*data);
    bit r = !(enable*s);
    RSFF(s, r, Latch1[0], unused);
end

And this latch returns an array of “[4]” bits:

// A 4 bit gated latch, with enable high data pass through.
// When enable is true, data passes through.  When enable
// is low, the data is locked and holds its previous value.
box Latch4[4](in enable, in data[4]) is
    Latch4[0] = Latch1(enable, data[0]);
    Latch4[1] = Latch1(enable, data[1]);
    Latch4[2] = Latch1(enable, data[2]);
    Latch4[3] = Latch1(enable, data[3]);
end

These latches use the range operator to select a range of bits from an array of bits:

// An 8 bit gated latch, with enable high data pass through.
// When enable is true, data passes through.  When enable
// is low, the data is locked and holds its previous value.
box Latch8[8](in enable, in data[8]) is
    Latch8[0:4] = Latch4(enable, data[0:4]);
    Latch8[4:4] = Latch4(enable, data[4:4]);
end

// A 32 bit gated latch, with enable high data pass through.
// When enable is true, data passes through.  When enable
// is low, the data is locked and holds its previous value.
box Latch32[32](in enable, in data[32]) is
    Latch32[0:8] = Latch8(enable, data[0:8]);
    Latch32[8:8] = Latch8(enable, data[8:8]);
    Latch32[16:8] = Latch8(enable, data[16:8]);
    Latch32[24:8] = Latch8(enable, data[24:8]);
end

Visit Wikipedia’s Flip Flop page read about the classical positive edge triggered D flip flop.

// Classical edge triggered D flip-flop, with positive edge trigger.
// On the rising edge of the clock, the data is locked in and passed
// through to the output.  At all other times, the previous value is used.
box DFF[1](in clock, in data) is
    // Use three RS flip-flops to compose an edge triggered flip flop
    bit r;
    bit s;
    bit notD;
    // When clock is low, both S and R are high, and the 
    // last flip-flop retains its value.
    RSFF(notD, clock, unused, s);       // Generate active low S signal
    RSFF(clock*s, data, r, notD);       // Generate active low R signal
    RSFF(s, r, DFF[0], unused);         // Output flip flop
end

// An 8 bit register with positive edge trigger.  
// On the rising edge of the clock, the data is locked in (and passed through)
box Register8[8](in clock, in data[8]) is
    Register8[0] = DFF(clock, data[0]);
    Register8[1] = DFF(clock, data[1]);
    Register8[2] = DFF(clock, data[2]);
    Register8[3] = DFF(clock, data[3]);
    Register8[4] = DFF(clock, data[4]);
    Register8[5] = DFF(clock, data[5]);
    Register8[6] = DFF(clock, data[6]);
    Register8[7] = DFF(clock, data[7]);
end

// A 32 bit register with positive edge trigger.  
// On the rising edge of the clock, the data is locked in (and passed through)
box Register32[32](in clock, in data[32]) is
    Register32[0:8] = Register8(clock, data[0:8]);
    Register32[8:8] = Register8(clock, data[8:8]);
    Register32[16:8] = Register8(clock, data[16:8]);
    Register32[24:8] = Register8(clock, data[24:8]);
end

Leave a Reply