Load and Store

Opcode	Action	Flags	Description
`LD Rx,#`	Rx <= Imm	Z	Loads a particular register with an immediate value. There are two versions of this instruction. One that will load 16 bits and one that will load 32 bits. The assembler will choose the correct one depending on the size of your immediate value. The Z flag is set if the immediate value was 0, and cleared if it isn't.
`LD Rx,Ry`	Rx <= Ry	Z	Moves the value from Ry to Rx. The Z flag is set if the immediate value was 0, and cleared if it isn't.
`PUSH Rx`	Stack[i] <= Rx	Increase Stack Number	Pushes Rx to the stack. The stack is 32 bits wide, and 16 words deep. Attempting to push more than 16 values will generate an error.
`POP Rx`	Rx ≤= Stack[i]	Decrease Stack Number	POPs from the stack to Rx. The stack is 32 bits wide, and 16 words deep. Attempting to POP an empty stack will generate an error.
`LD.B Rx,(nnnn)` `LD.W Rx,(nnnn)` `LD.L Rx,(nnnn)`	Rx < = MEM(nnnn)	Z	Loads Rx with data from absolute memory location specified in nnnn. The .B, .W, and .L specifies if a byte, word, or long word will be loaded into the specified register. If a byte or word are loaded, the upper 24 or 16 (respectively) bits will be zeroed. Z is updated in a similar fashion to the other load instructions. Attempting to read past memory will result in a wraparound condition. i.e. reading a long word from 0x1ffe will result in data being read from `0x1ffe`, `0x1fff`,`0x0000`, and `0x0001`.
`LD.B Rx,(Ry)` `LD.W Rx,(Ry)` `LD.L Rx,(Ry)`	Rx < = MEM(Ry)	Z	Loads Rx with data from memory pointed to by Ry. Works the same as the absolute version, only using Ry as a pointer. All the same rules apply.
`LD.B (nnnn),Rx` `LD.W (nnnn),Rx` `LD.L (nnnn),Rx`	MEM(nnnn) < = Rx		Stores Rx to the absolute memory location specified in nnnn. The .B, .W, and .L specifies if a byte, word, or long word will be stored into the specified register. Z Flag is not updated Attempting to write past the end of memory will cause a wraparound condition similar to the load version of this instruction.
`LD.B (Ry),Rx` `LD.W (Ry),Rx` `LD.L (Ry),Rx`	x		Stores Rx into memory pointed to by Ry. Works the same as the absolute version, only using Ry as a pointer. All the same rules apply.
`PMPW Rx,Ry`	FPGA(Rx) < = Ry		Writes to PMP space on the FPGA. Rx holds the address to access, and Ry is the data to be written. These are 32 bit addresses, and 32 bit data.
`PMPR Rx,Ry`	Ry < = FPGA(Rx)		Reads from PMP space on the FPGA. Rx holds the address to access, and Ry is the data to read. These are 32 bit addresses, and 32 bit data.
`PMPBW Rx,Ry`	FPGA((Rx & 0xf0000000) \| ((Rx & 0x03ffffff) << 2)) < Ry	Z	Similar to PMPW but writes bytes instead. It writes bytes by left shifting the address and still writing 32 bits, but only 8 of them matter. The address written is(Rx & 0xf0000000) \| ((Rx & 0x03ffffff) << 2). This might look a little weird, but this was done so the programmer can specify a 256Mbyte space and write 64Mbyte worth of bytes here. i.e. the address range 1000_0000 to 1fff_ffff will be used for byte addressing. The code will strip off the bottom 28 bits first, leaving 1000_0000. Then, the remaining bits are left shifted twice, and OR'd back on. Bits 26/27 are ANDed off, because they would step on the upper 4 bits. Here's some address examples: `VM address PMP address 1000_0000 -> 1000_0000 1000_0001 -> 1000_0004 1000_0002 -> 1000_0008 1000_0003 -> 1000_000C 1000_0004 -> 1000_0010 1000_1234 -> 1000_48D0 13ff_ffff -> 1fff_fffc 1400_0000 -> 1000_0000 // we AND off bits 26 and 27, so the address "wraps around" to 1000_0000 again. 1800_0001 -> 1000_0004 // we wrap again due to bits 26/27 being ANDed 1c00_0002 -> 1000_0008 // and this also wraps too. 1fff_ffff -> 1fff_fffc // and wrap again!` On the FPGA, these addresses can just be aliased to another range to perform the byte writes if you need to write to RAM byte-wise for some reason, vs. word wise like usual.