Hi,
A year-old thread, but the project is still being developed by the OP (Herbert) in
Another Thread, however this one contains the main specification (in post #1) so is a better place to continue.
To start with the last question: Yes, the C code for the CRC would need to be converted into PICaxe Basic. However, it's not essential to use the CRC: it can only detect that an error is present somewhere (which might not be in the part of the data actually being used), but a CRC
cannot locate or correct any error(s). So it may be better to tolerate a "possible error" than to "throw away" all the data in the frame. However, hippy did cover the overall structure, so it's worth evaluating the code, but I would never consider such code "finished" until it has been validated against real (example) data; It's far too easy to make errors (or oversights) in the coding, or in the original specification.
I believe a "literal" conversion of the CRC code into PICaxe Basic is:
Code:
symbol Value = w0
symbol ValueLow = b0
symbol ValueHigh = b1
symbol CRC = w1
symbol Polynomial = $1021
symbol i = b6
CalcCrcFromAtPtr:
ValueLow = @ptr
CRC = Value << 8 XOR CRC ; Reversed sequence for L - R precedence, ie. Shift Value left by 8 bits then XOR
FOR i = 0 TO 7 STEP 1 ; For each bit of the last input byte
IF CRC > $7FFF THEN ; MS bit is set as a 1
CRC = CRC << 1 XOR Polynomial ; = $1021
ELSE
CRC = CRC << 1 ; Only shift left by one bit-position
ENDIF
NEXT i
Return
The inner loop is executed 8 times for every byte (i.e. once for each bit) so those instructions need to be executed up to 33 * 8 = 264 times, which may take a considerable time, so we should try to make the code as efficient as possible. Firstly, we can avoid the
Value << 8 instruction by loading the byte directly into the High Byte, i.e.
ValueLow = 0 (required only once)
: ValueHigh = @ptr. The FOR .... NEXT loop involves 8 "jump back"s (i.e. GOTOs) so it's significantly faster to "unroll" the loop by repeating the same instructions 8 times. Obviously this makes the program larger, but not enormously so, and it can be kept tidy by using a Macro. Also, the ELSE part in the IF ... ENDIF structure involves time-consuming "hidden" jumps so is better avoided.
The algorithm reads the Most Significant Bit of the CRC, shifts CRC left by one bit and
then eXORs CRC with the "Polynomial". This is actually easier to code in Assembler or "Machine Code" than in most higher-level languages, because we could use a simple "Shift Left" which moves the Top Bit into a "Carry" Flag, that can immediately determine whether the XOR should be executed or "Skipped". Unfortunately, PICaxe Basic (and it appears "C") does not support a "Carry" flag, so it's necessary to test the Top Bit of the CRC word and "Remember" (i.e. store) its value until the Shift has been executed. One method for doing this is to conditionally set a variable to either the Polynomial value or to Zero, and then always execute the XOR whether it is needed or not (since an XOR with zero does nothing): But this does "waste" a little time, so I believe the fastest (PICaxe) algorithm is probably:
Code:
; Other symbols as above
symbol TopBit = bit31 ; MSbit of w1
symbol EndPtr = 24 ; Or 32 (declare a variable for dynamic selection)
symbol Polynomial = w2 ; Or directly define as a constant $1021
Polynomial = $1021 ; Processing a variable should be slightly faster than a (word) constant
#Macro CRCcore ; \/ Estimated "Base PIC" Instruction Cycles
Value = TopBit ; 500 ICs
CRC = CRC + CRC ; 600 (For an X2, << 1 might be faster)
If Value <> 0 then ; 1200
CRC = CRC XOR Polynomial ; +200 ICs if executed
Endif ; Average Total = 2400 ICs
#EndMacro
CalcCrcFromAtPtr:
CRC = 0
ptr = 0
do ; Loop excluding the Macros takes approximately 3000 ICs per pass
ValueLow = 0 ; Or could use another variable for the Topbit flag
ValueHigh = @ptrinc
CRC = CRC XOR Value
CRCcore ; Apply the left-shift and XOR algorithm
CRCcore ; Approximately 2400 ICs each time
CRCcore
CRCcore
CRCcore
CRCcore
CRCcore
CRCcore ; Total ICs per loop = ~22,200 (~1.4ms @64 MHz)
loop until ptr > endptr ; 25 or 33 passes depending on mode (= 35 - 46 ms)
To be useful, the CRC calculation must be applied to ALL the received bytes, but this could take too long. The execution time of the CRC appears to be almost two complete frames, which suggests that it is not practical to calculate the CRC with a PICaxe in this application. Also, IMHO, hippy was a little "optimistic" in saying:
... 115200 baud is something most PICAXE can handle.
Yes, for literally only one or two bytes, or if the bytes have significant pauses between them, but if the bytes are almost "touching" (technically called "concatonated") as appears to be the case in the specification above, then any M2 will very soon fail. Even an X2 at 64 MHz needs to use the HSERIN "Background Receive" mode and may be unable to keep up with processing a continuous stream of bytes. Therefore, although hippy's code above is technically correct, the CRC calculation seems too slow for a practical application so this is better removed from the general
GetWord routine (which might not need to be used for every pair of input bytes anyway). Note that the CRC routine reads bytes directly from the ScratchPad RAM buffer, it is NOT concerned that the bytes are paired into words. So the Macro becomes:
Code:
#Macro GetWord(wVar)
Do : Loop While ptr = hSerPtr ; 800 ICs to fall-through, 1200 for each loop
wVar = @ptrInc ; ~600 Instruction Cycles
Do : Loop While ptr = hSerPtr ; 800 ICs to fall-through, 1200 for each loop
wVar = wVar << 8 OR @ptrInc ; Perhaps ~1000 Instruction Cycles
#EndMacro ; Minimum execution time ~3200 ICs (= 200us)
The first line is necessary to wait for the first serial byte to be received (which increments the ptr) and then the maximum code execution speed appears to be almost the same as the bytes are received (100 us/byte). However, I've never measured X2 instruction timings and also am assuming that the
Do : Loop While {not ready} executes in the same time as a
label: IF {not ready} THEN GOTO label instruction (which should be the same, but is more difficult to code in a Macro).
But there are several issues with these conditional instructions. Firstly, once all the characters (for the present frame) have been received (after a few ms) there is no need for the tests, so the code could run at about twice as fast if they were removed. Conversely, the code assumes that the bytes are all being read in sequence and it may not work if any are skipped (e.g. because they are not needed). Also, there might be a "bug" in the coding, that in the Simulator the
ptr variable has the correct number of bits (7 for a 20X2) but
hSerPtr appears to be a normal16 bits word (i.e. the same as S_W5). Thus the "=" test may fail after a few frames, but I'm unable to test a real chip.
Therefore, it may be better to manage the
hSerPtr in a "foreground" (main program) task (i.e. not as a background circular buffer), by resetting it to zero before each frame (or maybe switching between two alternate "banks" to allow the processing to continue over the frame boundary). Then, the "waiting" loops can be skipped as soon as
hSerPtr reaches the "last byte" at address 27 or 35.
UPDATE: I've just come across
THIS OLD CODE SNIPPET THREAD (with its subsequent link) that might be of interest if it is decided to develop the CRC (de-)coding. It confirms that speed is likely to be a major issue, with the initial code around 4 times slower than my (provisional) version above. The "byte lookup" method might be faster, but it appears to require 512 Table bytes and I didn't see any description of how to calculate the table! M2s do have a table memory of 512 bytes, but X2s have only 256 bytes. There is also the 256 bytes of EEPROM (DATA) memory so it might be
just possible to code it into an X2, but very far from an easy task, even if full test/example data were available.
Cheers, Alan.