What you are trying to do is tricky, but very educational (if you are prepared to spend a lot of effort).
First, you must realise that this kind of PC-only (as opposed to PC+SP) task switching (which is the only thing you can do on a plain 12 or 14-bit PIC core) will only work when all the yield() statements in a task are in the same funtion: they can't be in a called function, and the compiler must not have messed with the function structure (as optimization might do).
Next:
currentTask->pch = PCLATH;\
currentTask->pcl = PCL + 8;\
asm("goto _taskswitcher");
- You seem to assume that PCLATH is the upper bits of the program counter, as PCL is the lower bits. This is NOT the case. When you write to PCL the PCLATH bits are written to the PC, but the upper PC bits are never (implicitly) written to PCLATH. Re-read the relevant section of the datasheet.
- Even if PCLATH was the upper bits of the PC, this would get you into trouble when the instruction after the goto is on not on the same 256-instruction 'page' as the first instruction.
- the plain goto will not work when _taskswitcher is not in the current PCLATH page, you will need an LGOTO or equivalent.
A solution to your PCLATH problem is to declare a label after the goto, and write the lower and upper bits of that label to your pch and pcl locations. But I am not sure you can declare a 'local' label in inline assembly. You sure can in plain MPASM (Olin will smile).
Lastly, to this kind of context switching you must save and restore ALL context that the compiler might depend on, which might include
- indirection register(s)
- status flags
- scratch memory locations
- local variables that might overlap in memory because the compiler does not realise that your tasks must be independent
- other things I can't imagine right now but the compiler author might use in the next version of the compiler (they tend to be very imaginative)
The PIC architecture is more problematic in this respect because a lot of resources are loacted all over the memory map, where more traditional architectures have them in registers or on the stack. As a consequence, PIC compilers often do not generate reentrant code, which is what you definitely need to do the things you want (again, Olin will probaly smile and assemble along.)
If you are into this for the joy of writng an task switcher I suggest that you swicth to a CPU that has a more traditional organization, like an ARM or Cortex. If you are stuck with your feet in a concrete plate of PICs, study existing PIC switchers (for instance salvo/pumkin?).
Well, I'm really not sure.
This CPU description is incomplete, for example there is no branch nor data memory access.
For the non-pipelined version, there is only one visible clock, the program counter. It is possible to calculate the next PC address while doing the ALU operations, propagation time is 6+3.5+4+4+1+6=24.5 ns
All the decoders (source, operand...) are parallel, so only the longest delay is part of the timing "critical path". There is no clear indication of the delay needed for writing into the register file, maybe 4 ns more.
For the pipelined version :
F : I-cache : 6ns ( in parallel with the program counter update )
ID1 : Instruction decoder : 3.5ns
ID2 : Destination decoder : 4ns ( parallel with the other decoders, which are faster) + register file : 4ns
EX : ALU + MUX : 7ns
WB : register update : ??? ns
Max delay is around 8ns.
Alternatively, all decoding in ID1 (so 7.5ns) and register access in ID2 (4ns). Traditionally, the ALU is part of the EXECUTE stage.
Anyway, I think that this exercise is really poorly written.
Best Answer
You are using the wrong control signal ALE which latches the address part and I have no idea why you are enabled the output with the decoded signal.
I'd also switch to a 374 edge triggered device clocked by that decoded signal.