I learned on a 68HC11 in college. They are very simple to work with but honestly most low powered microcontrollers will be similar (AVR, 8051, PIC, MSP430). The biggest thing that will add complexity to ASM programming for microcontrollers is the number and type of supported memory addressing modes. You should avoid more complicated devices at first such as higher end ARM processors.
I'd probably recommend the MSP430 as a good starting point. Maybe write a program in C and learn by replacing various functions with inline assembly. Start simple, x + y = z, etc.
After you've replaced a function or algorithm with assembly, compare and contrast how you coded it and what the C compiler generated. This is probably one of the better ways to learn assembly in my opinion and at the same time learn about how a compiler works which is incredibly valuable as an embedded programmer. Just make sure you turn off optimizations in the C compiler at first or you'll likely be very confused by the compiler's generated code. Gradually turn on optimizations and note what the compiler does.
RISC vs CISC
RISC means 'Reduced Instruction Set Computing' it doesn't refer to a particular instruction set but just a design strategy that says that the CPU has a minimal instruction set. Few instructions that each do something basic. The is no stringently technical definition of what it takes 'to be RISC'. On the other hand CISC architectures have lots of instructions but each 'does more'.
The purposed advantages of RISC are that your CPU design needs fewer transistors which means less power usage (big for microcontrollers), cheaper to make and higher clock rates leading to greater performance. Lower power usage and cheaper manufacturing are generally true, greater performance hasn't really lived up to the goal as a result of design improvements in CISC architectures.
Almost all CPU cores are RISC or 'middle ground' designs today. Even with the most famous (or infamous) CISC architecture, x86. Modern x86 CPUs are internally RISC like cores with a decoder bolted on the front end that breaks down x86 instructions to multiple RISC like instructions. I think Intel calls these 'micro-ops'.
As to which (RISC vs CISC) is easier to learn in assembly, I think its a toss up. Doing something with a RISC instruction set generally requires more lines of assembly than doing the same thing with a CISC instruction set. On the other hand CISC instruction sets are more complicated to learn due to the greater number of available instructions.
Most of the reason CISC gets a bad name is that x86 is by and far the most common example and is a bit of a mess to work with. I think thats mostly a result of the x86 instructions set being very old and having been expanded half a dozen or more times while maintaining backward compatibility. Even your 4.5Ghz core i7 can run in 286 mode (and does at boot).
As for ARM being a RISC architecture, I'd consider that moderately debatable. Its certainly a load-store architecture. The base instruction set is RISC like, but in recent revisions the instruction set has grown quite a bit to the point where I'd personally consider it more of a middle ground between RISC and CISC. The thumb instructions set is really the most 'RISCish' of the ARM instruction sets.
Best Answer
You are going about this the wrong way. Unlike programming large machines with operating system, on microcontrollers you have to know how things work down to the hardware. That means now a compiler gets in the way by obscuring things, as apposed to dealing with all that machine stuff you don't have to care about on a large system. Writing in a HLL on a microcontroller can still be useful, but only if you understand the underlying hardware.
Blindly using a compiler and calling library routines to manage the hardware may sound like the easy way to go, but it's not good for learning, and since you're really not learning what is going on at the low levels, stuff will happen and you won't understand what is going on.
So to really answer you question, the best book for learning a PIC 18 is the datasheet for that PIC 18. If you are looking for something generic to start with, try the 18F2520. That's a easy to handle 28 pin part, and comes with a good mix of general peripherals. It also has plenty of RAM and program memory to do lots of useful projects.
When you first read the datasheet, you need to look carefully at the parts that talk about the general architecture, like the memory model, instruction set, stack, pointer registers, etc. That will be the same for all PIC 18. Different PIC 18 models vary in the amount of memory and mix of peripherals. Once you can write basic code using the core, you can try out peripherals one at a time and learn them individually.
As for MPLAB, that is a separate document. Actually MPLAB isn't a big deal to learn. What takes more is the assembler, the linker, possibly the librarian, and how they interact and how to use them. Once you get all that, you can throw in a compiler, but now you'll understand what it's actually doing and therefore what the various gotchas and restrictions are that you just don't see on big systems.