Simply put, you use whatever tools and talents you have available to get the job done, in the time required, and within budget. In the end, that is all that matters.
While some tasks naturally lend themselves to a hardware or a software solution, there are a lot of tasks which it does not matter. For example, I could use a 555 timer to blink an LED, or I could use a US$0.30 MCU. There are pros and cons for either, but in the end it really doesn't matter so long as you get the job done. I have seen blinking LEDs done with either approach in commercial products.
I cannot give you a list of things that should be done in software, or a list for hardware. That list would be long, boring, and mostly meaningless. Technology progresses. 20 years ago I would have never thought of using an MCU just to blink an LED. The tasks that are appropriate for hardware/software is always changing. If I gave you a list now, it would be out of date tomorrow.
To make matters worse, the line between hardware and software is blurring. FPGAs come to mind, which are programmed in a way similar to writing software, but the end result is hardware. And even FPGAs frequently have logic inside of them that resembles some form of CPU. But even GPUs and some CPUs have FPGA-like features that may or may not remain hidden from the software programmer.
Knowledge and experience will have to guide you in knowing what is appropriate for hardware and software. Your first step is to know the problem that you are trying to solve and how to solve it. The second step is knowing different ways to solve it-- CPUs, FPGAs, analog circuits, etc. In many cases you will need to know several different CPUs, or several different FPGAs in order to figure out what the best approach is.
There is no substitute for knowing your craft.
Yes, you've identified the subsystems correctly. On the main PCB, the "black areas" are individual chips.
"hynlx" 512A KOR (I guess Korea)
This is an SDRAM chip, used by the firmware on the SOC in general, and in particular as a frame buffer for the decoded video.
SAMSUNG 031 PCBO
This is a flash EEPROM chip, which contains both the firmware for the SOC and the video file.
ATJ2273B-C
This is the main processor (system on chip, or SOC).
It contains a general-purpose CPU, along with USB, SDRAM, audio and video display controllers, and it "runs the show".
Best Answer
Then came a few 2D graphic primitives :
These hardware acceleration features were usually done with fixed hardware logic. Sometimes, DSPs or special purpose CPUs (for example Texas Instruments TIGA chipsets) were used to offload the main CPU. These efforts were sometimes defeated when the cost of transferring data or programming the graphic accelerator exceeded the time required by the CPU for doing the work itself, especially when special effects (like transparency) were needed. 2D rotation is not used enough to reward hardware acceleration. In games, pre-rendered rotated sprites were stored in memory.
The hardware complexity and cost must be balanced with the time saved by the main CPU. Blitting and filling is very useful, drawing shapes like anti-aliased lines and Bezier curves are often beyond the capacity of 2D hardware.
(For a recent example of basic 2D acceleration in small ARM SOCs, look for "ST Chrom-ART Accelerator")