Electronic – Processor Running C Natively

cmicroprocessor

A friend of mine came up with an idea for something dealing with a micro-processor running C natively. Problem is, we need to be able to know if there is a processor out there already before we spend our time and money on something. Does anybody have any clue about such a processor?

Best Answer

Of course to properly look at this we must know what it means to "Natively" execute anything. On the surface this seems like an easy question, but it isn't. Let me elaborate.

But first, let me say that I am massively simplifying this description! There is no way I can explain this in a reasonable number of words without some over-arching generalizations and simplifications. Deal with it.

Let's start with a bit-slice processor (BSP) design. These are the easiest of processors to design, the hardest to program for, the smallest in terms of logic size, and the worst in terms of code-density. Essentially, an instruction word in a bit-slice processor never goes through an instruction decode step. The instruction word is somewhat pre-decoded. The individual bits of the instruction goes directly to latches, muxes, ALUs, etc inside the processor. Consequently the instruction word can be very large. Instructions larger than 256 bits is not uncommon! Normal BSP's are purpose built for a single task and are not general purpose CPU's. While BSP's sound somewhat exotic, they are used all over the place but are so deeply embedded that you probably don't notice.

One step up from a BSP is a RISC CPU. The overall data flow is changed to be more general purpose, and an instruction decode stage is added to the pipeline. Inside the RISC CPU there is still a giant instuction word, like the BSP, except that the instruction decode is used to convert the 32-bit instruction into that giant instruction word. Fundamentally this instruction decode is like a giant look up table that converts the 32-bit instruction to the giant instruction word used in the BSP. It is not literally a giant look up table, but that is what it effectively is. This instruction decode limits what the instructions can do, but greatly simplifies programming and is what turns this thing into a general purpose CPU.

Next step up we get to a CISC CPU. The main difference is that the instruction decode becomes more complex. Instead of the ID being just a huge lookup table, the ID converts the 32-bit instruction into a series of BSP-like instructions. You can really think of each 32-bit instruction and being a small subroutine call inside a BSP.

Next, you have assembly language. This is the ASCII text that you write that gets converted into those 32-bit instructions by the assembler and linker. While this is the lowest level of programming that a human might do, there is not always a one to one relationship between what the human writes and what the CPU executes. Even here the assembler is doing some level of interpreting and manipulating of the final instructions. For example, MIPS assemblers will rearrange or add instructions to deal with pipeline hazards. I'm sure other assemblers will do something similar.

Then you have a fully interpreted language. In this language, the interpreter has to parse the ASCII of each line or command every time that line is executed. This is what most scripting languages do.

There are also fully compiled languages, like C/C++, in which a compiler takes the ASCII source code and converts it into assembly language (or sometimes directly into the normal 32-bit opcodes).

Between interpreted and compiled languages there is "tokenized languages". These are most like interpreted languages, but the ASCII source code is parsed only once. The net effect is that the execution speed is much quicker and a fully interpreted language, but you still have the flexibility of an interpreted language and don't have the compile time of a compiled language. The term "tokenized" is used because the code is pre-parsed, or tokenized, into something that is easier to deal with than straight ASCII. Java is a good example of a tokenized language.

There have also been "BASIC CPUs", essentially these are CPU's that have a BASIC interpreter built into them. They are a normal MCU where the Flash EPROM contains a BASIC interpreter as well as the pre-tokenized BASIC program.

So, back to the question: What does it mean to natively execute a program? Does the program have to be down to the BSP level to be native? If so then almost nothing is native. What about the 32-bit instruction level? Ok, that's what most would call native since that is what the "CPU block" is given to execute. Normally anything ASCII is not "native" since some level of interpretation needs to be done before it can be executed. How about those BASIC MCU's? Do they natively execute BASIC? Probably not.

But let's look more at those BASIC MCU's. The BASIC interpreter is stored in the Flash EPROM and is made up of those MCU's standard opcodes. But what if the interpreter was actually part of a CISC CPU's instruction decode? Instead of the instruction decode running some subroutine for an "Multiple and ADD with Saturation" instruction, it ran a subroutine for "let X=5 + y". Would that CPU then be said to execute BASIC natively? I would!

But let's look at the C language specifically. And let's assume some crazy CISC processor that would interpret ASCII C source code directly. As you look at the tasks of managing files, parsing ASCII, and managing variables you notice two things: Either the BSP at the core of our C-CPU becomes absolutely huge and unmanageable or the BSP starts to look like what any other modern CPU has. But if the BSP looks similar to other CPU's then the instruction decode must do all the hard work, which it is not well suited for either.

What you end up with if you follow this to it's natural conclusion is something that looks like a normal RISC or CISC CPU that has a C Interpreter already programmed into it's Flash EPROM. Exactly like those Basic MCU's I mentioned before!

The net result is that a CPU that runs C "natively" is not useful-- even as an educational project. I could go on and on, but I'm almost late for a meeting now. Enjoy!