Electronic – An overhead of running program on Linux vs embedded bare-metal

armlinuxmicrocontrollerprogramming

Let's say I have some multi-core ARM Cortex-A.

Now I have there a Linux, with RT_PREEMPT patch (so called real-time Linux). I run a processes with affinity set to one core and high priority (above 50).

Thanks to real-time patch, even the kernel scheduler cannot preempt my process, and thanks to affinity it can run all the time on a single core (and OS on others).

In such conditions, is there any difference (i.e. overhead) of running such code on Linux vs bare-metal?
Will the performance be the same? If not, why?

Best Answer

Overhead does not relate to preemption. Preemption stops your process and runs another process. If you disable that for e.g. one CPU core, you have the CPU core alone for your process.

Still, there's an overhead if you do I/O through the Linux I/O functions instead e.g. controlling the I/O lines directly through your process. But, for any reasonably complicated I/O, you had to implement a function with similar overhead yourself and you can assume the Linux folks are better than you in that – more experience and more test cases.

So, the only way you could win the overhead race is with very basal I/O functions (e.g. bitbanging an exotic, fast protocol) you implement yourself in "your kernel" instead of running it through the various abstraction layers of the Linux GPIO subsystem and do the same in "your userspace process".

Related Solutions

Electronic – Cycle counting with modern CPUs (e.g. ARM)

I vote for DMA. It's really flexible in Cortex-M3 and up - and you can do all kind of crazy things like automatically getting data from one place and outputing into another with specified rate or at some events without spending ANY CPU cycles. DMA is much more reliable.

But it might be quite hard to understand in details.

Another option is soft-cores on FPGA with hardware implementation of these tight things.

Electronic – Smallest embedded linux distro

I'd say you're dreaming. The main problem will be the limited RAM.

In 2004, Eric Beiderman managed to get a kernel booting with 2.5MB of RAM, with a lot of functionality removed.

However, that was on x86, and you're talking about ARM. So I tried to build the smallest possible ARM kernel, for the 'versatile' platform (one of the simplest). I turned off all configurable options, including the ones that you're looking for (USB, WiFi, SPI, I2C), to see how small it would get. Now, I'm just referring to the kernel here, and this does not include any userspace components.

The good news: it will fit in your flash. The resulting zImage is 383204 bytes.

The bad news: with 256kB of RAM, it won't be able to boot:

$ size obj/vmlinux
  text     data     bss     dec     hex filename
734580    51360   14944  800884   c3874 obj/vmlinux

The .text segment is bigger than your available RAM, so the kernel can't decompress, let alone allocate memory to boot, let alone run anything useful.

One workaround would be to use the execute-in-place support (CONFIG_XIP), if your system supports that (ie, it can fetch instructions directly from Flash). However, that means your kernel needs to fit uncompressed in flash, and 734kB > 700kB. Also, the .data and .bss sections total 66kB, leaving abut 190kB for everything else (ie, all dynamically-allocated data structures in the kernel).

That's just the kernel. Without the drivers you need, or any userspace.

So, yes, you're going to need a bit more RAM.

Best Answer

Related Solutions

Electronic – Cycle counting with modern CPUs (e.g. ARM)

Electronic – Smallest embedded linux distro

Related Topic