Assembly – Were the First Assemblers Written in Machine Code?

I am reading the book The Elements of Computing Systems: Building a Modern Computer from First Principles, which contains projects encompassing the build of a computer from boolean gates all the way to high level applications (in that order). The current project I'm working on is writing an assembler using a high level language of my choice, to translate from Hack assembly code to Hack machine code (Hack is the name of the hardware platform built in the previous chapters). Although the hardware has all been built in a simulator, I have tried to pretend that I am really constructing each level using only the tools available to me at that point in the real process.

That said, it got me thinking. Using a high level language to write my assembler is certainly convenient, but for the very first assembler ever written (i.e. in history), wouldn't it need to be written in machine code, since that's all that existed at the time?

And a correlated question… how about today? If a brand new CPU architecture comes out, with a brand new instruction set, and a brand new assembly syntax, how would the assembler be constructed? I'm assuming you could still use an existing high level language to generate binaries for the assembler program, since if you know the syntax of both the assembly and machine languages for your new platform, then the task of writing the assembler is really just a text analysis task and is not inherently related to that platform (i.e. needing to be written in that platform's machine language)… which is the very reason I am able to "cheat" while writing my Hack assembler in 2012, and use some preexisting high level language to help me out.

Best Answer

for the very first assembler ever written (i.e. in history), wouldn't it need to be written in machine code

Not necessarily. Of course the very first version v0.00 of the assembler must have been written in machine code, but it would not be sufficiently powerful to be called an assembler. It would not support even half the features of a "real" assembler, but it would be sufficient to write the next version of itself. Then you could re-write v0.00 in the subset of the assembly language, call it v0.01, use it to build the next feature set of your assembler v0.02, then use v0.02 to build v0.03, and so on, until you get to v1.00. As the result, only the first version will be in machine code; the first released version will be in the assembly language.

I have bootstrapped development of a template language compiler using this trick. My initial version was using printf statements, but the first version that I put to use in my company was using the very template processor that it was processing. The bootstrapping phase lasted less than four hours: as soon as my processor could produce barely useful output, I re-wrote it in its own language, compiled, and threw away the non-templated version.

Best Answer

Related Solutions

Theory – How to Automatically Convert Code from Low-Level to High-Level Language

C Programming – Writing an Assembler in C for Low-Level Language