There's no relationship, it's like asking about the relationship between cars and nuts and bolts - one contains the other, but there's more to it...
Have a look at www.nand2tetris.org which is a course starting at the basic building block (the NAND gate) and working up to a microcontroller playing tetris.
You can of course go further "back" than a NAND gate by breaking it down to its component transistors, etc., but that's probably not overly helpful to a software engineer unless the question is about the actual physics of system performance.
Yes, you can implement a UART in software. Here's one in 8051 assembly, which should work for your microcontroller. This technique is called bit-banging.
Assuming by "serial" you mean RS-232¹, and assuming you're talking to something that actually wants RS-232 and not "TTL serial", you will still need an external RS-232 level-shifter like a MAX232.
The problem with software UARTs is that RS-232 doesn't have a separate clock line². As such, reliable communication is dependent on the timing accuracy of the devices on both ends of the connection. You therefore don't want to try and provide a software UART if your instruction clock is inaccurate, as with many internal RC oscillators. The AT89S52 you're using doesn't have the option of an internal oscillator, but you do still need to make sure your external oscillator is accurate to within 1% of the nominal value for a software UART to achieve reliable communication.
On top of that, for the processor's core instruction cycle time to divide evenly into the bit time of RS-232, you need to pick odd oscillator frequencies. 11.0592 MHz is popular for this. Another option is 14.7456 MHz, as explained here.
The clock frequency affects the hard-coded delay loops you need in a bit-banging approach. That's what the DJNZ R0,$
bit is in the assembly code I pointed you to: it's a pure delay loop, doing nothing but decrementing a counter to burn time. Up at the top of the file, you see the BITTIM
constant, which hard-codes this particular implementation so that its timing works when using an 11.0592 MHz CPU clock. If you change the clock frequency, you have to change this constant, too. Then you should test it again carefully to make sure the timing is still good enough. Sometimes you find yourself needing to add NOP
or similar instructions to pad the timing out with bit-banging approaches like this one.
The lower your serial data rate, the more slackness you can get away with. So, if you only need 1,200 bps, you might be able to get away with an internal RC oscillator or a "normal" μC instruction clock rate, particularly if you aren't transmitting or receiving continuously. Conversely, my experience is that if you need to run faster than 9,600 bps or so, you really need a hardware UART.
Footnotes
There are many different forms of serial communication. When used generically, the term most often means RS-232, but based on the tags defined here on Electronics.SE, I²C, SPI, CAN, and RS-485 are all quite common.
RS-485: I believe everything above applies just as well to RS-485, as well as to its close relative RS-422.
I²C and SPI: These include a synchronous clock line, so most of what I've written above doesn't apply to these communication methods. Nevertheless, it is also possible to implement them in software via bit-banging.
CAN bus: Not possible to implement in software, due to the need for CSMA/BA.
There are still other forms of serial communication which are simply too fast to implement in software for a typical μC, such as USB, Ethernet, PCI Express, and SATA. You can implement them in FPGAs, but that's a separate question.
Well, not with DB-9 RS-232, or any of the common lower-pin-count variants. The original DB-25 based flavor of RS-232 did set aside a couple of pins for sender and receiver clocks (15 and 17) but in my decades of experience with RS-232, I've never used a device that actually depended on their presence.
Best Answer
Pretty much just transistors. Lots of them. Starting with a couple of thousand for the 4004 (the first commercially successful microprocessor) in 1971, to billions in the latest chips. Transistors are used to create logic gates, which in turn are used to create the basic building blocks of the processor:
Microcontrollers in addition have program and data memory, which is also constructed from transistors, along with analog circuitry and I/O ports.
High-level language languages are compiled or interpreted, in the first case eventually translated into machine instructions which are decoded by the instruction decoder. Opcodes in the instruction dictate which operation takes place. Arithmetic (add subtract and so forth) and logic operations like AND, OR etc. are handled by the ALU. So yes there are gates buried within the ALU that perform AND and OR operations, corresponding to AND and OR operations in your program.
However such operations need not actually be done by AND or OR gates Any logic function can be performed using either all NAND gates (AND followed by and inverter) or all NOR gates (OR gates followed by and inverter) so no other type of gate would be needed. The guidance computer for the Apollo 11 spacecraft which landed on the moon in 1969, consisted entirely of 2800 IC's with dual three-input NOR gates.
Here is the transistor count for a selected number of microprocessors over the years:
It's astonishing that the number of transistors has increased by more than six orders of magnitude in 41 years (1971-2012). This increase has almost exactly matched Moore's Law:
$$ 2300(2^{41/2}) = 3,410,693,920 $$
which states the number of transistors in an integrated circuit doubles every two years.
Some notes about the table:
The earliest microprocessors, such as the 6502 used in the very popular Apple ][, had low enough transistor counts that with only little magnification you can see the individual transistors. Here is a 6502 simulator that actually shows the data paths through the chip as it executes a program. Just click on the Play button on the right side. You can zoom in and get more detail. The simulator was created by exposing the die, photographing the surface and substrates at high resolution, and then creating a complete digital model of the chip.
The very first computers in the 1940's were made up of either vacuum tubes or relays, but in either case they performed the same Boolean algebra as the digital circuits today. Discrete transistor computers came along ion the late 1950's/early 1960's. They were superseded by processors using SSI (small scale integrated circuits). This was largely driven by the aerospace industry, both the space race and defense. to minimize weight. The microprocessors in the list earlier are an example of LSI (large scale integration).