Transistors are switches, yes, but switches are more than just for turning lights on and off.
Switches are grouped together into logic gates. Logic gates are grouped together into logic blocks. Logic blocks are grouped together into logic functions. Logic functions are grouped together into chips.
For example, a TTL NAND gate typically uses 2 transistors (NAND gates are considered one of the fundamental building blocks of logic, along with NOR):
simulate this circuit – Schematic created using CircuitLab
As the technology transitioned from TTL to CMOS (which is now the de-facto standard) there was basically an instant doubling of transistors. For instance, the NAND gate went from 2 transistors to 4:
simulate this circuit
A latch (such as an SR) can be made using 2 CMOS NAND gates, so 8 transistors. A 32-bit register could therefore be made using 32 flip-flops, so 64 NAND gates, or 256 transistors. An ALU may have multiple registers, plus lots of other gates as well, so the number of transistors grows rapidly.
The more complex the functions the chip performs, the more gates are needed, and thus the more transistors.
Your average CPU these days is considerably more complex than say a Z80 chip from 30 years ago. It not only uses registers that are 8 times the width, but the actual operations it performs (complex 3D transformations, vector processing, etc) are all far far more complex than the older chips can perform. A single instruction in a modern CPU may take a many seconds (or even minutes) of computation in an old 8-bitter, and all that is done, ultimately, by having more transistors.
Since you have a good grasp on the other components, transistors should be no problem at all to understand. Doing a quick search around here, I found a post that I think sums it up very well.
Basics of Transistors
Think of a NPN transistor this way: You put a little current thru B-E, and that allows a lot of current thru C-E. The ratio of a lot to a little is the transistor gain, sometimes known as beta and sometimes hFE.
To sum it up, a common use of transistors is as an amplifier, or even simpler, a switch.
A good example would be powering a motor by using a microcontroller for a robot. You want to be able to turn the motors on/off, which is what the microcontroller will do. If you were to hook the motor straight up to a MCU digital pin, you would destroy the MCU because it cannot handle the currents typically needed. Instead, you use a transistor that will use a small amount of current through the B-E but will allow larger currents to flow through C-E.
Best Answer
The 2N3904 and 2N3906 are very common NPN and PNP transistors, and can be used in many circuits that calls for small-signal (as opposed to power) transistors. Both can drive 200 mA.
There are hundreds (or more) sites with circuits using these transistors. Google "2N3904 circuits" or "2N3906 circuits". Some are for amplifiers, some are for timers (blinking LED's), and some are for making various kinds of sounds -- sirens, musical notes, etc.
Here are a couple to get you started; the first is a Class A amplifier using a 2N3904 in a common emitter configuration:
and the second is a Class B amplifier using both the 2N3904 and 2N3906 in what's called a complementary configuration:
The values do not have to be exact; in partof the fun is changing them and seeing what effect that has. You can learn about how the initial values are determined by reading about "biasing" in articles like this.
You may also run into circuits using the 2N2222, which is like a 2N3904 but can switch three times as much power. So if you find a circuit using a 2N2222, as along as it doesn't need to handle more than 200 mA, you should be able to get by with a 2N3904 instead.
You can get the datasheets for any of these by Googling the part number along with the word datasheet, several ones will show up. You can also try looking them up in a distributor's webpage like Digi-Key.