The 0.7V drop is from the Base-Emitter junction being a PN junction (in an NPN transistor), which is the same as a diode (a silicon diode has a forward drop of ~0.7V).
A bipolar transistor is either NPN or PNP.
The reason it has current gain is that the base current turns the transistor on, allowing current from the collector (which is connected to V+) to flow to the emitter.
The reason it doesn't have voltage gain is due to the "negative feedback" effect from Re.
Let's run through an example of why the emitter stays 0.7V below Vb and does not reach V+.
Let's say we have this setup, and Q1 has a current gain of 200:
Now say we apply 3V to the base.
We know that the transistor begins to turn on when the base is ~0.7V higher than the emitter, so at this point current starts to flow from V+ into the collector and out through the emitter through Re to ground.
Now here's the important bit - when the current flows through Re a voltage appears across Re.
Now for arguments sake let's say the transistor "tries" to turn on fully and since we have a rising current flowing through Re, the voltage across Re rises also.
What happens when the voltage across Re reaches 2.3V?
Well, you should see where this is going now - the base is still at 3V. When the emitter was at 0V, the base-emitter (b-e) voltage was >0.7V and the transistor was on. Now, however, the b-e voltage is at 3V - 2.3V = 0.7V! so if the voltage across Re rises any further, the transistor would turn off. So the circuit has a natural limiting mechanism, and what happens is that it always sits at ~0.7V below the base voltage. It would not matter if the current gain is infinite, the emitter voltage cannot rise above this point without "turning itself off".
Here is a simulation of the above circuit, with the base voltage gradually ramped up from 0V to 3V:
Here's another simulation with a capacitor added in to prevent the emitter voltage from rising too quickly, so we can see how the transistor turns on fully at first to charge the cap as quickly as possible, then (almost) turns off again as the cap reaches ~2.3V and only the resistor current is left as things settle:
Simulation:
9V batteries are great for smoke alarms, but pretty shocking for anything else. The capacity of a standard PP3 (alkaline) is about 400mAh and has an internal resistance of around 5 Ohms (in fact some low duty ones internal resistances closer to 20+ Ohms). So basically not a lot of energy, poor current capability, and lots of energy wasted internally. If you use a linear regulator as you are, you are wasting 63% of that energy as heat in the regulator. You'd end up with a run time of only about 2 hours before your battery dies. The regulator will be dissipating over 1 Watt which is quite a lot without a heat sink.
Using a switching regulator would help - you reduce the current draw from the battery which reduces the losses on the internal resistance, whilst also massively reducing the losses in the regulator, so you claw back some of that wasted energy for a longer running time. But it still isn't a lot - at 200mA draw on the output, and assuming you have a 90+% efficient switching regulator, theoretically that would give you around 5 hours running time (factoring in the internal resistance). And that is not accounting for the discharge curve - the voltage will drop off much sooner than 5 hours, and as it does the switching regulator will start drawing more and more current from the battery to try and maintain the output voltage meaning more losses internally in the battery.
A standard alkaline AA battery by comparison has a capacity of around 2500mAh or more and an internal resistance of <0.5 Ohm. The trouble is the voltage - 1.5V. There are however many simple switcher ICs and modules out there that can run off a single AA or AAA cell and step up to a 3.3V output. This combination would give you around 4.7 hours with one cell and a boost converter. The main reason for the lower run time is there will be a much higher current draw from the battery than at the output as is the way with boost converters, so there will be more lost in the internal resistance. But even still you get a similar run time as a PP3 in less space, and with batteries that tend to be far cheaper. Or you could use the same approach but go for a D cell battery for example, those have a lower resistance and much higher capacity (closer to 15Ah), so with one of those and a boost converter you would have a run time of around 30 hours - 6 times longer than a PP3.
If you were to use more than once cell, i.e. go for say 3 AA batteries in series, your voltage will go up (capacity stays the same) and so will the run time. With 3 in series and a step down converter, you would have a run time of closer to 16 hours (at 200mA). You could even simply use a low drop out linear regulator to bring that down to 3.3V - it would hurt efficiency, but you would still be able to get a running time of around 12 hours - so still much more than the PP3.
Best Answer
You have a load where you want \$3.3V_{DC}\$ and a compliance current of up to \$500mA\$. The design is linear and sources its power from a \$12V_{DC}\$ supply. It's not clear to me (because I may have missed reading it, or for other reasons) if this is a lead acid battery operating in a car or a laboratory power supply on a bench. You have some questions about \$V_{BE}\$ as a function of temperature and its impact on the circuit you are considering. You have a all-too-hot PNP BJT. You have BJTs, no MOSFETs. You are currently using a resistor divider to set your output voltage.
Let me start by just thinking out loud about the design you already show. \$Q_2\$ will be sourcing most of the current. Luckily, it's not operating saturated, as \$V_{CE} > 1V\$. So you can expect \$\beta \ge 50\$ for the PNP and a reasonable base current. Unluckily, it's not operating saturated, with \$V_{CE} > 8V\$, so its dissipating like crazy -- likely at more than 4W. That's probably more than a TO220 package will well do into air. So that's a problem identified. Remember it for later. \$Q_1\$ is just providing base current to \$Q_2\$. That's likely to be \$I_{C_{Q1}} < 10mA\$. And luckily, \$Q_1\$ is also not operating saturated, so once again you can expect \$\beta \ge 80\$ for the NPN and a very reasonable base current that is probably \$I_{B_{Q_1}} \le 150\mu A\$. Not a bad load current drawn away from something setting the voltage (resistor divider.) But this does reflect on your resistor divider, if you intend on keeping it, in terms of stiffness and you need to carefully consider the implications. (You could also consider a zener here, of course. But I'll stick with your resistor divider.)
So let's pencil out a design and ignore heating problems for now. You'd do something like this:
simulate this circuit – Schematic created using CircuitLab
Well, there's a rough idea. You can see a lot of power in the PNP BJT.
Now, you don't actually have to burn off all that power in the PNP. You can distribute it somewhere else, if you want. It does have to be burned somewhere. But you can insert a resistor. It turns out that an easy place would be in the collector leg of the PNP (the \$V_{CE_{Q_1}}\$ stays the same then.) That PNP only needs about \$2V \le V_{CE} \le 4V\$ in order to keep both itself and the NPN out of saturation. And a TO220 package probably can dissipate 2W into air. So let's split the difference and figure \$V_{CE_{Q_2}} = 3V\$, so that \$Q_2\$ is burning only 1.5W or so, and shove the rest into some other resistor.
The new schematic looks like this:
simulate this circuit
\$R_3\$ will dissipate about \$3W\$, worst case. (The above circuit is really targeted for a maximum of \$485mA\$, but I figured you would be okay with that in order to get a standard resistor value there.) \$Q_2\$, as predicted, will be about \$1.5W\$.
If the current is, let's say, \$250mA\$, then what happens? Well, the PNP BJT will stretch out its collector and will have to drop another \$3V\$, for a total of about \$6V\$. But the current is now only \$250mA\$, too. So it will still dissipate about \$1.5W\$. The resistor will reduce its dissipation, though.
In either case, you can get away with a small signal NPN. You just need to get a TO220 packaged PNP and these are fairly cheap and easy to get.
Regulation isn't all that good, still. We did, after all, allow \$200mV\$ range for the divider in the calculations. You could go even stiffer for the resistor divider. But another approach would be to use a zener. (Of an appropriate value.)
Where did I get the 4.025V node value for the divider?? Well, the NPN BJT is a small signal device. Stuck in my head is that they have \$V_{BE} = 0.7V\$ when \$I_C = 4mA\$. So I figured \$3.3V + V_{BE} = 3.3V + 700mV + 60mV\cdot log10(\frac{10mA}{4mA}) = 4.025V\$ and that's where the number came from.