In a nutshell, bipolar junction transistors work because of the physical geometry of the two junctions. The base layer is very thin, and the charge carriers that are flowing from the emitter to the base do not recombine right away — most of them pass right through the base altogether and enter the depletion region of the reverse-biased base-collector junction. Once this happens, the strong field in this region quickly sweeps them the rest of the way to the collector terminal, becoming the collector current.
The voltage divider rule between your two resistors does not work like you think because the base emitter junction of the BJT tends to go up to about 0.7V and then not go much higher whilst the current into the base can increase more and more. In other words the BE junction clamps the voltage level between the two resistors to about 0.7V.
When the R1 value is increased to a certain level the voltage at the BJT base lowers down below the 0.6 to 0.7V level and the transistor starts to shut off. At some point the voltage divider will begin to act like normal as the current into the base approaches zero.
ADDITIONAL INFORMATION
Since the OP is not yet quite getting it let me be specific with the examples that were posted. It is correct that at a voltage in range of 0.6 to 0.7V the transistor will begin to turn on.
Let's look at the 20K//1K case in the left picture. Assume for a moment that the transistor base is not connected to the two resistors. By the voltage divider equations the divider voltage is:
Vb = (Vsupply * R6)/(R5 + R6) = (12V * 1K)/(20K + 1K) = 0.571V
This voltage is less than the voltage needed to turn on a transistor so if you would reconnect the transistor base to the divider there will be virtually no current flowing into the base of the transistor and the voltage divider will remain near this 0.571V value.
Next step is to visualize what happens in the above equation when the R5 value is decreased. The divider voltage will increase slowly as the R5 value is decreased.
As R5 decreases more and more the Vb divider voltage will rise up to to the point where the transistor wants to begin turning on. That will be in the 0.6 to 0.7 voltage range. At this point the transistor base begins allowing some of the current from R5 to flow into the base of the transistor.
Be aware that transistors are current mode devices and are actually turned on when the current into the base starts to flow. Below the Vbe threshold the current is nearly zero. As the divider gets past the Vbe threshold the current into the base increases and the transistor starts to turn on.
Ok lets go back and decrease the value of R5 a little more. The lower resistance of R5 allows more current from the 12V supply to flow to R6 and the base of the transistor. The voltage across R5//R6 divider will no longer follow the above equation because the base of the transistor is placing a load on R5 and stealing current so that R6 does not get as much. The nature of the transistor base-emitter junction is that the current into the base can increase more and more whilst the voltage of the base will change only a little.
As I said before the base of the transistor begins to act like a clamp on the voltage divider not allowing the Vb to increase much above the 0.7V level as R5 is made increasingly smaller and smaller. Instead the base current increases to the point that the collector current starts to flow and the transistor eventually turns full on.
The amount of base current needed to turn the transistor full ON will depend on how much collector current is allowed to flow which is limited by components in the collector circuit. The relationship between the base current and the collector current is called the transistor gain or Beta. If the collector current is limited then the transistor will saturate to a Vce of near zero volts when the base current has reached a sufficient level.
It is possible to keep lowering the value of R5 more and more causing the base current to increase more. But beyond the level that caused saturation (Vce near zero) the Vb will only increase slightly and no additional collector current will flow because it has reached the level limited by the components in the collector circuit.
Best Answer
There is a precise definition and a sloppy one for saturation. I'll start with the precise one.
That's pretty much it. The saturation region is precisely defined here.
The sloppy one comes about because the practical behavior of different parameters of the BJT don't all neatly fall so perfectly on those lines. Besides, those voltages aren't the only thing that is important. Temperature certainly has a large effect on some parameters, so you could imagine extending a 3rd axis, in and out of the paper here, to add in that dimension. And then mapping the practical details onto that new view would be even more complex.
The sloppy idea of saturation is a practical one. If you are considering operating the BJT as a switch, you have already made the decision to operate in some part of the saturation region in the chart. But you also will be operating with a substantial forward biased \$V_{bc}\$ and not close to zero and certainly not reverse biased. This isn't a precise definition and different people will use different thresholds. So part of that region isn't useful. If you are considering operating the BJT as an amplifier, then you probably want to keep \$V_{bc}\$ reverse biased and perhaps add a small margin to that, as well. So once again, varying ideas of "out of saturation" will apply here for an amplifier. And once again, it doesn't fall precisely on the chart as shown.
EDIT: The distinction shown in the above chart is an arbitrary demarcation using the difference value of \$0V\$ as the place to draw lines, but it is also an objective, measurable, quantitative, and exact one. Whether or not it is a physical one depends on the physical parameters you care about, I suppose. But if you seek some physical point of demarcation, such as was found for Pluto vs the other planets where 5 orders of magnitude in natural demarcation was found based upon some well-reasoned physical ideas, then we'd have to get into a discussion about what physical ideas you think are appropriate. Only then could anyone attempt to decide about where these points of demarcation occur. And I don't believe that kind of discussion is appropriate here.
For the seminal paper which provides the mathematical models and expresses them using exactly the two axes I show above, see: J. J. Ebers and J. L. Moll, "Large-Signal Behavior of Junction Transistors," Proc. IRE, Vol. 42, pp. 1761-1772, December 1954. The earliest publication I happen to have on the shelf, showing the above chart, is "Modeling the Bipolar Transistor," by Ian Getreu, 1976. It appears in the first few pages of the book. His book was the result of his working at Tektronix in their STS (semiconductor test systems) group in the late 1960's and early 1970's and was initially published by Tektronix. It is currently available via Lulu.
If you want to see the original Ebers-Moll equations, which use \$V_{bc}\$ and \$V_{be}\$, then I conveniently posted them here, "Why is Vbc absent from bjt equations?," as a response to that question. You don't have to go back to the original paper if such a summary is okay. Also, if your question is an historical one, I could attempt to re-contact Ian and see if he remembers where he got his chart. He may recall.
EDIT AGAIN: I'm adding a chart taken from a 1979 edition of Jacob Millman's "Microelectronics: Digital and Analog Circuits and Systems." This is from the top of page 61, Section 3-2:
Hopefully, that helps more. Such a chart should be readily available in nearly any introduction text on semiconductors.
You now have both a quantitative description that you can get from this post where I provide three separate, but equivalent, quantitative DC views of the BJT and also a qualitative description in the above diagram, as well, which illustrates both minority and majority carriers. It doesn't get more complete than that in a post on EE.SE.