The first key, so they say, to understanding BJT behaviour is to understand that its driven by minority carrier behaviour. In an NPN device, that means that electrons in the p-type base region control the behaviour.
I think you captured that in your description, but most of the rest of what you wrote doesn't fit the usual way of describing the physics.
Since the base is very thin in relation to the collector and emitter, ... there are not many holes available to be recombined with emitter electrons. The emitter on the other hand is a heavily doped N+ material with many,many electrons in the conduction band.
This is the only part of what you wrote that makes sense. The forward bias on the b-e junction creates excess carriers in the base region. There are not enough holes to recombine with those electrons instantaneously, so the region of excess holes extends some distance from the beginning of the depletion region associated with the b-e junction. If it extends far enough, it will reach the opposite depletion region (for the c-b junction). Any electrons that get to that depletion region are quickly swept away by the electric field in the depletion region and that creates the collector current.
OK, so how is entropy involved?
A key point is that the spread of excess electrons away from the b-e junction is described by diffusion. And diffusion is, in some sense, a process that takes a low-entropy situation (a large number of particles segregated in one part of a volume) and turns it into a high-entropy situation (particles spread evenly across a volume).
So when you talk about "a high entropy of electrons", you actually have it backwards. Diffusion actually acts to increase entropy, not reduce it.
The idea that excess electrons are "effectively doping and shrinking the base/collector depletion region into N-type material" also doesn't make any sense. The excess carriers don't affect the extent of the c-b depletion region much. Electrons that reach the c-b depletion region are simply swept through by the electric field.
Best Answer
Exceeding absolute maximum ratings is always bad, that's why they are called absolute. And almost every datasheet I've ever seen states something like this:
The amount of damage can vary from a negligible (for your application) degradation of the performance of the device to a complete fireworks show. Of course it's intuitive that the largest the overstress and the longer it is applied the worse are the consequences.
Anyway, once you go beyond the maximum ratings, even for the tiniest amount and for the smallest time interval, you cannot trust the datasheet any longer. This doesn't mean it is sure 100% your device has been damaged, but it may well be. You could be lucky and have a part which is, due to manufacturing spread, more capable than the average of its kind, but there is no way to know. That is, for small over-stresses such a "stronger specimen" could survive with no damage whatsoever, but to be sure it really wasn't damaged you'd have to fully characterize the device again, i.e. you should repeat the same procedure the manufacturer does when he collects the data to compile the datasheet.
Of course, at hobbyist level such procedure is almost infeasible, so you are left with simple testing, such as checking whether the junctions of a BJT still behave as diodes using a multimeter with diode-checking capability. Even if you have a multimeter with a BJT tester this will be almost useless, since it would measure the actual hFE under some unknown conditions, and that will tell you nothing about whether the performance was better before the overstress "accident".
You could do some simple functional test, i.e. plug the BJT into some simple circuit that you know it's not critical, i.e. it doesn't rely on tight part specifications (such as using the BJT to switch on a LED), and see if it works. That will tell you that the BJT is still "usable", for some foggy definition of usable, of course.
The exact process that brings a device to be damaged is a very broad area: you may want to google for "BJT failure modes" and see what pops up.
And "why the damage is permanent": why do you expect a physical object not to have limits? If I load a nylon rope too much it breaks. Neglecting how exactly the rupturing process evolves, would you expect a rope to be indestructible? The same is for electronic devices. The maximum ratings are just that: the limits beyond which the device is likely to break.