This is a simplified view to the problem:
In normal operation (Vin is positive, and Vout has reached its target value), the (high) current flows through the channels of Q2 and Q1. No current flows through the body diode of Q2, in normal operation (it does flow, during startup, as The Photon says). The reason to have two MOSFETs (instead of one MOSFET (Q1) and one Schottky diode (in place of Q2)) is exactly this one. To avoid the voltage drop that otherwise we would have across that Schottky diode.
Why does the current flow through both channels, in normal operation? Because both are on. The LT4356 uses an internal charge pump to generate a voltage Vg higher than Vgs_th for those MOSFETs (which is 2.5 V max). Imagine Vg is around 10 V (actually, between 4.5 V and 14 V). Vg1 is 10 V above Vout. So, Vgs(Q1)=10 V > 2.5 V=Vgs_th, and Q1 is on.
Q3 is on only for negative voltages below -1.4 V. So, in normal operation, Q3 is off. No current flows through the 1 Mohm resistor, and Q2 sees exactly the same Vg as Q1. So, Vg1=Vg2. How about Vgs for Q2? How much is it? Well, if Vout is (for instance) designed to be +12 V, and Vg is 10 V above it, then Vg1=Vg2=22 V. Q2 is on if its Vgs is higher than 2.5 V. For Q2 not to be on, Vin should be higher than Vg2-Vgs_th=22-2.5=19.5 V (!), which will never happen, in normal conditions. In normal conditions, Vin will be only slightly above Vout. So, Q2 is on in normal operation, and its body diode is just short circuited, contributing to zero voltage drop (which was the reason to put there a second MOSFET).
When Vin is reversed, and below -1.4 V, Q3 is on, that makes Vgs(Q2)=0, and there is no way that Q2 may conduct. Also, its body diode will be reversed biased, so it won't conduct, either. Since Q2 is in series with Q1, it does not matter what Q1 does, because no current will flow through any of them, and the load will be safe.
More: the reason for this complexity is that a silicon MOSFET is a device that can carry current in both directions, but can block only in one direction (due to the unavoidable body diode). If that body diode wasn't there, a MOSFET would be an ideal switch (able to carry and block in both directions), and a single MOSFET would be enough. Given that the diode is there, the only way to build a bidirectional-carrying bidirectional-blocking switch with them is by placing two of them in anti-series. With their gates tied together and also either a) (ideally) their sources tied together, or b) their drains tied together (as is the case, here).
GaAs MOSFETs don't have the body diode, and therefore a single device works as an ideal switch.
Summary:
Remove all diodes except maybe D1.
Use a lower Rdson FET if you can.
Change R6 to as low as you can for now - 100 Ohms would not be too low, but ...
Provide an active FET gate pulldown - just am emitter follower, for much improved turn off times.
Run PWM as slow as you can tolerate.
You do not need D1 and D2 - remove either one.
If PWM+ is always positive wrt PWM- you do not need either of D1 or D2.
If PWM+ - PWM- is AC then placing D2 across the opto input with reverse polarity will place about equal load on PWM with both polarities of input. This may or may not matter.
D4 is not needed.
D3 is not needed.
Now for the hot stuff. You can like to use whatever you wish, but you may have to rewrite the laws of physics. The IRF540 has more Rdson than you want even when driven well - and the 4.7K turnoff resistor ensures that it has a slow and horribly hot turn off. Changing R6 to as low as you can stand will help heaps. With eg R6 = 100 R, IR5-R6_on is < = about 100 mA which is sad but only maybe 1% of your load current. Adding an emitter follower pulldown driver for turnoff will help immensely.
And it will still almost certainly be too hot. See below.
Changing to a MOSFET with Rdson more like 10 milliOhm or lower will help immensely and actually allow you to do what you want IF you do it properly.
IRF540 data sheet and another like unto it - almost
At 84 Watts load at 12V Il = 7A.
If you have say about 10V Vgs then at 10A an IRF540 has
at 25C by fig 1 - about Vds = 0.35V TYPICAL
and at 175C by fig 2 - about Vds = 1V TYPICAL
In the first case as 7A you have about 2.5W dissipation at 25C TYPICAL and in the second case at 175C = 7 Watts.
Both of those are TYPICAL and both are with 20 uS pulses. ie reality will usually be worse.
The TO220 pkg has 62 C/W Rjc and
the D^2Pak claims 40 C/W Rja with 1" square FR4 PCB.
So TO220 rise with no heatsink =
= >= 62 C/W x 0.35 W at 25C = 22C rise
so Tj ~= 47C.
This is enough to start it up the pernicious -> hotter -> more Rdson -> more temperature rise -> more Rdson ... curve.
At 1 Watt you'd have 62C rise = Tj = ~~~ 90 C.
So it should not get that hot TYPICALLY as the 1 Watt is when Tj = 175C
were it not for the fact that this data sheet was influenced by marketers and tobacco salesmen and they use a 20 uS pulse width. And it's also typical. So say 100-120 C would be expectable. Your very very very slow turn off will add the coup de grace.
The circuit below shows two emitter followers as a gate driver.
Added:
Related only - high side driving 'trick'.
This is Olin's P Channel high side FET gate driver cct.
He says it achieves 200 nano-second switching.
Note the special magic from having R14 present, what that dos tohow Q2 works and the roles of R15 and R14. You should understand it yourself , but:
R14 makes Q2 an emitter follower "sort of". R14 is now driven to 1 Vbe below Vin high. Note the "no series resistor note.
HOWEVER, R15 is (here) 45 x R14 so as R14 voltage rises R15 will drop 5 x as much.
So if Vin rises from 0V to ~= (3.3-Vbe) = say 2.7V, R15 will drop 5 x 2.7 ~= 13.5V. Thi provides all the drive needed but the gate of the FET does not need a zener clamp to prevent overdrive.
Best Answer
The idea of the transistors is that:
So the left transistor acts as a matched diode for the right transistor. The exact component values may hinge a bit on the chosen MOSFET and PNP matched pair. Similar tricks are available in other ways, but this is the most well known one.
If you tie the MOSFET's gate directly to ground, like this:
simulate this circuit – Schematic created using CircuitLab
You are effectively creating an always-on-link, with possibly some adjusted start-up behaviour. Usually this start-up behaviour is enhanced using capacitors and/or resistors on the gate-path.
Because if the left is high, and the right isn't, the right will get lifted up by the body diode, then the source becomes higher than the gate, causing the FET to turn on. If the right goes high, the source goes up relative to the gate right away and again the FET turns on. Not much for diode action.
In either case usually you'd seek a FET that has a very low On-Resistance at least 10 to 20 percent below the minimum operating voltage. So if you're using it on 3.3V, you'd want a FET that's fully on at 2.5V or so, which would probably mean 1.2V or less threshold, but that's down to datasheets.