Your circuit is somehow weird. The purpose of the instrumentation amplifier is to have very high input impedance at the input stage and a good common mode rejection ratio. In your case you have some non symmetric components at the input stage, that will result in bad CMRR, extra you have a resistor placed in between both inputs, why?
The Vref needs to be connected to a low impedance, so connecting an opamp output to Vref input is a way to go. But there is one another strange thing: you have one measurement as Setpoint and other named Sense, in control theory you would need (Setpoint-Sense), what you'll get is (Setpoint+Sense), OK you can still swap the Sense inputs to get negative value.
It's simple: manufacturers make what customers will buy! It's the same reason why Ferrari won't put a trailer attachment at the rear of their cars...
Price is a very important part of this, of course, and price is tied to silicon area, process, yield, and of course packaging.
For example, an opamp with +/-12V supplies and 1A output current will dissipate several watts, so it can't be a standard opamp package like SO-8. Thus its target audience shrinks down to customers who are willing to use this specific package, which won't be standard so there won't be a second source in case the first manufacturer is out of stock. Also the non-standard package means it can't be marketed to customers who want a standard opamp, unless it's something like a SO-8 with a thermal pad, which has its own sets of issues, it's a lot worse thermally than TO-220-5 LM1875 for example.
Contrast this with, say, a NE5532. In the unlikely case it's out of stock, there are tons of equivalents. Package is standard... it's a jellybean part.
We haven't gotten into actual tech limitations yet, and already the economics favor the chip that will cater to the largest audience, in this case small signal, low current circuits. Even if the thermally enhanced package adds only 5c to the cost of the beefier opamp, it's 5c too much if you don't need the capability.
Now, higher output current requires larger output transistors, thus more silicon area: it will be more expensive. It will also be slower.
Say you make an ADA4898, top of the line, ultra-everything opamp. The process for that is likely to be damn expensive, also all the chips that fail the stringent spec tests are trashed, so yield can be an issue. The folks at AD aren't going to enlarge the output transistors if this only interests 5% of the customers for this product... because this would make it more expensive for the others, so they'd pick another opamp...
Maybe the manufacturers can't do it? Well, nope. If there is a market, they will rush into it. Check ADA4311 for example, it's a driver for PLC/DSL/Whatever twisted pair, it's fast, low noise, high current, etc. But it's a current feedback amp with high offset and crummy DC specs because these don't matter to the application, so going for a precision design would only increase cost.
Now check LM1875, an audio power opamp. It's old, crummy and slow, so it can be probably be manufactured on a cheap old process with high yield. It could be better, sure... but it's good enough for the application.
If you want an opamp with good small-signal specs (DC offset, noise, etc) and high current drive, the simplest is to stick a power output stage on it, or make a compound with the first opamp driving a beefier one. If the second opamp is faster, no extra compensation is required. If it is slower, compensation is required.
Best Answer
From the top of my head:
There's basically a few things you'd want to have from a "perfect" amplifier, but which are hard to realize within a single one:
In a three-Opamp differential amplifier (and I'd assume that things like the INA128 actually are made of three opamps!), the input impedance of the output opamp doesn't really matter – so you can use something with a lower input impedance, but with a high output drive strength. In fact, I'd speculate that it might even make sense to use BJTs for the input stage differential amplifier of that third opamp – you'd be sinking exactly what need, and:
That third opamp would ideally have a high CMRR – and it's, I've been told[citation needed] a bit easier to use laser-trimmed on-die resistors to make this thing a little more symmetrical if these resistors are lower value due to more current flowing through them.
So, wild guess: Third opamp input differential stage: BJTs, rest FET, with a relatively fat FET pair at the output.
The two input Opamps wouldn't need as much CMRR (in fact, none, as long as they react identically), but a high input impedance – an ideal use case for FET inputs.
Friis' noise formula tells us that these two mostly define the noise figure of the overall circuit, so it's at least likely the stages after the input stage are also BJTs. A significant amount of the overall voltage gain might happen here (for exactly Friis' noise formula reasons).
I mentioned laser-trimmed resistors: Since you need to really get the resistors in an instrumentation amplifier right, the streaks of weakly doped silicon that make up resistors on ICs are in this case designed to be "zappable" with a laser during production – meaning they can be adjusted after / while being measured by calibrated equipment.
Because I can:
The three opamps for which I could find die shots (which I'm not competent to interpret: