At some point in my life, I used to run the USB business for big semi company. The best result I remember was NEC SATA controller capable of pushing 320Mbps actual data throughput for mass storage, probably current sata drives are capable of this or slightly more. This was using BOT (some mass storage protocol runs on USB).
I can give a technical detailed answer but I guess you can deduce yourself. What you need to see is that, this is ecosystem play, any significant improvement would require somebody like Microsoft to change their stack, optimize etc, which is not going to happen. Interoperability is far more important than speed. Because existing stacks carefully cover the mistakes of slew of devices out there because when the USB2 spec come out probably the initial devices didn't really confirm to the spec that well since the spec was buggy, the certification system was buggy etc. etc.. If you build a home brew system using Linux or custom USB host drivers for MS and a fast device controller you can probably get close to the theoretical limits.
In terms of streaming, the ISO supposed to be very fast but controllers do not implement that very well, since 95% of the apps use Bulk transfer.
As a bonus insight, for example, if you go and build a hub IC today, if you follow the spec to the dot, you will practically sell zero chips. If you know all the bugs in the market and make sure your hub IC can tolerate to them, you can probably get in to the market. I am still amazed today, how well USB is working given number of bad software and chips out there.
The generic answer to this question is yes, the VBUS (+5V from cable) must be connected to the device even if it is self-powered. The reason is as follows:
To start the connect process on host side, the device must pull up D+ (in case of FS/HS mode), or D- (in case of LS device).
However, USB specifications have a mandatory requirement that no USB device should source any current on any interface pin unless it is connected to a cable, see section 7.1.5.1, which reads,
The voltage source on the pull-up resistor must be derived from or
controlled by the power supplied on the USB cable such that when VBUS
is removed, the pull-up resistor does not supply current on the data
line to which it is attached.
If a USB device doesn't have this control, one of data lines will be a source of current. Premature assertion of pull-ups were a source of problems for some legacy USB hosts. That's why this rule was instituted, and there is a special test for this in USB-IF certification program.
Therefore, the USB VBUS is an important "side-band" signal in USB connect protocol. As such, normal USB device ICs do have a separate input pin to sense the presence of USB host. Some IC manufacturers (e.g. FT232H, MCP2221, etc.) skip on this requirement, assuming that their chip will be solely used in bus-powered configuration, where the pull-up control requirement is automatically satisfied. However, when designing these chips into self-powered designs, some extra circuit efforts are needed to link the enabling of pull-ups with presence of VBUS on the USB port.
Regarding the USB connect "handshake" protocol, USB doesn't rely on current drawn from VBUS. The protocol is this: Host port must have VBUS active; VBUS is connected to device; device sees the VBUS and pulls-up 1.5k on one of D+/D- wires; host sees this connect, and after a 100ms delay asserts USB_RESET signaling (SE0 etc.).
Best Answer
I believe the original 1 ms (1 kHz) start of frame interrupt was to give devices to cheap but accurate timing. The accuracy spec on that is rather tight.
If you don't need to do accurate timing, or you have your own crystal, then you don't need to know when USB frames start. The host initiates all transactions, even when you send, so your hardware and firmware will know that way when to do what.
I've created a bunch of USB devices, and have not used start of frame detection at all so far. If you are using a canned library, then possibly the library uses SOF for some internal timing or to know when to do housekeeping. If the library docs say you need SOF, then you need SOF.