The use of 4 x AA Alkaline would usually be safe BUT does exceed the USB spec and damage may occur in some cases. I have seen IC's in this role with max operating voltages of 5.5V (which is ludicrous) - you'd hope designers had more sense, but it can't be guaranteed.
While some devices may use converters between charge input and battery, many don't (probably most). A LiIon battery has max charging voltage of 4.2V so a 5V nominal USB input will usually meet this need with enough headroom for a linear regulator.
An Alkaline cell can be nearly 1.6V when fully charged - about 1.55V is common or 6.2V for 4, and up to 6.4V may be seen. There is not much energy in this initial high voltage "tail" and voltage falls to 1.5V or below very quickly.
So, you should be safe, but YMMV, alas.
A solution would be to use an LDO (low dropout voltage) regulator OR a clamp regulator which takes the peak energy out of the battery or a series diode to drop 0.4 to 0.8V (Schotky / Silicon).
LDO is best solution but you want as little drop as possible.
Clamp to drain peak battery voltage is unusual but viable. A zener could be used but is too inexact. An eg TL431 clamp regulator in a TO92 or other largish package (to get OK dissipation capability) ould do. A TL431 plus a transistor would be safer.
Series diode is cheap and easy but prevents full battery use. Say minimum usable battery voltage is 4.6V (may be higher). At 1.15V/cell there is still some battery capacity left. Adding a Schottky diode increases minimum battery voltage to 4.6 + 0.4 = 5V or 1.25 V/cell. Some capacity wasted. At the top end a 0.4V drop diode results in Vbattmax of say (1.55V x 4 - 0.4) = 5.8V or 1.45V/cell."Almost certainly safe".
Using NimH works but is more marginal at bottom end and safer at top end. At 4.6V, V per cell is 1.15V where NimH still has modest energy left. At top end Vmax = say 1.35V, maybe 1.4V for short periods at start. 4 x 1.4V = 5.6V. Very probably safe.
My understanding of the your design is that the entire device is on a single PCB, is within a single enclosure, and is connected to the host by a single USB cable. You've integrated a hub onto the PCB to allow both the devices to communicate with the PC. The following answer will hinge on these assumptions, if it's made of several separate devices connected by disconnectable cables then that changes things.
In this case, I suggest that you simply configure the hub to enumerate as a high-power device, and share the resulting 500 mA among the whole board. Interestingly enough, TI's ganged-port sample schematic shows the devices all connected together, even when using their power management IC:
The incoming 5V power supply line (highlighted in blue, as it's one of two nets that we're interested in on this complicated schematic) is connected to a TPS2041 power management IC (a generous description, it's really just a FET that shuts down when it detects 500mA of current being passed). However, each of the inputs are shorted together, and each of the outputs are shorted together as well, and then distributed to each of the downstream ports (the net shown in red).
Basically, they're doing overcurrent protection for all of the downstream sections in a single IC. They have no way of detecting whether they have three low-power (100mA) units, a single high-power unit, or two low-power units and one 300 mA unit. All these options are acceptable based on this reference design. You wrote:
According to the USB specification, a bus-powered hub can provide only one unit per downstream port while drawing max 5 units...
but, to directly answer your question, this design from Texas Instruments (a USB group member and major implementor) shows that you only have to guarantee that the total current is less than 5 units.
To solve your problem, the rules state (taken from the excellent USB in a nutshell document):
High power bus powered functions will draw all its power from the bus and cannot draw more than one unit load until it has been configured, after which it can then drain 5 unit loads (500 mA Max) provided it asked for this in its descriptor.
If you can guarantee that your driver stage will not begin drawing current until the device has been configured (which might be as simple as a timed delay in the host controller), you can simply wire everything together. Because your entire circuit is on a single PCB and has no user-accessible downstream ports, you can probably also leave out the TPS2041 and simply design the system to not require more than 500 mA of current in any state.
Another benefit of enumerating as a high-power device is improved input voltage specifications. When you have enumerated as a low-power device, the host is only required to produce 4.40 V at the upstream port (which will be lower at your device due to the resistance of the cable). When you have enumerated as a high-power device, the specification guarantees that you'll get 4.75 V, which is more likely to be within the operating range of any 5V components you may be using.
Best Answer
First you should try to understand USB Suspend. It isn't quite so much "when no data is being transferred", but "when the data lines are idle for an extended period of time". In my experience, the bus only suspends when the PC is suspended. Otherwise, there's almost always some traffic flowing on the bus (e.g. Start of Frame markers etc). Now, there are other types of suspend, but they are generally optional.
So, like Passerby said, you really don't have to worry about not going into suspend mode. There's always 5V on VBus when the computer is suspended - otherwise, remote wake up would not be possible - and most host controllers don't really enforce the Suspend current restrictions. If they did enforce the current restrictions, someone's not-quite-good-enough device might work on one host controller but not another, and the consumer is likely to blame the computer/host controller for not working with a bad device, rather than blaming the manufacturer of the bad device.
With that out of the way, the next step you will probably need is something like an auto-switching power mux. TI has an excellent selection of muxes that will automatically switch between two voltages sources, depending on which are available. The keyword here is that you don't want self powered or bus powered, you want dual power. In a dual-power role, these muxes will prefer self power, and only switch over to bus power when there is no self power, and they are able to indicate to the MCU which power source is in use right now. I haven't used any but I think some are also designed to work with batteries.
http://www.ti.com/lsds/ti/power-management/power-multiplexer-mux-products.page