The generic answer to this question is yes, the VBUS (+5V from cable) must be connected to the device even if it is self-powered. The reason is as follows:
To start the connect process on host side, the device must pull up D+ (in case of FS/HS mode), or D- (in case of LS device).
However, USB specifications have a mandatory requirement that no USB device should source any current on any interface pin unless it is connected to a cable, see section 7.1.5.1, which reads,
The voltage source on the pull-up resistor must be derived from or
controlled by the power supplied on the USB cable such that when VBUS
is removed, the pull-up resistor does not supply current on the data
line to which it is attached.
If a USB device doesn't have this control, one of data lines will be a source of current. Premature assertion of pull-ups were a source of problems for some legacy USB hosts. That's why this rule was instituted, and there is a special test for this in USB-IF certification program.
Therefore, the USB VBUS is an important "side-band" signal in USB connect protocol. As such, normal USB device ICs do have a separate input pin to sense the presence of USB host. Some IC manufacturers (e.g. FT232H, MCP2221, etc.) skip on this requirement, assuming that their chip will be solely used in bus-powered configuration, where the pull-up control requirement is automatically satisfied. However, when designing these chips into self-powered designs, some extra circuit efforts are needed to link the enabling of pull-ups with presence of VBUS on the USB port.
Regarding the USB connect "handshake" protocol, USB doesn't rely on current drawn from VBUS. The protocol is this: Host port must have VBUS active; VBUS is connected to device; device sees the VBUS and pulls-up 1.5k on one of D+/D- wires; host sees this connect, and after a 100ms delay asserts USB_RESET signaling (SE0 etc.).
Interrupts are your friend for timing sensitive tasks, but only if you put the timing critical aspects into the interrupt, and there are no other interrupts happening that have a higher priority. The microcontrollers on the "AVR-based" Arduino (e.g. the ATmega328P) have fixed interrupt priorities as detailed on page 58ff of the datasheet. So if you used TIMER2 COMPA as your critical timing interrupt and no other interrupts you should be OK (as it has the highest priority). If you also want to use lower priority interrupts, you need to make sure that all of them re-enable global interrupts when entering their interrupt service routine:
When an interrupt occurs, the Global Interrupt Enable I-bit is cleared
and all interrupts are disabled. The user software can write logic one
to the I-bit to enable nested interrupts. All enabled interrupts can
then interrupt the current interrupt routine.
(p. 14 of the datasheet)
This is slightly different on ARM based Arduinos as their Cortex-M3 core has a "Nested Vector Interrupt Controller", where the priorities aren't fixed (can be set in software), and nested interrupt handling is the norm. So for timing critical applications, the ARM based Arduino would give you more flexibility. However, I don't think that's really necessary for your application.
The bigger question is really how easy these things can be implemented with the Arduino libraries. To get the best performance you will probably have to code outside the libraries to some degree, at least for the timing critical bits, i.e. avoid things like delay() or millis() altogether.
Whether or not you need to split depends on how much processing you intend to do. Again, going outside the libraries can potentially give you better performance.
Best Answer
There is no chip that would support USB MIDI in hardware (except the QinHeng CH345, which is buggy, and the MFM0860, which also is buggy).
You can use any general-purpose USB microcontroller for USB MIDI. However, you have to write all of the firmware yourself, or modify the software for some existing protocol (like CDC).
In the case of the MSP430, you would not be able to use the Descriptor Tool but had to construct the descriptors by hand.
There are also several open-source USB MIDI implementations for 8051-based microcontrollers; and the LUFA library for AVR and NXP chips. Cypress has a USB MIDI library for their PSoC chips.
If your device is generating the MIDI commands (as opposed to receiving MIDI data from somewhere else), you do not need to parse the MIDI stream to convert it into USB MIDI event packets, and your implementation becomes easier.