Logs and data drives have different data access patterns that are in conflict with each other (at least in theory) when they share a drive.
Log Writes
Log access consists of a very large number of small sequential writes. Somewhat simplistically, DB logs are ring buffers containing a list of instructions to write data items out to particular locations on the disk. The access pattern consists of a large number of small sequential writes that must be guaranteed to complete - so they are written out to disk.
Ideally, logs should be on a quiet (i.e. not shared with anything else) RAID-1 or RAID-10 volume. Logically, you can view the process as the main DBMS writing out log entries and one or more log reader threads that consume the logs and write the changes out to the data disks (in practice, the process is optimised so that the data writes are written out immediately where possible). If there is other traffic on the log disks, the heads are moved around by these other accesses and the sequential log writes become random log writes. These are much slower, so busy log disks can create a hotspot which acts as a bottleneck on the whole system.
Data Writes
(updated) Log writes must be committed to the disk (referred to as stable media) for a transaction to be valid and eligible to commit. One can logically view this as log entries being written and then used as instructions to write data pages out to the disk by an asynchronous process. In practice the disk page writes are actually prepared and buffered at the time the log entry is made, but they do not need to be written immediately for the transaction to be committed. The disk buffers are written out to stable media (disk) by the Lazy Writer process (Thanks to Paul Randal for pointing this out) which This Technet article discusses in a bit more detail.
This is a heavily random access pattern, so sharing the same physical disks with logs can create an artificial bottleneck on system performance. The log entries must be written for the transaction to commit, so having random seeks slowing down this process (random I/O is much slower than sequential log I/O) will turn the log from a sequenital into a random access device. This creates a serious performance bottleneck on a busy system and should be avoided. The same applies when sharing temporary areas with log volumes.
The role of caching
SAN controllers tend to have large RAM caches, which can absorb the random access traffic to a certain extent. However, for transactional integrity it is desirable to have disk writes from a DBMS guaranteed to complete. When a controller is set to use write-back caching, the dirty blocks are cached and the I/O call is reported as complete to the host.
This can smooth out a lot of contention problems as the cache can absorb a lot of I/O that would otherwise go out to the physical disk. It can also optimise the parity reads and writes for RAID-5, which lessens the effect on performance that RAID-5 volumes have.
These are the characteristics that drive the 'Let the SAN deal with it' school of thought, althoug this view has some limitations:
Write-back caching still has failure modes that can lose data, and the controller has fibbed to the DBMS, saying blocks have been written out to disk where in fact they haven't. For this reason, you may not want to use write-back caching for a transactional application, particlarly something holding mission-critical or financial data where data integrity problems could have serious consequences for the business.
SQL Server (in particular) uses I/O in a mode where a flag (called FUA or Forced Update Access) forces physical writes to the disk before the call returns. Microsoft has a certification program and many SAN vendors produce hardware that honours these semantics (requirements summarised here). In this case no amount of cache will optimise disk writes, which means that log traffic will thrash if it is sitting on a busy shared volume.
If the application generates a lot of disk traffic its working set may overrun the cache, which will also cause the write contention issues.
If the SAN is shared with other applications (particularly on the same disk volume), traffic from other applications can generate log bottlenecks.
Some applications (e.g. data warehouses) generate large transient load spikes that make them quite anti-social on SANs.
Even on a large SAN separate log volumes are still recommended practice. You may get away with not worring about layout on a lightly used application. On really large applications, you may even get a benefit from multiple SAN controllers. Oracle publish a series of data warehouse layout case studies where some of the larger configurations involve multiple controllers.
Put responsibility for performance where it belongs
On something with large volumes or where performance could be an issue, make the SAN team accountable for the performance of the application. If they are going to ignore your recommendations for configuration, then make sure that management are aware of this and that responsibility for system performance lies in the appropriate place. In particular, establish acceptable guidelines for key DB performance statistics like I/O waits or page latch waits or acceptable application I/O SLA's.
Note that having responsibility for performance split across multiple teams creates an incentive to finger-point and pass the buck to the other team. This is a known management anti-pattern and a formula for issues that drag out for months or years without ever being resolved. Ideally, there should be a single architect with authority to specify application, database and SAN configuration changes.
Also, benchmark the system under load. If you can arrange it, secondhand servers and direct-attach arrays can be purchased quite cheaply on Ebay. If you set up a box like this with one or two disk arrays you can frig with the physical disk configuration and measure the effect on performance.
As an example, I have done a comparison between an application running on a large SAN (an IBM Shark) and a two-socket box with a direct attach U320 array. In this case, £3,000 worth of hardware purchased off ebay outperformed a £1M high-end SAN by a factor of two - on a host with roughly equivalent CPU and memory configuration.
From this particular incident, it might be argued that having something like this lying around is a very good way to keep SAN administrators honest.
Logs
Logs are a mostly sequential access structure. Simplistically, you can view them as a ring buffer of entries that say 'Write this data to this block'. When the DB engine issues a write, it actually writes one of these entries to the log. A log reader process then follows up asynchronously and writes the blocks out to the disk.
Because there is relatively little disk head activity, the logs are relatively quick. However, if the log writing activity is in contention with random access activity on the same disk the contention can significantly affect log writing performance, which affects overall DB performance.
Additionally, writing the logs to a separate disk gives you a measure of redundancy. If you back up the data, log entries since the backup can be rolled forward on a restored database. This means that a disaster has to take out both the log and data volumes to cause data loss.
For these reasons, you should have logs on a separate array (different physical disks) to the data volumes. A single mirrored pair can handle quite a large amount of log traffic if there is no contention, so you probably don't need anything more than that unless you have very large transaction volumes. However, make sure nothing else is generating disk activity on the log volumes.
TempDB
If you have processes that make heavy use of TempDB, you will get a performance benefit from having this on a RAID-10 volume, as this provides better write performance than a RAID-5. (see the notes on write-back caching below). If you are putting your data and inedexes on RAID-5 volumes and have processes that make heavy use of TempDB you may get a benefit from putting TempDB on a separate RAID-10 volume. If you are using RAID-10 on the data volumes as well this won't matter.
Data and Indexes
If you have really large data volumes you may put data on a RAID 5, 6, 50 (Striped RAID-5 blocks) or 60. If your data volues are smaller you might go with a RAID-10. If you have RAID-5 (or similar) on your data volumes you will probably want a separate TempDB volume, otherwise the system will probably work OK with data, indexes and TempDB sharing a single RAID-10 volume.
Write-Back Caching
If the RAID controllers on your SAN support battery back up, you can set up write-back caching on them. This means that the controllers cache writes in RAM and optimise writes back to the disk. Write-back caching can be quite a performance win on RAID-5 volumes, but adds extra failure modes to the system.
If power fails, the battery back-up will retain the cached data for a few days. When you reactivate the SAN, it will write out the cached entries. In theory this is fairly reliable and probably will work. However, there are plenty of anecdotal stories about write-back caches failing.
If you can activate write-back caching on a per-volume basis, consider having it inactive on the logs. That way, a cache failure can be recovered without data corruption by restoring the database and rolling forward the logs, as described above. Benchmark your system to make sure the performance is satisfactory.
Conclusion
Separate log and data volumes. If you are using RAID 4, 5, 6, 40, 50 or 60 and have heavy activity in TempDB, consider putting TempDB on a separate RAID-10 volume. Unless you have extremely large data volumes, you can share data, indexes and TempDB on a RAID-10 volume. Log volumes should not share disks with sources of heavy disk activity.
Best Answer
After some more research (checking waits and running the site with peak traffic), the "problem" described above was actually not a problem.
The problem appeared when we removed a bottleneck (the old log storage). So when we got faster disks for the tranaction logs the data disks could handle more transactions / sec and therefore the queue length increased.
It also explain why disk idle time was good.
The "disk time"-counter seems to be pretty much useless for a fast disk system (using cache etc).