Sql-server – SQL Server 2005 standard filegroups / files for performance on SAN

database-performancegroupsperformance-tuningsql serversql-server-2005

I submitted this to stack overflow (here) but realised it should really be on serverfault. so apologies for the incorrect and duplicate posting:

Ok so I've just been on a SQL Server course and we discussed the usage scenarios of multiple filegroups and files when in use over local RAID and local disks but we didn't touch SAN scenarios so my question is as follows;

I currently have a 250 gig database running on SQL Server 2005 where some tables have a huge number of writes and others are fairly static. The database and all objects reside in a single file group with a single data file. The log file is also on the same volume. My interpretation is that separate data files should be used across different disks to lessen disk contention and that file groups should be used for partitioning of data. However, with a SAN you obviously don't really have the same issue of disk contention that you do with a small RAID setup (or at least we don't at the moment), and standard edition doesn't support partitioning.

So in order to improve parallelism what should I do?

My understanding of various Microsoft publications is that if I increase the number of data files, separate threads can act across each file separately. Which leads me to the question how many files should I have. One per core? Should I be putting tables and indexes with high levels of activity in separate file groups, each with the same number of data files as we have cores?

Thank you

Best Answer

  • Every file group should have X data and log files, with X being the number of task manager visible cores - this allows optimal IO behavior.

  • This is particularly important for the tempdb - files are sometimes completely locked for the SQL Server, when extents (groups of 8 pages) are allocated / freed. Tempdb allcoates a LOT of objects.

  • Distributing for multiple discs makes only sense for better IO. A good SAN - may saturate the driver's outstanding IO capabilities (Queue length often maxes at 256 PER DISC), so a good SAN may require multiple discs to keep enough IO outstanding to fully utilize it.

  • But even without a good SAN, multiple files avoid the bottleneck of having single file access when doing inserts etc.

  • Having more than the mentioned X makes no sense - as maximum every core can run one thread at a given time. ALlocation is atomic, so every core will not switch threads while doing that. But X cores may all want extends at the same time ;)

  • As important: properly format the discs. Make sure (pre 2008 server) partitions are aligned, make sure you format the discs with a 64kb node size - otherwise you may waste up to 40% of your IO performance right there.

  • Partitioning does not enter into this game at all ;)