StorageWorks MSA60 and other storage related questions

storage

I do not have deeper knowledge of the storage area, sorry for asking evidently stupid questions 🙂 We are thinking about getting HP StorageWorks MSA60 for storing our VM. Do we need another DL server with controller so that we could use iSCSI ? Do we need to get some P800 controller for doing that? I cannot imagine how it is connected together actually … MSA60->DLserver with p800 controller and servers that are running VM connected with iSCSI to this DL server ?

Or MSA60 directly supports iSCSI so the DL server is not necessary ? What is inside this MSA60? Is it possible to install there OS ?

Thank you.

Best Answer

The HP StorageWorks MSA60 is a JBOD enclosure intended to be used with a server. It's an extension of a Smart Array controller, so it will require a p800 or p812 controller inside a server to interface with the enclosure. The MSA60 is meant to connect to a single server. You haven't provided information on your operating systems or choice of virtualization platform, but I'm pretty sure this isn't the solution you're looking for.

1). If you're looking to extend the amount of storage available on a single virtual host server, then the MSA60 will work. There would be no need for iSCSI as you'd be directly connected via SAS to the storage array. This provides no real benefit other than having more room for disk drives than the server may have.

2). If you're looking to have two or more virtual host servers in a cluster and tied to the same shared storage via iSCSI (or SAS or fibre channer), then you should look at the entry-level MSA P2000 storage arrays. These look like the MSA60, but provide support to connect to multiple servers. They can be expanded using additional JBOD enclosures, though.

Some home truths about storage, or why is enterprise storage so f-ing expensive?

Consumer hard drives offer large volumes of space so that even the most discerning user of *cough* streaming media *cough* can buy enough to store a collection of several terabytes. In fact, disk capacity has been growing faster than the transistor counts on silicon for a couple of decades now.

'Enterprise' storage is a somewhat more complex issue as the data has performance and integrity requirements that dictate a somewhat more heavyweight approach. The data must have some guarantee of availability in the event of hardware failures and it may have to be shared with a large number of users, which will generate many more read/write requests than a single user.

The technical solutions to this problem can be many, many times more expensive per gigabyte than consumer storage solutions. They also require physical maintenance; backups must be taken and often stored off-site so that a fire does not destroy the data. This process adds ongoing costs.

Performance

On your 1TB consumer or even enterprise near-line drive you have just one head. The disk rotates at 7200 RPM, or 120 revolutions per second. This means that you can get at most 120 random-access I/O operations per second in theory* and somewhat less in practice. Thus, copying a large file on a single 1TB volume is relatively slow.

On a disk array with 14x 72GB disks, you have 14 heads over disks going at (say) 15,000 RPM or approximately 250 revolutions per second. This gives you a theoretical maximum of 3,500 random I/O operations per second* (again, somewhat less in practice). All other things being equal a file copy will be many, many times faster.

* You could get more than one random access per revolution of the disk if the geometry of the reads allowed the drive to move the heads and read a sector that happened to be available within one revolution of the disk. If the disk accesses were widely dispersed you will probably average less than one. Where a disk array formatted in a striped (see below) layout you will get a maximum of one stripe read per revolution of the disk in most circumstances and (depending on the RAID controller) possibly less than one on average.

The 7200 RPM 1TB drive will probably be reasonably quick on sequential I/O. Disk arrays formatted in a striped scheme (RAID-0, RAID-5, RAID-10 etc.) can typically read at most one stripe per revolution of the disk. With a 64K stripe we can read 64Kx250 = 16MB or so of data per second off a 15,000 RPM disk. This gives a sequential throughput of around 220MB per second on an array of 14 disks, which is not that much faster on paper than the 150MB/sec or so quoted for a modern 1TB SATA disk.

For video streaming (for example), an array of 4 SATA disks in a RAID-0 with a large stripe size (some RAID controllers will support stripe sizes up to 1MB) have quite a lot of sequential throughput. This example could theoretically stream about 480MB/sec, which is comfortably enough to do real-time uncompressed HD video editing. Thus, owners of Mac Pros and similar hardware can do HD video compositing tasks that would have required a machine with a direct-attach fibre array just a few years ago.

The real benefit of a disk array is on database work which is characterised by large numbers of small, scattered I/O requests. On this type of workload performance is constrained by the physical latency of bits of metal in the disk going round-and-round and back-and-forth. This metric is known as IOPS (I/O operations per second). The more physical disks you have - regardless of capacity - the more IOPS you can theoretically do. More IOPS means more transactions per second.

Data integrity

Additionally most RAID configurations give you some data redundancy - which requires more than one physical disk by definition. The combination of a storage scheme with such redundancy and a larger number of drives gives a system the ability to reliably serve a large transactional workload.

The infrastructure for disk arrays (and SANs in the more extreme case) is not exactly a mass-market item. In addition it is one of the bits that really, really cannot fail. This combination of standard of build and smaller market volumes doesn't come cheap.

Total storage cost including backup

In practice, the largest cost for maintaining 1TB of data is likely to be backup and recovery. A tape drive and 34 sets of SDLT or ultrium tapes for a full grandfather cycle of backup and recovery will probably cost more than a 1TB disk array did. Add the costs of off-site storage and the salary of even a single tape-monkey and suddenly your 1TB of data isn't quite so cheap.

The cost of the disks is often a fair way down the hierarchy of dominant storage costs. At one bank I had occasion to work for SAN storage was costed at £900/GB for a development system and £5,000/GB for a disk on a production server. Even at enterprise vendor prices the physical cost of the disks was only a tiny fraction of that. Another example that I am aware of has a (relatively) modestly configured IBM Shark SAN that cost them somewhere in excess of £1 million. Just the physical storage on this is charged out at around £9/gigabyte, or about £9,000 for space equivalent to your 1TB consumer HDD.

Beginner SAS and server questions

Regarding Backplanes

It varies from vendor to vendor, but in general backplanes are not compatible with of the shelf hard-drives. Many need some kind of drive carrier that has the built in interface between the SAS connector and the backplane connector. This is because these kinds of systems are hot-plug, and that requires special bits.

Regarding RAID controllers

Hardware RAID provides a level of parallel processing that can come very much in handy, as well as handling certain tasks better than software RAID can. One area is the on-adapter cache, which allows the RAID card to better virtualize the underlaying storage so it performs better. Software RAID can do some of that, but hardware RAID still performs better these days. Also, in my experience HW RAID handles failures more gracefully than SW RAID. Yours may vary.

Regarding RAID and ZFS

This is going to sound a bit odd, but I run into the same issues with NetWare's NSS file-system (which looks a lot like ZFS as it happens). In my case I trust the hardware vendors more to handle complex storage configs than I trust the software vendors to provide solid solutions. This may be misplaced trust, but I'd rather have a storage management system with several largish RAID arrays, than one with 48 individual disk drives. This allows me to leverage the best of both environments.

I can go into some detail about load leveling on hardware and software, but that's a bit beyond the scope of this article ;)

Regarding attaching external SAS arrays

If I'm reading that SUN unit correctly, it's a JBOD unit by itself. Attach it to a SAS RAID controller with external ports and you can use hardware RAID on it. Or attach it to a stand alone SAS card and have up to 48 individual drives presented to the operating system. Either method will work. Whether or not the SAS RAID card can be configured for JBOD is up to the RAID card manufacturer, I've seen it go both ways over the years.

Regarding "4 (x4-wide) SAS host/uplink ports (48 Gb/sec bandwidth)"

This means that the unit has multiple SAS ports on it, and it can do link aggregation for increased bandwidth. To make full use of this, you'll need 4 free ports on the card you attach it to. These also can be used to attach two hosts to this unit, if you're of a mind.

The 'Expansion ports' on the spec are for attaching additional SAS shelves to the first unit. You'd attach your RAID card to the first unit, and then attach additional units to the first over those expansion ports. I think. Trough this you can get silly amounts of direct-attach-storage.

Regarding standard ports

Some of this varies from vendor to vendor, but in general 1U-2U servers these days do not ship with external storage connectors standard. The 4U servers may be different, but I don't play with those that often so I don't know first hand. To get the ability to use external storage, you'll need an adapter card of some kind. Whether that's a simple SAS adapter, or a smarter version of the built in RAID adapter is up to you.

Best Answer

Related Solutions

What’s the best way to explain storage issues to developers and other users

Some home truths about storage, or why is enterprise storage so f-ing expensive?

Beginner SAS and server questions

Related Topic