First, some clarifications, to help you further unconfuse yourself.
- Your linux instances are considered Virtual Machines. These are different from the existing VM Role. The former is a vhd you manage in your storage account, and supports several OS variants. The latter is something you create locally, upload to Windows Azure (into storage not managed by you), then spawn one or more instances of that VM Role (which works similarly to web/worker roles). Just wanted to clarify, as you refer to your linux VM as "VM Role."
- Web roles (or worker roles, or VM Roles), are not more expensive than Virtual Machines. They all get metered at an hourly rate, per core, at a list price of $0.12 / hour (or $0.02 / hour for XS).
- Storage is a service, not a role. Web / worker / vm "roles" are essentially templates (or scaffolding) for the code that runs in the virtual machine instances, aside from the code you deploy. Storage is a REST-accessible service.
Ok, having said all that: The instructions you found about mounting a drive to a linux Virtual Machine show how to do things via the portal (and you can do the same thing with command-line scripts). You can mount up to 16 total drives (2 per core, and 1 on an XS). Each mounted drive is treated like an entire file system.
If you wanted each Virtual Machine to have its own drive, you can mount the appropriate drive(s) to each (again, up to 16 per Virtual Machine). Once a drive is mounted, that Virtual Machine has exclusive write access to the drive (no drive-sharing). This is independent of OS: Same restriction in Win28K, Linux, or even a web/worker/vm role. So: In the model where each Virtual Machine serves only one website, this helps. In the model where each Virtual Machine serves all websites, this doesn't really help with what you're trying to do. So...
If you're load-balancing traffic between two Virtual Machines, and they both need to access the same static content (e.g. website content), one thing to consider: Store static content directly in a blob (as a zip/tar) or series of blobs in a container. Then, upon bootup (or some type of signal), have the Virtual Machine(s) download said blob(s) to local storage. This method provides a central place for you to store your web content. You could also store them in an Azure drive, but I don't really see the value in doing so: You'd have to then worry about taking read-only snapshots of the drive, then mounting the snapshot. Seems like lots of extra work, vs just grabbing files from blob storage.
By the way: The copy operations are going to be throttled by Virtual Machine size. Network bandwidth on the Virtual Machine is 100Mbps per core, but for XS, only 5Mbps. Depending on how much data you'd be copying from blob storage to local disk, this could seem a bit slow with XS. Oh, and bandwidth between blob storage and your Virtual Machine is free within the same data center.
Hopefully this answers your question...
The first article covers how it will failover within a single data center, stating "Other than the loss of an entire data center all other failures are mitigated by the service." This is covering you if the hardware or storage fails for your database(s).
The second article covers how to use geo-replication to handle the case where the data center itself fails, "In this case the application deployment topology is optimized for handling regional disasters when all application components are impacted and need to failover as a unit."
They are both correct and cover two different scenarios using the same service.
Best Answer
Virtual machine vhd's are backed by Windows Azure Storage. To be more precise, page blobs. A blob is triple-replicated within the data center itself, then georeplicated to a neighboring data center (unless you opt out of georeplication). In the US, those pairs are East<-->West and North Central <--> South Central. In Europe, it's Dublin <--> Amsterdam, and in Asia it's Hong Kong <-->Singapore.
Storage SLA (detailed here): 99.9% availability.
Regarding backups: It's trivial to create a copy of a blob. Up until recently, the API only supported blob-copy within the same storage account (ergo same data center). With the Spring 2012 release, the API was updated, and now supports cross-account blob copy (so you can very easily make periodic backups anywhere you want). Announcement and full description here.
I'm not saying you need to back up your vhd's. It's up to you. Given the relatively inexpensive cost of storage (full details here): Assuming a 50GB vhd, that would run you under $3 monthly (or less than $2 if you turned off georeplication). Having a backup somewhere else would just add another few bucks to your monthly storage cost.
Regarding your 2nd question: First let me point out an article that Michael Washam posted, with a detailed overview of Virtual Machines. This article hopefully answers some of your questions. Let me call out a few items.
All hardware and related infrastructure support is taken care of: Front-end load balancer, hardware health, Host OS updates, availability sets (when load-balancing across multiple VMs), etc.
Note that, in that article, there's mention of a new Single-instance SLA of 99.9%. So, while you will have a short period of downtime if only running one VM, the Windows Azure fabric will bring your VM back up fairly quickly if, say, the hardware running your Guest VM fails.To avoid single-instance downtime, consider an Availability Set, where you can load-balance an endpoint across multiple Virtual Machines.