Totally Agree. A lot of people don't know this, or substitute alternatives which are not necessarily viable. Among the tenants of reliability is isolation. The nature of Amazon's services is that it isolates at the datacenter level. One should isolate at the level in which they are comfortable taking on failures. Once there is an active dependency, a la EBS, the number of subsystems increase multi-fold and the likelihood of failure & cascading failure dramatically increases.
Where getting a bit from disk to memory used to be: platter -> diskcontroller -> cpu -> memory,
now with SANs & NFS & virtualized block storage, it's:
platter -> diskcontroller -> cpu -> memory -> nic -> wire -> switch(es)/router(s)/network configs(human config item) -> wire -> nic -> cpu -> memory.
Not to say that centralized storage doesn't have its benefits, but now the scope of isolation has drastically increased, which when considering the combinatorial possibilities of failure in the prior scenario vs the latter, the latter has a significantly larger chance and mode of failure that is significantly more difficult to programmatically automate failover.
TLDR: With amazon, the scope is isolation is the datacenter. To be on amazon, one must architect and design at the scope of handling failure at the datacenter level, rather than at the host or cluster level.
Where getting a bit from disk to memory used to be: platter -> diskcontroller -> cpu -> memory,
now with SANs & NFS & virtualized block storage, it's: platter -> diskcontroller -> cpu -> memory -> nic -> wire -> switch(es)/router(s)/network configs(human config item) -> wire -> nic -> cpu -> memory.
Not to say that centralized storage doesn't have its benefits, but now the scope of isolation has drastically increased, which when considering the combinatorial possibilities of failure in the prior scenario vs the latter, the latter has a significantly larger chance and mode of failure that is significantly more difficult to programmatically automate failover.
TLDR: With amazon, the scope is isolation is the datacenter. To be on amazon, one must architect and design at the scope of handling failure at the datacenter level, rather than at the host or cluster level.