Scatterling DBA: January 2012

Sunday, 15 January 2012

Framed VS Frameless SAN

Been a while since I posted on storage theory - this time I'm briefly covering the difference between framed SAN's like Compellent, 3Par, EMC etc. and frameless SAN's like Equallogic, Lefthand, Solidfire etc.

Framed

There are now 2 major SAN philosophies in terms of design. The traditional design is called framed and is characterised by head units with disks attached. This stems from the principle of taking a pair of servers, attaching a lot of disks and sharing them out over a SAN.

The upside is that it’s been around for a long time and vendors have gotten extremely good at building them. However their reputation has generally let to them charging a premium on these when you want additional functionality such as replication etc.

They can potentially be scaled up to the biggest storage clusters and can have extensive connectivity options. The biggest downside is that it is very easy to incorrectly spec the head unit and buy too small or too much. Buying too small, the head units become a bottleneck and upgrade costs are excessively high – in some cases complete replacement. Buying too big means a massive amount of over-expenditure on a unit that will be underutilized for its entire life.

There are exceptions to this where they only have one option for a head unit no matter what capacity you buy. In general, performance on all framed SAN's can only scale in terms of additional spindles, and once a bottleneck is reached in the head unit, any further scalability can be costly.

Frameless

Frameless SANS are a newer generation that are characterised by consisting of groups of self-contained units, each with their own controllers, disk, cache and connectivity. As more units are added, so capacity and performance scale. Groups of these units are usually managed as a single SAN and volumes usually span multiple of these units to gain performance or redundancy - sometimes both. The biggest advantage of this model is that you are never overspending on a head unit, and the head unit does not become a bottleneck for the disks. The downside to this model is that adding capacity generally gets more expensive the larger you grow, as you are buying more than a shelf of disks each time you add to the SAN. You are buying controllers, cache and capacity and this all adds to the incremental cost. Maximum scalability is also capped to the maximum number of units in a group.

As this model is relatively new, the vendors that made them needed to add to the value proposition, and several of them throw in all software value adds for the base price. This means that from a cost point of view, they can be far more attractive than the old framed SAN model if you want all the functionality they can provide, even if the incremental expansion cost is higher.

The bottom line - neither solution is perfect, and both have use cases in almost every level of business.

Thursday, 12 January 2012

Why Compellent is going to be more efficient than Equallogic … for me…

I tweeted about EQL and Compellent efficiency the other day, and thought I would elaborate a little, to clarify my point.

First off, our Equallogic units are great, especially our SUMO’s. We are still going to be using them for the next few years and I do think that we made the correct choice in buying them just over a year ago.

Going forward we will be using Compellent and these are a few of the efficiency reasons why:

Block size

Equallogic uses a 15MB block size compared to Compellent’s 2MB or 512KB. For structured data that means that the EQL is an order of magnitude less efficient when it comes to snapshots. I have seen a log of a few hundred MB generate a snapshot of hundreds of GB on the EQL which becomes expensive on their SSD units. On the flip side, if your app is sequentially adding data to a file, the block size will be completely irrelevant so it does come down to your application.

I do not believe this is something they can change on a whim and is something you buy into when you buy EQL.

Thin provisioning

On EQL right now there is no space reclamation on thin provisioned volumes. As a result, once you have written to a LUN, it doesn’t know when you have deleted data (under windows anyway) and hence eventually becomes fat provisioned. Compellent has an agent that communicates with the array and tells it which blocks can be freed up, allowing thin provisioning to reclaim space. In addition Compellent only writes thin – there are no fat provisioned LUNs as well as the fact that if it sees the OS zeroing out a large amount of disk space, it doesn’t commit them to disk. No more accidental full formats on LUNS…

Pre-allocation

On the EQL you have to pre-allocate LUN space and snapshot space among other things. This is great for knowing that you won’t run out of capacity, but does reduce the efficiency of your space utilisation. Space allocated to snapshots on one LUN cannot be temporarily used by another LUN. On the Compellent arrays you don’t need to pre-allocate snapshot space per LUN but do run a greater risk of running out of capacity on badly managed systems.

Tiering / Data Progression

EQL implementation of auto tiering is great for consistent workloads, and fantastic at balancing loads on similar arrays. Where it’s not as good is inconsistent workloads or giving you granular control on how to tier data to lower performance disks. I will do a post at some stage about Compellent’s tiering, but for now lets just say that it’s far more robust and has some efficiency advantages like restriping snapshot data from R10 to R5. This robust tiering will allow us to more effectively use our SSD tier on the Compellent, than we are able to right now on the EQL.

For me these enhancements are theoretical as our arrays have arrived, but have not gone into production yet. Over the next few months I’m pretty sure I’m going to be giving the Compellents some stick when I get frustrated by its quirks, but for now I’m just really excited to see if it can live up to my expectations!