Scatterling DBA: February 2012

We now have our shiny new pair of Compellent Storage Center's installed, up and running.
This means that I get to spend some quality time testing, benchmarking and evaluating them before they go into production.
Previously I mentioned that the Compellent's were too smart for SQLIO which really disappointed me no end.
Not only did we use it as a tool to compare arrays through the purchase decision, but I find it significantly easier to test with than IOMeter.

Owing to a suggestion from @tonyholland00 (thank you Tony!!) it turns out that if you pre-allocate the disk and turn off caching, you can get realistic results from SQLIO again. Now most other array admins would cry out in horror at the idea of turning off caching on their arrays, but that's because their arrays are completely dependent on their write cache to deliver their advertised performance. Compellent arrays are different - they only have a tiny (512MB) write cache, and are properly architected around the number of spindles required for their target workload and don't require cache as a crutch. In addition, when testing the SSD Tier we have, the SSD's are actually faster than the cache. Pre-allocation is slightly out of the ordinary, but if that last 1-2% is that important to you for bulk loading, you would be pre-allocating your LUN anyway!

So there I was merrily testing away, and I wanted to both confirm the performance I was promised in our POC, and to see how far I could push one of these arrays. I'll definitely do some more posts on my findings, but I came across a bottleneck that may be common in many iSCSI configurations.

We are using 10GbE and I configured a blade with 4 ports, but was only able to get 2.4GB/sec even on bigger block sizes. Interestingly, the performance data looked almost identical to the POC data I received from Dell, but I assumed they had only used 2 HBA's.
All targets were connected, but I couldn't get as much info on the connections as I had been getting with my EQL arrays, as there is no connections tab comparable to what you get with the EQL HIT kit.
After a bit of investigation, it turned out that we were effectively getting one session on each fault domain (VLAN), and hence restricting us to just over 20Gb.

The solution is to manually create a session for each NIC to each target. Through the GUI, you can go to each connection, go to the properties, and add a new session. (Remembering to enable multi-pathing both when you create the initial connection, and when you add the additional session).
Now that's great if you are setting up a single server, going to a single array, but it's going to get very tedious when you are doing large numbers of servers, and connecting to more than one array.

We have 2 arrays, dual controller, each with 6 ports - resulting in 48 sessions to be manually configured. That's 10 min of tedium per server that I want to avoid. Thankfully, I found this link:
http://mrshannon.wordpress.com/2010/01/08/making-iscsi-targets-via-cmd/
You configure your list of targets once, and then on each server simply choose which nics you want to use. Run the batch file, reboot and you should be good to go.

After configuring the additional sessions, I was able to max out bandwidth on all 4 10GbE ports using a block size as low as 128K on a single LUN.
I suspect this oversight is common, as I think the same happened even in our POC setup, and it's not intuitive that you need to do it.
Coming from EQL, you expect it to be done automatically... the HIT Kit makes you lazy... :-)
Hopefully this will help even if you have fewer nics, by saturating all bandwidth available.

Was doing some testing during my compellent admin training yesterday, and came across some rather interesting behavior.
My IO testing tool of choice is SQLIO and it allows for very quick and easy perfomance baselining for your storage.
It's not SQL specific as the name would suggest, is command line and can easily be adapted to test almost any IO profile.
The best way to test is to ceate a test file larger than the cache on the SAN to ensure that the test is not completely cached.
(Unless of course you want to test cache performance...)

In this instance I created a 10GB file which is significantly bigger than the 3.5GB read cache on the compellent.
I ran my tests but considering that the underlying disks were only 5 x 15K drives, the perfomance was WAY too high. (30K IOPS - would save a lot on SSD's)
In addition, RAID 5 was writing as fast as RAID 10 - so something was fishy...

Having a look at the disk utilisation on the SAN I could see why straight away : the SAN was only storing change data !
My 10GB file was using less than 150MB because the rest was filled with 0's and the compellent does not write large blocks of sequential zero's...

Volume with 10 GB data file...

As a result my test file was easily fitting into cache - it was basically just metadata, which also explains why R5 and R10 were performing the same - 100% cache hit for both read and write!!

Disabling the cache allowed me to get a bit further but the fantastic thin write technology means that I'm going to have to get creative if I want to continue using SQLIO in future.

So lessons learned (by me) are:
1) When doing testing, always have expectations of what the outcome should be, as a both a sanity check, and to make sure the results are valid.
2) Make sure that your synthetic test works for the platform you are testing!

P.S.Subsequently found a good explanation of how SQLIO works and similar phenomena by Grant Fritchey which would confirm my observations:
http://www.simple-talk.com/community/blogs/scary/archive/2011/06/28/102058.aspx

Scatterling DBA

Wednesday, 22 February 2012

Compellent iSCSI Optimisation on Windows

Friday, 3 February 2012

Compellent too clever for SQLIO

Search This Blog