SQL Storage 101
Having completed my session at SQLBits, I felt that there was too much information that I didn’t cover as comprehensively as I would have liked. So what I’m going to do is break up some of my session into a series of posts to try flesh out a high level understanding of SQL storage.
Welcome to the first part of SQL Storage for the Real World – Storage 101.
Read / Write Ratio
Random vs Sequential
That's where IO latency comes in, and that will be covered in the next post.
Disk IO
Disk IO are the physical reads and writes that make their
way to the storage medium of your SQL server. The two types are writes and reads,
and as their name entails, are either written to / read from the drive, from / into
memory on the server. SQL server predominantly uses 8K – 64K block sizes, but
these can be bigger when the server is doing large sequential operations such
as read-ahead’s, backups, checkDB’s etc. In reality you will want to test your
storage subsystem on 8k, 64K, 256K and possibly 1024K block sizes to get a feel
for what it is capable of. I will expand on testing your storage, especially with SQLIO in the near future.
Read / Write Ratio
To understand the requirements that your storage
subsystem will have to fulfil, you will need to find out what the ratio is
between reads and writes under peak loads. The reason for this is, depending on
the storage you choose, the number of writes the system can sustain will be
drastically different depending on this ratio. For example, SSD’s generally
have different specifications for reads and writes. Some can be drastically
slower to write data, and some can be significantly faster. This will come down
to manufacturer and model. Equallogic SSD’s for example have much lower write
speeds than read speeds and as a result the specified IO performance of a
member is highly dependent on the R/W ratio. On the other hand, FusionIO PCI
SSD’s are able to write much faster than read, so read speed may well be the
limiting criteria.
On the other hand, individual hard drives have more
symmetrical R/W ratios, but are generally used in RAID configurations.
Depending on the type of RAID, this can have a massive effect on the number of
writes a set number of disks can handle. For example, a RAID 0 stripe has no
overhead, and hence the write throughput should be roughly equal to the sum of
the number of drives in the stripe and also be equal to the number of reads
possible. On the other hand, RAID10 arrays have to write each block twice, and
hence are only able to write at half the number of IO’s that they are able to
read. This gets progressively worse for RAID 5 and RAID 6. This phenomenon is
termed the RAID Factor. Some vendors have put in place mechanisms to mitigate
the impact of this – Netapp is an example that comes to mind.
To determine your read/write ratio is relatively easy –
simply run perfmon over the busiest periods of disk activity, capturing reads/sec
and writes/sec, put this into excel and work out a 95th percentile
average. It is common to see a 70/30 read/write ratio on an OLTP or mixed
workload. However this can skew massively in both directions - on one of my
systems I have seen a continuous 45/55 ratio - meaning that there were more
writes than reads.
Random vs Sequential
There are 4 main pillars of disk performance:
·
Random Read
·
Random Write
·
Sequential Read
·
Sequential Write
Each of these profiles needs to be tested and validated
against what you will be expecting from your system. Depending on your
requirements, some types of data access will be more valuable than others. For
example: sequential writes are extremely important for log files, while random
reads are more important for OTLP data files. On the other hand, data warehouses
generally prefer sequential reads. The good / bad news is that no load is 100%
random and no load is 100% sequential and I’m not aware of an accurate way of
measuring random vs sequential. You can argue that some access types will be
more prevalent in certain types of databases, but remember that almost all disk
access is abstracted or virtualised in some way. For example, some SSD controllers
cache and then compress incoming writes before they are stored in NAND, more
than likely your SAN is not storing your writes in the exact order they came
in, on a single disk. In the world of hypervisors, almost all access will end
up being random in any case.
As a result, all you can really do is understand what
your system is capable of, so that you can firstly check that it is performing
as you expect it to, and secondly test your application and see if its
performance is sufficient for your needs. How do you know if it is sufficient?
That's where IO latency comes in, and that will be covered in the next post.
No comments:
Post a Comment