ORIGINALLY POSTED 9th August 2007
12,805 views on developerworks
John and the OSSG have been discussing various aspects of SVC and Virtual SANs and I thought it was worth answering some of these questions in depth as well as clarifying some of OSSG’s answers.
“It’s essentially a Linux server cluster”
While this statement is kind of true, it also has some nasty implications without further clarification. In actual fact SVC does use a cut down version of Linux, the Linux kernel is simply used to bootstrap the box and then map all hardware into user-space, thus giving the SVC code direct access to the hardware. Other than providing a filesystem internally to the nodes and some kernel, the rest of the running code is 100% SVC controlled. We even wrote our own memory management to provide what was needed. As for clustering, I read the above to imply it was using some Linux based clustering. This is not the case. The cluster management code came from a research project in IBM’s Almaden labs and was designed as a real-time highly available scale-out storage clustering service. The architecture is designed to scale-out many times more than we currently support, but the doubling of the base node hardware every 12-18 months has provided more than enough extra oomph for eight nodes to still satisfy most customers needs. The cluster management forms the framework for all other I/O and Copy Services code that is written for SVC.
Each component within SVC sitting in what we call the ‘I/O fast path’ has strict time-slice allocation to guarantee minimal overheads inside the SVC code from top to bottom of the product. The framework above facilitates this and only with the entire code path being at our control could we guarantee minimal latency and maximum control.
“you create latency”
As John correctly commented, all writes are essentially write-hit as completion is returned to the host at the point the SVC cache has mirrored the write to its partner node. When the workload is heavy the cache will destage write data, based on an LRU algorithm. Thus ensuring new host writes continue to be serviced as quickly as possible. The rate of destage is ramped to free up space quicker when the cache reaches certain thresholds. This avoids any nasty cache full situations.
Reads will either be read-hits, and sequential workloads get the benefit of both controller prefetch and SVC prefetch algorithms, giving the latest SVC nodes the ability to show more than 10GB/s on large transfer sequential read-miss workloads. Random reads are at the mercy of the storage again, and here we tie in with the ‘fast path’ with tens of microseconds of additional latency on a read-miss. The chances are this will also be a read-miss on the controller where a high-end system will respond in around 10 milliseconds. The order of magnitude of the additional latency introduced by SVC is therefore ‘lost in the noise’
“are there any companies who create competitive products for SVC”
There are no direct competitors that provide the hardware and software in a single tested and validated SAN appliance package. DataCore and Falconstor will claim they do, but as I described above, unless you have complete control of the hardware you are using as you base for the software you are at the mercy of the underlying operating system, specifically its I/O interfaces and the many layers of abstraction before you can actually get your hands on an I/O. The main reasons for the lack of competition in the appliance space are that EMC and Hitachi chose alternative routes and a true synchronous virtualization appliance (like SVC) is darn hard to write and get right! You have the added considerations of all data flowing through the device and ensuring you don’t become the bottleneck. Many of the competitions sales pitches will hammer down on this point, but as I’ve describe above the additional latency introduced by SVC is practically non-existent.
The impacts of the various virtualization strategies is a huge topic…
“IBM charges a ‘partition license’ fee for LUN masking”
I think the confusion here has come from talking at ‘cross-products’. IBM does indeed charge a partition license for its midrange DS products. This means you are charged more for attaching to more hosts. When you add SVC above your storage, you are essentially only using one ‘host’ in the eyes of the controller, therefore the basic license covers you for this. SVC has its own licensing and is licensed in terms of capacity virtualized. So you do end up still having to pay license fees, but for small and medium installations, the basic license may be sufficient. As OSSG stated, the IBM supplied multi-pathing software (SDD and SDDPCM) is supplied free of charge with SVC. SVC also has a huge support matrix which is ever growing and supports most of the common OS specific multi-pathing software and products such as Veritas. Customers can of course submit an RPQ to get a specific un-supported item tested and supported.
Leave a Reply