Using Dedicated ports for inter-node communication and replication

ORIGINALLY POSTED 13th August 2017

I’m in Melbourne this week, because I’m attending the Technical University here starting tomorrow.   So I thought that this would be a good time to get through one of my backlog of Blog posts.

For a number of years now it has been recommended to have dedicated ports for inter-node communication on SVC.   However I’ve come across a couple of customers recently who haven’t quite configured it right.  Therefore I thought it would be a good idea to go through it here.

Lets start with the basics…

What inter-node communication does SVC/Storwize do?

The nodes in a clustered system need to communicate with each other for a large number of reasons, including:

  • Mirroring data in cache
  • Keeping redundant copies of critical metadata in sync between the nodes, including FlashCopy metadata.
  • Making configuration changes
  • And many more.

What network is used for this communication

The inter-node communication uses slightly different networks depending upon your configuration:

  • If you have a Storwize system
    • All communication between node canisters in the same IO group will occur across the PCI link on the midplane.  If the PCI midplane has a failure then a Fibre Channel network will be used as a back-up link
    • All communication between node canisters in different IO groups will occur on the Fibre Channel network
  • For SVC the communication goes across Fibre channel network.

Should I do something special for this inter node communication

If you have a single IO group Storwize system, then you don’t need to do anything more.   However it is a good idea to create a zone in your fabric that allows any FC ports from node canister 1 to communicate with node canister 2, just in case there is a PCI failure in the midplane link.

If you have a multi-IO group Storwize system  you can consider using dedicated inter node communication if you have enough free FC ports.  Take a read of the SVC advice below to understand whether you want to do it or not.  However the default Storwize configuration is designed to keep the communication between nodes in different IO groups to a minimum (e.g. one Storage pool only uses arrays from a single control enclosure).  Therefore you shouldn’t need to worry aobut this too much unless you have a very large Storwize System.

For SVC systems with 4 FC ports per node. SVC was always designed to use all of it’s ports for all of the communications (host, storage, inter-node and replication).  If you still have SVC nodes with only 4 FC ports per node, then you should keep doing this.   You might want to read the details below to decide whether you should upgrade to an 8 port (or higher) SVC node.

Finally …. for SVC systems with 8 or more FC ports per node we have been recommending that you start dedicating ports for different types of traffic. 

Why do you recommend dedicating ports?

Keeping the internode communication flowing as fast as possible is critical to getting the best possible performance from your SVC or Storwize system.   Our experience is that Fibre Channel can experience temporary slowdowns based on traffic patterns that can reduce the SVC performance.  By dedicating ports we can either avoid these slowdowns or keep the SVC communications isolated from the slowdowns:

What are the recommendations?

For simplicity I’m going to say “SVC” in the rest of this blog post.  But the advice applies equally to Storwize.

I’ll also make reference to local and remote port masking here, and then I’ll explain it in more detail later.

1. Replication ports

If you are replicating your SVC to another SVC using Metro or Global Mirror replications across Fibre Channel (i.e. not using SVC’s IP replication feature) then this traffic should be isolated to it’s own SVC FC ports.  

We recommend 2 ports per node (one for each fabric).

This is because the long distance link is the most likely link to get overloaded (espeically if you are using an FC IP router).  By keeping the replication traffic on it’s own ports then an overloaded long distance link cannot also overload the ports performing other traffic.

IMPORTANT: You should make sure that the replication traffic does not share any ISLs with any host traffic, otherwise this isolation will not work correctly.  Ideally plug these SVC ports into the same director that’s connected to the long distance link.

  • You must use the same FC port ID on every node in the cluster when doing this type of isolation
  • You must use local port masking to ensure that the SVC does not use these replication ports for communication between nodes in the same  cluster.
  • You can use remote port masking to ensure that the SVC will not use any other ports for replication, however Zoning is normally sufficient to achieve this isolation.
  • You must use zoning to ensure that no host or storage uses these physical ports
    • Remember that if you are using the NPIV feature then this means that you should not use the NPIV port asociated with the physical port either.

2. Inter Node Ports

If you have more than 4 ports we recommend that you dedicate ports to doing nothing but inter node communications (When we talk about inter-node we almost always mean between nodes in the same cluster).  The reason for this is that this means that if there’s any problems on the FC ports which are talking to other clusters, hosts or storage, it can’t affect the SVC performance.

We recommend 2 ports per node (one for each fabric).  However if you expect to exceed 3GB/s (GigaBytes not GigaBits) of write data rate then you will need more than 2 ports of inter node to achieve the bandwidth requirements.

I strongly recommend that all of these ports be attached to the same FC director or switch.  Keeping this traffic inside a single director is the best way to get the best performance.   Obviously with Hyperswap or Stretched Cluster you can’t achieve this.

  • You must use the same FC port ID on every node in the cluster when doing this type of isolation
  • You must use local port masking to ensure that the SVC does not use any other ports for communication between nodes in the same cluster.
    • IMPORTANT: If you haven’t configured a local  port mask then you have not successfully dedicated ports for inter-node communication.
  • You can use remote port masking to ensure that the SVC will not use these ports for replication, however Zoning is normally sufficient to achieve this isolation.
  • You must use zoning to ensure that no host or storage uses these physical ports
    • Remember that if you are using the NPIV feature then this means that you should not use the NPIV port asociated with the physical port either.

3. Host and Storage

SVC is quite happy sharing the same ports for host and backend storage traffic.  If you have at least 8 ports left after the inter-node and replication traffic has been isolated then you could consider isolating them.  But don’t use fewer than 4 ports each for these purposes as you are likely to cause a bottleneck that will be worse than if you had shared the same ports for inter-node.

4. Dedicated ports for super fast storage

If you’ve got a lot of ports and you’re not quite sure what to do with all of them you could consider dedicating ports for your tier 0 storage to get the best possible response times from that storage.  If you do this then the SVC ports and the tier 0 ports should all be in a single director or switch.

5. Special hosts

Some of you probably have some hosts that are enormous or super critical.  You could consider putting these onto their own ports.

Alternatively maybe you still have some servers running with 2Gb FC ports that could cause a slowdown in the FC network.  It can often be a good idea to isolate these older slower hosts onto their own FC ports on the SVC.

What is this port masking feature you’ve been talking about?

When we decided that we needed to start isolating inter-node communication from all other traffic it became obvious that the zoning required to prevent SVC port 1 from logging into any FC ports on SVC node 2 would have required single initiator single target zoning everywhere, and we considered this too onerous an overhead for administrators.   Also – just one incorrect host zone could mess everything up.

So we needed a way to tell the product to ignore available FC logins rather than using zoning to fix this.   And so the portmasks were born.

Let me try and explain how they work with a worked example.  Here is an lssystem output (it’s actually the example from the knowledge center):

id 000002006C40A278
name cluster0
location local
[...]
easy_tier_acceleration off
local_fc_port_mask:0000000000000000000000000000000000000000000000000000000000001100
partner_fc_port_mask:00000000000000000000000000000000000000000000000000000000110000

topology:standard
[...]

These look really intimidating the first time you see them – but they aren’t as scary as they look.

The local port mask is set to lots of zeros followed by 1100.  1 means enabled for this function (local port mask means that we are configuring communication between nodes in this cluster), and 0 means disabled.  The other thing to know is that you need to read this backwards.  So the last digit represents the first port.   So how to we read 1100

  • Port 4 is enabled for local inter-node
  • Port 3 is enabled for local inter-node
  • Port 2 is disabled for local inter-node
  • Port 1 is disabled for local inter-node
  • All ports with higher numbers than 4 are disabled for local inter-node (because they are all zeros)

Equally – the remote port mask is lots of zeros followed by 110000

  • Port 6 is enabled for replication
  • Port 5 is enabled for replication
  • Port 4 is disabled for replication
  • Port 3 is disabled for replication
  • Port 2 is disabled for replication
  • Port 1 is disabled for replication
  • All ports with higher numbers than 6 are disabled for replication (because they are all zeros)

And the next question you are probably asking is “which port is port 4”.  The answer is – it’s the port which has fc_io_port_id 4 in the lsportfc output.  It’s important to remember that port 4 really means port 4 on every node in the cluster – not just port 4 on a single node.

My brain is hurting – can you simplify this at all

I agree – when I started writing this I thought it was going to be a nice short one – and it feels like I’ve been writing it forever

When we launched the DH8s we put together some idiots guides for what to use each port for and what port mask you should use for that configuration.

https://www.ibm.com/docs/en/sanvolumecontroller/8.4.x?topic=pc-planning-more-than-four-fabric-ports-per-node

The same idiots guide will work just fine for SV1 nodes as well as large storwize clusters.

Note is that the idiots guide is trying to tell you which ports to use whereas here I’ve just told you how many ports to use.  It doesn’t matter which ports are used for the different traffic – but if you want to keep your life simple then the idiots guide approach should work for *most* people without unnecessary reconfiguration of hosts.

I hope that this has been helpful, and that you still have some hair left when you get to the end.

Andrew

3 responses to “Using Dedicated ports for inter-node communication and replication”

  1. This link is broken:
    https://www.ibm.com/developerworks/community/blogs/storagevirtualization/entry/2145_dh8_port_designation_recommendations?lang=en

    And only takes me to your main page – I can’t find that article referenced as the “idiots guide”

    Like

    1. Thanks for letting me know about the broken link.

      That information did get moved into the documentation – so here’s a link that has it.

      https://www.ibm.com/docs/en/sanvolumecontroller/8.4.x?topic=pc-planning-more-than-four-fabric-ports-per-node

      You can see that there’s a table with a suggested use for each port

      Like

      1. Excellent! Thank you for the quick reply. I have seen that page before, but couldn’t remember where I had seen it or if the broken link had contained more information.

        Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: