vPC failure scenarios are sometimes destructive. However, if you have good understanding on vPC and you follow Cisco recommended vPC design, then you can handle Virtual Port Channel (vPC) failure scenarios with confidence. In this lesson, i will be discussing different vPC Failure Scenarios, it’s impact on the network and how to solve the problem with Cisco recommended way.
Depending on your requirement, vPC can be design as one-sided (regular) vPC, double sided or Multilayer (DCI) vPC. You can use Cisco guide for vPC design from here.
If you are new to vPC configuration, below articles are recommended for you from this blog.
vPC Failure Scenarios
I will be discussing based on following failures.
- vPC keep-alive link failure
- vPC peer-link failure
- Member port failure
- vPC Peer switch failure
- Dual Failure Scenarios
++ Case 1: Peer-link failure, followed by Keep-alive link
++ Case 2: Keep-alive link failure, followed by Peer-link
vPC keep-alive link failure:
If only keep-alive fails, nothing will happen. Only heartbeat between Primary and Secondary node will be lost.
Restore the link as early as possible to avoid further complication if double failure happens.
vPC peer-link failure
If peer-link fails, then all the member ports from vPC secondary node will be suspended. Here important to note, keep-alive is active in this scenario, which allowing nodes to exchange heartbeat.
Make sure peer-link is UP and running.
The peer-link is important, so it’s really good idea to create port channel with multiple ports from multiple modules, so that, if a link/module goes faulty, other links remain up and active.
Member port failure
If the member port fails for a particular end host, that host only will be affected. All other members will still be operational. In case of one link fails, then traffic will be through another interface. If both fails, then full outage for that end host.
Make sure members are up and running.
vPC Peer switch failure
In case of Primary switch failure in vPC, secondary switch will be promoted as operational primary and forward all the traffic. If secondary switch fails, primary will keep forwarding traffic like earlier.
Bring the peer switch UP. Then, make sure the keep-alive is UP and make sure it’s operational. And, then move to peer-link and lastly, the member ports.
Dual Failure Scenarios
In dual failure scenarios, we will be discussing below failure cases.
1. Case: Peer-link failure, followed by Keep-alive link
2. Case: Keep-alive link failure, followed by Peer-link
Case 1: Peer-link failure, followed by Keep-alive link
Here, the member port will be suspended first due to peer-link down, but the heartbeat is there through keep-alive link. Traffic will flow through the primary peer switch. Now, if keep-alive fails, the suspended ports will remain suspended and all the traffic keeps flowing through primary node.
Just bring the keep-alive link first and then work with peer-link. You should maintain this order.
Case 2: Keep-alive link failure, followed by Peer-link
This failure is most critical. If keep-alive link fail first, nothing will happen due to vPC peer roles are already decided. However, if peer-link dies after the keep-alive, secondary vPC node will start thinking that, the primary node are completely down because of no heartbeat from Primary node. So, secondary node will become operational primary. In this case, both vPC nodes will forward the traffic. This type of scenario called split brain scenario in vPC.
Make all the member nodes from secondary switch are down. Then, bring the keep-alive link. After restoring heartbeat (keep-alive), make the peer-link up and running. If vPC form, then up the member ports.