Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
This guide outlines a streamlined upgrade process for network fabric infrastructure, designed to support users in modernizing and managing their network environments efficiently. It provides step-by-step instructions leveraging both the Azure Portal and Azure CLI, enabling comprehensive lifecycle management of Nexus Fabric network devices. Regular updates are crucial for maintaining system integrity and accessing the latest product improvements
Overview
Runtime bundle components: These components require operator consent for upgrades that may affect traffic behavior or necessitate device reboots. The network fabric's design allows for updates to be applied while maintaining continuous data traffic flow.
Runtime changes are categorized as follows:
Operating system updates: Necessary to support new features or resolve issues.
Base configuration updates: Initial settings applied during device bootstrapping.
Configuration structure updates: Generated based on user input for configurations like isolation domains and ACLs. These updates accommodate new features without altering user input.
By following this guide, users can ensure a consistent, scalable, and secure approach to upgrading their network fabric components.
Required Pre-Upgrade Validations
Before initiating the Network Fabric (NF) Runtime Upgrade process, it is required that users validate these resource states prior to triggering the upgrade. These proactive validation steps help prevent upgrade failures and avoid service interruption challenges. If the required resource states are not met, NNF upgrade process should be stopped.
| Check | Expectation | Post Upgrade Check Applicable? | RT Upgrade Failure Phase |
|---|---|---|---|
| Check for NFC provisioning state | Provisioning state must be in "Succeeded" | No | Fabric upgrade start step will fail |
| Check for Administrative lock status of Network Fabric resource | Must be in unlocked state - Azure Operator Nexus - How to Use Administrative Lock or Unlock Network fabric - Operator Nexus | No | Fabric upgrade start step will fail |
| Network Fabric resource state checks | Resource states must be validated: • Administrative state is in "Enabled" status • Provisioning state is in "Succeeded" state • Configuration state is in "Provisioned" state |
Yes | Fabric upgrade start command will fail |
| Fabric Devices - NPB, TOR, CE, Mgmt switch | Resource states must be validated: • Administrative state is in "Enabled" status • Provisioning state is in "Succeeded" state • Configuration state is in "Succeeded" state |
Yes | Device upgrade command will fail for corresponding device |
| NNF device disk space | Minimum 3.0 GB free space within /mnt directory of all the network devices that are getting upgraded | No | Device upgrade command will fail for corresponding device |
| BGP Summary Validation | Ensure BGP sessions are established across all VRFs (show ip bgp summary vrf all runro command on CEs) | Yes | CE Device upgrade command will fail (probable connectivity issue with PE) |
| GNMI Metrics Emission | Confirm GNMI metrics are being emitted for subscribed paths | Yes | Device upgrade command will fail for corresponding device (Probable connectivity issue) |
| Terminal Server | The Terminal Server shall be confirmed to be accessible and running | No | Fabric upgrade start command will fail |
| NetworkToNetworkConnect (NNI) Network Interfaces referred in NNI Network Monitor (BMP) ACLs & Associated resources Ingress ACLs, CPU & CP TP ACLs L2ISD Resources L3ISD Resources Route Policies IPPrefixes & TrustedIpPrefixes IP Communities IP Extended Communities |
When the Resource has an Administrative state is in "Enabled" status: • Provisioning state shall need to be in "Succeeded" state • Configuration state in "Succeeded" state When the Resource has an Administrative state in "Disabled" status, the resource has no impact on the runtime upgrade |
No | Fabric upgrade start command will fail |
| Internal and External Networks referred in L3 ISD | When L3 ISD Administrative state is in "Enabled" status: • Internal & External Networks Administrative state is in "Enabled" status • Provisioning state is in "Succeeded" status • Configuration State is in "Succeeded" status |
No | Fabric upgrade start command will fail |
| Network Tap | When the Resource has an Administrative state is in "Enabled" status: • Provisioning state shall need to be in "Succeeded" state • Configuration state in "Succeeded" or "Accepted" state |
No | Fabric upgrade start command will fail |
| Network Tap Rule, NNI and Internal network associated with Network Tap | Parent Network Tap has an Administrative state is in "Enabled" status: • Provisioning state shall need to be in "Succeeded" state • Configuration state in "Succeeded" or "Accepted" state |
No | Fabric upgrade start command will fail |
| Neighbour Group associated to Network Tap | Parent Network Tap has an Administrative state is in "Enabled" status: • Provisioning state shall need to be in "Succeeded" state |
No | Fabric upgrade start command will fail |
Recommended Pre-Upgrade Validations
Before initiating the Network Fabric (NF) Runtime Upgrade process, it is recommended that users validate these resource states prior to triggering the NF upgrade. These resources will not prevent the upgrade, but should be checked before and after to confirm state remains consistent.
| NNF Resource | Expectation |
|---|---|
| Cable validation of Network Fabric | All link connections should be up and stable per BOM description - Validate Cables for Nexus Network Fabric - Operator Nexus |
NNF Upgrade Procedure
Step 0: Network Fabric Status
az networkfabric fabric show -g xxxxxx --resource-name xxxxxxx
Excerpts of the Expected output:
**"administrativeState": "Enabled",**
**"configurationState": "Provisioned"**
"fabricASN": 65025,
"fabricVersion": "5.0.0",
"fabricLocks": [ { "lockState": "Disabled", "lockType": "Configuration" } ]
Step 1: Trigger Upgrade
Nexus Network Fabric customer triggers the upgrade POST action on NetworkFabric via AZ CLI/Portal with requested payload as:
Sample az CLI command
az networkfabric fabric upgrade -g xxxx --resource-name xxxx --action start --version "6.1.0"
As part of the above POST action request, Managed Network Fabric Resource Provider (RP) performs a validation check to determine whether a version upgrade is permissible from the current fabric version.
The above command marks the Network Fabric in "Under Maintenance" mode and prevents any create or update operation within the Network fabric instance.
Step 2: Trigger Upgrade Per Device
Nexus Network Fabric customer triggers upgrade POST actions per device. Each of the NNF device resource states must be validated either Azure Portal or Azure CLI:
- Provisioning state is in Succeeded state,
- Configuration state is in Provisioned state.
- Administrative state is in Enabled state
Each of the NNF devices will enter maintenance mode post triggering the upgrade. Traffic is drained and route advertisements will be stopped.
NNF Upgrade sequence
- Odd numbered TORs (parallel).
- Even numbered TORs (parallel).
- Compute rack management switches (parallel).
- CEs are to be upgraded one after the other in a serial manner. Stop the upgrade procedure if there are any failures corresponding to CE upgrade operation. After each CE upgrade, wait for a duration of five minutes to ensure that the recovery process is complete before proceeding to the next CE device upgrade.
- Upgrade Network Packet Broker (NPB) devices in a serial manner.
- Aggregate rack management switches are to be upgraded one after the other in a serial manner.
Similar to the pre-upgrade validation steps, it is recommended to validate the NNF device resource states post triggering the upgrade at the following checkpoints:
- After odd numbered TORs complete, prior to even numbered TORs upgrade.
- After CE1 upgrade, prior to CE2 upgrade.
- After Agg switch1 upgrade, prior to Agg switch2 upgrade.
Sample az CLI command
az networkfabric device upgrade --version 6.1.0 -g xxxx --resource-name xxx-CompRack1-TOR1 --debug
Post validation for Step 2
After all Network Fabric devices upgrades are completed, User must ensure that none of the NNF devices are "Under Maintenance" and these devices runtime versions must be showing 6.1.0 by running the following commands.
Sample az CLI command:
az networkfabric device list -g <resource-group> --query "[].{name:name,version:version}" -o table
Step 3: Complete Upgrade
Once all the NNF devices are successfully upgraded to the latest version i.e 6.1.0, Nexus Network Fabric customer will run the following command to take the network fabric out of maintenance state and complete the upgrade procedure.
Sample az CLI command
az networkfabric fabric upgrade --action complete --version "6.1.0" -g "<resource-group>" --resource-name "<fabric-name>" --debug
Once the Fabric upgrade is done, we can verify the status of the network fabric by executing the following az cli commands:
az networkfabric fabric show -g <resource-group> --resource-name <fabric-name> az networkfabric fabric list -g xxxxx --query "[].{name:name,fabricVersion:fabricVersion,configurationState:configurationState,provisioningState:provisioningState}" -o table
Step 4: Credential rotation (optional step).
Customer performing action must validate the device's maintenance mode status after each cycle of credential rotation is completed. The device should not remain in the under-maintenance state post credential rotation.
Post Upgrade validation steps
| Post NNF RT Upgrade action | Expectation |
|---|---|
| Version compliance | All Network Fabric devices must be in either RT version 6.1.0 |
| Maintenance status check | Ensure TOR and CE devices maintenance status is "NOT under Maintenance" (show maintenance runro command) |
| Connectivity Validation | Verify CE ↔ PE connections are stable or similar to the pre-upgrade status (show ip interface brief runro command) |
| Reachability Checks | Confirm all NF devices are reachable via jump server (ping <MA1_IP>, ping6 <Loopback6_IP>) |
| BGP Summary Validation | Ensure BGP sessions are established across all VRFs (show ip bgp summary vrf all runro command on CEs) |
| GNMI Metrics Emission | Confirm GNMI metrics are being emitted for subscribed paths (check via dashboards or CLI) |
Appendix
The following table outlines the step-by-step procedures associated with selected pre and post upgrade actions referenced earlier in this guide
Each entry in the table corresponds to a specific action, offering detailed instructions, relevant parameters, and operational notes to ensure successful implementation. This appendix serves as a practical reference for users seeking to deepen their understanding and confidently carry out the NNF upgrade procedure
| Action | Detailed steps |
|---|---|
| Device image validation | Confirm latest image version is installed by executing "show version" runro command on each NF device. az networkfabric device run-ro -g xxxx -resource-name xxxx -ro-command "show version". The above output must reflect the latest image version as per the release documentation. |
| Maintenance status check | Ensure TOR and CE device status is not under maintenance by executing "show maintenance" runro command. The above status must not be in "Maintenance mode is disabled". |
| Connectivity Validation | Verify CE ↔ PE connections are stable. "Show ip interface brief" runro command. |
| Reachability Checks | Confirm all NF devices are reachable via jump server: * MA1 address ping <MA1_IP> * Loopback6 address ping6 <Loopback6_IP> |
| BGP Summary Validation | Ensure BGP sessions are established across all VRFs by executing "show ip bgp summary vrf all" "runro command" on CE devices.The above status must ensure that peers should be in Established state - consistent with pre upgrade state. |