Latest Cisco, PMP, AWS, CompTIA, Microsoft Materials on SALE Get Now
Tracking broadcast storms
3232

SPOTO Cisco Expert

SPOTO Cisco Expert

Settle a problem:41

Answered:

Few events are as disruptive as a broadcast storm. It can quickly saturate network links, pin switch CPUs, and bring user connectivity to a grinding halt. When this happens in a large campus environment, especially with a high-density switch stack like the Catalyst 9300 series, finding the source can feel like searching for a needle in a haystack. The challenge is to diagnose and resolve the issue quickly without taking the entire stack—and all its connected users—offline.

Recently, a community member raised this exact issue, sparking a helpful discussion on how to tackle the problem. Let’s break down that conversation and build a comprehensive guide to hunt down and prevent broadcast storms.

A Summary of the Community Discussion

A user with a network of stacked Cisco Catalyst 9300s asked for a method to trace the source of frequent broadcast storms without a full system reboot. The community provided several key suggestions:

  1. Check Interface Counters: The first piece of advice was to use show interface commands to look for interfaces with a rapidly increasing input packet count, specifically focusing on the broadcast counter. This is the primary reactive method for identifying the port where the storm is entering the network.
  2. Enable BPDU Guard: Another user suggested enabling BPDU Guard but disabling the automatic recovery of error-disabled ports. This is a proactive measure to prevent Layer 2 loops (a common cause of broadcast storms) by shutting down any access port that incorrectly receives Spanning Tree Protocol (STP) BPDUs. Disabling auto-recovery ensures a network administrator must manually investigate the port before it comes back online.
  3. General Investigation: Finally, it was noted that a thorough investigation requires checking logs, understanding the specific network configuration, and using Cisco’s configuration guides to implement appropriate traffic control measures.

These points are an excellent starting point. Now, let’s expand on them to create a systematic workflow for troubleshooting and prevention.

A Step-by-Step Guide to Hunting Down Broadcast Storms

This guide is structured to help you move from immediate reaction to long-term prevention.

Step 1: Immediate Triage - Find the Ingress Port

When the network is slow or users are reporting outages, your first priority is to find which port is receiving the flood of broadcast traffic. The command line is your best friend here.

The show interfaces command provides a wealth of information, but we can filter it to find what we need quickly.

  1. Connect to the primary switch in the stack via SSH or console.

  2. Run the following command:

    show interfaces | include is up|Broadcast
    
    • is up filters the output to only show active interfaces.
    • Broadcast shows the line containing the broadcast packet counters.
  3. Analyze the output. You will see a list of interfaces and their broadcast counters. Run the command two or three times, a few seconds apart.

    Example Output (first run):

    GigabitEthernet1/0/24 is up, line protocol is up (connected)
      5 minute input rate 3000 bits/sec, 5 packets/sec
      Received 251346 broadcasts (250100 multicasts)
    GigabitEthernet2/0/15 is up, line protocol is up (connected)
      5 minute input rate 987000 bits/sec, 45000 packets/sec
      Received 89473210 broadcasts (1024 multicasts)
    

    Example Output (5 seconds later):

    GigabitEthernet1/0/24 is up, line protocol is up (connected)
      5 minute input rate 3000 bits/sec, 5 packets/sec
      Received 251371 broadcasts (250125 multicasts)
    GigabitEthernet2/0/15 is up, line protocol is up (connected)
      5 minute input rate 995000 bits/sec, 46200 packets/sec
      Received 89699210 broadcasts (1024 multicasts)
    

    In this example, the broadcast counter for GigabitEthernet1/0/24 barely changed, which is normal. However, the counter for GigabitEthernet2/0/15 jumped by over 226,000 in just 5 seconds. This is our problem port.

Step 2: Isolate the Problem and Identify the Source

Now that you’ve identified the port (GigabitEthernet2/0/15 in our example), you need to find out what’s connected to it. But first, stop the bleeding.

  1. Temporarily Disable the Port: To restore network stability immediately, shut down the offending interface.
    configure terminal
     interface GigabitEthernet2/0/15
      shutdown
     end
    
  2. Identify the Connected Device: With the immediate crisis averted, you can investigate.
    • Check CDP/LLDP: If the connected device is another network device (like a switch, IP phone, or access point), Cisco Discovery Protocol (CDP) or Link Layer Discovery Protocol (LLDP) will likely identify it.
    show cdp neighbors GigabitEthernet2/0/15 detail
    show lldp neighbors GigabitEthernet2/0/15 detail
    
    • Check the MAC Address Table: See what MAC address(es) were learned on this port before you shut it down. If it was a single device, this can help you track it down. A flood of MAC addresses could indicate another switch is connected.
    show mac address-table interface GigabitEthernet2/0/15
    
    • Physical Tracing: Check your network documentation or physically trace the cable from the port to the end device. Common culprits include:
      • A small, unmanaged “desktop” switch plugged into the wall, with two of its ports connected back to the wall plate, creating a loop.
      • A misconfigured server with a NIC team that is malfunctioning.
      • A faulty network interface card (NIC) on a user’s PC.

Step 3: Proactive Hardening and Prevention

Once you’ve resolved the immediate issue, you must configure your switches to prevent it from happening again. This is where proactive features are essential.

  1. Storm Control: This feature monitors the rate of broadcast, multicast, and unknown-unicast traffic on a port. If the traffic exceeds a configured threshold, it can shut down the port or send an SNMP trap. It’s highly recommended on all user-facing (access) ports.

    Configuration Example:

    configure terminal
     interface range GigabitEthernet1/0/1-48
      storm-control broadcast level pps 500
      storm-control multicast level pps 1000
      storm-control action shutdown
     exit
    

    This configuration will shut down a port if it receives more than 500 broadcast packets per second (pps) or 1000 multicast pps.

  2. Spanning Tree Protocol (STP) Hardening: Most broadcast storms are caused by Layer 2 loops. Hardening STP is your best defense.

    • BPDU Guard: This should be enabled on all access ports where you never expect to see another switch. If a BPDU packet (the language of STP) is received on a BPDU Guard-enabled port, the port is immediately put into an err-disable state, breaking the loop.
    ! Globally enable on all PortFast ports
    spanning-tree portfast bpduguard default
    
    ! Or apply per-interface
    interface GigabitEthernet1/0/1
     spanning-tree bpduguard enable
    
    • Disabling Auto-Recovery (as asked in the forum): By default, an err-disable port will try to recover after a timeout (typically 300 seconds). For a critical issue like a loop, you may want to disable this so an admin must investigate.
    ! To disable automatic recovery for a BPDU Guard violation
    no errdisable recovery cause bpduguard
    

    To see the current recovery settings, use show errdisable recovery.

    • Root Guard: Enable this on ports where the STP root bridge should not appear. For example, on ports leading to other access layer switches. This prevents a misconfigured switch from hijacking the STP topology.
    interface GigabitEthernet1/0/48
     description Trunk_to_Access_Switch
     spanning-tree guard root
    

Final Workflow

To summarize, here is a complete workflow for handling broadcast storms:

  1. React: Use show interfaces | include is up|Broadcast to find the port with rapidly increasing broadcast counters.
  2. Isolate: Immediately shutdown the offending interface to restore network stability.
  3. Investigate: Use show cdp/lldp neighbor, show mac address-table, and physical tracing to identify the source device and the root cause (e.g., loop, faulty NIC).
  4. Remediate: Address the root cause. Remove the looping cable, replace the faulty device, or correct the misconfiguration.
  5. Harden: Proactively configure Storm Control and BPDU Guard on all access ports to automatically prevent future incidents.

By adopting this methodical approach, you can move from a reactive fire-fighting mode to a state of proactive network stability, confidently tracking down and eliminating broadcast storms without resorting to a full stack outage.

Don't Risk Your Certification Exam Success – Take Real Exam Questions
Pass the Exam on Your First Try? 100% Exam Pass Guarantee