Thursday, February 14, 2008

Troubleshooting Multicast RPF Failure

 

Hi Brian,

Ienjoy the new blog feature. Lots of valuable information condensed in a small space. Could you explain in a nutshell how to troubleshoot multicast RPF failures? I understand the concept, just figuring out what shows and/or debugs to use always seems to take me 30 minutes rather than 5 minutes.

Firstlet’s talk briefly about what the RPF check, or Reverse Path Forwarding check, is for multicast. PIM is known as Protocol Independent Multicast routing because it does not exchange its own information about the network topology. Instead it relies on the accuracy of an underlying unicast routing protocol like OSPF or EIGRP to maintain a loop free topology. When a multicast packet is received by a router running PIM the device first looks at what the source IP is of the packet. Next the router does a unicast lookup on the source, as in the “show ip route w.x.y.z”, where “w.x.y.z” is the source. Next the outgoing interface in the unicast routing table is compared with the interface in which the multicast packet was received on. If the incoming interface of the packet and the outgoing interface for the route are the same, it is assumed that the packet is not looping, and the packet is candidate for forwarding. If the incoming interface and the outgoing interface are *not* the same, a loop-free path can not be guaranteed, and the RPF check fails. All packets for which the RPF check fails are dropped.

Nowas for troubleshooting the RPF check goes there are a couple of useful show and debug commands that are available to you on the IOS. Suppose the following topology:

multicast.rpf.failure.gif

Wewill be running EIGRP on all interfaces, and PIM dense mode on R2, R3, R4, and SW1. Note that PIM will not be enabled on R1. R4 is a multicast source that is attempting to send a feed to the client, SW1. On SW1 we will be generating an IGMP join message to R3 by issuing the “ip igmp join” command on SW1’s interface Fa0/3. On R4 we will be generating multicast traffic with an extended ping. First let’s look at the topology with a successful transmission:

SW1#conf t 
Enter configuration commands, one per line. End with CNTL/Z.SW1(config)#intfa0/3
SW1(config-if)#ip igmp join 224.1.1.1SW1(config-if)#endSW1#R4#pingProtocol [ip]:
Target IP address: 224.1.1.1Repeatcount [1]: 5
Datagram size [100]:Timeoutin seconds [2]:
Extended commands [n]: yInterface[All]: Ethernet0/0
Time to live [255]:Sourceaddress: 150.1.124.4
Type of service [0]:SetDF bit in IP header? [no]:
Validate reply data? [no]:Datapattern [0xABCD]:
Loose, Strict, Record, Timestamp, Verbose[none]:Sweeprange of sizes [n]:
Type escape sequence to abort.Sending5, 100-byte ICMP Echos to 224.1.1.1, timeout is 2 seconds:
Packet sent with a source address of 150.1.124.4Reply to request 0 from 150.1.37.7, 32 ms
Reply to request 1 from 150.1.37.7, 28 msReplyto request 2 from 150.1.37.7, 28 ms
Reply to request 3 from 150.1.37.7, 28 msReplyto request 4 from 150.1.37.7, 28 ms

Now let’s trace the traffic flow starting at the destination and working our way back up the reverse path.

SW1#show ip mroute 224.1.1.1 
IP Multicast Routing TableFlags:D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,T - SPT-bit set, J - Join SPT, M - MSDP created entry,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,U - URD, I - Received Source Specific Host Report, Z - Multicast Tunnel
Y - Joined MDT-data group, y - Sending to MDT-data groupOutgoinginterface flags: H - Hardware switched, A - Assert winner
Timers: Uptime/ExpiresInterfacestate: Interface, Next-Hop or VCD, State/Mode

(*, 224.1.1.1), 00:03:05/stopped, RP 0.0.0.0, flags: DCLIncominginterface: Null, RPF nbr 0.0.0.0
Outgoing interface list:FastEthernet0/3,Forward/Dense, 00:03:05/00:00:00

(150.1.124.4, 224.1.1.1), 00:02:26/00:02:02, flags: PLTXIncominginterface: FastEthernet0/3, RPF nbr 150.1.37.3
Outgoing interface list: Null

SW1’sRPF neighbor for (150.1.124.4,224.1.1.1) is 150.1.37.3, which means that SW1 received the packet from R3.

R3#show ip mroute 224.1.1.1 
IP Multicast Routing TableFlags:D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,T - SPT-bit set, J - Join SPT, M - MSDP created entry,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,U - URD, I - Received Source Specific Host Report,
Z - Multicast Tunnel, z - MDT-data group sender,Y - Joined MDT-data group, y - Sending to MDT-data group
Outgoing interface flags: H - Hardware switched, A - Assert winnerTimers:Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode(*, 224.1.1.1), 00:03:12/stopped, RP 0.0.0.0, flags: DC
Incoming interface: Null, RPF nbr 0.0.0.0Outgoinginterface list:
Serial1/3, Forward/Dense, 00:03:12/00:00:00Serial1/2,Forward/Dense, 00:03:12/00:00:00
Ethernet0/0, Forward/Dense, 00:03:12/00:00:00(150.1.124.4, 224.1.1.1), 00:02:33/00:01:46, flags: T
Incoming interface: Serial1/3, RPF nbr 150.1.23.2Outgoinginterface list:
Ethernet0/0, Forward/Dense, 00:02:34/00:00:00Serial1/2,Prune/Dense, 00:02:34/00:00:28

R3’s RPF neighbor is 150.1.23.2, which means the packet came from R2.

R2#show ip mroute 224.1.1.1 
IP Multicast Routing TableFlags:D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,T - SPT-bit set, J - Join SPT, M - MSDP created entry,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,U - URD, I - Received Source Specific Host Report,
Z - Multicast Tunnel, z - MDT-data group sender,Y - Joined MDT-data group, y - Sending to MDT-data group
Outgoing interface flags: H - Hardware switched, A - Assert winnerTimers:Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode(*, 224.1.1.1), 00:02:44/stopped, RP 0.0.0.0, flags: D
Incoming interface: Null, RPF nbr 0.0.0.0Outgoinginterface list:
FastEthernet0/0, Forward/Dense, 00:02:44/00:00:00Serial0/1,Forward/Dense, 00:02:44/00:00:00

(150.1.124.4, 224.1.1.1), 00:02:44/00:01:35, flags: TIncominginterface: FastEthernet0/0, RPF nbr 0.0.0.0
Outgoing interface list:Serial0/1,Forward/Dense, 00:02:45/00:00:00

R2 has no RPF neighbor, meaning the source is directly connected. Now let’s compare the unicast routing table from the client back to the source.

SW1#show ip route 150.1.124.4 
Routing entry for 150.1.124.0/24Knownvia "eigrp 1", distance 90, metric 20540160, type internal
Redistributing via eigrp 1Lastupdate from 150.1.37.3 on FastEthernet0/3, 00:11:23 ago
Routing Descriptor Blocks:*150.1.37.3, from 150.1.37.3, 00:11:23 ago, via FastEthernet0/3
Route metric is 20540160, traffic share count is 1Totaldelay is 21100 microseconds, minimum bandwidth is 128 Kbit
Reliability 255/255, minimum MTU 1500 bytesLoading1/255, Hops 2

R3#show ip route 150.1.124.4Routingentry for 150.1.124.0/24
Known via "eigrp 1", distance 90, metric 20514560, type internalRedistributingvia eigrp 1
Last update from 150.1.13.1 on Serial1/2, 00:11:47 agoRoutingDescriptor Blocks:
* 150.1.23.2, from 150.1.23.2, 00:11:47 ago, via Serial1/3Routemetric is 20514560, traffic share count is 1
Total delay is 20100 microseconds, minimum bandwidth is 128 KbitReliability255/255, minimum MTU 1500 bytes
Loading 1/255, Hops 1150.1.13.1,from 150.1.13.1, 00:11:47 ago, via Serial1/2
Route metric is 20514560, traffic share count is 1Totaldelay is 20100 microseconds, minimum bandwidth is 128 Kbit
Reliability 255/255, minimum MTU 1500 bytesLoading1/255, Hops 1

R2#show ip route 150.1.124.4Routingentry for 150.1.124.0/24
Known via "connected", distance 0, metric 0 (connected, via
interface)Redistributingvia eigrp 1
Routing Descriptor Blocks:*directly connected, via FastEthernet0/0
Route metric is 0, traffic share count is 1

Basedon this output we can see that SW1 sees the source reachable via R3, which was the neighbor the multicast packet came from. R3 sees the source reachable via R1 and R2 due to equal cost load-balancing, with R2 as the neighbor that the multicast packet came from. Finally R2 sees the source as directly connected, which is where the multicast packet came from. This means that the RPF check is successful as traffic is transiting the network, hence we had a successful transmission.

Nowlet’s modify the routing table on R3 so that the route to R4 points to R1. Since the multicast packet on R3 comes from R2 and the unicast route will going back towards R1 there will be an RPF failure, and the packet transmission will not be successful. Again note that R1 is not routing multicast in this topology.

R3#conf t 
Enter configuration commands, one per line. End with CNTL/Z.R3(config)#iproute 150.1.124.4 255.255.255.255 serial1/2
R3(config)#endR3#R4#pingProtocol [ip]:
Target IP address: 224.1.1.1Repeatcount [1]: 5
Datagram size [100]:Timeoutin seconds [2]:
Extended commands [n]: yInterface[All]: Ethernet0/0
Time to live [255]:Sourceaddress: 150.1.124.4
Type of service [0]:SetDF bit in IP header? [no]:
Validate reply data? [no]:Datapattern [0xABCD]:
Loose, Strict, Record, Timestamp, Verbose[none]:Sweeprange of sizes [n]:
Type escape sequence to abort.Sending5, 100-byte ICMP Echos to 224.1.1.1, timeout is 2 seconds:
Packet sent with a source address of 150.1.124.4.....R4#

We can now see that on R4 we do not receive a response back from the final destination… so where do we start troubleshooting? First we want to look at the first hop away from the source, which in this case is R2. On R2 we want to look in the multicast routing table to see if the incoming interface list and the outgoing interface list is correctly populated. Ideally we will see the incoming interface as FastEthernet0/0, which is directly connected to the source, and the outgoing interface as Serial0/1, which is the interface downstream facing towards R3.

R2#show ip mroute 224.1.1.1 
IP Multicast Routing TableFlags:D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,T - SPT-bit set, J - Join SPT, M - MSDP created entry,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,U - URD, I - Received Source Specific Host Report, Z - Multicast Tunnel
Y - Joined MDT-data group, y - Sending to MDT-data groupOutgoinginterface flags: H - Hardware switched
Timers: Uptime/ExpiresInterfacestate: Interface, Next-Hop or VCD, State/Mode

(*, 224.1.1.1), 00:07:27/stopped, RP 0.0.0.0, flags: DIncominginterface: Null, RPF nbr 0.0.0.0
Outgoing interface list:Serial0/1,Forward/Dense, 00:07:27/00:00:00
FastEthernet0/0, Forward/Dense, 00:07:27/00:00:00(150.1.124.4, 224.1.1.1), 00:07:27/00:01:51, flags: T
Incoming interface: FastEthernet0/0, RPF nbr 0.0.0.0Outgoinginterface list:
Serial0/1, Forward/Dense, 00:04:46/00:00:00

Thisis the correct output we should see on R2. Two more verifications we can do are with the “show ip mroute count” command and the “debug ip mpacket” command. “show ip mroute count” will show all currently active multicast feeds, and whether packets are getting dropped:

R2#show ip mroute count 
IP Multicast Statistics3routes using 1864 bytes of memory
2 groups, 0.50 average sources per groupForwardingCounts: Pkt Count/Pkts per second/Avg Pkt Size/Kilobits per second
Other counts: Total/RPF failed/Other drops(OIF-null, rate-limit etc)Group: 224.1.1.1, Source count: 1, Packets forwarded: 4, Packets received:
4
Source: 150.1.124.4/32, Forwarding: 4/1/100/0, Other: 4/0/0Group: 224.0.1.40, Source count: 0, Packets forwarded: 0, Packets
received: 0

“debugip mpacket” will show the packet trace in real time, similar to the “debug ip packet” command for unicast packets. One caveat of using this verification is that only process switched traffic can be debugged. This means that we need to disable fast or CEF switching of multicast traffic by issuing the “no ip mroute-cache” command on the interfaces running PIM. Once this debug is enabled we’ll generate traffic from R4 again and we should see the packets correctly routed through R2.

R4#ping 224.1.1.1 repeat 100 

Type escape sequence to abort.Sending100, 100-byte ICMP Echos to 224.1.1.1, timeout is 2 seconds:


R2(config)#int f0/0R2(config-if)#noip mroute-cache
R2(config-if)#int s0/1R2(config-if)#noip mroute-cache
R2(config-if)#endR2#debugip mpacket
IP multicast packets debugging is onR2#IP(0): s=150.1.124.4 (FastEthernet0/0) d=224.1.1.1 (Serial0/1) id=231,
prot=1, len=100(100), mforwardIP(0):s=150.1.124.4 (FastEthernet0/0) d=224.1.1.1 (Serial0/1) id=232, prot=1,
len=100(100), mforwardIP(0):s=150.1.124.4 (FastEthernet0/0) d=224.1.1.1 (Serial0/1) id=233, prot=1,
len=100(100), mforwardIP(0):s=150.1.124.4 (FastEthernet0/0) d=224.1.1.1 (Serial0/1) id=234, prot=1,
len=100(100), mforwardIP(0):s=150.1.124.4 (FastEthernet0/0) d=224.1.1.1 (Serial0/1) id=235, prot=1,
len=100(100), mforwardR2#undebugall

Now that we see that R2 is correctly routing the packets let’s look at all three of these verifications on R3.

R3#show ip mroute 224.1.1.1 
IP Multicast Routing TableFlags:D - Dense, S - Sparse, B - Bidir Group, s - SSM Group, C - Connected,
L - Local, P - Pruned, R - RP-bit set, F - Register flag,T - SPT-bit set, J - Join SPT, M - MSDP created entry,
X - Proxy Join Timer Running, A - Candidate for MSDP Advertisement,U - URD, I - Received Source Specific Host Report,
Z - Multicast Tunnel, z - MDT-data group sender,Y - Joined MDT-data group, y - Sending to MDT-data group
Outgoing interface flags: H - Hardware switched, A - Assert winnerTimers:Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode(*, 224.1.1.1), 00:00:01/stopped, RP 0.0.0.0, flags: DC
Incoming interface: Null, RPF nbr 0.0.0.0Outgoinginterface list:
Serial1/3, Forward/Dense, 00:00:01/00:00:00Ethernet0/0,Forward/Dense, 00:00:01/00:00:00

(150.1.124.4, 224.1.1.1), 00:00:01/00:02:58, flags:Incominginterface: Null, RPF nbr 0.0.0.0
Outgoing interface list:Ethernet0/0,Forward/Dense, 00:00:02/00:00:00
Serial1/3, Forward/Dense, 00:00:02/00:00:00

FromR3’s show ip mroute output we can see that the incoming interface is listed as Null. This is an indication that for some reason R3 is not correctly routing the packets, and is instead dropping them as they are received. For more information let’s look at the “show ip mroute count” output.

R3#show ip mroute count 
IP Multicast Statistics3routes using 2174 bytes of memory
2 groups, 0.50 average sources per groupForwardingCounts: Pkt Count/Pkts(neg(-) = Drops) per second/Avg Pkt Size/Kilobits
per secondOthercounts: Total/RPF failed/Other drops(OIF-null, rate-limit etc)

Group: 224.1.1.1, Source count: 1, Packets forwarded: 0, Packets received:
15Source:150.1.124.4/32, Forwarding: 0/0/0/0, Other: 15/15/0

Group: 224.0.1.40, Source count: 0, Packets forwarded: 0, Packets
received: 0

Fromthis output we can see that packets for (150.1.124.4,224.1.1.1) are getting dropped, and specifically the reason they are getting dropped is because of RPF failure. This is seen from the “Other: 15/15/0” output, where the second field is RPF failed drops. For more detail let’s look at the packet trace.

R3#conf t 
Enter configuration commands, one per line. End with CNTL/Z.R3(config)#inte0/0
R3(config-if)#no ip mroute-cacheR3(config-if)#ints1/3
R3(config-if)#no ip mroute-cacheR3(