I stumbled into a interesting issue the other day with icmp inspect breaking MTR. After cutting over traffic to an Cisco ASR1001HX running IOX-XE Zone Bases Firewall, mtr running from behind the ZBF was showing 99.9% packet-loss for all the hops between the ZBF and the Last.
There was no real packet loss and the last hop was always showing 0% loss as expected, however Monitoring Systems that relies on MTR to monitor hop-by-hop went completely crazy, thinking that we were having real packet loss.
We noticed messages similar to the following on the Logs:
%IOSXE-6-PLATFORM: R0/0: cpp_cp: QFP:0.0 Thread:122 TS:00002775229457146791 %FW-6-DROP_PKT: Dropping icmp pkt from internal0/0/recycle:0 X.X.X.X:11 => 10.152.10.139:0(target:class)-(ZP-TRUST-TO-UNTRUST:CM-TRUST-TO-UNTRUST-ALLOW) due to ICMP ERR Pkt:exceed burst lmt with ip ident 0 %IOSXE-6-PLATFORM: R0/0: cpp_cp: QFP:0.0 Thread:025 TS:00002773939123637161 %FW-6-DROP_PKT: Dropping icmp pkt from Port-channel1 X.X.X.X:11 => 10.152.10.139:0(target:class)-(ZP-TRUST-TO-UNTRUST:CM-TRUST-TO-UNTRUST-ALLOW) due to ICMP ERR Pkt:exceed burst lmt with ip ident 4427
First thought seeing ICMP ERR Pkt:exceed burst lmt is that it was some sort of icmp rate-limit on the box, so we disabled ip icmp rate-limit unreachable that’s enabled by default and is used to limit the amount of unreachable icmp packets in a X amount of time.
Well, that didn’t work. Paying more attention to the logs we noticed that they were all icmp type 11 code 0, that is ttl-exceeded in transit, ZBF was dropping the icmp 11/0 responses, and because of that the source running MTR would never receive them and would that packet loss was happening.
Solution? Add an ICMP Pass rule on your Policy-Map, making ICMP Stateless for the Firewall.
class-map type inspect match-any CM-UNTRUST-TO-TRUST-ICMP-ALLOW match protocol icmp policy-map type inspect PM-UNTRUST-TO-TRUST class type inspect CM-UNTRUST-TO-TRUST-ICMP-ALLOW pass class type inspect CM-UNTRUST-TO-TRUST-ALLOW inspect class class-default
Pass means that the ZBF will allow the traffic according to your class-map, and won’t keep any state of the connection, making it stateless for icmp traffic, you can apply it to your Untrust to Trust zone-pair, and MTR should work fine again.
You can and probably should be more specific and match on ttl-exceeded instead of all icmp messages, but this should be a good enough example to illustrate the issue.
I hope this can help someone having the same issue, I honestly couldn’t find much about this with my Google-Fu, and I still haven’t heard from Cisco with an official answer, if its a bug or expected for some reason.. If you know more about it, please write a comment 🙂
Vinicius Congratulations on the article
LikeLike
Thanks Fernando!
LikeLike