High CPU – ARP or Proxy-ARP Causing High CPU

arpciscoproxy-arptroubleshooting

We have an issue on a Cisco A901-6CZ-F-A with high cpu caused by Arp requests coming on Vlan11 from our customer.

We enabled 'debug arp' to try and diagnose, below is a small snippet.

IP addresses and mac addresses have been changed for the purpose of this post.

Feb 99 20:44:51.107 GMT: IP ARP: rcvd rep src 10.2.1.200
0004.f2aa.aaaa, dst 10.2.255.253 Vlan11 Feb 99 20:44:51.131 GMT: IP ARP: creating incomplete entry for IP address: 10.2.5.133 interface
Vlan11 Feb 99 20:44:51.131 GMT: IP ARP: sent req src 10.2.255.253
ecbd.1daa.aaaa,
dst 10.2.5.133 0000.0000.0000 Vlan11

On Vlan11 there is an ip/mask of 10.2.5.133 255.255.0.0

From reading about proxy-arp, which is on by default, our router is replying to the arp requests with its own mac address ecbd.1daa.aaaa from Vlan11.

Why these request are coming to us is still a mystery, could be miss-configured default route on customer lan, wrong subnet mask, etc?

I thought the router could be stopped from responding to these requests with:

no ip proxy-arp

But with 'debug arp' we still see the router responding, so perhaps this is not proxy-arp?

I wanted to capture the traffic that was causing these requests, and tried to do so with the following capture:

monitor capture buffer BUFFER
monitor capture point ip process-switched PROC both

I don't see the ARP requests within my capture, i also don't see the ARP requests with:

monitor capture point ip cef CEF Vlan11 both

Have I miss understood proxy-arp? Is there a better method to identify the traffic?

Best Answer

Fault is now resolved. We didn't have a topology beyond our ASR901.

We know in normal circumstances the ARP table and the mac-address table have no relation to each other, however that is not the case with the 901.

In the IOS on ASR901 after 15.2(2)x and after, when the mac-address table is cleared (or times out) it takes the ARP table with it

To resolve the fault we used : mac-address-table aging-time 3600

Which moved the fault to 3600 instead of 300 seconds.

After which customer changed internal network, fault never came back.

Related Topic