You're right, you wouldn't really see the burstiness easily on SNMP. 1GE can send 1.48Mpps, so it takes very very little time to congest the the 45Mbps, which can handle less than 75kpps.
If your ingress is 1GE and egress is 45Mbps, then obviously the congestion point of 45Mbps will need to drop packets. This is normal and expected. If you increase buffers you'll introduce more delay.
1GE takes 0.45ms to send 40 1500B IP frames, which is right now the amount of burst you can handle. However dequeueing them on the 45Mbps already takes 10ms.
If you don't have any acute problem, I would probably not do anything about it. But if some traffic is more eligible for dropping than other, then you should replace FIFO with class-based queueing. Say maybe you want to prioritize so that more ftp is dropped and less voip.
Then it'll also make more sense to add more buffering on the ftp traffic, as it's not really sensitive to delay.
If you want to try your luck with deeper buffers, something like this should suffice:
policy-map WAN-OUT
class class-default
fair-queue
queue-limit 200 packets
!
interface Serial1/0
service-policy output WAN-OUT
This would cause 50ms buffers on the Serial1 and would allow you to handle up-to 2.25ms burst from single Gige interface.
Interface Internal-Data0/0 "", is up, line protocol is up
2749335943 input errors, 0 CRC, 0 frame, 2749335943 overrun, 0 ignored, 0 abort
^^^^^^^^^^^^^^^^^^
0 output errors, 0 collisions, 0 interface resets
You show overruns on the InternalData interfaces, so you are dropping traffic through the ASA. With that many drops, it's not hard to imagine that this is contributing to problem. Overruns happen when the internal Rx FIFO queues overflow (normally because of some problem with load).
EDIT to respond to a question in the comments:
I don't understand why the firewall is overloaded, it is not close to using 10Gbps. Can you explain why we are seeing overruns even when the CPU and bandwidth are low? The CPU is about 5% and the bandwidth either direction never goes much higher than 1.4Gbps.
I have seen this happen over and over when a link is seeing traffic microbursts, which exceed either the bandwidth, connection-per-second, or packet-per-second horsepower of the device. So many people quote 1 or 5 minute statistics as if the traffic is relatively constant across that timeframe.
I would take a look at your firewall by running these commands every two or three seconds (run term pager 0
to avoid paging issues)...
show clock
show traffic detail | i ^[a-zA-Z]|overrun|packets dropped
show asp drop
Now graph out how much traffic you're seeing every few seconds vs drops; if you see massive spikes in policy drops or overruns when your traffic spikes, then you're closer to finding the culprit.
Don't forget that you can sniff directly on the ASA with this if you need help identifying what's killing the ASA... you have to be quick to catch this sometimes.
capture FOO circular-buffer buffer <buffer-size> interface <intf-name>
Netflow on your upstream switches could help as well.
Best Answer
On the external interface of your router, use this ACL: