Cisco – QoS woes – managed IP VPN

avayaciscoqos

(First of all, I'm sorry for this wall of text. I don't know how to make it any shorter without losing important information. I originally wanted to use the chat room for this, like we do on serverfault for these kind of questions, but there is nobody in the network engineering room).

We're a corporation with several daughter companies, where we have a rather large managed IP-VPN with about 70 different locations, varying from 2Mbps SHDSL to 100Mbps fiber. The IP-VPN carries multiple VPNs (or tunnels to be exact).

The priority of traffic is this, from a management and design standpoint:

  1. VoIP (Avaya and Lync)
  2. Video (Lync)
  3. RDP
  4. Internal services (fileservers, Active Directory, intranet etc)
  5. Non-prioritized internal services (proxy servers for internet usage, windows update services, system center configuration management, antivirus update proxies etc)
  6. The not matched traffic (internet)

VoIP is only used at certain offices, where there is a low amount of users. The biggest remote office that use VoIP right now has a 4mbps SHDSL with 5 employees and 5 avaya IP phones running the G.711 ALAW 64K codec. This should never bring the voip data traffic up to more than 320kbps.
I've verified that the phones use DSCP 46 for audio, and it's therefore correctly matched as EF (see config below). The signaling however is matched as DSCP 24, which I'm not sure if our QoS profile picks up..

All remote locations use RDP against several RDS farms at our HQ (2x100Mbit fiber). The bandwidth used for RDP is not so easy to figure out, since it basically uses everything it gets. We do have certain limitations set to make sure that it's not too resource hungry, but that is probably out of scope for this site. We do have some rather severe problems with RDP lately (https://serverfault.com/questions/515809/mouse-cursor-jumps-around-when-using-rdp), which is why I'm posting this on network engineering.

Lync uses DSCP 46 for audio and DSCP 34 for video. Internal services and non-prioritized internal services are just matched by subnets, and everything else is just match any.

Here is a copy of the latest QoS config revision, which I have modified slightly to hide certain names and IP addresses:

!
class-map match-any INTERNAL-PRI
 match access-group name CUST-INT-PRI
 match access-group name CUST-DMZ
class-map match-any INTERNAL-NOPRI
 match access-group name CUST-INT-NOPRI
class-map match-any REMOTEDESKTOP
 match access-group name RDP
class-map match-any ALL
 match any
class-map match-any NETWORK
 match ip precedence 6
 match ip precedence 7
class-map match-any EF
 match ip dscp ef
 match ip dscp cs5
class-map match-any AF-HIGH
 match ip dscp af41
 match ip dscp cs4
class-map match-any AF-MEDHI
 match ip dscp af31
 match ip dscp cs3
class-map match-any AF-MEDIUM
 match ip dscp af21
 match ip dscp cs2
class-map match-any AF-LOW
 match ip dscp af11
 match ip dscp cs1
class-map match-any BE
 match ip dscp default
!
!
policy-map setTos
 class EF
 class REMOTEDESKTOP
  set ip dscp af31
 class INTERNAL-PRI
  set ip dscp af21
 class INTERNAL-NONPRI
  set ip dscp af11
 class class-default
  set ip dscp default
policy-map useTos
 class EF
  priority percent 10
 class AF-HIGH
  bandwidth remaining percent 35
 class AF-MEDHI
  bandwidth remaining percent 25
 class AF-LOW
  bandwidth remaining percent 20
 class BE
  bandwidth remaining percent 10
 class NETWORK
policy-map QOS
 class ALL
  shape average 4096000
  service-policy useTos
!
!         
ip access-list standard CUST-DMZ
 permit 123.123.123.0 0.0.0.255
!
ip access-list standard CUST-INT-PRI
 permit 10.50.0.0 0.0.0.255
 permit 10.51.0.0 0.0.0.255
!
ip access-list standard CUST-INT-NOPRI
 permit 10.50.10.0 0.0.0.255
 permit 10.51.10.0 0.0.0.255
!
ip access-list extended RDP
 permit tcp any eq 3389 any
 permit tcp any any eq 3389
!

As you can see, it's a rather large QoS configuration. Note that we did not create this config our selves, it was all done by a previous employee at our IP-VPN provider. Note also that the shape value is changed according to what kind of connection it is (2mbps, 4mbps, 8mbps and 10mbps).

By now you're probably wondering – What's the question here? Here goes..

  1. Like I mentioned earlier, we are drowning in complaints from RDP users about lag/user input not being recognized. Are we not prioritizing it correctly? Is it possible to make sure that RDP gets a minimum amount of packet loss, latency and jitter, but still being restricted in bandwith?
  2. I'm not seeing any mention of queues in this config. I've read some Microsoft documentation, and they recommend to use priority queue on VoIP and WRED on video. How do I make this happen?
  3. As the config shows, none of the AF classings use medium or high drop. What kind of services are safe to drop? RDP, video and voip does not work well with drops..
  4. Are the bandwith percentages in order? It sums up to 100% usage

Any other suggestion(s) are welcome, as I'm desperate to get this sorted out. If you think it's too much to answer on a Q&A site I'll just bite the dust and hire a consultant from our Cisco Gold partner, which is financially OK – I just want to learn this if I can.

Best Answer

To answer your questions:

  • RDP traffic should get up to the 25% of the remaining bandwidth. Where the already reserved bandwidth is the 35% ( class-default gets 25% by default and EF get 10% ). So, if i'm right, you assigned ~665Kbps to RDP. Anyway you should check if you're dropping packets issuing the command below:

show policy-map <your wan interface> output class REMOTEDESKTOP

and checking for dropped packets.

  • Cisco assign a queue to each user-defined class that includes the bandwidth or police commands. To make a long-story simple those commands define the amount of bandwidth assigned to every queue during congestions.

  • In theory every TCP based stream should be OK with drops. In practice some of them aren't. Dropping precedence bits tell the routers what packets should be dropped, within a given class, before congestion happens. Since RDP is the only type of traffic defined in your REMOTEDESKTOP class, you should not worry about it.

  • Bandwidth percentage are not in order ( as Jeremy stated ).

That said, there are a lot of things that i would change in your configuration:

  • There are no matches between some of classes of the setTos and the useTos policy-map. For instance the one named AF-HIGH is processing no packets since no class in setTos sets DSCP to AF41.

  • BE class in setTos is somehow self-defeating since it makes the class-default class useless. Note that class-default is the only system-defined class and get the 25% of the bandwidth by default ( 100 - max-reserved-bandwidth ) .

  • bandwidth remaining percent is not the best options ( as Jeremy explained ). I would change it to bandwidth percent.

  • I would prefer to mark EF packets by myself and not to rely on the phones' settings.