How does Cisco perform authentication server reachability on aaa

aaa

Let's suppose we have a line like this:

aaa authentication login default group radius group tacacs+ local 

According to the IOS 12.2 documentation:

If R1 authenticates the user, it issues a PASS response to the network access server and the user is allowed to access the network. If R1 returns a FAIL response, the user is denied access and the session is terminated. If R1 does not respond, then the network access server processes that as an ERROR and queries R2 for authentication information

It is important to remember that a FAIL response is significantly different from an ERROR. A FAIL means that the user has not met the criteria contained in the applicable authentication database to be successfully authenticated. Authentication ends with a FAIL response. An ERROR means that the security server has not responded to an authentication query. Because of this, no authentication has been attempted. Only when an ERROR is detected will AAA select the next authentication method defined in the authentication method list.

My question is: How does a Cisco device exactly perform the check on an authentication server? Is it possible to modify/extend this method?

The motivation behind this question is to ensure that the reachability creteria meet some specific L7 methods rather than L3/L4.

Best Answer

The router sends the initial request, and simply waits for a well-formed answer from the Radius/TACACS server. There are no active "keepalive" style health checks; the router doesn't ping the server and look at response-times or anything like that.

What the router does next, depends on your configured criteria. In general, once the timeout expires (default is 5 seconds), or if a malformed response is received, the router will try the secondary server or fail down to the next configured authentication method.

Radius, unlike TACACS, can also mark a server as "dead" and cease to try authentications against it for a preconfigured amount of time.

See the following settings to tweak this behavior:

Timeout Configuration:

TEST-1861(config)#tacacs-server timeout ?
  <1-1000>  Wait time (default 5 seconds)

TEST-1861(config)#radius-server timeout ?
  <1-1000>  Wait time (default 5 seconds)

Radius Dead-timer configuration:

TEST-1861(config)#radius-server deadtime ?
  <1-1440>  time in minutes

TEST-1861(config)#radius-server dead-criteria ?
  time   The time during which no properly formed 
         response must be recieved from the RADIUS server
  tries  The number of times the router must fail 
         to receive a response from the radius server
         to mark it as dead

TEST-1861(config)#radius-server dead-criteria time ?
  <1-120>  Time in seconds during which no response must
  be received from the RADIUS server in order to consider it dead

TEST-1861(config)#radius-server dead-criteria tries ?
  <1-100>  Number of transmits to radius server without 
  responses before marking server as dead