How to know which element replaces which for a cache

ccachemips

If I assume that the first element of the matrix that is fetched to the D-cache is a[0][0], for associativity 4, please tell me which element in which matrix that will write over a[0][0] in the D-cache. Since the formula for set associativity is

In a set-associative cache, the set containing a memory block is given
by (Block number) modulo (Number of sets in the cache)

How can I know from this code, compiled and run as MIPS assembly in a MIPS simulator, which element that will write over a[0][0]?

/* matris.c */
#include <stdio.h>
#include <idt_entrypt.h>

#define MATRIXSIZE 16
#define MATRIXSIZE_ROWS 16
#define MATRIXSIZE_COLS 16

/*
 * addera two matriser
 */
void matrisadd( int res[MATRIXSIZE_ROWS][MATRIXSIZE_COLS],
                int   a[MATRIXSIZE_ROWS][MATRIXSIZE_COLS],
                int   b[MATRIXSIZE_ROWS][MATRIXSIZE_COLS] )
{
  int i,j;

  for(i=0; i < MATRIXSIZE; ++i) /* variera rad-index */
    for(j=0; j < MATRIXSIZE; ++j) /* variera kolumn-index */
      res[i][j] = a[i][j] + b[i][j];
}

int main()
{
  static int   a[MATRIXSIZE_ROWS][MATRIXSIZE_COLS];
  static int   b[MATRIXSIZE_ROWS][MATRIXSIZE_COLS];
  static int res[MATRIXSIZE_ROWS][MATRIXSIZE_COLS];
  int i,j, Time;

  /*
   * initiera matris a och b
   */
  for( i=0; i<MATRIXSIZE; ++i)
    for( j=0; j<MATRIXSIZE; ++j)
    {
      a[i][j] = i+j;
      b[i][j] = i-j;
    }

  flush_cache();              /* toem cachen */
  timer_start();              /* nollstall tidmatning */

  matrisadd( res, a, b);

  Time = timer_stop();                /* las av tiden */
  printf("Time: %d\n",Time);
}

Best Answer

I think ANSI C dictates that multi-dimensional arrays are allocated to memory in row-major order. That means elements in a row are contiguous in memory, which is to say a[0][0] is adjacent to a[0][1] in memory.

The way associative caches work is that some of the address bits tell you which line of the cache the item is on, and some other bits of the address tell you which set the item goes in. You need to know the line-width to know how many bytes are on a line, and you need to know how many lines each set contains.

Lets say you didn't have an associative cache (i.e. you had associativity = 1). Lets also say you had 32 bytes per line, and 16 lines. So your total cache size would be 16 * 32 = 512 bytes. If your a[0][0] was allocated to memory such that it was on a cache line boundary (i.e. it's address was divisible by 32), then the line it brought in would contain a[0][0] through a[0][31] assuming the cardinality of the second dimension was that big. If on the other hand the cardinality of the second dimension was say 16, the a[0][0] row would contain a[0][0] ... a[0][15], a[1][0] ... a[1][15]. And so on.

However your array is dimensinoed, after 512 bytes, you're going to be full and the 513th byte offset from a[0][0] is going to blow away the row containing a[0][0]. Unless your cache is associative. In that case, your 513th entry is going to go into the next set, and that's going to keep happening for as many ways associative your cache is. Only after exhausting the sets will you wrap back around and blow away a[0][0]. Now, obviously for the same size cache, increasing associativity will effectively reduce the number of lines per set. So the access pattern determines whether it's helpful or not. Mileage may vary.

Related Solutions

PIC32 RTCC running too fast 1min = 10 sec

This doesn't answer your question, but might make the code a little easier for you to debug. The case statements are really long and may not be the best way to explain what you are doing with your outputs. I make no guarantees that the code is operational (I have not run it at all), but this should get you thinking about file size and readability.

Your singleminutes case statement has a truth table like this:

//    | out
//  in| 0 1 2 3 4
// ---------------
//  0 | 0 0 0 0 0
//  1 | 0 1 0 0 0
//  2 | 0 1 1 0 0
//  3 | 0 1 1 1 0
//  4 | 0 1 1 1 1

which might be better represented with output-centric code like this:

if (singleminutes >= 1)
    PPEins = 1;
else
    PPEins = 0;

if (singleminutes >= 2)
    PPZwei = 1;
else
    PPZwei = 0;

if (singleminutes >= 3)
    PPDrei = 1;
else
    PPDrei = 0;

if (singleminutes >= 4)
    PPVier = 1;
else
    PPVier = 0;

The nfminutes is a little more complicated, but here is the Truth Table:

//   | MHUhr PMFuenf PMZehn PMViertel PMZwanzig PMVor PMNach PMHalb |        |
// --|--------------------------------------------------------------|--------|-----
// 0 | 1     0       0      0         0         0     0      0      | 1000 0 |  000
// 1 | 0     1       0      0         0         0     1      0      | 0100 0 |  010
// 2 | 0     0       1      0         0         0     1      0      | 0010 0 |  010
// 3 | 0     0       0      1         0         0     1      0      | 0001 0 |  010
// 4 | 0     0       0      0         1         0     1      0      | 0000 1 |  010
// 5 | 0     1       0      0         0         1     0      1      | 0000 0 |  101
// 6 | 0     0       0      0         0         0     0      1      | 0000 0 |  001
// 7 | 0     1       0      0         0         0     1      1      | 0100 0 |  011
// 8 | 0     0       0      0         1         1     0      0      | 0000 1 |  100
// 9 | 0     0       0      1         0         1     0      0      | 0001 0 |  100
//10 | 0     0       1      0         0         1     0      0      | 0010 0 |  100
//11 | 0     1       0      0         0         1     0      0      | 0100 0 |  100

and again some output-centric code:

    // MHUhr PMFuenf PMZehn PMViertel PMZwanzig
if( nfminutes == 0 )
    MHUhr = 1;
else
    MHUhr = 0;

if(( nfminutes == 1 ) || (nfminutes == 5) || (nfminutes == 7) || (nfminutes == 11))
    PMFuenf = 1;
else
    PMFuenf = 0;

if(( nfminutes == 2 ) || (nfminutes == 10) )
    PMZehn = 1;
else
    PMZehn = 0;
if(( nfminutes == 3 ) || (nfminutes == 9) )
    PMViertel = 1;
else
    PMViertel = 0;

if(( nfminutes == 4 ) || (nfminutes == 8) )
    PMZwanzig = 1;
else
    PMZwanzig = 0;


// PMVor PMNach PMHalb
if( ((nfminutes >= 1 ) && (nfminutes <= 4 )) || (nfminutes == 7))
    PMNach = 1;
else
    PMNach = 0;

if( (nfminutes >= 5) && (nfminutes <= 7 )
    PMHalb = 1;
    else
        PMHalb = 0;
if(nfminutes >=8)
    PMVor  = 1;
else
    PMVor  = 0;

The code above might do well with some #defines too

#define UHR     0
#define PHUENF_NACH 1
#define ZEHN_NACH   2
...
if(nfminutes == UHR)

Again for hours. Truth Table:

      | 12  1  2  3  4  5  6  7  8  9 10 11
//----|------------------------------------
// 0  | 1  0  0  0  0  0  0  0  0  0  0  0 
// 1  | 0  1  0  0  0  0  0  0  0  0  0  0 
// 2  | 0  0  1  0  0  0  0  0  0  0  0  0 
// 3  | 0  0  0  1  0  0  0  0  0  0  0  0 
// 4  | 0  0  0  0  1  0  0  0  0  0  0  0 
// 5  | 0  0  0  0  0  1  0  0  0  0  0  0 
// 6  | 0  0  0  0  0  0  1  0  0  0  0  0 
// 7  | 0  0  0  0  0  0  0  1  0  0  0  0 
// 8  | 0  0  0  0  0  0  0  0  1  0  0  0 
// 9  | 0  0  0  0  0  0  0  0  0  1  0  0 
// 10 | 0  0  0  0  0  0  0  0  0  0  1  0 
// 11 | 0  0  0  0  0  0  0  0  0  0  0  1

and code. Slightly different structure with all outputs being cleared, then only the correct output turned on.

// one-hot, clear all will not cause a glitch
PHZwoelf    = 0;
PHEins      = 0;
PHZwei      = 0;
PHDrei      = 0;
PHVier      = 0;
PHFuenf     = 0;
PHSechs     = 0;
PHSieben    = 0;
PHAcht      = 0;
PHNeun      = 0;
PHZehn      = 0;
PHElf       = 0;

if( hours == 0 )
    PHZwoelf = 1;   
if( hours == 1 )
    PHEins = 1;
if( hours == 2 )
    PHZwei = 1; 
if( hours == 3 )
    PHDrei = 1; 
if( hours == 4 )
    PHVier = 1; 
if( hours == 5 )
    PHFuenf = 1;    
if( hours == 6 )
    PHSechs = 1;    
if( hours == 7 )
    PHSieben = 1;   
if( hours == 8 )
    PHAcht = 1; 
if( hours == 9 )
    PHNeun = 1; 
if( hours == 10 )
    PHZehn = 1; 
if( hours == 11 )
    PHElf = 1;

All this also allows you to do your input calculations together before your case statements.

// update single minutes
int singleminutes = (int) (unbcd(tm.min)%5);    // 1, 2, 3, 4
// update 5 minutes
int nfminutes = (int) (unbcd(tm.min)/5);    // Fuenf Nach, Zehn Nach, ...
// update hours
int hours = (int) (unbcd(tm.hour)%12);      // 12, 1, 2, 3, 4...
if(nfminutes>=5) hours++;           // 7:25 = Fuenf Vor Halb Acht (8)

How to calculate access time for a cache simulation

You need to have some idea how long it takes the main memory to transfer a block of data corresponding to one "line" in the cache. The number you need to enter in this dialog is the number of CPU clock periods corresponding to that amount of time, for read operations and write operations, respectively.

For example, say your memory bus runs at 100 MHz and it takes three clocks to do a read (30 ns) and two clocks to do a write (20 ns). If your cache block size is 2 words, then the total time to read or write a block would be 60 ns and 40 ns, respectively. If your CPU is running at, say, 200 MHz (5 ns clock period), you would enter 12 and 8 into these fields.

If your main memory is more complex, such as DDR SDRAM that supports burst transfers and multiple banks, the calculation gets a little more intricate, but for the purposes of this simulation, an "average" figure for each kind of transfer should be adequate.

Best Answer

Related Solutions

PIC32 RTCC running too fast 1min = 10 sec

How to calculate access time for a cache simulation

Related Topic