## Memlat

Memlat is a tiny benchmark program to measure cache and memory access latencies. I wrote it based on the idea from the article "What Every Programmer Should Know About Memory" by Ulrich Drepper. If you find any bug, or have comment, you can email me at minlee (at) cc.gatech.edu

## Instruction

- (1) download and untar.
- (2) STRIDE size is hardcoded as 64byte. For accuracy, this must be your cache line size. Adjust it.
- (3) 'make' it and run.
- (4) check if the cycle value (first column in the example below) matches to your processor speed. (e.g. in the example below, the processor was 2992.500 MHz and this matches to cycle value.) This is because memlat uses rdtsc instruction to measure cycles and we want the procesesor run at full speed for accurate measurement. The cycle value is measured cycles for 1sec by rdtsc.

## Example

[root@piquet memlat]# ./memlat
Usage: ./memlat size(KB) duration(second) random(0|1)
stride size is 64, random(0|1) is do_shuffle switch, so give 0 for sequential, 1 for random. each run is 1sec, and the final report shows min,max,average for both cycles,performance(unit:Million elements traversed).
For accuracy, make sure stride size (current 64) == your cache line size. Also set affinity to one cpu without running any other process. Note that 2 or 3 cycles are typically measurable minimum due to size of core loop, so for L1 cache you'll see them even if it has actually less latency.
[root@piquet memlat]#
[root@piquet memlat]#
[root@piquet memlat]#
[root@piquet memlat]# ./memlat 512 10 1
64(STRIDE size) * 8192(# of stride) = 512 KB
cycle: 2994414795, count:189324540, so, 15.816306 cycles/memref
cycle: 2992305429, count:189193313, so, 15.816127 cycles/memref
cycle: 2992371714, count:189202611, so, 15.815700 cycles/memref
cycle: 2992375665, count:189144438, so, 15.820585 cycles/memref
cycle: 2992366656, count:189190240, so, 15.816707 cycles/memref
cycle: 2992360986, count:189201527, so, 15.815734 cycles/memref
cycle: 2992376781, count:189206673, so, 15.815387 cycles/memref
cycle: 2992372398, count:189206898, so, 15.815345 cycles/memref
cycle: 2992380669, count:189208316, so, 15.815270 cycles/memref
cycle: 2992382568, count:189207564, so, 15.815343 cycles/memref
summary: cycle 15.815 15.821 15.816 perf 189.144 189.324 189.208
[root@piquet memlat]#

## Graph

Give various working set size, get cycle numbers, and plot graph. I could get ones similar to above graph.

## Download

memlat-0.1.tgz for x86_64.
*last updated : Jan 2012*