Source code for the cache management technique, GRASP, published in [Faldu et al., HPCA'20] and the evaluated benchmarks.
This repo contains two parts: (1) Benchmarks and (2) Simulator.
This repo implements a simple data structure optimization to improve cache efficiency for three Ligra applications -- PageRank, PageRankDelta and BellmanFord. To make this repo standalone, it retains a large chunk of the code from the original Ligra framework along with changes from Balaji et al. [IISWC'18] and Faldu et al. [2]. For detailed information on the Ligra framework, please consult their original repository. To know how to apply vertex reordering to improve cache efficiency, please consult the repository for DBG.
Specifically, we merge two arrays to improve spatial locality for the above mentioned applications. Files with Opt suffix contains our optimized version of the applications.
This repo contains a simple trace-based simulation infrastructure contianing four cache management techniques -- LRU, Belady, PIN and Grasp. The simulator expects a file containing a header and a trace of L2 miss addresses as specified in read_file() of cache.h. Simulator is very simple, and assumes that all addresses are memory read. Simulator provides a first order effect of various policies on cache efficiency. However, it may not provide accurate results for write-dominated applications.
Finally, note that in the paper [1], we used the Sniper simulator. We included the Sniper APIs used to instrument applications to pass the region bounds to the cache microarchitecture in the Ligra source code. Look for code under the macro _SNIPER_.
export DBG_ROOT='directory where this repo is cloned'
cd ${DBG_ROOT}/apps
make clean; make cleansrc; make -j
For more details on how to run, see DBG.
export DBG_ROOT='directory where this repo is cloned'
cd ${DBG_ROOT}/trace-based-simulators
make clean; make POLICY=grasp;
Normalized application runtime (lower is better) after data-structure optimization for above mentioned three applications when processing various datasets. The evaluation is done on a dual-socket server with two Broadwell based Intel Xeon CPU E5-2630 exposing total of 40 hardware threads.
Normalized Runtime | BellmanFordOpt-iters | PageRankDeltaOpt | PageRankOpt |
---|---|---|---|
dbpedia-link | 0.98 | 0.71 | 0.77 |
friendster | 1.02 | 0.88 | 0.66 |
pld-arc | 0.92 | 0.94 | 0.71 |
sd-arc | 0.94 | 0.93 | 0.70 |
soc-LiveJournal1 | 0.97 | 0.81 | 0.76 |
twitter_mpi | 0.99 | 0.65 | 0.67 |
twitter_rv | 0.95 | 0.69 | 0.69 |
web-Google | 0.92 | 0.78 | 0.79 |
Miss-rate for various cache management techniques for five graph applications processing the web-Google dataset using the traces provided in the ${DBG_ROOT}/datasets directory. Datasets are reorderd using DBG. Simulation assumes 16-way associative 1MB cache.
Miss-rate(%) | LRU | PIN | GRASP | Belady |
---|---|---|---|---|
BC | 60.2 | 62.4 | 36.5 | 33.3 |
SSSP | 92.1 | 91.3 | 85.7 | 61.3 |
PR | 87.9 | 75.7 | 64.7 | 56.5 |
PRD | 87.9 | 75.7 | 64.7 | 56.5 |
Radii | 84.2 | 72.4 | 62.7 | 51.0 |
The repo also contains code from multiple other repositories (e.g., Ligra, Graph-Reordering-IISWC18, GAP, SNIPER, DBG) and the original copyright and license constraints apply to their code.
Please cite the following if you use the source code from this repository in your research.
@inproceedings{DBGFalduHPCA20,
author={Priyank Faldu and Jeff Diamond and Boris Grot},
booktitle={International Symposium on High-Performance Computer Architecture (HPCA)},
title="{Domain-Specialized Cache Management for Graph Analytics}",
year={2020},
month=feb,
}
@inproceedings{DBGFalduIISWC19,
author={Priyank Faldu and Jeff Diamond and Boris Grot},
booktitle={International Symposium on Workload Characterization (IISWC)},
title="{A Closer Look at Lightweight Graph Reordering}",
year={2019},
month=nov,
}
[1] P. Faldu and J. Diamond and B. Grot, "Domain-Specialized Cache Management for Graph Analytics", in Proceedings of the International Symposium on High-Performance Computer Architecture (HPCA), February 2020.
[2] P. Faldu and J. Diamond and B. Grot, "A Closer Look at Lightweight Graph Reordering," in Proceedings of the International Symposium on Workload Characterization (IISWC), November 2019.