-
Notifications
You must be signed in to change notification settings - Fork 194
Description
Hi team, I'm running some benchmarks for a sparse matrix - matrix implementation of mine, and comparing everything against Eigen CSR (that' the baseline). I'm getting this results for TACO in the BCSR configuration @stephenchouca suggested in here:
taco is being compiled as follow: cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_POLICY_VERSION_MINIMUM=3.5 ... So no omp, or anything else.
The size and density is the size and density of block sparse matrices (with uniform block sizes), and the speedup is calculated against Eigen.
Is it possible for this cases that taco is supposed to work best with omp? Or am I missing anything?
Also, I'm running pack on the input matrices, and compile and assemble beforehands, and the main benchmark looks (with google bench) looks as follows:
for (const auto& prepared : preparedCases) {
const std::string benchName = makeBenchmarkName(prepared->config);
benchmark::RegisterBenchmark(
benchName.c_str(),
[prepared](benchmark::State& state) {
for (auto _ : state) {
prepared->result.compute();
benchmark::DoNotOptimize(prepared->result);
benchmark::ClobberMemory();
}
const double total_mults = static_cast<double>(state.iterations());
state.counters["mults"] = benchmark::Counter(total_mults);
state.counters["mults_per_sec"] =
benchmark::Counter(total_mults, benchmark::Counter::kIsRate);
});
}