Sample contended locks by overflowing interval bucket #982

krk · 2024-09-02T15:28:57Z

Description

Implement a sampling method for contended locks, controlled by the lock
threshold (lock=DURATION). Bucket is filled by each contended lock's
duration, sampling is done only when the bucket overflows.

Also included are two commits to make testing reliable and easier:

Update PmuTests to have more cache misses
Allow setting TESTS variable from make cli

Related issues

#805

Motivation and context

Goal is to reduce the overhead while profiling lock contentions.

How has this been tested?

Different threshold values were tested with the included LockProfiling.java and Renaissance benchmarks.

By submitting this pull request 8000 , I confirm that my contribution is made under the terms of the Apache 2.0 license.

Makes the test more reliable, passes in higher spec machines.

e.g. `make test TESTS=lock` will run tests under test/test/lock folder.

apangin · 2024-09-04T01:23:45Z

src/lockTracer.cpp

+    unsigned long tid = (unsigned long)syscall(SYS_gettid);
+
+    // _threshold is used both as a duration threshold and a bucket interval. When the counter overflows _threshold, the event is sampled.
+    if (_enabled && duration >= _threshold && enter_time >= _start_time && updateCounter(_total_duration, duration, _threshold)) {


duration >= _threshold condition is exactly how the current algorithm works.
The idea of a counter was to replace this condition.

apangin · 2024-09-04T01:25:57Z

src/lockTracer.h

  private:
    static double _ticks_to_nanos;
    static jlong _threshold;
+    static jlong _interval;


I don't see where _interval is used.

apangin · 2024-09-04T01:27:20Z

test/one/profiler/test/Output.java


        try (ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
-             PrintStream out = new PrintStream(outputStream)) {
+                PrintStream out = new PrintStream(outputStream)) {


Please do not reformat untouched code, especially since the existing style already matches default IDEA settings.

I was using vscode default rules, will not format, thanks.

apangin · 2024-09-04T01:40:49Z

test/one/profiler/test/Test.java

    boolean enabled() default true;
+
+    // Optional inputs to the test method.
+    double[] inputs() default {};


double[] is probably not the best option for a generic case. String[] would likely be a better universal choice, similarly to main method accepting String[] args.
Optionally, the framework could automatically translate inputs to proper types derived from a method signature. This may unnecessarily complicate the code, though.

Will accept String[] and convert to double[] now, leaving the option to automatically support other types.

apangin · 2024-09-04T01:45:49Z

test/test/lock/LockTests.java

+    @Test(mainClass = LockProfiling.class, inputs = { 10, 75 })
+    @Test(mainClass = LockProfiling.class, inputs = { 100, 75 })
+    @Test(mainClass = LockProfiling.class, inputs = { 1000, 75 })
+    @Test(mainClass = LockProfiling.class, inputs = { 10000, 75 })


Too many options, IMO - this will significantly increase test run time. I'd reduce the number of tests to the required minimum (let's say, up to 10 seconds in total).

apangin · 2024-09-04T01:48:05Z

test/test/pmu/Dictionary.java

    public static void test8M() {
-        long[] array = new long[1024 * 1024];
-        testRandomRead(array, 1024 * 1024);
+        long[] array = new long[8 * 1024 * 1024];


8M meant 8 Megabytes, you made it 64. Perhaps, we can rename it to test8MB to avoid confusion.
Does the test fail with 8MB?

Tests fail with 8MB:

WARNING: PmuTests.cacheMisses failed java.lang.AssertionError: Expected 0.6086956521739131 < 0.2 at one.profiler.test.Assert.isLess(Assert.java:18) at test.pmu.PmuTests.cacheMisses(PmuTests.java:40)

I can rename the functions to test16KB/test8MB or testSmallRead/testLargeRead.

apangin · 2024-09-04T18:27:24Z

src/arguments.cpp

 //     alloc[=BYTES]    - profile allocations with BYTES interval
 //     live             - build allocation profile from live objects only
-//     lock[=DURATION]  - profile contended locks longer than DURATION ns
+//     lock[=DURATION]  - profile contended locks overflowing the DURATION ns bucket (default: 10u)


apangin · 2024-09-04T18:29:22Z

src/lockTracer.cpp


-
 double LockTracer::_ticks_to_nanos;
 jlong LockTracer::_threshold;


Maybe, _interval? Similarly to other engines.

apangin · 2024-09-04T18:33:05Z

test/one/profiler/test/TestProcess.java

        if (args != null && !args.isEmpty()) {
            args = substituteFiles(args);
-            for (StringTokenizer st = new StringTokenizer(args, " "); st.hasMoreTokens(); ) {
+            for (StringTokenizer st = new StringTokenizer(args, " "); st.hasMoreTokens();) {


Extraneous formatting changes

apangin · 2024-09-04T18:40:28Z

test/one/profiler/test/Output.java

+            long samples = extractSamples(s);
+            matched1 += pattern1.matcher(s).find() ? samples : 0;
+            matched2 += pattern2.matcher(s).find() ? samples : 0;
+        }


Why not reusing samples method?

apangin · 2024-09-04T18:44:00Z

test/one/profiler/test/Runner.java

+                if (p.inputs().length == 0) {
+                    m.invoke(holder, p);
+                } else {
+                    m.invoke(holder, p, p.inputs());


I'm not sure we need to pass inputs() explicitly.
Test method already accepts p and can extract inputs whenever it wants.

apangin · 2024-09-04T18:47:51Z

test/test/lock/LockProfiling.java

+public class LockProfiling {
+    final static int timeOutsideLock = 1_000_000; // 1 ms
+    final static ThreadLocal<Double> totalUsefulWork = ThreadLocal.withInitial(() -> 0.0);
+    final static ThreadLocal<Double> totalWait = ThreadLocal.withInitial(() -> 0.0);


The canonical order of modifiers is static final

apangin · 2024-09-04T18:50:38Z

test/test/lock/LockTests.java

+
+    // 0 is equivalent to disabling sampling of locks, so all profiles are included.
+    @Test(mainClass = LockProfiling.class, inputs = { "0", "75" })
+    @Test(mainClass = LockProfiling.class, inputs = { "1", "75" })


Technically, there is no difference between 0 and 1, you can leave only 0 case here.

apangin · 2024-09-04T18:51:29Z

test/test/lock/LockTests.java

+    @Test(mainClass = LockProfiling.class, inputs = { "10000", "75" })
+
+    // Large (for the specific paylod) interval value skews the sampled lock
+    // contention distribution.


Is it an expected behavior? Is it what we want to assert here?

This is an expected behavior. As the interval increases, lock contentions with longer durations will be favored.

Even though it's an expected, it is probably not intended behavior.
A test should verify an intended behavior.
What is the actual ratio in all these cases?

interval = 0, ratio = 75.84415584415585, minRatio = 70.0 interval = 10000, ratio = 75.125, minRatio = 70.0 interval = 1000000, ratio = 95.3307392996109, minRatio = 90.0 interval = 1000000000, ratio = NaN, minRatio = NaN

apangin · 2024-09-09T12:52:35Z

test/one/profiler/test/Runner.java

 import java.lang.reflect.InvocationTargetException;
 import java.lang.reflect.Method;
 import java.lang.reflect.Modifier;
+import java.util.ArrayList;


Seems like there are no meaningful changes in this file - please revert it back.

thanks, fixed.

apangin · 2024-09-09T12:54:18Z

test/test/lock/LockTests.java


 public class LockTests {
+    private static void contendedLocks(TestProcess p, int interval, double minRatio) throws Exception {
+        Output out = p.profile("-e lock --lock " + interval + " --threads -o collapsed");


-e lock is an obsolete form. --lock option is enough.

thanks, fixed.

apangin · 2024-09-09T13:03:32Z

test/test/lock/LockTests.java

+    @Test(mainClass = LockProfiling.class, inputs = { "10000", "75" })
+
+    // Large (for the specific paylod) interval value skews the sampled lock
+    // contention distribution.


Even though it's an expected, it is probably not intended behavior.
A test should verify an intended behavior.
What is the actual ratio in all these cases?

Implement a sampling method for contended locks, controlled by the lock threshold (lock=DURATION). Bucket is filled by each contended lock's duration, sampling is done only when the bucket overflows. Goal is to reduce the overhead while profiling lock contentions.

krk added 2 commits August 27, 2024 18:07

Update PmuTests to have more cache misses.

d6d8633

Makes the test more reliable, passes in higher spec machines.

Allow setting TESTS variable from make cli.

8725043

e.g. `make test TESTS=lock` will run tests under test/test/lock folder.

apangin reviewed Sep 4, 2024

View reviewed changes

krk force-pushed the lock-mon branch from 68b27e1 to b33813b Compare September 4, 2024 11:16

apangin reviewed Sep 4, 2024

View reviewed changes

krk force-pushed the lock-mon branch from b33813b to e2e979f Compare September 5, 2024 17:04

apangin reviewed Sep 9, 2024

View reviewed changes

krk force-pushed the lock-mon branch from e2e979f to 82b5fcf Compare September 9, 2024 13:20

Minor formatting changes

29d7f39

apangin merged commit 25fa02e into async-profiler:master Sep 9, 2024
1 check passed

krk mentioned this pull request Sep 9, 2024

Lock monitoring overhead/accuracy #805

Closed

krk mentioned this pull request Oct 9, 2024

Assert on total wait time in lock contention tests #1017

Merged

apangin mentioned this pull request Nov 5, 2024

Restructure and update documentation #1029

Merged



		double LockTracer::_ticks_to_nanos;
		jlong LockTracer::_threshold;

Sample contended locks by overflowing interval bucket #982

Sample contended locks by overflowing interval bucket #982

Uh oh!

Conversation

Description

Related issues

Motivation and context

How has this been tested?

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose 8000 a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants