Fix --x-initial and --x-assign random stability (#2662) #5958

toddstrader · 2025-04-25T18:21:49Z

We recently noticed that seeds were not portable between builds w/ and w/o --trace. This is because the two have different numbers of VL_RAND_RESET_*() calls. This is a proof of concept which, while not fully fleshed out, does resolve the issue for us (at least in limited testing).

We can work around this by setting --x-initial and --x-assign to 0. However, this is obviously not ideal.

The basic idea is that since VL_RAND_RESET_*() call may be optimized away (or differently) between builds, we don't let these calls advance the PRNG stream. Instead we calculate salt values during verilation (based on the AST or RTL path or something stable) and run the PRNG algorithm on that without modifying its state.

Open items:

in *rand64(), if reset is true return the next value per the algorithm after the state has been salted, but do not update the sate (instead of the simple XOR which is there now)
compute salts for various places which emit hard-coded salts currently (should be stable between verilations and w/ and w/o --trace
tests

I recognize that this isn't foolproof since one could:

int value;
always_ff @(posedge clk) value = $random();

And if value isn't used then I assume Verilator can prune it when we're running w/o --trace. However, this scenario feels a lot more obvious / straightforward to deal with whereas x randomization is very pervasive and implicit.

Anyway, I wanted to get initial thoughts on this before I went further. Please advise.

wsnyder

Yes, makes sense, similar idea to #2662, thanks for working on this.

Please add a test that checks the randomization stability, e.g. that is calls compile and execute twice, and prints two random integers, then checks that the print outputs are the same.

wsnyder · 2025-04-25T20:55:33Z

include/verilated.cpp

+uint64_t VlRNG::rand64(bool reset, uint64_t salt) VL_MT_UNSAFE {
+    if (reset) { return (m_state[0] + m_state[1]) ^ salt; }


This appears a misuse of VlRNG::rand64. If you want a new seed you should be using srandom(...) instead. I presume this is because you used VL_RANDOM, I think instead we should remove use of VL_RANDOM and make a new function.

This will also be faster as existing VL_RANDOM outside of init won't need to initialize and hit the new branch you added here.

Good point, I'll lose the reset flag and just make a different method. But do be clear, I'm not trying to set the seed here, I'm trying to piggy back on the seed that was set and get a psuedo-random value that is a function of the seed and the variable (by way of salt) without disturbing the state of VlRNG.

Ah, ok. I think we should rely on only the original seed and hash of signal hierarchy.

Another thing to add to test is inset a new signal first in the second test.

toddstrader · 2025-04-28T15:11:25Z

Re: testing, is there a way to create multiple VlTest objects in a single t_foo.py? I'm trying to do something along these lines:

for suffix, flags in [
    ("", []),
    ("_trace", ["--trace"]),
    ("_added", ["-DADD_SIGNAL"]),
    ("_trace_added", ["--trace", "-DADD_SIGNAL"]),
]:
    test.obj_dir = obj_dir + suffix

    verilator_flags2 = ["--x-initial unique"] + flags
    test.compile(verilator_flags2=verilator_flags2)

Doing this fails because the suffixed obj dirs don't exist. If I work around that I run into other problems because -Mdir is wrong. It seems like what I really want to do is create new VlTest objects with different Args.obj_suffix. But I don't see this pattern anywhere and the bootstrapping code continues to confuse me. Is there a different way to go about this?

wsnyder · 2025-04-28T16:19:48Z

There are a couple of tests that compile/execute within a single test more than once, but generally this is to be avoided as is hard to debug and won't parallelize. Instead please make 4 separate .py tests referring to the same Verilog file.

toddstrader · 2025-04-28T17:16:12Z

There are a couple of tests that compile/execute within a single test more than once, but generally this is to be avoided as is hard to debug and won't parallelize. Instead please make 4 separate .py tests referring to the same Verilog file.

Gotcha. I was hoping to avoid the golden files and just compare logs with each other. But yeah, that'll work too.

toddstrader · 2025-04-28T20:43:52Z

Still a WIP, but the first commit is a test which can be run to demonstrate the issue.

Question: when I use AstVar::prettyName() in t_x_rand_stability.py I get fully scoped names (e.g. test.uninitialized) but when I add a verilator public in t_x_rand_stability_add.py prettyName() now returns just the variable name (uninitialized). Why is this and is there something else I should be doing to get the fully qualified variable name?

toddstrader · 2025-04-28T21:08:09Z

I see. It's an inlining difference. So I guess that's the question here. Should the random reset value be based on the longest possible Verilog path for the given variable? Or should it be based on just the variable name?

If we go with the former, then inlining changes will perturb the randomization like I'm seeing here. But if we go with the latter, there's less entropy on these reset values. Thoughts?

wsnyder · 2025-04-28T21:34:19Z

Maybe a hash of the module's scope, combined with a hash of the variable's name within module (not full scope). Accept that if inlining changes that may make it inconsistent; generally it'll take a pretty large change for that to flip.

toddstrader · 2025-04-29T14:46:40Z

Sorry, doesn't module's scope have the same maybe-inlined-maybe-not problem? I don't believe AstNodeModule knows or can know this information (aside from maybe someInstanceName()). E.g. if I instantiate a non-inlined sub module twice I get:

    1: MODULE 0x5555570886c0 <e1247> {f44ai}  sub  L3 [1ps]
    1:2: VAR 0x5555570887e0 <e1255> {f45ar} @dt=0x55555708b080@(G/w8)  no_init [VSTATIC]  VAR
    1:2: SCOPE 0x5555570ba1c0 <e1642#> {f39aj}  TOP.t__DOT__the_sub_1 [abovep=0x5555570ba0e0] [cellp=0x555557088480] [modp=0x5555570886c0]
    1:2:1: VARSCOPE 0x55555708be00 <e1644#> {f45ar} @dt=0x55555708b080@(G/w8)  TOP.t__DOT__the_sub_1->no_init [scopep=0x5555570ba1c0] -> VAR 0x5555570887e0 <e1255> {f45ar} @dt=0x55555708b080@(G/w8)  no_init [VSTATIC]  VAR
    1:2: SCOPE 0x5555570ba2a0 <e1646#> {f40aj}  TOP.t__DOT__the_sub_2 [abovep=0x5555570ba0e0] [cellp=0x5555570885a0] [modp=0x5555570886c0]
    1:2:1: VARSCOPE 0x55555708bec0 <e1648#> {f45ar} @dt=0x55555708b080@(G/w8)  TOP.t__DOT__the_sub_2->no_init [scopep=0x5555570ba2a0] -> VAR 0x5555570887e0 <e1255> {f45ar} @dt=0x55555708b080@(G/w8)  no_init [VSTATIC]  VAR

But there's only one VAR and one VL_RAND_RESET_I(). So what would the module's scope be here?

wsnyder · 2025-04-29T14:56:48Z

The scope's hash needs to be computed at runtime as "Hash(vlSelf->name())" this then gets combined with the verilation-time computed "Hash(AstVar->name())" that is presumably passed to runtime through VL_RESET_WITH_HASHED_NEW_WHATEVER_I.

BTW the other reason to do at runtime is there might be multiple instantiations of the verilated module, and each should hash differently.

So generated code is something like

    modhash = hash_func(vlSelf->name())
     ...
    sig1 = VL_RESET_HASHED(0x1234, modhash, RNG);
    sig2 = VL_RESET_HASHED(0x4567, modhash, RNG);

    where 0x1234 is compiletime from Hash(AstVar->name())
    and VL_RESET_HASH combines 0x1234 and modhash value with RNG

toddstrader · 2025-04-29T16:52:45Z

Ah, OK. Thanks, I was missing the runtime part. I believe I'm with you now.

toddstrader · 2025-05-02T15:45:51Z

Further along now but am seeking some advice on this:

Vt_class_packed_sub__03a__03aWrReqQ__Vclpkg__DepSet_hdfbb1e03__0.cpp: In member function âvoid Vt_class_packed_sub__03a__03aWrReqQ::_ctor_var_reset(Vt_class_packed__Syms*)â:
Vt_class_packed_sub__03a__03aWrReqQ__Vclpkg__DepSet_hdfbb1e03__0.cpp:19:54: error: âvlSelfâ was not declared in this scope
   19 |     uint64_t __VscopeHash = std::hash<std::string>{}(vlSelf->name());
      |                                                      ^~~~~~

Basically, VlClass doesn't know the scope in which an object of said class is constructed. I think I can have VlClassRef pass along a VerilatedModule* its m_objp but am not sure if there are any gotchas there. E.g. is it possible to construct a VlClass object without a VlClassRef?

Anyway, I'll try this next but any guidance is appreciated.

toddstrader · 2025-05-02T18:33:04Z

Actually, this is short sighted ^. If I am able to get the scope name into VlClass for hashing, that only gets us up to the scope where the new() happens. And since the new() is an expression and not a var, there's no name to hash. So:

Cls obj1 = new();
Cls obj2 = new();

would get the same x randomization.

I'm not immediately sure what to do about this but will continue poking at it.

wsnyder · 2025-05-03T01:11:01Z

Probably the "right" model is for it to be seeded similar to how IEEE specifies the RNG key be passed as classes are created (see 18.14).

However that might be a pain, it also requires a RNG in each class object which slows things down.

An alternative is to use the class package's name() this is static for the class (what $display %m uses). The downside is it will result in the same random setting for every class created.

Another complication is we'd like thread stability, new() could be called from different threads and want same randomization.

Perhaps then it's easiest to use the static class name and accept the same randomization for every object of a class. People that care about this should really be using either different seeds, or "rand" tags. Also the rand-reset stuff is really more for design randomization, classes are usually for the verif world where that design randomization doesn't apply and again people should be using "rand" and constraints for such classes.

toddstrader · 2025-05-05T17:56:57Z

I made the class --x-initial change as suggested above. Some other notes:

I didn't touch vlReadMem::setData() which also uses VL_RAND_RESET_I(). This one feels more self-inflicted because you have to provide a memory file with x's or z's for it to matter. I guess we could have this take in the scope and compute a stable random value based off of that and maybe the character offset in the file? But I'm inclined not to mess with this one for now.

I removed m_reset from AstRand and instead have EmitCFunc::emitVarResetRecurse() detect whether the variable is XTEMP or not. I think it's fine to do the --x-assign randomization during reset because there shouldn't really be a notion of --x-initial for verilator-constructed variables. In fact, we were doing both x initial and x assign randomization for these variables, so this gets rid of one of them.

Beyond that, everything is passing for me locally and I've run out of immediate things to do here. So I believe this is in a review-able state.

There are a couple concepts in here at this point. Please let me know if it's desirable to chop any of this up.

toddstrader · 2025-05-05T17:59:19Z

test_regress/t/t_x_assign.cpp

    // Evaluate one clock posedge
    top->clk = 0;
    top->eval();
    top->clk = 1;
    top->eval();

-#if defined(T_X_ASSIGN_UNIQUE_0) || defined(T_X_ASSIGN_UNIQUE_1)


I don't understand why this test was previously expecting a "random" value during randReset() of 0 or 1.

Not sure perhaps mis-copied from another test.

test_regress/t/t_x_rand_mt_stability.py

toddstrader · 2025-05-05T18:02:30Z

src/V3Options.cpp

@@ -2096,6 +2096,7 @@ V3Options::V3Options() {
    m_makeDir = "obj_dir";
    m_unusedRegexp = "*unused*";
    m_xAssign = "fast";
+    m_xInitial = "unique";


Just noticed that this parameter did not have the advertised default value.

Good catch. Looks like the code using it though treated "" as "unique".

test_regress/t/t_debug_emitv.py

wsnyder

Thanks, getting close.

I think the tests for stability need something to make sure that the outputs are compatible. My concern is we'll change something and update the golden files and not realize we've lost the similarity between them.

src/V3EmitCImp.cpp

include/verilated_funcs.h

wsnyder · 2025-05-05T22:29:02Z

include/verilated.cpp

+    if (Verilated::threadContextp()->randReset() == 0) return 0;
+    if (Verilated::threadContextp()->randReset() == 1) return ~0;
+    state[1] ^= state[0];
+    state[0] = (((state[0] << 55) | (state[0] >> 9)) ^ state[1] ^ (state[1] << 14));
+    state[1] = (state[1] << 36) | (state[1] >> 28);
+    return state[0] + state[1];


Can you use the VlRNG class instead?

Yeah. I switched to using VlRNG::rand64() which then makes VL_SCOPED_RAND_RESET_*() VL_MT_UNSAFE. I'm unsure of the implications here. Is this the right way to go or do I need to make something akin to VlRNG::vl_thread_rng() but for the global seed and which doesn't disturb the seed epoch?

wsnyder · 2025-05-05T23:25:03Z

src/V3EmitCFunc.cpp

+                out += ", " + varNameProtected + suffix;
+                if (!zeroit) {
+                    emitVarResetScopeHash();
+                    uint64_t salt = std::hash<std::string>{}(varp->prettyName());


Suggested change

uint64_t salt = std::hash<std::string>{}(varp->prettyName());

const uint64_t salt = std::hash<std::string>{}(varp->prettyName());

C++ "Hash functions are only required to produce the same result for the same input within a single execution of a program". So use VString functions instead, making a new one if needed.

That's a bummer.

I dropped a hash function in V3String and also the runtime library. Is there something better to do here? I don't see any hash functions already associated with VString. There is a SHA256 in there, but it speaks about 64B blocks and I don't think that's what we want for this application. There's also V3Hash but that is intrinsically 32b. Also, is there a reasonable way to share code between the compiler and the runtime? I'm not seeing any examples.

src/V3EmitCFunc.cpp

toddstrader · 2025-05-06T14:07:50Z

Thanks, getting close.

I think the tests for stability need something to make sure that the outputs are compatible. My concern is we'll change something and update the golden files and not realize we've lost the similarity between them.

Yeah, see t_x_rand{,_mt}_stability.py. It already compares the golden files between the different flavors of this test (modulo the _zero one which is clearly not going to be the same).

One thing to note is that the single and multi-threaded versions do not compare. The x randomization values do, but the $random() values do not. I'm guessing this is known / expected. In any case, I'm then not comparing the results between the single and multi-threaded tests (even though I could do so by stripping out the $random() values) because the entire point of this exercise is to preserve PRNG behavior and that's apparently already not going to happen between these two sets of tests.

toddstrader · 2025-05-16T12:41:08Z

@wsnyder just circling back on this one. Anything else to do here?

wsnyder

Thanks, missed this was ready. A few minor things then can merge.

include/verilated.cpp

src/V3String.cpp

….' See (#6018). Reverts 4581023 plus line in Changes file

…lator#2662) (verilator#5958).' See (verilator#6018)." This reverts commit 640339e.

… (#6025)

toddstrader mentioned this pull request Apr 25, 2025

PRNG stability w/ and w/o --trace #5959

Closed

wsnyder changed the title ~~POC: VL_RAND_RESET PRNG stability~~ Fix --x-initial and --x-assign random stability (#2662) Apr 25, 2025

wsnyder reviewed Apr 25, 2025

View reviewed changes

toddstrader force-pushed the rand_reset_stability branch from 2609ac3 to 3acdb15 Compare April 28, 2025 20:40

toddstrader marked this pull request as ready for review May 5, 2025 17:46

toddstrader commented May 5, 2025

View reviewed changes

test_regress/t/t_x_rand_mt_stability.py Show resolved Hide resolved

toddstrader commented May 5, 2025

View reviewed changes

toddstrader force-pushed the rand_reset_stability branch from 1db7971 to b40500a Compare May 5, 2025 19:38

toddstrader commented May 5, 2025

View reviewed changes

test_regress/t/t_debug_emitv.py Show resolved Hide resolved

wsnyder reviewed May 5, 2025

View reviewed changes

wsnyder reviewed May 16, 2025

View reviewed changes

include/verilated.cpp Outdated Show resolved Hide resolved

src/V3String.cpp Outdated Show resolved Hide resolved

src/V3String.cpp Outdated Show resolved Hide resolved

toddstrader added 3 commits May 16, 2025 11:15

x rand test

095eefc

POC: VL_RAND_RESET PRNG stability

c2549d1

runtime scope hashing -- WIP

f0005ca

toddstrader added 18 commits May 16, 2025 11:15

split rand reset into scoped and not

63e2e68

format

8f740ca

Handle classes

228a7c2

vltmt

76806e4

VL_MT_SAFE

30434fb

cleanup

fba7ab5

remove xtemp initial assign

d53da10

AstRand::m_reset rip out

31a388a

fix rand initial

52ba867

use --x-assign for XTEMP reset

5ad37ec

this / vlSelf

02925b3

fix t_debug_emitv

c2a5052

py fix

df85809

CR

cfb74a4

golden

8d691d5

unsafe

4680ade

fallthrough

2189af1

CR

8efb24b

toddstrader force-pushed the rand_reset_stability branch from 2cbc7d8 to 8efb24b Compare May 16, 2025 15:18

wsnyder merged commit 4581023 into verilator:master May 16, 2025
36 checks passed

toddstrader deleted the rand_reset_stability branch May 16, 2025 18:07

wsnyder mentioned this pull request May 17, 2025

Fix scope randomized failure #6018

Closed

wsnyder added a commit that referenced this pull request May 18, 2025

Revert 'Fix --x-initial and --x-assign random stability (#2662) (#5958)…

640339e

….' See (#6018). Reverts 4581023 plus line in Changes file

toddstrader added a commit to toddstrader/verilator that referenced this pull request May 19, 2025

Revert "Revert 'Fix --x-initial and --x-assign random stability (veri…

e581d71

…lator#2662) (verilator#5958).' See (verilator#6018)." This reverts commit 640339e.

toddstrader added a commit to toddstrader/verilator that referenced this pull request May 21, 2025

Revert "Revert 'Fix --x-initial and --x-assign random stability (veri…

8e76740

…lator#2662) (verilator#5958).' See (verilator#6018)." This reverts commit 640339e.

toddstrader added a commit to toddstrader/verilator that referenced this pull request May 27, 2025

Revert "Revert 'Fix --x-initial and --x-assign random stability (veri…

40e7d8d

…lator#2662) (verilator#5958).' See (verilator#6018)." This reverts commit 640339e.

wsnyder pushed a commit that referenced this pull request May 27, 2025

Fix --x-initial and --x-assign random stability (#2662) (#5958) (#6018)…

4b041c6

… (#6025)

		uint64_t VlRNG::rand64(bool reset, uint64_t salt) VL_MT_UNSAFE {
		if (reset) { return (m_state[0] + m_state[1]) ^ salt; }

	uint64_t salt = std::hash<std::string>{}(varp->prettyName());
	const uint64_t salt = std::hash<std::string>{}(varp->prettyName());

Fix --x-initial and --x-assign random stability (#2662) #5958

Fix --x-initial and --x-assign random stability (#2662) #5958

Uh oh!

Conversation

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!