Benchmarks #46

BulaBula-zy · 2025-03-11T02:55:43Z

No description provided.

Summary: RFC https://discourse.llvm.org/t/rfc-python-callback-for-source-file-resolution/83545 SBModule will be used for resolve source file callback as Python function arguments. This diff allows these things. Can be instantiated from SBPlatform. Can be passed to/from Python. Test Plan: N/A. The next set of diffs in the stack have unittests and shell test validation Co-authored-by: Rahul Reddy Chamala <rachamal@fb.com>

Our platform has some constraints that allow us to make assumptions that aren't generally applicable to other platforms. We keep an entirely separate .s file for the routines.

…UFPS instruction (llvm#121778) Avoid always assuming the worst for v4f32 2 input shuffles, and match the SHUFPS pattern where possible - each pair of output elements must come from the same source register.

…21680) clone, getNumOperands, and getOperand haven't been used for quite some time. The only remaining useful thing is the common implementation of getBit.

Refactors `analysisStates` to use two nested maps . This prevents `eraseState` from having to scan through every analysis state which can be costly when there are many analysis states and/or `eraseState` is called frequently. Signed-off-by: Ian Wood <ianwood2024@u.northwestern.edu>

This allows us to use them in C++ code without needing to do a table lookup.

… bools. This is essentially a port of TargetLowering::scalarizeVectorStore(), which is used for the case where we have something like a store of <8 x s8> truncating to <8 x s1> in memory. The naive lowering is a sequence of extracts to compute a scalar value to store. AArch64's DAG implementation has some more smarts to improve this further which we can do later. Reviewers: topperc, davemgreen Pull Request: llvm#121169

This case is different from the earlier <8 x i1> case handled because it triggers a legalization failure in lowerStore() that's intended for scalar code. It also was triggering incorrect bitcast actions in the AArch64 rules that weren't expecting truncating stores. With these two fixed, more cases are handled. The code is still bad, including some missing load promotion in our combiners that result in dead stores hanging around at the end of codegen. Again, we can fix these in separate changes. Reviewers: davemgreen, madhur13490, topperc, arsenm Reviewed By: davemgreen Pull Request: llvm#121185

- adding Flatten and Branch to if stmt. - adding dxil control flow hint metadata generation - modifing spirv OpSelectMerge to account for the specific attributes. Closes llvm#70112 --------- Co-authored-by: Joao Saffran <jderezende@microsoft.com> Co-authored-by: joaosaffran <joao.saffran@microsoft.com>

The error returned from the driver is actually "Could not scan", not "Could not parse". The reason that the test has been passing is that the FileCheck's regular expression "{{.*}}" was one of many sources of problems, and was quoted in the output. The "CHECK" line matched the quoted line instead of the actual error message.

Don't bother with separate getShiftAmountTy/getConstant calls.

…eType calls. NFC.

…119592) This PR resolves llvm#118845. I aimed to mirror the implementation `m_Shuffle()` in [PatternMatch.h](https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/IR/PatternMatch.h). Updated [SDPatternMatch.h](https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/CodeGen/SDPatternMatch.h) - Added `struct m_Mask` to match masks (`ArrayRef<int>`) - Added two `m_Shuffle` functions. One to match independently of mask, and one to match considering mask. - Added `struct SDShuffle_match` to match `ISD::VECTOR_SHUFFLE` considering mask Updated [SDPatternMatchTest.cpp](https://github.com/llvm/llvm-project/blob/main/llvm/unittests/CodeGen/SelectionDAGPatternMatchTest.cpp) - Added `matchVecShuffle` test, which tests the behavior of both `m_Shuffle()` functions - - - I am not sure if my test coverage is complete. I am not sure how to test a `false` match, simply test against a different instruction? [Other tests ](https://github.com/llvm/llvm-project/blob/main/llvm/unittests/CodeGen/SelectionDAGPatternMatchTest.cpp#L175), such as for `VSelect`, test against `Select`. I am not sure if there is an analogous instruction to compare against for `VECTOR_SHUFFLE`. I would appreciate some pointers in this area. In general, please liberally critique this PR! --------- Co-authored-by: Aidan <aidan.goldfarb@mail.mcgill.ca>

This PR fix the debug infor generation for RWBuffer types. - This implements the [same fix as DXC](microsoft/DirectXShaderCompiler#6296). - Adds the HLSLAttributedResource debug info generation Closes llvm#118523 --------- Co-authored-by: Joao Saffran <jderezende@microsoft.com> Co-authored-by: joaosaffran <joao.saffran@microsoft.com>

The 'set' construct is another fairly simple one, it doesn't have an associated statement and only a handful of allowed clauses. This patch implements it and all the rules for it, allowing 3 of its for clauses. The only exception is default_async, which will be implemented in a future patch, because it isn't just being enabled, it needs a complete new implementation.

A fairly simple one, only valid on the 'set' construct, this clause takes an int expression. Most of the work was already done as a part of parsing, so this patch ends up being a lot of infrastructure.

…m#121661) SearchableTable is the legacy version that does not appear to be well documented. Not sure if the plan was to delete it eventually. We can eventually use the PrimaryKey feature of GenericTable to remove one of the SearchIndex declarations. This will sort the generated table by the primary key and remove the separately generated indexing table to reduce .rodata size. This patch is just the mechanical migration. The size savings will be done in follow ups.

…vm#121630) Extract all compiler options used to build "release" versions of libc API functions into a separate helper function, instead of burying this logic inside libc_function() macro. With this change, we further split two "flavors" of cc_library() produced for each libc public function: * `<function>.__internal__` library used in unit tests is *not* built with release copts and is thus indistinguishable from regular libc_support_library(). Arguably, it's a good thing, because all sources in a unit test are built with the same set of compiler flags, instead of "franken-build" when a subset of sources is always built with -O3. If a user needs to run the tests in optimized mode, they should really be using Bazel invocation-level compile flags instead. * `<function>` library that libc users can use to construct their own static archive *is* built with the same release copts as before. There is a pre-existing problem that its libc_support_library() dependencies are not built with the same copts. We're not addressing it here now.

) Summary: This is a holdover from when these targets were merged. They're basically the same but there's no reason they should be treated as identical. I think we will live with a little duplication.

Previously, registers and subregisters mapped to the same Dwarf encoding. We don't really have any way to refer to subregisters directly from Dwarf, the expression emitter should instead use DW_OPs to stencil out the subregister from the whole register. This was also confusing tools that need to map back to the llvm reg (e.g. dwarfdump), since getLLVMRegNum() would arbitrarily return the _LO16 register.

…vm#120991) Adds the ability to lookup and display all merged functions for an address in llvm-gsymutil. Now, when `--merged-functions` is used in combination with `--address/--addresses-from-stdin`, lookup results will contain information about merged functions, if available. To support printing merged function information when using the `--verbose` option, the `LookupResult` data structure also had to be extended with pointers to the raw function data and raw merged function data. This is because merged functions share the same address range, so it's not easy to look up the raw merged function data for a particular `LookupResult` that is based on a merged function.

Support true16 format for v_fma_f16 in MC. Since we are replacing v_fma_f16 to v_fma_f16_t16/v_fma_f16_fake16 in Post-GFX11, have to update the CodeGen pattern for v_fma_f16_fake16 to get CodeGen test passing. There is no pattern modified/created, but just replacing the v_fma_f16 with fake16 format.

The signature of `CheckTemplateArgValues` implements error handling via the `bool` return type, yet always returned false. The single possible error case instead used `PrintFatalError,` which exits the program afterward. This behavior is undesirable: It prevents any further errors from being printed and makes TableGen less usable as a library as it crashes the entire process (e.g. `tblgen-lsp-server`). This PR therefore fixes the issue by using `Error` instead and returning true if an error occurred. All callers already perform proper error handling. As `llvm-tblgen` exits on error, a test was also added to the LSP to ensure it exits normally despite the error.

Support true16 format for v_cvt_u32_u16 in MC

The file suffix .f95 remained after 7a07d8e, change it to .ll.

Since that benchmark is testing n*n inputs, the batch size reported to GoogleBenchmark should be that amount. Otherwise, GoogleBenchmark reports the timing for calling std::gcd on the whole sequence, which is misleading.

This optimizes `std::filesystem::copy_file` to use the `copy_file_range` syscall (Linux and FreeBSD) when available. It allows for reflinks on filesystems such as btrfs, zfs and xfs, and server-side copy for network filesystems such as NFS.

As feedback on llvm#119052, it was recommended I add a new bit to delineate internal and external progress events. This patch adds this new category, and sets up Progress.h to support external events via SBProgress.

…m#120858) In our implementation, failing these checks would result in a null pointer access rather than an out-of-bounds access.

This reverts commit 81fc3ad. This breaks some LLDB tests, e.g. SymbolFile/DWARF/x86/no_unique_address-with-bitfields.cpp: lldb: ../llvm-project/clang/lib/AST/Decl.cpp:4604: unsigned int clang::FieldDecl::getBitWidthValue() const: Assertion `isa<ConstantExpr>(getBitWidth())' failed.

The dependency file and the P1689 file are text files, but the open call misses the OF_Text flag. This PR adds the flag. Fixes regressions in test cases ClangScanDeps/modules-extern-unrelated.m and ClangScanDeps/P1689.cppm.

Need to check if the GEP bases are equal and return false early. Also, need to return false if the lookup is too deep, considering bases equal too. Fixes a crash in the assertion.

…e DAP object lifecycle. (llvm#120457)" This reverts commit 0d9cf26. Breaks the lldb-aarch64-windows buildbot.

This is in order to prepare for future MR where we will extend `ReachingDefAnalysis` to stack slots.

…"default<O3>" to allow DOS to correctly evaluate the RUN command Necessary for running update_test_checks.py on windows

The version string can be anything, don't restrict it to digits and dots. It's derived from the resource dir, so just check for that.

I think this is a false positive for a non-capturing lambda, but I can't find anything in the standard that guarantees that these have eternal lifetime.

…m#121366) This PR adds VALID_ELEMENT_ACCESS and VALID_INPUT_RANGE checks for vector<bool>.

Add baseline test for 64-bit adds when the low half of an operand is known 0.

If one of the inputs has all 0 bits, the low part cannot carry and we can just pass through the original value. Add case: https://alive2.llvm.org/ce/z/TNc7hf Sub case: https://alive2.llvm.org/ce/z/AjH2-J We could do this in the general case with computeKnownBits, but add is so common this could be potentially expensive for something which will fire infrequently. One potential concern is this could break the 64-bit add we expect to see for addressing mode matching, but these constants shouldn't appear often in addressing expressions. One test for large offset expressions changes but isn't worse. Fixes ROCm#237

The argument is always zero now.

Use the spv version of the resource.getpointer intrinsic when targeting SPIR-V.

Detect cases where ABI attributes between the call-site and the called function differ. For now this only handles argument attributes. Inspired by https://discourse.llvm.org/t/difference-between-call-site-attributes-and-declaration-attributes/83902.

…ents (llvm#122073) Some functions of the deprecated 1:N dialect conversion were marked as `LLVM_DEPRECATED`. This caused compilation warnings because there are still test cases of the 1:N dialect conversion framework. (These test cases will be deleted at the same time when the 1:N driver is deleted.)

This significantly reduces the amount of debug information generated for codebases using libc++, without hurting the debugging experience.

While VALUES is not actually used by LLVM_MAKE_OPT_ID_WITH_ID_PREFIX threading the correct value through is clearer and avoids the potential for strange bugs if this ever changes.

…20995) LLD (and other Mach-O linkers) when preparing an encryptable binary make space to leave all the load commands in an non-encrypted page (see [1]) When using objcopy of a small encryptable binary, the code was not respecting this fact, and the encryptable segments were not kept beyond the first page. This was obvious for small or empty binaries. The changes introduced here keep track if a `LC_ENCRYPTION_INFO` or `LC_ENCRYPTION_INFO_64` has been seen, and in such case, it adds a full page of offset in order to leave the load commands in its own page (similar to what LLD is doing). [1]: https://github.com/llvm/llvm-project/blob/d8e792931226b15d9d2424ecd24ccfe13adc2367/lld/MachO/SyntheticSections.cpp#L90-L93

…edICmps (llvm#121970)

…lvm#107983)"" (llvm#122022) Reverts llvm#121352 Triggers "vector type should not be a bool!" on: ``` bool a[100]; bool b[100]; auto t = std::mismatch(std::begin(a), std::end(a), std::begin(b), std::end(b)); ``` https://godbolt.org/z/Y73s3sdef

…ubstitution"" (llvm#122130) Unfortunately that breaks some code on Windows when lambdas come into play, as reported in llvm#102857 (comment) This reverts commit 96eced6.

…xes for SysRegs. NFC (llvm#122001) Use PrimaryKeyReturnRange to get all of the registers with the same encoding. This allows AltName to be removed.

lookupExactFPImmByRepr is never called. The Name field in the table is unused. The Name is only used by the GenericEnum.

* remove code in memory access handler * remove vector clock operation, i.e. release and acuqire operation

* to measure the overhead of the function calls/instrumentation of memory accesses

* to measure the overhead of the interceptors

# Conflicts: # compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp # compiler-rt/lib/tsan/rtl/tsan_interface_atomic.cpp # compiler-rt/lib/tsan/rtl/tsan_new_delete.cpp # llvm/lib/Transforms/Instrumentation/ThreadSanitizer.cpp

rchamala and others added 30 commits January 6, 2025 09:17

[AArch64][SME] Add Darwin specific SME ABI routines.

cb5d866

Our platform has some constraints that allow us to make assumptions that aren't generally applicable to other platforms. We keep an entirely separate .s file for the routines.

[CostModel][X86] Attempt to match cheap v4f32 shuffles that map to SH…

db88071

…UFPS instruction (llvm#121778) Avoid always assuming the worst for v4f32 2 input shuffles, and match the SHUFPS pattern where possible - each pair of output elements must come from the same source register.

[TableGen] Remove unused functionality from OpInit class. NFC (llvm#1…

d40235a

…21680) clone, getNumOperands, and getOperand haven't been used for quite some time. The only remaining useful thing is the common implementation of getBit.

[RISCV] Add Enum for CSR encodings. (llvm#121674)

1401703

This allows us to use them in C++ code without needing to do a table lookup.

[DAG] expandUINT_TO_FP - use getShiftAmountConstant helper. NFC.

112793a

Don't bother with separate getShiftAmountTy/getConstant calls.

[DAG] VectorLegalizer::ExpandUINT_TO_FLOAT- pull out repeated getValu…

9236751

…eType calls. NFC.

[OpenACC] Implement 'default_async' sema

ff24e9a

A fairly simple one, only valid on the 'set' construct, this clause takes an int expression. Most of the work was already done as a part of parsing, so this patch ends up being a lot of infrastructure.

[libc] Split AMDGPU and NVPTX configs into separate folders (llvm#120153

f4bab06

) Summary: This is a holdover from when these targets were merged. They're basically the same but there's no reason they should be treated as identical. I think we will live with a little duplication.

[gn] port 21edac2 (BuiltinsSPIRV)

40a00af

[AMDGPU][True16][MC] true16 for v_cvt_u32_u16 (llvm#120646)

4af3332

Support true16 format for v_cvt_u32_u16 in MC

[flang][test] One more fix in flang/test/Driver/parse-error.ll

6e6f89c

The file suffix .f95 remained after 7a07d8e, change it to .ll.

[LLDB] Add external progress bit category (llvm#120171)

774c226

As feedback on llvm#119052, it was recommended I add a new bit to delineate internal and external progress events. This patch adds this new category, and sets up Progress.h to support external events via SBProgress.

[libc++][hardening] Add checks to forward_list element access. (llv…

bda7c9a

…m#120858) In our implementation, failing these checks would result in a null pointer access rather than an out-of-bounds access.

tbaederr and others added 29 commits January 8, 2025 15:09

[MLIR][GPU] Fix gpu.printf test syntax after f50f969

0d7022e

[SLP]Fix a crash for very long GEP chains

1160994

Need to check if the GEP bases are equal and return false early. Also, need to return false if the lookup is too deep, considering bases equal too. Fixes a crash in the assertion.

Revert "[lldb-dap] Ensure the IO forwarding threads are managed by th…

81898ac

…e DAP object lifecycle. (llvm#120457)" This reverts commit 0d9cf26. Breaks the lldb-aarch64-windows buildbot.

Revert llvm#116331 & llvm#121852 (llvm#122105)

b66f6b2

[ReachingDefAnalysis][NFC] Rename PhysReg to Reg. (llvm#122112)

f37bee1

This is in order to prepare for future MR where we will extend `ReachingDefAnalysis` to stack slots.

[PhaseOrdering][AArch64] block_scaling_decompr_8bit.ll - use -passes=…

322ff42

…"default<O3>" to allow DOS to correctly evaluate the RUN command Necessary for running update_test_checks.py on windows

Make test more lenient for custom clang version strings

fe162be

The version string can be anything, don't restrict it to digits and dots. It's derived from the resource dir, so just check for that.

Fix -Wdangling-assignment-gsl in ClangdLSPServerTests

a3b4d91

I think this is a false positive for a non-capturing lambda, but I can't find anything in the standard that guarantees that these have eternal lifetime.

[libc++] Add missing hardening checks and tests for vector<bool> (llv…

b054289

…m#121366) This PR adds VALID_ELEMENT_ACCESS and VALID_INPUT_RANGE checks for vector<bool>.

AMDGPU: Add baseline test for add64 with constant test (llvm#122048)

6376418

Add baseline test for 64-bit adds when the low half of an operand is known 0.

[Loads] Drop dead Offset argument (NFC)

a5c3cbf

The argument is always zero now.

[HLSL] Add SPIR-V version of getPointer. (llvm#121963)

92e575d

Use the spv version of the resource.getpointer intrinsic when targeting SPIR-V.

[libc++] Put _LIBCPP_NODEBUG on all internal aliases (llvm#118710)

f695852

This significantly reduces the amount of debug information generated for codebases using libc++, without hurting the debugging experience.

[OptTable] Fix typo VALUE => VALUES (NFCI) (llvm#121523)

e540546

While VALUES is not actually used by LLVM_MAKE_OPT_ID_WITH_ID_PREFIX threading the correct value through is clearer and avoids the potential for strange bugs if this ever changes.

[InstCombine] move foldAndOrOfICmpsOfAndWithPow2 into foldLogOpOfMask…

d4182f1

…edICmps (llvm#121970)

Revert "[Clang] Implement CWG2369 "Ordering between constraints and s…

3972ed5

…ubstitution"" (llvm#122130) Unfortunately that breaks some code on Windows when lambdas come into play, as reported in llvm#102857 (comment) This reverts commit 96eced6.

[AArch64] Use GenericTable PrimaryKey to remove one of the SearchInde…

b05be2a

…xes for SysRegs. NFC (llvm#122001) Use PrimaryKeyReturnRange to get all of the registers with the same encoding. This allows AltName to be removed.

[AArch64] Simplify ExactFPImm GenericTable. NFC (llvm#121827)

29ed600

lookupExactFPImmByRepr is never called. The Name field in the table is unused. The Name is only used by the GenericEnum.

[EXPER] remove code to measure the overhead of FastTrack

949b563

* remove code in memory access handler * remove vector clock operation, i.e. release and acuqire operation

[EXPER] remove the instrumentation that calls the access handlers

b1ec8a2

* to measure the overhead of the function calls/instrumentation of memory accesses

[EXPER] remove calls to internal function

0a43028

* to measure the overhead of the interceptors

Merge branch 'monitor-clean' into benchmarks-merge

3d2dbe5

# Conflicts: # compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp # compiler-rt/lib/tsan/rtl/tsan_interface_atomic.cpp # compiler-rt/lib/tsan/rtl/tsan_new_delete.cpp # llvm/lib/Transforms/Instrumentation/ThreadSanitizer.cpp

BulaBula-zy merged commit 412ba15 into monitor-clean Mar 11, 2025
50 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Benchmarks #46

Benchmarks #46

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Benchmarks #46

Benchmarks #46

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!