8000 Benchmarks by BulaBula-zy · Pull Request #46 · focs-lab/llvm-project · GitHub
[go: up one dir, main page]

Skip to content

Benchmarks #46

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10,000 commits into from
Mar 11, 2025
Merged

Benchmarks #46

merged 10,000 commits into from
Mar 11, 2025

Conversation

BulaBula-zy
Copy link
Collaborator

No description provided.

rchamala and others added 30 commits January 6, 2025 09:17
Summary:
RFC
https://discourse.llvm.org/t/rfc-python-callback-for-source-file-resolution/83545

SBModule will be used for resolve source file callback as Python
function arguments. This diff allows these things.

Can be instantiated from SBPlatform.
Can be passed to/from Python.

Test Plan:
N/A. The next set of diffs in the stack have unittests and shell test
validation

Co-authored-by: Rahul Reddy Chamala <rachamal@fb.com>
Our platform has some constraints that allow us to make assumptions that
aren't generally applicable to other platforms. We keep an entirely separate
.s file for the routines.
…UFPS instruction (llvm#121778)

Avoid always assuming the worst for v4f32 2 input shuffles, and match the SHUFPS pattern where possible - each pair of output elements must come from the same source register.
…21680)

clone, getNumOperands, and getOperand haven't been used for quite some
time. The only remaining useful thing is the common implementation of
getBit.
Refactors `analysisStates` to use two nested maps . This prevents
`eraseState` from having to scan through every analysis state which can
be costly when there are many analysis states and/or `eraseState` is
called frequently.

Signed-off-by: Ian Wood <ianwood2024@u.northwestern.edu>
This allows us to use them in C++ code without needing to do a table
lookup.
… bools.

This is essentially a port of TargetLowering::scalarizeVectorStore(), which
is used for the case where we have something like a store of <8 x s8> truncating
to <8 x s1> in memory. The naive lowering is a sequence of extracts to compute
a scalar value to store.

AArch64's DAG implementation has some more smarts to improve this further which
we can do later.

Reviewers: topperc, davemgreen

Pull Request: llvm#121169
This case is different from the earlier <8 x i1> case handled because it triggers
a legalization failure in lowerStore() that's intended for scalar code.

It also was triggering incorrect bitcast actions in the AArch64 rules that weren't
expecting truncating stores.

With these two fixed, more cases are handled. The code is still bad, including
some missing load promotion in our combiners that result in dead stores hanging
around at the end of codegen. Again, we can fix these in separate changes.

Reviewers: davemgreen, madhur13490, topperc, arsenm

Reviewed By: davemgreen

Pull Request: llvm#121185
- adding Flatten and Branch to if stmt.
- adding dxil control flow hint metadata generation
- modifing spirv OpSelectMerge to account for the specific attributes.

Closes llvm#70112

---------

Co-authored-by: Joao Saffran <jderezende@microsoft.com>
Co-authored-by: joaosaffran <joao.saffran@microsoft.com>
The error returned from the driver is actually "Could not scan", not
"Could not parse". The reason that the test has been passing is that
the FileCheck's regular expression "{{.*}}" was one of many sources
of problems, and was quoted in the output. The "CHECK" line matched
the quoted line instead of the actual error message.
Don't bother with separate getShiftAmountTy/getConstant calls.
…119592)

This PR resolves llvm#118845. I aimed to mirror the implementation
`m_Shuffle()` in
[PatternMatch.h](https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/IR/PatternMatch.h).

Updated
[SDPatternMatch.h](https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/CodeGen/SDPatternMatch.h)
- Added `struct m_Mask` to match masks (`ArrayRef<int>`)
- Added two `m_Shuffle` functions. One to match independently of mask,
and one to match considering mask.
- Added `struct SDShuffle_match` to match `ISD::VECTOR_SHUFFLE`
considering mask

Updated
[SDPatternMatchTest.cpp](https://github.com/llvm/llvm-project/blob/main/llvm/unittests/CodeGen/SelectionDAGPatternMatchTest.cpp)
- Added `matchVecShuffle` test, which tests the behavior of both
`m_Shuffle()` functions

- - -

I am not sure if my test coverage is complete. I am not sure how to test
a `false` match, simply test against a different instruction? [Other
tests
](https://github.com/llvm/llvm-project/blob/main/llvm/unittests/CodeGen/SelectionDAGPatternMatchTest.cpp#L175),
such as for `VSelect`, test against `Select`. I am not sure if there is
an analogous instruction to compare against for `VECTOR_SHUFFLE`. I
would appreciate some pointers in this area. In general, please
liberally critique this PR!

---------

Co-authored-by: Aidan <aidan.goldfarb@mail.mcgill.ca>
This PR fix the debug infor generation for RWBuffer types.
- This implements the [same fix as
DXC](microsoft/DirectXShaderCompiler#6296).
- Adds the HLSLAttributedResource debug info generation

Closes llvm#118523

---------

Co-authored-by: Joao Saffran <jderezende@microsoft.com>
Co-authored-by: joaosaffran <joao.saffran@microsoft.com>
The 'set' construct is another fairly simple one, it doesn't have an
associated statement and only a handful of allowed clauses. This patch
implements it and all the rules for it, allowing 3 of its for clauses.
The only exception is default_async, which will be implemented in a
future patch, because it isn't just being enabled, it needs a complete
new implementation.
A fairly simple one, only valid on the 'set' construct, this clause
takes an int expression.  Most of the work was already done as a part of
parsing, so this patch ends up being a lot of infrastructure.
…m#121661)

SearchableTable is the legacy version that does not appear to be well
documented. Not sure if the plan was to delete it eventually.

We can eventually use the PrimaryKey feature of GenericTable to remove
one of the SearchIndex declarations. This will sort the generated table
by the primary key and remove the separately generated indexing table to
reduce .rodata size.

This patch is just the mechanical migration. The size savings will be
done in follow ups.
…vm#121630)

Extract all compiler options used to build "release" versions of libc
API functions into a separate helper function, instead of burying this
logic inside libc_function() macro.

With this change, we further split two "flavors" of cc_library()
produced for each libc public function:

* `<function>.__internal__` library used in unit tests is *not* built
with release copts and is thus indistinguishable from regular
libc_support_library(). Arguably, it's a good thing, because all sources
in a unit test are built with the same set of compiler flags, instead of
"franken-build" when a subset of sources is always built with -O3. If a
user needs to run the tests in optimized mode, they should really be
using Bazel invocation-level compile flags instead.
* `<function>` library that libc users can use to construct their own
static archive *is* built with the same release copts as before. There
is a pre-existing problem that its libc_support_library() dependencies
are not built with the same copts. We're not addressing it here now.
)

Summary:
This is a holdover from when these targets were merged. They're
basically the same but there's no reason they should be treated as
identical. I think we will live with a little duplication.
Previously, registers and subregisters mapped to the same Dwarf
encoding. We don't really have any way to refer to subregisters directly
from Dwarf, the expression emitter should instead use DW_OPs to stencil
out the subregister from the whole register. This was also confusing
tools that need to map back to the llvm reg (e.g. dwarfdump), since
getLLVMRegNum() would arbitrarily return the _LO16 register.
…vm#120991)

Adds the ability to lookup and display all merged functions for an
address in llvm-gsymutil.

Now, when `--merged-functions` is used in combination with
`--address/--addresses-from-stdin`, lookup results will contain
information about merged functions, if available.

To support printing merged function information when using the
`--verbose` option, the `LookupResult` data structure also had to be
extended with pointers to the raw function data and raw merged function
data. This is because merged functions share the same address range, so
it's not easy to look up the raw merged function data for a particular
`LookupResult` that is based on a merged function.
Support true16 format for v_fma_f16 in MC.

Since we are replacing v_fma_f16 to v_fma_f16_t16/v_fma_f16_fake16 in
Post-GFX11, have to update the CodeGen pattern for v_fma_f16_fake16 to
get CodeGen test passing. There is no pattern modified/created, but just
replacing the v_fma_f16 with fake16 format.
The signature of `CheckTemplateArgValues` implements error handling via
the `bool` return type, yet always returned false. The single possible
error case instead used `PrintFatalError,` which exits the program
afterward.

This behavior is undesirable: It prevents any further errors from being
printed and makes TableGen less usable as a library as it crashes the
entire process (e.g. `tblgen-lsp-server`).

This PR therefore fixes the issue by using `Error` instead and returning
true if an error occurred. All callers already perform proper error
handling.

As `llvm-tblgen` exits on error, a test was also added to the LSP to
ensure it exits normally despite the error.
Support true16 format for v_cvt_u32_u16 in MC
The file suffix .f95 remained after 7a07d8e, change it to .ll.
Since that benchmark is testing n*n inputs, the batch size reported to
GoogleBenchmark should be that amount. Otherwise, GoogleBenchmark
reports the timing for calling std::gcd on the whole sequence, which is
misleading.
This optimizes `std::filesystem::copy_file` to use the `copy_file_range`
syscall (Linux and FreeBSD) when available. It allows for reflinks on
filesystems such as btrfs, zfs and xfs, and server-side copy for network
filesystems such as NFS.
As feedback on llvm#119052, it was recommended I add a new bit to delineate
internal and external progress events. This patch adds this new
category, and sets up Progress.h to support external events via
SBProgress.
…m#120858)

In our implementation, failing these checks would result in a null
pointer access rather than an out-of-bounds access.
tbaederr and others added 29 commits January 8, 2025 15:09
This reverts commit 81fc3ad.

This breaks some LLDB tests, e.g.
SymbolFile/DWARF/x86/no_unique_address-with-bitfields.cpp:

lldb: ../llvm-project/clang/lib/AST/Decl.cpp:4604: unsigned int clang::FieldDecl::getBitWidthValue() const: Assertion `isa<ConstantExpr>(getBitWidth())' failed.
The dependency file and the P1689 file are text files, but the
open call misses the OF_Text flag. This PR adds the flag.
Fixes regressions in test cases ClangScanDeps/modules-extern-unrelated.m
and ClangScanDeps/P1689.cppm.
Need to check if the GEP bases are equal and return false early. Also,
need to return false if the lookup is too deep, considering bases equal
too. Fixes a crash in the assertion.
…e DAP object lifecycle. (llvm#120457)"

This reverts commit 0d9cf26. Breaks the
lldb-aarch64-windows buildbot.
This is in order to prepare for future MR where we will extend
`ReachingDefAnalysis` to stack slots.
…"default<O3>" to allow DOS to correctly evaluate the RUN command

Necessary for running update_test_checks.py on windows
The version string can be anything, don't restrict it to digits and
dots. It's derived from the resource dir, so just check for that.
I think this is a false positive for a non-capturing lambda, but I can't
find anything in the standard that guarantees that these have eternal
lifetime.
…m#121366)

This PR adds VALID_ELEMENT_ACCESS and VALID_INPUT_RANGE checks for vector<bool>.
Add baseline test for 64-bit adds when the low half of
an operand is known 0.
If one of the inputs has all 0 bits, the low part cannot
carry and we can just pass through the original value.

Add case: https://alive2.llvm.org/ce/z/TNc7hf
Sub case: https://alive2.llvm.org/ce/z/AjH2-J

We could do this in the general case with computeKnownBits,
but add is so common this could be potentially expensive for
something which will fire infrequently.

One potential concern is this could break the 64-bit add
we expect to see for addressing mode matching, but these
constants shouldn't appear often in addressing expressions.
One test for large offset expressions changes but isn't worse.

Fixes ROCm#237
The argument is always zero now.
Use the spv version of the resource.getpointer intrinsic when targeting
SPIR-V.
Detect cases where ABI attributes between the call-site and the called
function differ. For now this only handles argument attributes.

Inspired by
https://discourse.llvm.org/t/difference-between-call-site-attributes-and-declaration-attributes/83902.
…ents (llvm#122073)

Some functions of the deprecated 1:N dialect conversion were marked as
`LLVM_DEPRECATED`. This caused compilation warnings because there are
still test cases of the 1:N dialect conversion framework. (These test
cases will be deleted at the same time when the 1:N driver is deleted.)
This significantly reduces the amount of debug information generated
for codebases using libc++, without hurting the debugging experience.
While VALUES is not actually used by LLVM_MAKE_OPT_ID_WITH_ID_PREFIX
threading the correct value through is clearer and avoids the potential
for strange bugs if this ever changes.
…20995)

LLD (and other Mach-O linkers) when preparing an encryptable binary make
space to leave all the load commands in an non-encrypted page (see [1])

When using objcopy of a small encryptable binary, the code was not
respecting this fact, and the encryptable segments were not kept beyond
the first page. This was obvious for small or empty binaries.

The changes introduced here keep track if a `LC_ENCRYPTION_INFO` or
`LC_ENCRYPTION_INFO_64` has been seen, and in such case, it adds a full
page of offset in order to leave the load commands in its own page
(similar to what LLD is doing).

[1]:
https://github.com/llvm/llvm-project/blob/d8e792931226b15d9d2424ecd24ccfe13adc2367/lld/MachO/SyntheticSections.cpp#L90-L93
…lvm#107983)"" (llvm#122022)

Reverts llvm#121352

Triggers "vector type should not be a bool!" on:
```
  bool a[100];
  bool b[100];
  auto t = std::mismatch(std::begin(a), std::end(a), std::begin(b), std::end(b));
```

https://godbolt.org/z/Y73s3sdef
…ubstitution"" (llvm#122130)

Unfortunately that breaks some code on Windows when lambdas come into
play, as reported in
llvm#102857 (comment)

This reverts commit 96eced6.
…xes for SysRegs. NFC (llvm#122001)

Use PrimaryKeyReturnRange to get all of the registers with the same
encoding. This allows AltName to be removed.
lookupExactFPImmByRepr is never called. The Name field in the table is
unused. The Name is only used by the GenericEnum.
* remove code in memory access handler
* remove vector clock operation, i.e. release and acuqire operation
* to measure the overhead of the function calls/instrumentation of
memory accesses
* to measure the overhead of the interceptors
# Conflicts:
#	compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp
#	compiler-rt/lib/tsan/rtl/tsan_interface_atomic.cpp
#	compiler-rt/lib/tsan/rtl/tsan_new_delete.cpp
#	llvm/lib/Transforms/Instrumentation/ThreadSanitizer.cpp
@BulaBula-zy BulaBula-zy merged commit 412ba15 into monitor-clean Mar 11, 2025
50 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
0