8000 Rollup of 9 pull requests by Dylan-DPC · Pull Request #94425 · rust-lang/rust · GitHub
[go: up one dir, main page]

Skip to content

Rollup of 9 pull requests #94425

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 24 commits into from
Closed
Show file tree
Hide file tree
Changes from 2 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
261765f
Fix show error message when literal overflows in match patterns
light4 Feb 25, 2022
5b58c6f
fix E0081 error message due to lit_to_const behavior change
light4 Feb 25, 2022
b0c4db3
Fix ci test issue-94239.rs
light4 Feb 25, 2022
2bf56b9
Document that pre-expansion lint passes are deprecated
xFrednet Feb 26, 2022
ca2ad69
Make more use of `let_chains`
c410-f3r Feb 26, 2022
915740c
Add test for #79465 to prevent regression
GuillaumeGomez Feb 26, 2022
530e2dd
added TcpListener::bind_with_backlog to both sys_common and std::net
BartMassey Feb 25, 2022
6f03b95
added UnixListener::bind_with_backlog and UnixListener::bind_addr_wit…
BartMassey Feb 26, 2022
a6ca55b
modified TcpListener and UnixListener tests to use with_backlog varia…
BartMassey Feb 26, 2022
8643947
added issue numbers for with_backlog
BartMassey Feb 26, 2022
b98c009
unpublished listener DEFAULT_BACKLOGs
BartMassey Feb 26, 2022
35e3aaf
avoid rebuilding bootstrap when PATH changes
RalfJung Feb 27, 2022
b582bd3
For MIRI, cfg out the swap logic from 94212
scottmcm Feb 27, 2022
7ad4297
Use the first codegen backend in the config.toml as default
bjorn3 Feb 27, 2022
0eb8b32
3 - Make more use of let_chains
c410-f3r Feb 27, 2022
02c614b
Rollup merge of #94354 - light4:issue-94239, r=Dylan-DPC
Dylan-DPC Feb 27, 2022
fdcf90f
Rollup merge of #94396 - c410-f3r:yet-more-let-chains, r=Dylan-DPC
Dylan-DPC Feb 27, 2022
a87dc77
Rollup merge of #94397 - xFrednet:69838-deprecate-pre-expansion, r=cj…
Dylan-DPC Feb 27, 2022
8622ee2
Rollup merge of #94399 - GuillaumeGomez:regression-test-79465, r=matt…
Dylan-DPC Feb 27, 2022
c7844d7
Rollup merge of #94407 - BartMassey-upstream:with-backlog, r=joshtrip…
Dylan-DPC Feb 27, 2022
715f90a
Rollup merge of #94409 - RalfJung:path, r=Mark-Simulacrum
Dylan-DPC Feb 27, 2022
92ad84d
Rollup merge of #94412 - scottmcm:cfg-out-miri-from-swap, r=oli-obk
Dylan-DPC Feb 27, 2022
a20392e
Rollup merge of #94415 - bjorn3:cfg_default_backend, r=Mark-Simulacrum
Dylan-DPC Feb 27, 2022
56f14ca
Rollup merge of #94420 - c410-f3r:more-let-chains, r=Dylan-DPC
Dylan-DPC Feb 27, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 22 additions & 5 deletions library/core/src/mem/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -708,7 +708,10 @@ pub const fn swap<T>(x: &mut T, y: &mut T) {
// understanding `mem::replace`, `Option::take`, etc. - a better overall
// solution might be to make `ptr::swap_nonoverlapping` into an intrinsic, which
// a backend can choose to implement using the block optimization, or not.
#[cfg(not(target_arch = "spirv"))]
// NOTE(scottmcm) MIRI is disabled here as reading in smaller units is a
// pessimization for it. Also, if the type contains any unaligned pointers,
// copying those over multiple reads is difficult to support.
#[cfg(not(any(target_arch = "spirv", miri)))]
{
// For types that are larger multiples of their alignment, the simple way
// tends to copy the whole thing to stack rather than doing it one part
Expand Down Expand Up @@ -737,12 +740,26 @@ pub const fn swap<T>(x: &mut T, y: &mut T) {
#[rustc_const_unstable(feature = "const_swap", issue = "83163")]
#[inline]
pub(crate) const fn swap_simple<T>(x: &mut T, y: &mut T) {
// We arrange for this to typically be called with small types,
// so this reads-and-writes approach is actually better than using
// copy_nonoverlapping as it easily puts things in LLVM registers
// directly and doesn't end up inlining allocas.
// And LLVM actually optimizes it to 3×memcpy if called with
// a type larger than it's willing to keep in a register.
// Having typed reads and writes in MIR here is also good as
// it lets MIRI and CTFE understand them better, including things
// like enforcing type validity for them.
// Importantly, read+copy_nonoverlapping+write introduces confusing
// asymmetry to the behaviour where one value went through read+write
// whereas the other was copied over by the intrinsic (see #94371).

// SAFETY: exclusive references are always valid to read/write,
// are non-overlapping, and nothing here panics so it's drop-safe.
// including being aligned, and nothing here panics so it's drop-safe.
unsafe {
let z = ptr::read(x);
ptr::copy_nonoverlapping(y, x, 1);
ptr::write(y, z);
let a = ptr::read(x);
let b = ptr::read(y);
ptr::write(x, b);
ptr::write(y, a);
}
}

Expand Down
23 changes: 15 additions & 8 deletions library/core/src/ptr/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -419,6 +419,7 @@ pub const unsafe fn swap<T>(x: *mut T, y: *mut T) {
#[stable(feature = "swap_nonoverlapping", since = "1.27.0")]
#[rustc_const_unstable(feature = "const_swap", issue = "83163")]
pub const unsafe fn swap_nonoverlapping<T>(x: *mut T, y: *mut T, count: usize) {
#[allow(unused)]
macro_rules! attempt_swap_as_chunks {
($ChunkTy:ty) => {
if mem::align_of::<T>() >= mem::align_of::<$ChunkTy>()
Expand All @@ -437,15 +438,21 @@ pub const unsafe fn swap_nonoverlapping<T>(x: *mut T, y: *mut T, count: usize) {
};
}

// Split up the slice into small power-of-two-sized chunks that LLVM is able
// to vectorize (unless it's a special type with more-than-pointer alignment,
// because we don't want to pessimize things like slices of SIMD vectors.)
if mem::align_of::<T>() <= mem::size_of::<usize>()
&& (!mem::size_of::<T>().is_power_of_two()
|| mem::size_of::<T>() > mem::size_of::<usize>() * 2)
// NOTE(scottmcm) MIRI is disabled here as reading in smaller units is a
// pessimization for it. Also, if the type contains any unaligned pointers,
// copying those over multiple reads is difficult to support.
#[cfg(not(miri))]
{
attempt_swap_as_chunks!(usize);
attempt_swap_as_chunks!(u8);
// Split up the slice into small power-of-two-sized chunks that LLVM is able
// to vectorize (unless it's a special type with more-than-pointer alignment,
// because we don't want to pessimize things like slices of SIMD vectors.)
if mem::align_of::<T>() <= mem::size_of::<usize>()
&& (!mem::size_of::<T>().is_power_of_two()
|| mem::size_of::<T>() > mem::size_of::<usize>() * 2)
{
attempt_swap_as_chunks!(usize);
attempt_swap_as_chunks!(u8);
}
}

// SAFETY: Same preconditions as this function
Expand Down
27 changes: 27 additions & 0 deletions src/test/codegen/swap-large-types.rs
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,9 @@ pub fn swap_std(x: &mut KeccakBuffer, y: &mut KeccakBuffer) {
swap(x, y)
}

// Verify that types with usize alignment are swapped via vectored usizes,
// not falling back to byte-level code.

// CHECK-LABEL: @swap_slice
#[no_mangle]
pub fn swap_slice(x: &mut [KeccakBuffer], y: &mut [KeccakBuffer]) {
Expand All @@ -50,6 +53,8 @@ pub fn swap_slice(x: &mut [KeccakBuffer], y: &mut [KeccakBuffer]) {
}
}

// But for a large align-1 type, vectorized byte copying is what we want.

type OneKilobyteBuffer = [u8; 1024];

// CHECK-LABEL: @swap_1kb_slices
Expand All @@ -62,3 +67,25 @@ pub fn swap_1kb_slices(x: &mut [OneKilobyteBuffer], y: &mut [OneKilobyteBuffer])
x.swap_with_slice(y);
}
}

// This verifies that the 2×read + 2×write optimizes to just 3 memcpys
// for an unusual type like this. It's not clear whether we should do anything
// smarter in Rust for these, so for now it's fine to leave these up to the backend.
// That's not as bad as it might seem, as for example, LLVM will lower the
// memcpys below to VMOVAPS on YMMs if one enables the AVX target feature.
// Eventually we'll be able to pass `align_of::<T>` to a const generic and
// thus pick a smarter chunk size ourselves without huge code duplication.

#[repr(align(64))]
pub struct BigButHighlyAligned([u8; 64 * 3]);

// CHECK-LABEL: @swap_big_aligned
#[no_mangle]
pub fn swap_big_aligned(x: &mut BigButHighlyAligned, y: &mut BigButHighlyAligned) {
// CHECK-NOT: call void @llvm.memcpy
// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* noundef nonnull align 64 dereferenceable(192)
// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* noundef nonnull align 64 dereferenceable(192)
// CHECK: call void @llvm.memcpy.p0i8.p0i8.i64(i8* noundef nonnull align 64 dereferenceable(192)
// CHECK-NOT: call void @llvm.memcpy
swap(x, y)
}
0