试图通过 unsafe 绕过越界检查看越界检查有多少额外开销

V2EX = way to explore

V2EX 是一个关于分享和探索的地方

现在注册

已注册用户请登录

The Rust Programming Language

github.com/mozilla/rust

这是一个创建于 455 天前的主题，其中的信息可能已经有所发展或是发生改变。

结果在我的 M2 Mac mini 上,

有边界检查: 1152 ms
无边界检查: 1084 ms

基本是 6% 左右的时间开销(不确定我这个封装是否有额外开销).

附源代码:

struct Array<T>(*mut T); impl<T> From<*const T> for Array<T> { fn from(ptr: *const T) -> Self { Self(ptr as *mut _) } } impl<T> std::ops::Index<usize> for Array<T> { type Output = T; fn index(&self, index: usize) -> &Self::Output { unsafe { let ptr = self.0.offset(index as isize); &*ptr } } } impl<T> std::ops::IndexMut<usize> for Array<T> { fn index_mut(&mut self, index: usize) -> &mut Self::Output { unsafe { let ptr = self.0.offset(index as isize); &mut *ptr } } } fn main() { const SIZE: usize = 1024 * 1024; const LOOP: usize = 2_000_000; let mut arr = vec![0u32; SIZE]; let start = std::time::Instant::now(); // array indexing with boundary check { for _ in 0..LOOP { let index = rand::random::<usize>() % SIZE; arr[index] += 1; } } let elapsed = start.elapsed(); println!("Array indexing with boundary check runtime: {}ms", elapsed.as_millis()); // to avoid cache, use a different raw array. let mut arr = Array::from(vec![0u32; SIZE].as_ptr()); let start = std::time::Instant::now(); // array indexing wthout boundary check { for _ in 0..LOOP { let index = rand::random::<usize>() % SIZE; arr[index] += 1; } } let elapsed = start.elapsed(); println!("Array indexing without boundary check runtime: {}ms", elapsed.as_millis()); }

Unsafe

边界检查

性能

13 条回复 2024-09-02 15:13:24 +08:00

nagisaushio

2024-08-30 21:59:08 +08:00

为啥不直接用 get_mut

nagisaushio

2024-08-30 22:03:19 +08:00

你这写法有问题啊，release build 时会报错。

let mut arr = Array::from(vec![0u32; SIZE].as_ptr());

这句完了之后 vec 被 drop 了

Kaleidopink

2024-08-30 22:29:53 +08:00

@nagisaushio 确实, 但是我的上面 build --release 没有任何报错不知道为啥, 导致我没有发现

nagisaushio

2024-08-30 22:39:18 +08:00

这其实是个 UB ，release build 可以选择在任意地方 drop

另外我用 debug 和 release 测了几遍。两段耗时是差不多的。其实这里 rand::random 的计算量比 IndexMut 大多了，占了时间的大头，IndexMut 那一点微小的差异基本观测不到。所以其实测了个寂寞

Kaleidopink

2024-08-30 22:49:27 +08:00

@nagisaushio 不会吧, 即便我把循环次数拖到 2 亿次, 两者也能观察到 115 ms 和 85 ms 的差距.

Kaleidopink

2024-08-30 22:51:40 +08:00

@nagisaushio 当然我才发现不管是 slice 还是 Vec 其实都有 get_unchecked_mut 这个方法, 所以确实写了个寂寞.

nagisaushio

2024-08-30 23:16:36 +08:00

https://play.rust-lang.org/?version=stable&mode=release&edition=2021&gist=e7fe53cb9006e2ce5406778cdcc16296

你用这个多跑几次看看呢。我换了个更轻的 rng ，固定种子，消除额外的影响

gwy15

2024-08-30 23:28:11 +08:00 via iPhone

1. 你可以直接 get_unchecked
2. 有 use-after-drop ，你这不是 unsafe 是 unsound
3. Array 这个命名是错误的，应该叫 Pointer ，里面没有存长度
4. 你可以直接看 asm 确定有没有优化掉边界检查
槽点太多了……建议多看代码和文档，思而不学也不行

gwy15

2024-08-30 23:29:34 +08:00 via iPhone

另外正确的 benchmark 应该是用 black_box 来避免编译器的优化，而不是在计时路径上引入一个 rng……

Kaleidopink

2024-08-31 11:19:27 +08:00

@nagisaushio 在 playground 里面跑出来反而没有越界检查的更慢了, 在自己机器上跑就是正常的, 好生奇怪.

Kaleidopink

2024-08-31 11:20:03 +08:00

@gwy15 学到了, benchmark 确实没写过

Kauruus

2024-08-31 12:06:57 +08:00

之前有人做过测试，以 C 为基线，Rust 的运行时检测大约带来 1.77x 的损耗，其中 bound check 大约占 50%。

https://dl.acm.org/doi/fullHtml/10.1145/3551349.3559494

whoami9894

2024-09-02 15:13:24 +08:00

编译器又不傻，`index % SIZE`，bounds check 绝对被优化掉了