C++ 中文周刊 2024-03-30 第153期

周刊项目地址

公众号

qq群点击进入

RSS https://github.com/wanghenshui/cppweeklynews/releases.atom

欢迎投稿，推荐或自荐文章/软件/资源等，评论区留言

本期文章由 HNY 赞助

最近博客内容较少，所以基本整合起来发

资讯

标准委员会动态/ide/编译器信息放在这里

最近的热门事件无疑是xz被植入后门了，埋木马的哥们主动参与社区贡献，骗取信任拿到直接commit权限

趁主要维护人休假期间埋木马，但是木马有问题回归测试被安全人员发现sshd CPU升高，找到xz是罪魁祸首

无间道搁这

文章

How can I tell C++ that I want to discard a nodiscard value?

std::ignore 或者 decltype(std::ignore) _; 然后用 _，或者不自己写，等c++26

Step-by-Step Analysis of Crash Caused by Resize from Zero

resize 的参数为负数会异常(比如参数溢出意外负数)

有句讲句，标准库里的异常有时候很奇怪，大动干戈，副作用还是异常应该有明显的区分。但是目前来看显然是一股脑全异常了

比如stoi异常，这些场景里expect<T>更合适，或者c传统的返回值处理更合理一些

RDMA性能优化经验浅谈（一）

科普

GCC 14 Boasts Nice ASCII Art For Visualizing Buffer Overflows

告警更明显一些

Improvements to static analysis in the GCC 14 compiler

gcc14加了个-fanalyzer

包括上面的buffer溢出分析，死循环分析，比如 https://godbolt.org/z/vn55nn43z

void test (int m, int n) {
  float arr[m][n];
  for (int i = 0; i < m; i++)
    for (int j = 0; j < n; i++)
      arr[i][j] = 0.f;
  /* etc */
}

这里里面的循环条件一直没变，所以一直是死的

编译器能分析出问题

<source>: In function 'test':
<source>:5:23: warning: infinite loop [CWE-835] [-Wanalyzer-infinite-loop]
    5 |     for (int j = 0; j < n; i++)
      |                     ~~^~~
  'test': events 1-5
    |
    |    5 |     for (int j = 0; j < n; i++)
    |      |                     ~~^~~  ~~~
    |      |                       |     |
    |      |                       |     (4) looping back...
    |      |                       (1) infinite loop here
    |      |                       (2) when 'j < n': always following 'true' branch...
    |      |                       (5) ...to here
    |    6 |       arr[i][j] = 0.f;
    |      |       ~~~~~~~~~        
    |      |             |
    |      |             (3) ...to here
    |
ASM generation compiler returned: 0
<source>: In function 'test':
<source>:5:23: warning: infinite loop [CWE-835] [-Wanalyzer-infinite-loop]
    5 |     for (int j = 0; j < n; i++)
      |                     ~~^~~
  'test': events 1-5
    |
    |    5 |     for (int j = 0; j < n; i++)
    |      |                     ~~^~~  ~~~
    |      |                       |     |
    |      |                       |     (4) looping back...
    |      |                       (1) infinite loop here
    |      |                       (2) when 'j < n': always following 'true' branch...
    |      |                       (5) ...to here
    |    6 |       arr[i][j] = 0.f;
    |      |       ~~~~~~~~~        
    |      |             |
    |      |             (3) ...to here
    |
Execution build compiler returned: 0
Program returned: 255

这个功能非常有用 gcc14已经发布，能体验到赶紧用起来，免费的静态检查了属于是

A case in API ergonomics for ordered containers

range的问题，如果range 的顺序颠倒，可能会产生未定义行为

举个例子，正常的范围使用

std::set<int> x=...;

// elements in [a,b]
auto first = x.lower_bound(a);
auto last  = x.upper_bound(b);
 
while(first != last) std::cout<< *first++ <<" ";

// elements in [a,b)
auto first = x.lower_bound(a);
auto last  = x.lower_bound(b);

// elements in (a,b]
auto first = x.upper_bound(a);
auto last  = x.upper_bound(b);

// elements in (a,b)
auto first = x.upper_bound(a);
auto last  = x.lower_bound(b);

这里的用法的潜在条件是a < b，如果不满足，就完蛋了, 似乎没有办法预防写错，手动assert？

这也容易引起错误，能不能让使用者不要用接口有隐形成本？

boost multiindex设计了一种接口

template<typename LowerBounder,typename UpperBounder>
std::pair<iterator,iterator>
range(LowerBounder lower, UpperBounder upper);

显然，不同的类型，隐含一层检查，看上去不好用，但是结合boost lambda2，非常直观

// equivalent to std::set<int>
boost::multi_index_container<int> x=...;

using namespace boost::lambda2;

// [a,b]
auto [first, last] = x.range(_1 >= a, _1 <= b);

// [a,b)
auto [first, last] = x.range(_1 >= a, _1 < b);

// (a,b]
auto [first, last] = x.range(_1 > a,  _1 <= b);

// (a,b)
auto [first, last] = x.range(_1 > a,  _1 < b);

倾向于返回range处理，而不是手动拿到range，即使出现a>b的场景，顶多返回空range

这样要比上面的用法更安全一些

唉，API设计的问题还是有很多需要关注的地方

C++ left arrow operator

幽默代码一例(别这么写)

#include <iostream>
 
template<class T>
struct larrow {
    larrow(T* a_) : a(a_) { }
    T* a;
};
 
template <class T, class R>
R operator<(R (T::* f)(), larrow<T> it) {
    return (it.a->*f)();
}
 
template<class T>
larrow<T> operator-(T& a) {
    return larrow<T>(&a);
}
 
struct C {
    void f() { std::cout << "foo\n"; }    
};
 
int main() {
    C x;
    (&C::f)<-x;
}

Upgrading the compiler: undefined behaviour uncovered

TLDR enum没指定默认值的bug，类似int不指定默认值

Trivial, but not trivially default constructible

一个例子

template<class T>
struct S {
    S() requires (sizeof(T) > 3) = default;
    S() requires (sizeof(T) < 5) = default;
};

static_assert(std::is_trivial_v<S<int>>);
static_assert(not std::is_default_constructible_v<S<int>>);

是trivial的，但是构造函数有点多，就没法默认构造

你问这有什么用，确实没用。当不知道好了

今天也和读者聊天问push back T构造异常了咋办，

那我只能说这个T的实现很没有素质，除了bad alloc别的老子不想管

希望大家都做一个有素质的人

Understanding and implementing fixed point numbers

看不懂

Random distributions are not one-size-fits-all (part 1)

Random distributions are not one-size-fits-all (part 2)

随机数生成和场景关联程度太大了，lemire的算法省掉了取余% 但是部分场景性能并不能打败使用取余%的版本

How fast is rolling Karp-Rabin hashing?

其实就是滚动hash，比如这种

uint32_t hash = 0;
for (size_t i = 0; i < len; i++) {
  hash = hash * B + data[i];
}
return hash;

这个B可能是个质数，比如31，不过不重要

考虑一个字符串子串匹配场景，这种场景下得计算字串hash，比如长字符串内长度为N的子串，代码类似这样

for(size_t i = 0; i < len-N; i++) {
  uint32_t hash = 0;
  for(size_t j = 0; j < N; j++) {
    hash = hash * B + data[i+j];
  }
  //...
}

这个代码的问题是效率低，有没有什么优化办法？

显然这里面有重复计算，到N之前的hash计算完全可以提前算出来

后面变动的减掉就行

uint32_t BtoN = 1;
for(size_t i = 0; i < N; i++) { BtoN *= B; }

uint32_t hash = 0;
for(size_t i = 0; i < N; i++) {
  hash = hash * B + data[i];
}
// ...
for(size_t i = N; i < len; i++) {
  hash = hash * B + data[i] - BtoN * data[i-N];
  // ...
}

不知道你看懂没？

这样提前算好性能翻个五倍没啥问题

代码在这里 https://github.com/lemire/clhash/

还有这个 https://github.com/lemire/rollinghashcpp

视频

C++ Weekly - Ep 421 - You’re Using optional, variant, pair, tuple, any, and expected Wrong!

不要直接从原类型返回optional这种盒子类型，会破坏RVO 手动make_optional就行了

上一期

看到这里或许你有建议或者疑问或者指出错误，请留言评论! 多谢! 你的评论非常重要！也可以帮忙点赞收藏转发！多谢支持！

觉得写的不错那就给点吧, 在线乞讨

This site is open source. Improve this page.