C++ 中文周刊第122期

周刊项目地址

公众号

RSS https://github.com/wanghenshui/cppweeklynews/releases.atom

欢迎投稿，推荐或自荐文章/软件/资源等

请提交 issue

资讯

标准委员会动态/ide/编译器信息放在这里

#include cleanup in Visual Studio

支持提示删除没用的头文件，我记得clion是不是早就支持了？

Xmake v2.8.1 发布，大量细节特性改进

支持grpc-plugin了，那么别的支持也能实现，感觉更强了，可以用大项目试试

文章

Did you know that C++26 added @, $, and ` to the basic character set?

auto $dollar_sign = 42;
auto @commerical_at = 42;
auto `grave_accent = 42;

没懂有啥用

字符串截断问题

在某群和群友讨论 std::string能不能存带 ‘/0’的字符，我之前遇到过坑，就想当然的认为不能，然后被群友教育了一种用法

#include <iostream>
#include <vector>
#include <string>

int main() {
    std::string s1 = std::string{"\0\0\0\0\0\0\0\0\0\0\0\0\0\0123"};
    std::cout<<"---\n";
    std::cout<< s1.size()<<"\n";
    std::cout<<"---\n";
    std::string s2;
    s2.append("\0");
    s2.append("1");
    std::cout<<"---\n";
    std::cout<< s2.size()<<"\n";
    std::cout<< s2<<"\n";
    std::cout<<"---\n";

    std::vector<char> v = {'\0','\0','2','3','\0','5','6'};
    std::string s3(v.begin(), v.end());
    std::cout<<"---\n";
    std::cout<< s3.size()<<"\n";
    std::cout<< s3<<"\n";
    std::cout<<"---\n";
    std::string s4 = std::string{"\0\0\0\0\0\0\0\0\0\0\0\0\0\0a23", 17};
    std::cout<<"---\n";
    std::cout<< s4.size()<<"\n";
    std::cout<< s4<<"\n";
    std::cout<<"---\n";
    return 0;
}

s1 s2都是截断的，但是s3不是，想要std::string包含/0只能通过迭代器构造，不能从c字符串构造，因为c字符串复制判断结尾用的/0

同理，s4加上了长度构造，就包含/0不会截断了。c字符串的缺陷，没有长度信息

我这里先入为主觉得所有构造都这样了。稍为想的远一点

Inside boost::concurrent_flat_map

boost::concurrent_flat_map开链的并发hashmap速度不输tbb boost 1.83发布

为什么新版内核将进程pid管理从bitmap替换成了radix-tree？

有点意思

The C++ Type Loophole (C++14)

经典的友元函数注入

template<int N> struct tag{};

template<typename T, int N>
struct loophole_t {
  friend auto loophole(tag<N>) { return T{}; };
};
auto loophole(tag<0>);
sizeof(loophole_t<std::string, 0> );
statc_assert(std::is_same<std::string, decltype(loophole(tag<0>{})) >::value, "same");

这玩意属于缺陷，说不定以后就修了，为啥要讲这个老文章？看下面这个

Concept到底，来自te::conceptify的救赎！

看代码

//Conceptify it (Requires C++20)
struct Drawable {
  void draw(std::ostream &out) const {
    te::call([](auto const &self, auto &out)
      -> decltype(self.draw(out)) { self.draw(out); }, *this, out);
  }
};

struct Square {
  void draw(std::ostream &out) const { out << "Square"; }
};

template<te::conceptify<Drawable> TDrawable>
void draw(TDrawable const &drawable) { 
  drawable.draw(std::cout);
}

int main() {
  auto drawable = Square{};
  draw(drawable);  // prints Square
}

我咋感觉te::conceptify<Drawable>看上去和实现concept没区别？

这种实现能定义te::poly<Drawable>保存在容器里然后遍历？

实现原理是通过一个mapping类注册类型和typelist，typelist绑上lambda, 我说的不精确，可以看原文

call注册

template <...>
constexpr auto call(const TExpr expr, const I &interface, Ts &&... args) {
  ...
  return detail::call_impl<I>(...);
}

template <...>
constexpr auto call_impl(...) {
  void(typename mappings<I, N>::template set<type_list<TExpr, Ts...> >{});
  return ...;
}

mapping是这样的

template <class, std::size_t>
struct mappings final {
  friend auto get(mappings);
  template <class T>
  struct set {
    friend auto get(mappings) { return T{}; }
  };
};

通过这个友元注入get 来记住T类型，也就是之前保存的typelist，构造出lambda

template <class T, class TExpr, class... Ts>
constexpr auto requires_impl(type_list<TExpr, Ts...>)
    -> decltype(&TExpr::template operator()<T, Ts...>);

template <class I, class T, std::size_t... Ns>
constexpr auto requires_impl(std::index_sequence<Ns...>) -> type_list<
    decltype(requires_impl<I>(decltype(get(mappings<T, Ns + 1>{})){}))...>;
}  // namespace detail


template <class I, class T>
concept bool conceptify = requires {
  detail::requires_impl<I, T>(
      std::make_index_sequence<detail::mappings_size<T, I>()>{});
};

挺巧妙，友元函数注入+typelist绑定

问题在于lambda每次都是构造的，可能有小对象问题，但愿编译器能优化掉

A find-Function to Append .and_then(..).or_else(..) in C++

find还要判断结果，很烦，类似optional，封装一下

#include <iostream>
#include <vector>
#include <algorithm>
#include <optional>

namespace cwt {    
    // first we craete a type which will hold our find result
    template<typename T> 
    class find_result {
        public: 
            find_result() = default;
            find_result(T value) : m_value{value} {}

            // this is the and_then method which gets a callback
            template<typename Func>
            const find_result<T>& and_then(const Func&& func) const {
                // we only call the callback if we have a value 
                if (m_value.has_value()) {
                    func(m_value.value());
                }
                // and to further add or_else we return *this
                return *this;
            }

            // almost same here, just with return type void
            template<typename Func>
            void or_else(const Func&& func) const {
                if (!m_value.has_value()) {
                    func();
                }
            }
        private:
            // and since we don't know if we found a value
            // we hold possible one as optional
            std::optional<T> m_value{std::nullopt}; 
    };

    // this my find function, where try to find a value in a vector
    template<typename T>
    find_result<T> find(const std::vector<T>& v, const T value) {
        // like before we use the iterator
        auto it = std::find(v.begin(), v.end(), value);
        // and if we didn't find the value we return
        // find_result default constructed
        if (it == v.end()) {
            return find_result<T>();
        } else {
            // or with the found value
            return find_result<T>(*it);
        }
    }
} // namespace cwt

int main() {
    // lets create a simple vector of int values
    std::vector<int> v = {1,2,3,4};

    // we use our find function
    // and since we return find_result<int> 
    // we can append or call and_then or_else directly
    cwt::find(v, 2)
    .and_then([](int result){ std::cout << "found " << result << '\n'; })
    .or_else([](){ std::cout << "found nothing\n"; })
    ;

    cwt::find(v, 10)
    .and_then([](int result){ std::cout << "found " << result << '\n'; })
    .or_else([](){ std::cout << "found nothing\n"; })
    ;

    return 0;
}

看个乐

Packing a string of digits into an integer quickly

SIMD环节,需求，把 “20141103 012910”转成数字0x20141103012910

#include <x86intrin.h> // Windows: <intrin.h>
#include <string.h>

// From "20141103 012910", we want to get
// 0x20141103012910
uint64_t extract_nibbles(const char* c) {
  uint64_t part1, part2;
  memcpy(&part1, c, sizeof(uint64_t));
  memcpy(&part2 , c + 7, sizeof(uint64_t));
  part1 = _bswap64(part1);
  part2 = _bswap64(part2);
  part1 = _pext_u64(part1, 0x0f0f0f0f0f0f0f0f);
  part2 = _pext_u64(part2, 0x0f000f0f0f0f0f0f);
  return (part1<<24) | (part2);
}

汇编

movbe rax, QWORD PTR [rdi]
movbe rdx, QWORD PTR [rdi+7]
movabs rcx, 1085102592571150095
pext rax, rax, rcx
movabs rcx, 1080880467920490255
sal rax, 24
pext rdx, rdx, rcx
or rax, rdx

pext在amd zen3架构上开销很大，但本身也非常小巧了

ARM平台

#include <arm_neon.h>
// From "20141103 012910", we want to get
// 0x20141103012910
uint64_t extract_nibbles(const char *c) {
  const uint8_t *ascii = (const uint8_t *)(c);
  uint8x16_t in = vld1q_u8(ascii);
  // masking the high nibbles,
  in = vandq_u8(in, vmovq_n_u8(0x0f));
  // shuffle the bytes
  const uint8x16_t shuf = {14, 13, 12, 11, 10, 9, 7, 6,
    5, 4, 3, 2, 1, 0, 255, 255};
  in = vqtbl1q_u8(in, shuf);
  // then shift/or
  uint16x8_t ins =
    vsraq_n_u16(vreinterpretq_u16_u8(in),
    vreinterpretq_u16_u8(in), 4);
  // then narrow (16->8),
  int8x8_t packed = vmovn_u16(ins);
  // extract to general register.
  return vget_lane_u64(vreinterpret_u64_u16(packed), 0);
}

汇编

adrp x8, .LCPI0_0
ldr q1, [x0]
movi v0.16b, #15
ldr q2, [x8, :lo12:.LCPI0_0]
and v0.16b, v1.16b, v0.16b
tbl v0.16b, { v0.16b }, v2.16b
usra v0.8h, v0.8h, #4
xtn v0.8b, v0.8h
fmov x0, d0

Recognizing string prefixes with SIMD instructions

SIMD环节，在一个字符串数组里找子串是否存在

普通实现，bsearsh

std::string *lookup_symbol(const char *input) {
  return bsearch(input, strings.data(), strings.size(),
  sizeof(std::string), compare);
}

或者trie

或者SIMD

代码我就不贴了，https://github.com/lemire/Code-used-on-Daniel-Lemire-s-blog/blob/master/2023/07/13/benchmarks/benchmark.cpp

天书

参考阅读 https://trent.me/is-prefix-of-string-in-table/ 天书

Software Performance and Class Layout

这个讲的是局部性问题

比如

class my_class {
   int a;
   int b;
   ...
   int z;
};
int sum_all(my_class* m, int n) {
    int sum = 0;
    for (int i = 0; i < n; i++) {
        sum += m[i].a + m[i].z;
    }
    return sum;
}

循环用到了a和z，那么a和z就应该靠近点

class my_class {
   int a;
   int z;
   int b;
   ...
};

用不到的拆出来

class my_class {
   int m1;
   int m2;
   int m3;
};
int sum_all(my_class* m, int n) {
    int sum = 0;
    for (int i = 0; i < n; i++) {
        sum += m[i].m1 + m[i].m2;
    }
    return sum;
}

没用到m3，那就把它拿出来

class my_class_base {
   int m1;
   int m2;
};
class my_class_aux {
   int m3;
};
int sum_all(my_class_base* m, int n) {
    int sum = 0;
    for (int i = 0; i < n; i++) {
        sum += m[i].m1 + m[i].m2;
    }
    return sum;
}

同理，如果两个类有互相使用，就放在一起

class my_class1 {
    int m1;
    int m2;
};
class my_class2 {
    int a1;
    int a2;
}
int sum_all(my_class1* m1, my_class2* m2, int n) {
    int sum = 0;
    for (int i = 0; i < n; i++) {
        sum += m1[i].m1 + m1[i].m2 + m2[i].a1;
    }
    return sum;
}

改成

class my_class1 {
    int m1;
    int m2;
    int a1;
};
class my_class2 {
    int a2;
};
int sum_all(my_class1* m1, int n) {
    int sum = 0;
    for (int i = 0; i < n; i++) {
        sum += m1[i].m1 + m1[i].m2 + m1[i].a1;
    }
    return sum;
}

不要循环中访问指针

class my_class {
   int m1;
   int* p_a1;
};
int sum_all(my_class* m, int n) {
    int sum = 0;
    for (int i = 0; i < n; i++) {
        sum += m[i].m1 + *m[i].p_a1;
    }
    return sum;
}

这个pa1非常不合理，应该改成值

再比如这种猥琐的公用

class shared {
    int a1;
};
class my_class_1 {
   int m1;
   shared* s;
};
class my_class_2 {
   int m1;
   shared* s;
};
int sum_all_1(my_class_1* m, int n) {
    int sum = 0;
    for (int i = 0; i < n; i++) {
        sum += m[i].m1 + m[i].s->a1;
    }
    return sum;
}
int sum_all_2(my_class_2* m, int n) {
    int sum = 0;
    for (int i = 0; i < n; i++) {
        sum += m[i].m1 + m[i].s->a1;
    }
    return sum;
}

更新s省事了但是实际上循环访问低效，也得改成值

Structure Of Arrays (SOA)结构体数组改成数组结构体

把数据集改小，比如

class big_class {
   int index;
   ...
};
void my_sort(std::vector<big_class>& v) {
    std::sort(v.begin(), v.end(), [](const big_class& l, const big_class& r) { return l.index < r.index; });
}

一堆不相关的数据参与了数据加载，可以改成这个

class small_class {
    int index;
    int pointer;
};
void my_sort(std::vector<big_class>& v) {
    std::vector<small_class> tmp;
    tmp.reserve(v.size());
    for (int i = 0; i < v.size(); i++) {
        tmp.push_back({v[i].index, i});
    }
    std::sort(tmp.begin(), tmp.end(), [](const small_class& l, const small_class& r) { return l.index < r.index; });
    std::vector<big_class> result;
    result.reserve(tmp.size());
    for (int i = 0; i < tmp.size(); i++) {
        result.push_back(v[tmp[i].index]);
    }
    v = std::move(result);
}

这种改法得测量一下，未必有收益，可能有，但不多，主要取决于数据结构，只要bigclass比smallclass大很多的话，收益肯定是有的

Why does the compiler complain about a missing constructor when I’m just resizing my std::vector to a smaller size?

你用std::vector::resize缩容，提示构造告警

std::vector<Thing> things;
things.resize(n); // keep only the first n

但你是缩容，不需要构造啊。问题在于resize包括扩容和缩容，不知道你是缩容

还是erase吧

things.erase(things.begin() + n, things.end());

看不懂

开源项目需要人手

asteria 一个脚本语言，可嵌入，长期找人，希望胖友们帮帮忙，也可以加群753302367和作者对线
Unilang deepin的一个通用编程语言，点子有点意思，也缺人，感兴趣的可以github讨论区或者deepin论坛看一看。这里也挂着长期推荐了
gcc-mcf 懂的都懂

新项目介绍/版本更新

https://github.com/MikePopoloski/boost_unordered 把boost unorderd抽出来了 boost很重
https://github.com/intel/x86-simd-sort/releases/tag/v2.0 simd sort。能快点

本文永久链接

如果有疑问评论最好在上面链接到评论区里评论，这样方便搜索，微信公众号有点封闭/知乎吞评论

看到这里或许你有建议或者疑问或者指出错误，请留言评论! 多谢! 你的评论非常重要！也可以帮忙点赞收藏转发！多谢支持！

觉得写的不错那就给点吧, 在线乞讨

This site is open source. Improve this page.

C++ 中文周刊 第122期

资讯

文章

开源项目需要人手

新项目介绍/版本更新

C++ 中文周刊第122期