C++ 中文周刊 第122期

周刊项目地址

公众号

RSS https://github.com/wanghenshui/cppweeklynews/releases.atom

欢迎投稿,推荐或自荐文章/软件/资源等

提交 issue


资讯

标准委员会动态/ide/编译器信息放在这里

#include cleanup in Visual Studio

支持提示删除没用的头文件,我记得clion是不是早就支持了?

编译器信息最新动态推荐关注hellogcc公众号 本周更新 2023-07-12 第210期

Xmake v2.8.1 发布,大量细节特性改进

支持grpc-plugin了,那么别的支持也能实现,感觉更强了,可以用大项目试试

文章

auto $dollar_sign = 42;
auto @commerical_at = 42;
auto `grave_accent = 42;

没懂有啥用

在某群和群友讨论 std::string能不能存带 ‘/0’的字符,我之前遇到过坑,就想当然的认为不能,然后被群友教育了一种用法

#include <iostream>
#include <vector>
#include <string>

int main() {
    std::string s1 = std::string{"\0\0\0\0\0\0\0\0\0\0\0\0\0\0123"};
    std::cout<<"---\n";
    std::cout<< s1.size()<<"\n";
    std::cout<<"---\n";
    std::string s2;
    s2.append("\0");
    s2.append("1");
    std::cout<<"---\n";
    std::cout<< s2.size()<<"\n";
    std::cout<< s2<<"\n";
    std::cout<<"---\n";

    std::vector<char> v = {'\0','\0','2','3','\0','5','6'};
    std::string s3(v.begin(), v.end());
    std::cout<<"---\n";
    std::cout<< s3.size()<<"\n";
    std::cout<< s3<<"\n";
    std::cout<<"---\n";
    std::string s4 = std::string{"\0\0\0\0\0\0\0\0\0\0\0\0\0\0a23", 17};
    std::cout<<"---\n";
    std::cout<< s4.size()<<"\n";
    std::cout<< s4<<"\n";
    std::cout<<"---\n";
    return 0;
}

s1 s2都是截断的,但是s3不是,想要std::string包含/0只能通过迭代器构造,不能从c字符串构造,因为c字符串复制判断结尾用的/0

同理,s4加上了长度构造,就包含/0不会截断了。c字符串的缺陷,没有长度信息

我这里先入为主觉得所有构造都这样了。稍为想的远一点

boost::concurrent_flat_map开链的并发hashmap速度不输tbb boost 1.83发布

有点意思

经典的友元函数注入

template<int N> struct tag{};

template<typename T, int N>
struct loophole_t {
  friend auto loophole(tag<N>) { return T{}; };
};
auto loophole(tag<0>);
sizeof(loophole_t<std::string, 0> );
statc_assert(std::is_same<std::string, decltype(loophole(tag<0>{})) >::value, "same");

这玩意属于缺陷,说不定以后就修了,为啥要讲这个老文章?看下面这个

看代码

//Conceptify it (Requires C++20)
struct Drawable {
  void draw(std::ostream &out) const {
    te::call([](auto const &self, auto &out)
      -> decltype(self.draw(out)) { self.draw(out); }, *this, out);
  }
};

struct Square {
  void draw(std::ostream &out) const { out << "Square"; }
};

template<te::conceptify<Drawable> TDrawable>
void draw(TDrawable const &drawable) { 
  drawable.draw(std::cout);
}

int main() {
  auto drawable = Square{};
  draw(drawable);  // prints Square
}

我咋感觉te::conceptify<Drawable>看上去和实现concept没区别?

这种实现能定义te::poly<Drawable>保存在容器里然后遍历?

实现原理是通过一个mapping类注册类型和typelist,typelist绑上lambda, 我说的不精确,可以看原文

call注册

template <...>
constexpr auto call(const TExpr expr, const I &interface, Ts &&... args) {
  ...
  return detail::call_impl<I>(...);
}

template <...>
constexpr auto call_impl(...) {
  void(typename mappings<I, N>::template set<type_list<TExpr, Ts...> >{});
  return ...;
}

mapping是这样的

template <class, std::size_t>
struct mappings final {
  friend auto get(mappings);
  template <class T>
  struct set {
    friend auto get(mappings) { return T{}; }
  };
};

通过这个友元注入get 来记住T类型,也就是之前保存的typelist,构造出lambda

template <class T, class TExpr, class... Ts>
constexpr auto requires_impl(type_list<TExpr, Ts...>)
    -> decltype(&TExpr::template operator()<T, Ts...>);

template <class I, class T, std::size_t... Ns>
constexpr auto requires_impl(std::index_sequence<Ns...>) -> type_list<
    decltype(requires_impl<I>(decltype(get(mappings<T, Ns + 1>{})){}))...>;
}  // namespace detail


template <class I, class T>
concept bool conceptify = requires {
  detail::requires_impl<I, T>(
      std::make_index_sequence<detail::mappings_size<T, I>()>{});
};

挺巧妙,友元函数注入+typelist绑定

问题在于lambda每次都是构造的,可能有小对象问题,但愿编译器能优化掉

find还要判断结果,很烦,类似optional,封装一下

#include <iostream>
#include <vector>
#include <algorithm>
#include <optional>

namespace cwt {    
    // first we craete a type which will hold our find result
    template<typename T> 
    class find_result {
        public: 
            find_result() = default;
            find_result(T value) : m_value{value} {}

            // this is the and_then method which gets a callback
            template<typename Func>
            const find_result<T>& and_then(const Func&& func) const {
                // we only call the callback if we have a value 
                if (m_value.has_value()) {
                    func(m_value.value());
                }
                // and to further add or_else we return *this
                return *this;
            }

            // almost same here, just with return type void
            template<typename Func>
            void or_else(const Func&& func) const {
                if (!m_value.has_value()) {
                    func();
                }
            }
        private:
            // and since we don't know if we found a value
            // we hold possible one as optional
            std::optional<T> m_value{std::nullopt}; 
    };

    // this my find function, where try to find a value in a vector
    template<typename T>
    find_result<T> find(const std::vector<T>& v, const T value) {
        // like before we use the iterator
        auto it = std::find(v.begin(), v.end(), value);
        // and if we didn't find the value we return
        // find_result default constructed
        if (it == v.end()) {
            return find_result<T>();
        } else {
            // or with the found value
            return find_result<T>(*it);
        }
    }
} // namespace cwt

int main() {
    // lets create a simple vector of int values
    std::vector<int> v = {1,2,3,4};

    // we use our find function
    // and since we return find_result<int> 
    // we can append or call and_then or_else directly
    cwt::find(v, 2)
    .and_then([](int result){ std::cout << "found " << result << '\n'; })
    .or_else([](){ std::cout << "found nothing\n"; })
    ;

    cwt::find(v, 10)
    .and_then([](int result){ std::cout << "found " << result << '\n'; })
    .or_else([](){ std::cout << "found nothing\n"; })
    ;

    return 0;
}

看个乐

SIMD环节,需求,把 “20141103 012910”转成数字0x20141103012910

#include <x86intrin.h> // Windows: <intrin.h>
#include <string.h>

// From "20141103 012910", we want to get
// 0x20141103012910
uint64_t extract_nibbles(const char* c) {
  uint64_t part1, part2;
  memcpy(&part1, c, sizeof(uint64_t));
  memcpy(&part2 , c + 7, sizeof(uint64_t));
  part1 = _bswap64(part1);
  part2 = _bswap64(part2);
  part1 = _pext_u64(part1, 0x0f0f0f0f0f0f0f0f);
  part2 = _pext_u64(part2, 0x0f000f0f0f0f0f0f);
  return (part1<<24) | (part2);
}

汇编

movbe rax, QWORD PTR [rdi]
movbe rdx, QWORD PTR [rdi+7]
movabs rcx, 1085102592571150095
pext rax, rax, rcx
movabs rcx, 1080880467920490255
sal rax, 24
pext rdx, rdx, rcx
or rax, rdx

pext在amd zen3架构上开销很大,但本身也非常小巧了

ARM平台

#include <arm_neon.h>
// From "20141103 012910", we want to get
// 0x20141103012910
uint64_t extract_nibbles(const char *c) {
  const uint8_t *ascii = (const uint8_t *)(c);
  uint8x16_t in = vld1q_u8(ascii);
  // masking the high nibbles,
  in = vandq_u8(in, vmovq_n_u8(0x0f));
  // shuffle the bytes
  const uint8x16_t shuf = {14, 13, 12, 11, 10, 9, 7, 6,
    5, 4, 3, 2, 1, 0, 255, 255};
  in = vqtbl1q_u8(in, shuf);
  // then shift/or
  uint16x8_t ins =
    vsraq_n_u16(vreinterpretq_u16_u8(in),
    vreinterpretq_u16_u8(in), 4);
  // then narrow (16->8),
  int8x8_t packed = vmovn_u16(ins);
  // extract to general register.
  return vget_lane_u64(vreinterpret_u64_u16(packed), 0);
}

汇编

adrp x8, .LCPI0_0
ldr q1, [x0]
movi v0.16b, #15
ldr q2, [x8, :lo12:.LCPI0_0]
and v0.16b, v1.16b, v0.16b
tbl v0.16b, { v0.16b }, v2.16b
usra v0.8h, v0.8h, #4
xtn v0.8b, v0.8h
fmov x0, d0

SIMD环节,在一个字符串数组里找子串是否存在

普通实现,bsearsh

std::string *lookup_symbol(const char *input) {
  return bsearch(input, strings.data(), strings.size(),
  sizeof(std::string), compare);
}

或者trie

或者SIMD

代码我就不贴了,https://github.com/lemire/Code-used-on-Daniel-Lemire-s-blog/blob/master/2023/07/13/benchmarks/benchmark.cpp

天书

参考阅读 https://trent.me/is-prefix-of-string-in-table/ 天书

这个讲的是局部性问题

比如

class my_class {
   int a;
   int b;
   ...
   int z;
};
int sum_all(my_class* m, int n) {
    int sum = 0;
    for (int i = 0; i < n; i++) {
        sum += m[i].a + m[i].z;
    }
    return sum;
}

循环用到了a和z,那么a和z就应该靠近点

class my_class {
   int a;
   int z;
   int b;
   ...
};

用不到的拆出来

class my_class {
   int m1;
   int m2;
   int m3;
};
int sum_all(my_class* m, int n) {
    int sum = 0;
    for (int i = 0; i < n; i++) {
        sum += m[i].m1 + m[i].m2;
    }
    return sum;
}

没用到m3,那就把它拿出来

class my_class_base {
   int m1;
   int m2;
};
class my_class_aux {
   int m3;
};
int sum_all(my_class_base* m, int n) {
    int sum = 0;
    for (int i = 0; i < n; i++) {
        sum += m[i].m1 + m[i].m2;
    }
    return sum;
}

同理,如果两个类有互相使用,就放在一起

class my_class1 {
    int m1;
    int m2;
};
class my_class2 {
    int a1;
    int a2;
}
int sum_all(my_class1* m1, my_class2* m2, int n) {
    int sum = 0;
    for (int i = 0; i < n; i++) {
        sum += m1[i].m1 + m1[i].m2 + m2[i].a1;
    }
    return sum;
}

改成

class my_class1 {
    int m1;
    int m2;
    int a1;
};
class my_class2 {
    int a2;
};
int sum_all(my_class1* m1, int n) {
    int sum = 0;
    for (int i = 0; i < n; i++) {
        sum += m1[i].m1 + m1[i].m2 + m1[i].a1;
    }
    return sum;
}

不要循环中访问指针

class my_class {
   int m1;
   int* p_a1;
};
int sum_all(my_class* m, int n) {
    int sum = 0;
    for (int i = 0; i < n; i++) {
        sum += m[i].m1 + *m[i].p_a1;
    }
    return sum;
}

这个pa1非常不合理,应该改成值

再比如这种猥琐的公用

class shared {
    int a1;
};
class my_class_1 {
   int m1;
   shared* s;
};
class my_class_2 {
   int m1;
   shared* s;
};
int sum_all_1(my_class_1* m, int n) {
    int sum = 0;
    for (int i = 0; i < n; i++) {
        sum += m[i].m1 + m[i].s->a1;
    }
    return sum;
}
int sum_all_2(my_class_2* m, int n) {
    int sum = 0;
    for (int i = 0; i < n; i++) {
        sum += m[i].m1 + m[i].s->a1;
    }
    return sum;
}

更新s省事了但是实际上循环访问低效,也得改成值

Structure Of Arrays (SOA)结构体数组改成数组结构体

把数据集改小,比如

class big_class {
   int index;
   ...
};
void my_sort(std::vector<big_class>& v) {
    std::sort(v.begin(), v.end(), [](const big_class& l, const big_class& r) { return l.index < r.index; });
}

一堆不相关的数据参与了数据加载,可以改成这个

class small_class {
    int index;
    int pointer;
};
void my_sort(std::vector<big_class>& v) {
    std::vector<small_class> tmp;
    tmp.reserve(v.size());
    for (int i = 0; i < v.size(); i++) {
        tmp.push_back({v[i].index, i});
    }
    std::sort(tmp.begin(), tmp.end(), [](const small_class& l, const small_class& r) { return l.index < r.index; });
    std::vector<big_class> result;
    result.reserve(tmp.size());
    for (int i = 0; i < tmp.size(); i++) {
        result.push_back(v[tmp[i].index]);
    }
    v = std::move(result);
}

这种改法得测量一下,未必有收益,可能有,但不多,主要取决于数据结构,只要bigclass比smallclass大很多的话,收益肯定是有的

你用std::vector::resize缩容,提示构造告警

std::vector<Thing> things;
things.resize(n); // keep only the first n

但你是缩容,不需要构造啊。问题在于resize包括扩容和缩容,不知道你是缩容

还是erase吧

things.erase(things.begin() + n, things.end());

看不懂

开源项目需要人手

新项目介绍/版本更新


本文永久链接

如果有疑问评论最好在上面链接到评论区里评论,这样方便搜索,微信公众号有点封闭/知乎吞评论

看到这里或许你有建议或者疑问或者指出错误,请留言评论! 多谢! 你的评论非常重要!也可以帮忙点赞收藏转发!多谢支持! 觉得写的不错那就给点吧, 在线乞讨 微信转账