公众号
RSS https://github.com/wanghenshui/cppweeklynews/releases.atom
欢迎投稿,推荐或自荐文章/软件/资源等
标准委员会动态/ide/编译器信息放在这里
#include cleanup in Visual Studio
支持提示删除没用的头文件,我记得clion是不是早就支持了?
编译器信息最新动态推荐关注hellogcc公众号 本周更新 2023-07-12 第210期
支持grpc-plugin了,那么别的支持也能实现,感觉更强了,可以用大项目试试
auto $dollar_sign = 42;
auto @commerical_at = 42;
auto `grave_accent = 42;
没懂有啥用
在某群和群友讨论 std::string能不能存带 ‘/0’的字符,我之前遇到过坑,就想当然的认为不能,然后被群友教育了一种用法
#include <iostream>
#include <vector>
#include <string>
int main() {
std::string s1 = std::string{"\0\0\0\0\0\0\0\0\0\0\0\0\0\0123"};
std::cout<<"---\n";
std::cout<< s1.size()<<"\n";
std::cout<<"---\n";
std::string s2;
s2.append("\0");
s2.append("1");
std::cout<<"---\n";
std::cout<< s2.size()<<"\n";
std::cout<< s2<<"\n";
std::cout<<"---\n";
std::vector<char> v = {'\0','\0','2','3','\0','5','6'};
std::string s3(v.begin(), v.end());
std::cout<<"---\n";
std::cout<< s3.size()<<"\n";
std::cout<< s3<<"\n";
std::cout<<"---\n";
std::string s4 = std::string{"\0\0\0\0\0\0\0\0\0\0\0\0\0\0a23", 17};
std::cout<<"---\n";
std::cout<< s4.size()<<"\n";
std::cout<< s4<<"\n";
std::cout<<"---\n";
return 0;
}
s1 s2都是截断的,但是s3不是,想要std::string包含/0只能通过迭代器构造,不能从c字符串构造,因为c字符串复制判断结尾用的/0
同理,s4加上了长度构造,就包含/0不会截断了。c字符串的缺陷,没有长度信息
我这里先入为主觉得所有构造都这样了。稍为想的远一点
boost::concurrent_flat_map
开链的并发hashmap速度不输tbb boost 1.83发布
有点意思
经典的友元函数注入
template<int N> struct tag{};
template<typename T, int N>
struct loophole_t {
friend auto loophole(tag<N>) { return T{}; };
};
auto loophole(tag<0>);
sizeof(loophole_t<std::string, 0> );
statc_assert(std::is_same<std::string, decltype(loophole(tag<0>{})) >::value, "same");
这玩意属于缺陷,说不定以后就修了,为啥要讲这个老文章?看下面这个
看代码
//Conceptify it (Requires C++20)
struct Drawable {
void draw(std::ostream &out) const {
te::call([](auto const &self, auto &out)
-> decltype(self.draw(out)) { self.draw(out); }, *this, out);
}
};
struct Square {
void draw(std::ostream &out) const { out << "Square"; }
};
template<te::conceptify<Drawable> TDrawable>
void draw(TDrawable const &drawable) {
drawable.draw(std::cout);
}
int main() {
auto drawable = Square{};
draw(drawable); // prints Square
}
我咋感觉te::conceptify<Drawable>
看上去和实现concept没区别?
这种实现能定义te::poly<Drawable>
保存在容器里然后遍历?
实现原理是通过一个mapping类注册类型和typelist,typelist绑上lambda, 我说的不精确,可以看原文
call注册
template <...>
constexpr auto call(const TExpr expr, const I &interface, Ts &&... args) {
...
return detail::call_impl<I>(...);
}
template <...>
constexpr auto call_impl(...) {
void(typename mappings<I, N>::template set<type_list<TExpr, Ts...> >{});
return ...;
}
mapping是这样的
template <class, std::size_t>
struct mappings final {
friend auto get(mappings);
template <class T>
struct set {
friend auto get(mappings) { return T{}; }
};
};
通过这个友元注入get 来记住T类型,也就是之前保存的typelist,构造出lambda
template <class T, class TExpr, class... Ts>
constexpr auto requires_impl(type_list<TExpr, Ts...>)
-> decltype(&TExpr::template operator()<T, Ts...>);
template <class I, class T, std::size_t... Ns>
constexpr auto requires_impl(std::index_sequence<Ns...>) -> type_list<
decltype(requires_impl<I>(decltype(get(mappings<T, Ns + 1>{})){}))...>;
} // namespace detail
template <class I, class T>
concept bool conceptify = requires {
detail::requires_impl<I, T>(
std::make_index_sequence<detail::mappings_size<T, I>()>{});
};
挺巧妙,友元函数注入+typelist绑定
问题在于lambda每次都是构造的,可能有小对象问题,但愿编译器能优化掉
find还要判断结果,很烦,类似optional,封装一下
#include <iostream>
#include <vector>
#include <algorithm>
#include <optional>
namespace cwt {
// first we craete a type which will hold our find result
template<typename T>
class find_result {
public:
find_result() = default;
find_result(T value) : m_value{value} {}
// this is the and_then method which gets a callback
template<typename Func>
const find_result<T>& and_then(const Func&& func) const {
// we only call the callback if we have a value
if (m_value.has_value()) {
func(m_value.value());
}
// and to further add or_else we return *this
return *this;
}
// almost same here, just with return type void
template<typename Func>
void or_else(const Func&& func) const {
if (!m_value.has_value()) {
func();
}
}
private:
// and since we don't know if we found a value
// we hold possible one as optional
std::optional<T> m_value{std::nullopt};
};
// this my find function, where try to find a value in a vector
template<typename T>
find_result<T> find(const std::vector<T>& v, const T value) {
// like before we use the iterator
auto it = std::find(v.begin(), v.end(), value);
// and if we didn't find the value we return
// find_result default constructed
if (it == v.end()) {
return find_result<T>();
} else {
// or with the found value
return find_result<T>(*it);
}
}
} // namespace cwt
int main() {
// lets create a simple vector of int values
std::vector<int> v = {1,2,3,4};
// we use our find function
// and since we return find_result<int>
// we can append or call and_then or_else directly
cwt::find(v, 2)
.and_then([](int result){ std::cout << "found " << result << '\n'; })
.or_else([](){ std::cout << "found nothing\n"; })
;
cwt::find(v, 10)
.and_then([](int result){ std::cout << "found " << result << '\n'; })
.or_else([](){ std::cout << "found nothing\n"; })
;
return 0;
}
看个乐
SIMD环节,需求,把 “20141103 012910”转成数字0x20141103012910
#include <x86intrin.h> // Windows: <intrin.h>
#include <string.h>
// From "20141103 012910", we want to get
// 0x20141103012910
uint64_t extract_nibbles(const char* c) {
uint64_t part1, part2;
memcpy(&part1, c, sizeof(uint64_t));
memcpy(&part2 , c + 7, sizeof(uint64_t));
part1 = _bswap64(part1);
part2 = _bswap64(part2);
part1 = _pext_u64(part1, 0x0f0f0f0f0f0f0f0f);
part2 = _pext_u64(part2, 0x0f000f0f0f0f0f0f);
return (part1<<24) | (part2);
}
汇编
movbe rax, QWORD PTR [rdi]
movbe rdx, QWORD PTR [rdi+7]
movabs rcx, 1085102592571150095
pext rax, rax, rcx
movabs rcx, 1080880467920490255
sal rax, 24
pext rdx, rdx, rcx
or rax, rdx
pext在amd zen3架构上开销很大,但本身也非常小巧了
ARM平台
#include <arm_neon.h>
// From "20141103 012910", we want to get
// 0x20141103012910
uint64_t extract_nibbles(const char *c) {
const uint8_t *ascii = (const uint8_t *)(c);
uint8x16_t in = vld1q_u8(ascii);
// masking the high nibbles,
in = vandq_u8(in, vmovq_n_u8(0x0f));
// shuffle the bytes
const uint8x16_t shuf = {14, 13, 12, 11, 10, 9, 7, 6,
5, 4, 3, 2, 1, 0, 255, 255};
in = vqtbl1q_u8(in, shuf);
// then shift/or
uint16x8_t ins =
vsraq_n_u16(vreinterpretq_u16_u8(in),
vreinterpretq_u16_u8(in), 4);
// then narrow (16->8),
int8x8_t packed = vmovn_u16(ins);
// extract to general register.
return vget_lane_u64(vreinterpret_u64_u16(packed), 0);
}
汇编
adrp x8, .LCPI0_0
ldr q1, [x0]
movi v0.16b, #15
ldr q2, [x8, :lo12:.LCPI0_0]
and v0.16b, v1.16b, v0.16b
tbl v0.16b, { v0.16b }, v2.16b
usra v0.8h, v0.8h, #4
xtn v0.8b, v0.8h
fmov x0, d0
SIMD环节,在一个字符串数组里找子串是否存在
普通实现,bsearsh
std::string *lookup_symbol(const char *input) {
return bsearch(input, strings.data(), strings.size(),
sizeof(std::string), compare);
}
或者trie
或者SIMD
代码我就不贴了,https://github.com/lemire/Code-used-on-Daniel-Lemire-s-blog/blob/master/2023/07/13/benchmarks/benchmark.cpp
天书
参考阅读 https://trent.me/is-prefix-of-string-in-table/ 天书
这个讲的是局部性问题
比如
class my_class {
int a;
int b;
...
int z;
};
int sum_all(my_class* m, int n) {
int sum = 0;
for (int i = 0; i < n; i++) {
sum += m[i].a + m[i].z;
}
return sum;
}
循环用到了a和z,那么a和z就应该靠近点
class my_class {
int a;
int z;
int b;
...
};
用不到的拆出来
class my_class {
int m1;
int m2;
int m3;
};
int sum_all(my_class* m, int n) {
int sum = 0;
for (int i = 0; i < n; i++) {
sum += m[i].m1 + m[i].m2;
}
return sum;
}
没用到m3,那就把它拿出来
class my_class_base {
int m1;
int m2;
};
class my_class_aux {
int m3;
};
int sum_all(my_class_base* m, int n) {
int sum = 0;
for (int i = 0; i < n; i++) {
sum += m[i].m1 + m[i].m2;
}
return sum;
}
同理,如果两个类有互相使用,就放在一起
class my_class1 {
int m1;
int m2;
};
class my_class2 {
int a1;
int a2;
}
int sum_all(my_class1* m1, my_class2* m2, int n) {
int sum = 0;
for (int i = 0; i < n; i++) {
sum += m1[i].m1 + m1[i].m2 + m2[i].a1;
}
return sum;
}
改成
class my_class1 {
int m1;
int m2;
int a1;
};
class my_class2 {
int a2;
};
int sum_all(my_class1* m1, int n) {
int sum = 0;
for (int i = 0; i < n; i++) {
sum += m1[i].m1 + m1[i].m2 + m1[i].a1;
}
return sum;
}
不要循环中访问指针
class my_class {
int m1;
int* p_a1;
};
int sum_all(my_class* m, int n) {
int sum = 0;
for (int i = 0; i < n; i++) {
sum += m[i].m1 + *m[i].p_a1;
}
return sum;
}
这个pa1非常不合理,应该改成值
再比如这种猥琐的公用
class shared {
int a1;
};
class my_class_1 {
int m1;
shared* s;
};
class my_class_2 {
int m1;
shared* s;
};
int sum_all_1(my_class_1* m, int n) {
int sum = 0;
for (int i = 0; i < n; i++) {
sum += m[i].m1 + m[i].s->a1;
}
return sum;
}
int sum_all_2(my_class_2* m, int n) {
int sum = 0;
for (int i = 0; i < n; i++) {
sum += m[i].m1 + m[i].s->a1;
}
return sum;
}
更新s省事了但是实际上循环访问低效,也得改成值
Structure Of Arrays (SOA)结构体数组改成数组结构体
把数据集改小,比如
class big_class {
int index;
...
};
void my_sort(std::vector<big_class>& v) {
std::sort(v.begin(), v.end(), [](const big_class& l, const big_class& r) { return l.index < r.index; });
}
一堆不相关的数据参与了数据加载,可以改成这个
class small_class {
int index;
int pointer;
};
void my_sort(std::vector<big_class>& v) {
std::vector<small_class> tmp;
tmp.reserve(v.size());
for (int i = 0; i < v.size(); i++) {
tmp.push_back({v[i].index, i});
}
std::sort(tmp.begin(), tmp.end(), [](const small_class& l, const small_class& r) { return l.index < r.index; });
std::vector<big_class> result;
result.reserve(tmp.size());
for (int i = 0; i < tmp.size(); i++) {
result.push_back(v[tmp[i].index]);
}
v = std::move(result);
}
这种改法得测量一下,未必有收益,可能有,但不多,主要取决于数据结构,只要bigclass比smallclass大很多的话,收益肯定是有的
你用std::vector::resize缩容,提示构造告警
std::vector<Thing> things;
things.resize(n); // keep only the first n
但你是缩容,不需要构造啊。问题在于resize包括扩容和缩容,不知道你是缩容
还是erase吧
things.erase(things.begin() + n, things.end());
看不懂
如果有疑问评论最好在上面链接到评论区里评论,这样方便搜索,微信公众号有点封闭/知乎吞评论