C++ 中文周刊第49期

从reddit/hackernews/lobsters/meetingcpp1 meetingcpp2摘抄一些c++动态

周刊项目地址｜在线地址｜知乎专栏

腾讯云+社区

欢迎投稿，推荐或自荐文章/软件/资源等，请提交 issue

资讯

标准委员会动态/ide/编译器信息放在这里

编译器信息最新动态推荐关注hellogcc公众号本周更新 2022-02-02 第135期 2022-02-09 第136期

文章

Did you know that C++23 added std::byteswap to swap bytes?

#include <bit>

int main() {
   constexpr auto value = std::uint16_t(0xCAFE);
   std::cout << std::hex << value; // pritns cafe
   std::cout << std::hex << std::byteswap(value); // prints feca
}

没啥说的

Did you know that C++20 added __VA_OPT__ for comma omission and comma deletion?

#define VARIADIC(...) __VA_OPT__(__LINE__)

VARIADIC()     // `empty`
VARIADIC(a)    // `line` 4
VARIADIC(a, b) // `line` 5

效果 https://godbolt.org/z/rsj9ax7xY

考虑这么个场景

#define FOO(...)       printf(__VA_ARGS__)
#define BAR(fmt, ...)  printf(fmt, __VA_ARGS__)

FOO("this works fine");
BAR("this breaks!");

最后一行，会多出一个逗号，导致调用失败，如何吃掉这个逗号？

gcc拓展

#define BAR(fmt, ...)  printf(fmt "\n", ##__VA_ARGS__)

BAR("here is a log message");
BAR("here is a log message with a param: %d", 42);

或者用这个__VA_OPT__ 感觉boost.pp 里有这玩意。

另外，如何检查这个宏的编译器支持？看这里

#define PP_THIRD_ARG(a,b,c,...) c
#define VA_OPT_SUPPORTED_I(...) PP_THIRD_ARG(__VA_OPT__(,),true,false,)
#define VA_OPT_SUPPORTED VA_OPT_SUPPORTED_I(?)

PP_THIRD_ARG只要第三个参数，__VA_OPT__支持的话展开PP_THIRD_ARG(__VA_OPT__(,),true,false,)变成PP_THIRD_ARG(,,true,false,)第三个就是true,不展开VA_OPT第三个就是false，挺有意思的

使用Boost库实现自动数量的虚函数

看个乐，杀鸡用牛刀了属于是

SWAR explained: parsing eight digits

之前也介绍过，就是解析数字字符串的方法，如何做更快，点击回顾

这是其中之一SWAR，这里老博士重新讲一遍原理

最最简单版本

uint32_t parse_eight_digits(const unsigned char *chars) {
  uint32_t x = chars[0] - '0';
  for (size_t j = 1; j < 8; j++)
    x = x * 10 + (chars[j] - '0');
  return x;
}

这里不考虑合法性，不校验

一般来说，编译器循环展开会这样

        movzx   eax, byte ptr [rdi]
        lea     eax, [rax + 4*rax]
        movzx   ecx, byte ptr [rdi + 1]
        lea     eax, [rcx + 2*rax]
        lea     eax, [rax + 4*rax]
        movzx   ecx, byte ptr [rdi + 2]
        lea     eax, [rcx + 2*rax]
        lea     eax, [rax + 4*rax]
        movzx   ecx, byte ptr [rdi + 3]
        lea     eax, [rcx + 2*rax]
        lea     eax, [rax + 4*rax]
        movzx   ecx, byte ptr [rdi + 4]
        lea     eax, [rcx + 2*rax]
        lea     eax, [rax + 4*rax]
        movzx   ecx, byte ptr [rdi + 5]
        lea     eax, [rcx + 2*rax]
        lea     eax, [rax + 4*rax]
        movzx   ecx, byte ptr [rdi + 6]
        lea     eax, [rcx + 2*rax]
        lea     eax, [rax + 4*rax]
        movzx   ecx, byte ptr [rdi + 7]
        lea     eax, [rcx + 2*rax]
        add     eax, -533333328

很多都是相同的指令，整理一下

        imul    rax, qword ptr [rdi], 2561
        movabs  rcx, -1302123111085379632
        add     rcx, rax
        shr     rcx, 8
        movabs  rax, 71777214294589695
        and     rax, rcx
        imul    rax, rax, 6553601
        shr     rax, 16
        movabs  rcx, 281470681808895
        and     rcx, rax
        movabs  rax, 42949672960001
        imul    rax, rcx
        shr     rax, 32

我们的代码如何直接生成后面这种汇编？

SWAR SIMD within a register.其实就是让寄存器尽可能利用上，做更多的计算，从上面这个汇编就能看出来

为了达到省计算的目标，就要8个byte同时做算术

接下来就是构造了

与其一个一个的减 ‘\0’ 不如直接整个减0x30

val = val - 0x3030303030303030;

如果你的数字串是12345678，那对应16进值就是0x0807060504030201，那

然后乘

val = (val * 10) + (val >> 8);

最终效果是这样

uint32_t  parse_eight_digits_unrolled(uint64_t val) {
  const uint64_t mask = 0x000000FF000000FF;
  const uint64_t mul1 = 0x000F424000000064; // 100 + (1000000ULL << 32)
  const uint64_t mul2 = 0x0000271000000001; // 1 + (10000ULL << 32)
  val -= 0x3030303030303030;
  val = (val * 10) + (val >> 8); // val = (val * 2561) >> 8;
  val = (((val & mask) * mul1) + (((val >> 16) & mask) * mul2)) >> 32;
  return val;
}

其实别的地方也有这种技巧，比如

public static int bitCount(int i) {
    // HD, Figure 5-2
    i = i - ((i >>> 1) & 0x55555555);
    i = (i & 0x33333333) + ((i >>> 2) & 0x33333333);
    i = (i + (i >>> 4)) & 0x0f0f0f0f;
    i = i + (i >>> 8);
    i = i + (i >>> 16);
    return i & 0x3f;
}

原理 HD就是Hacker’s Delight这本书的意思

其实主要构造是最难的要考虑用算数拼出来

On finding the average of two unsigned integers without overflow

找平均值，且不溢出

unsigned average(unsigned a, unsigned b)
{
    return (a + b) / 2;
}

这个明显会溢出

如果你知道两个数的大小的话

unsigned average(unsigned low, unsigned high)
{
    return low + (high - low) / 2;
}

也有一种不需要知道大小的方法

unsigned average(unsigned a, unsigned b)
{
    return (a / 2) + (b / 2) + (a & b & 1);
}

当然SWAR方法更快

unsigned average(unsigned a, unsigned b)
{
    return (a & b) + (a ^ b) / 2;
}

原理a + b = ((a & b) « 1) + (a ^ b) 两部分分别是头和尾

作者还讨论了不同平台下的实现方法。喜欢扣细节的可以看看

Technique: Compile Time Code Generation and Optimization

写了个brainfuck编译器，用上constexpr和编译器优化，不知道brainfuck的先百度下，这里直接贴代码

一个实现

enum class op
{
    ptr_inc,     // >
    ptr_dec,     // <
    data_inc,    // +
    data_dec,    // -
    write,       // .
    read,        // ,
    jmp_ifz,     // [, jump if zero
    jmp,         // ], unconditional jump
};

template <std::size_t InstructionCapacity>
struct program
{
    std::size_t inst_count;
    op          inst[InstructionCapacity];
    std::size_t inst_jmp[InstructionCapacity];
};



template <std::size_t InstructionCapacity>
void execute(const program<InstructionCapacity>& program,
             unsigned char* data_ptr)
{
    auto inst_ptr = std::size_t(0);
    while (inst_ptr < program.inst_count)
    {
        switch (program.inst[inst_ptr])
        {
        case op::ptr_inc:
            ++data_ptr;
            ++inst_ptr;
            break;
        case op::ptr_dec:
            --data_ptr;
            ++inst_ptr;
            break;
        case op::data_inc:
            ++*data_ptr;
            ++inst_ptr;
            break;
        case op::data_dec:
            --*data_ptr;
            ++inst_ptr;
            break;
        case op::write:
            std::putchar(*data_ptr);
            ++inst_ptr;
            break;
        case op::read:
            *data_ptr = static_cast<unsigned char>(std::getchar());
            ++inst_ptr;
            break;
        case op::jmp_ifz:
            if (*data_ptr == 0)
                inst_ptr = program.inst_jmp[inst_ptr];
            else
                ++inst_ptr;
            break;
        case op::jmp:
            inst_ptr = program.inst_jmp[inst_ptr];
            break;
        }
    }
}



template <std::size_t N>
constexpr auto parse(const char (&str)[N])
{
    program<N> result{};

    std::size_t jump_stack[N] = {};
    std::size_t jump_stack_top = 0;

    for (auto ptr = str; *ptr; ++ptr)
    {
        if (*ptr ==  '>')
            result.inst[result.inst_count++] = op::ptr_inc;
        else if (*ptr ==  '<')
            result.inst[result.inst_count++] = op::ptr_dec;
        else if (*ptr ==  '+')
            result.inst[result.inst_count++] = op::data_inc;
        else if (*ptr ==  '-')
            result.inst[result.inst_count++] = op::data_dec;
        else if (*ptr ==  '.')
            result.inst[result.inst_count++] = op::write;
        else if (*ptr ==  ',')
            result.inst[result.inst_count++] = op::read;
        else if (*ptr == '[')
        {
            jump_stack[jump_stack_top++] = result.inst_count;
            result.inst[result.inst_count++] = op::jmp_ifz;
        }
        else if (*ptr == ']')
        {
            auto open = jump_stack[--jump_stack_top];
            auto close = result.inst_count++;

            result.inst[close] = op::jmp;
            result.inst_jmp[close] = open;

            result.inst_jmp[open] = close + 1;
        }
    }

    return result;
}

如何使用？

// `x = std::getchar(); y = x + 3; std::putchar(y);`
static constexpr auto add3 = parse(",>+++<[->+<]>.");

// Use this array for our data_ptr.
unsigned char memory[1024] = {};
execute(add3, memory);

不是很难

如果想玩jit优化可以看 Eli Bendersky的文章.

这里不是重点，重点是constexpr

使用尾递归

template <std::size_t InstructionCapacity>
void execute(const program<InstructionCapacity>& program,
             unsigned char* data_ptr,
             std::size_t inst_ptr = 0)
{
    if (inst_ptr >= program.inst_count)
        return; // Execution is finished.

    switch (program.inst[inst_ptr])
    {
    case op::ptr_inc:
        ++data_ptr;
        return execute(program, data_ptr, inst_ptr + 1);
    case op::ptr_dec:
        --data_ptr;
        return execute(program, data_ptr, inst_ptr + 1);
    case op::data_inc:
        ++*data_ptr;
        return execute(program, data_ptr, inst_ptr + 1);
    case op::data_dec:
        --*data_ptr;
        return execute(program, data_ptr, inst_ptr + 1);
    case op::write:
        std::putchar(*data_ptr);
        return execute(program, data_ptr, inst_ptr + 1);
    case op::read:
        *data_ptr = static_cast<unsigned char>(std::getchar());
        return execute(program, data_ptr, inst_ptr + 1);
    case op::jmp_ifz:
        if (*data_ptr == 0)
            return execute(program, data_ptr, program.inst_jmp[inst_ptr]);
        else
            return execute(program, data_ptr, inst_ptr + 1);
    case op::jmp:
        return execute(program, data_ptr, program.inst_jmp[inst_ptr]);
    }
}

Constexpr + template

template <const auto& Program, std::size_t InstPtr = 0>
constexpr void execute(unsigned char* data_ptr)
{
    if constexpr (InstPtr >= Program.inst_count)
    {
        // Execution is finished.
        return;
    }
    else if constexpr (Program.inst[InstPtr] == op::ptr_inc)
    {
        ++data_ptr;
        return execute<Program, InstPtr + 1>(data_ptr);
    }
    else if constexpr (Program.inst[InstPtr] == op::ptr_dec)
    {
        --data_ptr;
        return execute<Program, InstPtr + 1>(data_ptr);
    }
    else if constexpr (Program.inst[InstPtr] == op::data_inc)
    {
        ++*data_ptr;
        return execute<Program, InstPtr + 1>(data_ptr);
    }
    else if constexpr (Program.inst[InstPtr] == op::data_dec)
    {
        --*data_ptr;
        return execute<Program, InstPtr + 1>(data_ptr);
    }
    else if constexpr (Program.inst[InstPtr] == op::write)
    {
        std::putchar(*data_ptr);
        return execute<Program, InstPtr + 1>(data_ptr);
    }
    else if constexpr (Program.inst[InstPtr] == op::read)
    {
        *data_ptr = static_cast<char>(std::getchar());
        return execute<Program, InstPtr + 1>(data_ptr);
    }
    else if constexpr (Program.inst[InstPtr] == op::jmp_ifz)
    {
        if (*data_ptr == 0)
            return execute<Program, Program.inst_jmp[InstPtr]>(data_ptr);
        else
            return execute<Program, InstPtr + 1>(data_ptr);
    }
    else if constexpr (Program.inst[InstPtr] == op::jmp)
    {
        return execute<Program, Program.inst_jmp[InstPtr]>(data_ptr);
    }
}

最后用法也改成了

    // `x = std::getchar(); y = x + 3; std::putchar(y);`
    static constexpr auto add3 = parse(",>+++<[->+<]>.");

    // Use this array for our data_ptr.
    unsigned char memory[1024] = {};
    execute<add3>(memory);

旧版godbolt https://godbolt.org/z/MaPffqxGT

新版godbolt https://godbolt.org/z/Gd3zKWvKE

开O2能直接算出结果，constexpr的好处还是很明显的

CTRE基本上也是这个原理 CTRE library

复习一下std::condition_variable_any

std::condition_variable_any cond;
boost::shared_mutex m;

void foo() {
    boost::shared_lock<boost::shared_mutex> lk(m);
    while(!some_condition()) {
        cond.wait(lk);
    }
}

c++20

 void testInterruptibleCVWait()
 {
     bool ready = false;
     std::mutex readyMutex;
     std::condition_variable_any readyCV;
 
     std::jthread t([&ready, &readyMutex, &readyCV] (std::stop_token st)
     {
         while (...)
         {
 
             ...
             {
                 std::unique_lock lg{readyMutex};
                 readyCV.wait_until(lg, [&ready] {return ready; }, st);
                 // also ends wait on stop request for st
             }
             ...
         }
    });
 ...
 } // jthread destructor signals stop request and therefore unblocks the CV wait and ends the started thread

volatile means it really happens

除非你知道具体用途，否则别用volatile 用volatile就意味着被修饰的值是经常变的，所以会阻止软件上的优化

一个场景

void blink_twice() {
    volatile int *off = reinterpret_cast<int*>(0x40225190);
    volatile int *on = reinterpret_cast<int*>(0x40225194);
    int flag = (1 << 22);

    *on = flag;
    sleep(1);
    *off = flag;
    sleep(1);
    *on = flag;
    sleep(1);
    *off = flag;
}

写GPIO接口（操控单片机之类的板子）

A minimally interesting typo-bug

struct MoveOnlyWidget {
    MoveOnlyWidget(int);
    MoveOnlyWidget(const MoveOnlyWidget&) = delete;
    MoveOnlyWidget(MoveOnlyWidget&&) = default;
    MoveOnlyWidget& operator=(const MoveOnlyWidget&) = delete;
    MoveOnlyWidget& operator=(MoveOnlyWidget&&) = default;
    MoveOnlyWidget() = default;
};

两个构造函数

建议用

static_assert(!std::is_constructible_v<MoveOnlyWidget>);
static_assert(!std::is_default_constructible_v<MoveOnlyWidget>);
static_assert(!std::default_initializable<MoveOnlyWidget>);

类似的编译期单测直接干掉这种小错误

C++ Templates: How to Iterate through std::tuple: the Basics

std::tuple tp { 10, 20, 3.14, 42, "hello"};
printTuple(tp);

实现printTuple，怎么做

首先试着get单条

std::tuple tp {42, 10.5, "hello"};
std::cout << std::get<0>(tp) << ", ";
std::cout << std::get<1>(tp) << ", ";
std::cout << std::get<2>(tp) << ", ";

遍历tuple就是把std::get调用多次，就是生成get的模版参数N

template <typename T>
void printElem(const T& x) {
    std::cout << x << ',';
};

template <typename TupleT, std::size_t... Is>
void printTupleManual(const TupleT& tp) {
    (printElem(std::get<Is>(tp)), ...);
}


std::tuple tp { 10, 20, "hello"};
printTupleManual<decltype(tp), 0, 1, 2>(tp);

现在的问题就是如何生成Is列表，去掉printTupleManual 的模版参数

index_sequence能生成，所以有size就可以了

template <typename T>
void printElem(const T& x) {
    std::cout << x << ',';
};

template <typename TupleT, std::size_t... Is>
void printTupleManual(const TupleT& tp, std::index_sequence<Is...>) {
    (printElem(std::get<Is>(tp)), ...);
}


std::tuple tp { 10, 20, "hello"};
printTupleManual(tp, std::make_index_sequence<3>{});

而size tuple本身是有的

printTupleManual(tp, std::make_index_sequence<std::tuple_size_v<decltype(tp)>>{});

所以组合一下就可以了

template <typename TupleT, std::size_t TupSize = std::tuple_size_v<TupleT>>
void printTupleGetSize(const TupleT& tp) {
    printTupleManual(tp, std::make_index_sequence<TupSize>{});
}

std::tuple tp { 10, 20, "hello"};
printTupleGetSize(tp);

整理一下代码

template <typename TupleT, std::size_t... Is>
void printTupleImp(const TupleT& tp, std::index_sequence<Is...>) {
    size_t index = 0;
    auto printElem = [&index](const auto& x) {
        if (index++ > 0) 
            std::cout << ", ";
        std::cout << x;
    };

    std::cout << "(";
    (printElem(std::get<Is>(tp)), ...);
    std::cout << ")";
}

template <typename TupleT, std::size_t TupSize = std::tuple_size_v<TupleT>>
void printTuple(const TupleT& tp) {
    printTupleImp(tp, std::make_index_sequence<TupSize>{});
}

支持流式 «

#include <iostream>
#include <ostream>
#include <tuple>

template <typename TupleT, std::size_t... Is>
std::ostream& printTupleImp(std::ostream& os, const TupleT& tp, std::index_sequence<Is...>) {
    size_t index = 0;
    auto printElem = [&index, &os](const auto& x) {
        if (index++ > 0) 
            os << ", ";
        os << x;
    };

    os << "(";
    (printElem(std::get<Is>(tp)), ...);
    os << ")";
    return os;
}

template <typename TupleT, std::size_t TupSize = std::tuple_size<TupleT>::value>
std::ostream& operator <<(std::ostream& os, const TupleT& tp) {
    return printTupleImp(os, tp, std::make_index_sequence<TupSize>{}); 
}

int main() {
    std::tuple tp { 10, 20, "hello"};
    std::cout << tp << '\n';
}

再优化一下，加上索引

#include <iostream>
#include <ostream>
#include <tuple>

template <typename TupleT, std::size_t... Is>
std::ostream& printTupleImp(std::ostream& os, const TupleT& tp, std::index_sequence<Is...>) {
    auto printElem = [&os](const auto& x, size_t id) {
        if (id > 0) 
            os << ", ";
        os << id << ": " << x;
    };

    os << "(";
    (printElem(std::get<Is>(tp), Is), ...);
    os << ")";
    return os;
}

template <typename TupleT, std::size_t TupSize = std::tuple_size<TupleT>::value>
std::ostream& operator <<(std::ostream& os, const TupleT& tp) {
    return printTupleImp(os, tp, std::make_index_sequence<TupSize>{}); 
}

int main() {
    std::tuple tp { 10, 20, "hello"};
    std::cout << tp << '\n';
}

Centralizing Resource Cleanup Paths in C

作者讨论怎么写driver代码更干净

“Universal reference” or “forwarding reference”?

右值引用有时候又被叫做万能引用，mayer在书里开始叫的，作者觉得右值引用并不万能

A Tour of C++ - Reading Notes (Part 1) A Tour of C++ - Reading Notes (Part 2)

读书笔记

小技巧

template<typename Base, typename T>
inline bool instanceof(T *ptr) {
  return dynamic_cast<Base*>(ptr) != nullptr;
}

explict这个也是google代码规范推荐

class Vector {
 public:
  explicit Vector(int sz); // avoid implicit conversion from int to Vector
  /* ... */
}
Vector vec1(7);  // OK
Vector vec2 = 7; // NOT OK

字符串默认char *，要用后缀才能是string类型

/*
 * The "s" suffix needs one of the following namespace to be included:
 *   - using namespace std::literals 
 *   - using namespace std::string_literals 
 *   - using namespace std::literals::string_literal
 */
auto s1 = "Ian";  // const char* (C-style)
auto s2 = "Pan"s; // std::string