C++ 中文周刊 第49期

reddit/hackernews/lobsters/meetingcpp1 meetingcpp2摘抄一些c++动态

周刊项目地址在线地址知乎专栏 腾讯云+社区

欢迎投稿,推荐或自荐文章/软件/资源等,请提交 issue


资讯

标准委员会动态/ide/编译器信息放在这里

编译器信息最新动态推荐关注hellogcc公众号 本周更新 2022-02-02 第135期 2022-02-09 第136期

文章

#include <bit>

int main() {
   constexpr auto value = std::uint16_t(0xCAFE);
   std::cout << std::hex << value; // pritns cafe
   std::cout << std::hex << std::byteswap(value); // prints feca
}

没啥说的

#define VARIADIC(...) __VA_OPT__(__LINE__)

VARIADIC()     // `empty`
VARIADIC(a)    // `line` 4
VARIADIC(a, b) // `line` 5

效果 https://godbolt.org/z/rsj9ax7xY

考虑这么个场景

#define FOO(...)       printf(__VA_ARGS__)
#define BAR(fmt, ...)  printf(fmt, __VA_ARGS__)

FOO("this works fine");
BAR("this breaks!");

最后一行,会多出一个逗号,导致调用失败,如何吃掉这个逗号?

gcc拓展

#define BAR(fmt, ...)  printf(fmt "\n", ##__VA_ARGS__)

BAR("here is a log message");
BAR("here is a log message with a param: %d", 42);

或者用这个__VA_OPT__ 感觉boost.pp 里有这玩意。

另外,如何检查这个宏的编译器支持?看这里

#define PP_THIRD_ARG(a,b,c,...) c
#define VA_OPT_SUPPORTED_I(...) PP_THIRD_ARG(__VA_OPT__(,),true,false,)
#define VA_OPT_SUPPORTED VA_OPT_SUPPORTED_I(?)

PP_THIRD_ARG只要第三个参数,__VA_OPT__支持的话展开PP_THIRD_ARG(__VA_OPT__(,),true,false,)变成PP_THIRD_ARG(,,true,false,)第三个就是true,不展开VA_OPT第三个就是false,挺有意思的

看个乐,杀鸡用牛刀了属于是

之前也介绍过,就是解析数字字符串的方法,如何做更快,点击回顾

这是其中之一SWAR,这里老博士重新讲一遍原理

最最简单版本

uint32_t parse_eight_digits(const unsigned char *chars) {
  uint32_t x = chars[0] - '0';
  for (size_t j = 1; j < 8; j++)
    x = x * 10 + (chars[j] - '0');
  return x;
}

这里不考虑合法性,不校验

一般来说,编译器循环展开会这样

        movzx   eax, byte ptr [rdi]
        lea     eax, [rax + 4*rax]
        movzx   ecx, byte ptr [rdi + 1]
        lea     eax, [rcx + 2*rax]
        lea     eax, [rax + 4*rax]
        movzx   ecx, byte ptr [rdi + 2]
        lea     eax, [rcx + 2*rax]
        lea     eax, [rax + 4*rax]
        movzx   ecx, byte ptr [rdi + 3]
        lea     eax, [rcx + 2*rax]
        lea     eax, [rax + 4*rax]
        movzx   ecx, byte ptr [rdi + 4]
        lea     eax, [rcx + 2*rax]
        lea     eax, [rax + 4*rax]
        movzx   ecx, byte ptr [rdi + 5]
        lea     eax, [rcx + 2*rax]
        lea     eax, [rax + 4*rax]
        movzx   ecx, byte ptr [rdi + 6]
        lea     eax, [rcx + 2*rax]
        lea     eax, [rax + 4*rax]
        movzx   ecx, byte ptr [rdi + 7]
        lea     eax, [rcx + 2*rax]
        add     eax, -533333328

很多都是相同的指令,整理一下

        imul    rax, qword ptr [rdi], 2561
        movabs  rcx, -1302123111085379632
        add     rcx, rax
        shr     rcx, 8
        movabs  rax, 71777214294589695
        and     rax, rcx
        imul    rax, rax, 6553601
        shr     rax, 16
        movabs  rcx, 281470681808895
        and     rcx, rax
        movabs  rax, 42949672960001
        imul    rax, rcx
        shr     rax, 32

我们的代码如何直接生成后面这种汇编?

SWAR SIMD within a register.其实就是让寄存器尽可能利用上,做更多的计算,从上面这个汇编就能看出来

为了达到省计算的目标,就要8个byte同时做算术

接下来就是构造了

与其一个一个的减 ‘\0’ 不如直接整个减0x30

val = val - 0x3030303030303030;

如果你的数字串是12345678,那对应16进值就是0x0807060504030201,那

然后乘

val = (val * 10) + (val >> 8);

最终效果是这样

uint32_t  parse_eight_digits_unrolled(uint64_t val) {
  const uint64_t mask = 0x000000FF000000FF;
  const uint64_t mul1 = 0x000F424000000064; // 100 + (1000000ULL << 32)
  const uint64_t mul2 = 0x0000271000000001; // 1 + (10000ULL << 32)
  val -= 0x3030303030303030;
  val = (val * 10) + (val >> 8); // val = (val * 2561) >> 8;
  val = (((val & mask) * mul1) + (((val >> 16) & mask) * mul2)) >> 32;
  return val;
}

其实别的地方也有这种技巧,比如

public static int bitCount(int i) {
    // HD, Figure 5-2
    i = i - ((i >>> 1) & 0x55555555);
    i = (i & 0x33333333) + ((i >>> 2) & 0x33333333);
    i = (i + (i >>> 4)) & 0x0f0f0f0f;
    i = i + (i >>> 8);
    i = i + (i >>> 16);
    return i & 0x3f;
}

原理 HD就是Hacker’s Delight这本书的意思

其实主要构造是最难的要考虑用算数拼出来

找平均值,且不溢出

unsigned average(unsigned a, unsigned b)
{
    return (a + b) / 2;
}

这个明显会溢出

如果你知道两个数的大小的话

unsigned average(unsigned low, unsigned high)
{
    return low + (high - low) / 2;
}

也有一种不需要知道大小的方法

unsigned average(unsigned a, unsigned b)
{
    return (a / 2) + (b / 2) + (a & b & 1);
}

当然SWAR方法更快

unsigned average(unsigned a, unsigned b)
{
    return (a & b) + (a ^ b) / 2;
}

原理a + b = ((a & b) « 1) + (a ^ b) 两部分分别是头和尾

作者还讨论了不同平台下的实现方法。喜欢扣细节的可以看看

另外推荐观看CppCon 2019: Marshall Clow “std::midpoint? How Hard Could it Be?”

写了个brainfuck编译器,用上constexpr和编译器优化,不知道brainfuck的先百度下,这里直接贴代码

一个实现

enum class op
{
    ptr_inc,     // >
    ptr_dec,     // <
    data_inc,    // +
    data_dec,    // -
    write,       // .
    read,        // ,
    jmp_ifz,     // [, jump if zero
    jmp,         // ], unconditional jump
};

template <std::size_t InstructionCapacity>
struct program
{
    std::size_t inst_count;
    op          inst[InstructionCapacity];
    std::size_t inst_jmp[InstructionCapacity];
};



template <std::size_t InstructionCapacity>
void execute(const program<InstructionCapacity>& program,
             unsigned char* data_ptr)
{
    auto inst_ptr = std::size_t(0);
    while (inst_ptr < program.inst_count)
    {
        switch (program.inst[inst_ptr])
        {
        case op::ptr_inc:
            ++data_ptr;
            ++inst_ptr;
            break;
        case op::ptr_dec:
            --data_ptr;
            ++inst_ptr;
            break;
        case op::data_inc:
            ++*data_ptr;
            ++inst_ptr;
            break;
        case op::data_dec:
            --*data_ptr;
            ++inst_ptr;
            break;
        case op::write:
            std::putchar(*data_ptr);
            ++inst_ptr;
            break;
        case op::read:
            *data_ptr = static_cast<unsigned char>(std::getchar());
            ++inst_ptr;
            break;
        case op::jmp_ifz:
            if (*data_ptr == 0)
                inst_ptr = program.inst_jmp[inst_ptr];
            else
                ++inst_ptr;
            break;
        case op::jmp:
            inst_ptr = program.inst_jmp[inst_ptr];
            break;
        }
    }
}



template <std::size_t N>
constexpr auto parse(const char (&str)[N])
{
    program<N> result{};

    std::size_t jump_stack[N] = {};
    std::size_t jump_stack_top = 0;

    for (auto ptr = str; *ptr; ++ptr)
    {
        if (*ptr ==  '>')
            result.inst[result.inst_count++] = op::ptr_inc;
        else if (*ptr ==  '<')
            result.inst[result.inst_count++] = op::ptr_dec;
        else if (*ptr ==  '+')
            result.inst[result.inst_count++] = op::data_inc;
        else if (*ptr ==  '-')
            result.inst[result.inst_count++] = op::data_dec;
        else if (*ptr ==  '.')
            result.inst[result.inst_count++] = op::write;
        else if (*ptr ==  ',')
            result.inst[result.inst_count++] = op::read;
        else if (*ptr == '[')
        {
            jump_stack[jump_stack_top++] = result.inst_count;
            result.inst[result.inst_count++] = op::jmp_ifz;
        }
        else if (*ptr == ']')
        {
            auto open = jump_stack[--jump_stack_top];
            auto close = result.inst_count++;

            result.inst[close] = op::jmp;
            result.inst_jmp[close] = open;

            result.inst_jmp[open] = close + 1;
        }
    }

    return result;
}

如何使用?

// `x = std::getchar(); y = x + 3; std::putchar(y);`
static constexpr auto add3 = parse(",>+++<[->+<]>.");

// Use this array for our data_ptr.
unsigned char memory[1024] = {};
execute(add3, memory);

不是很难

如果想玩jit优化可以看 Eli Bendersky的文章.

这里不是重点,重点是constexpr

使用尾递归

template <std::size_t InstructionCapacity>
void execute(const program<InstructionCapacity>& program,
             unsigned char* data_ptr,
             std::size_t inst_ptr = 0)
{
    if (inst_ptr >= program.inst_count)
        return; // Execution is finished.

    switch (program.inst[inst_ptr])
    {
    case op::ptr_inc:
        ++data_ptr;
        return execute(program, data_ptr, inst_ptr + 1);
    case op::ptr_dec:
        --data_ptr;
        return execute(program, data_ptr, inst_ptr + 1);
    case op::data_inc:
        ++*data_ptr;
        return execute(program, data_ptr, inst_ptr + 1);
    case op::data_dec:
        --*data_ptr;
        return execute(program, data_ptr, inst_ptr + 1);
    case op::write:
        std::putchar(*data_ptr);
        return execute(program, data_ptr, inst_ptr + 1);
    case op::read:
        *data_ptr = static_cast<unsigned char>(std::getchar());
        return execute(program, data_ptr, inst_ptr + 1);
    case op::jmp_ifz:
        if (*data_ptr == 0)
            return execute(program, data_ptr, program.inst_jmp[inst_ptr]);
        else
            return execute(program, data_ptr, inst_ptr + 1);
    case op::jmp:
        return execute(program, data_ptr, program.inst_jmp[inst_ptr]);
    }
}

Constexpr + template

template <const auto& Program, std::size_t InstPtr = 0>
constexpr void execute(unsigned char* data_ptr)
{
    if constexpr (InstPtr >= Program.inst_count)
    {
        // Execution is finished.
        return;
    }
    else if constexpr (Program.inst[InstPtr] == op::ptr_inc)
    {
        ++data_ptr;
        return execute<Program, InstPtr + 1>(data_ptr);
    }
    else if constexpr (Program.inst[InstPtr] == op::ptr_dec)
    {
        --data_ptr;
        return execute<Program, InstPtr + 1>(data_ptr);
    }
    else if constexpr (Program.inst[InstPtr] == op::data_inc)
    {
        ++*data_ptr;
        return execute<Program, InstPtr + 1>(data_ptr);
    }
    else if constexpr (Program.inst[InstPtr] == op::data_dec)
    {
        --*data_ptr;
        return execute<Program, InstPtr + 1>(data_ptr);
    }
    else if constexpr (Program.inst[InstPtr] == op::write)
    {
        std::putchar(*data_ptr);
        return execute<Program, InstPtr + 1>(data_ptr);
    }
    else if constexpr (Program.inst[InstPtr] == op::read)
    {
        *data_ptr = static_cast<char>(std::getchar());
        return execute<Program, InstPtr + 1>(data_ptr);
    }
    else if constexpr (Program.inst[InstPtr] == op::jmp_ifz)
    {
        if (*data_ptr == 0)
            return execute<Program, Program.inst_jmp[InstPtr]>(data_ptr);
        else
            return execute<Program, InstPtr + 1>(data_ptr);
    }
    else if constexpr (Program.inst[InstPtr] == op::jmp)
    {
        return execute<Program, Program.inst_jmp[InstPtr]>(data_ptr);
    }
}

最后用法也改成了

    // `x = std::getchar(); y = x + 3; std::putchar(y);`
    static constexpr auto add3 = parse(",>+++<[->+<]>.");

    // Use this array for our data_ptr.
    unsigned char memory[1024] = {};
    execute<add3>(memory);

旧版godbolt https://godbolt.org/z/MaPffqxGT

新版godbolt https://godbolt.org/z/Gd3zKWvKE

开O2能直接算出结果,constexpr的好处还是很明显的

CTRE基本上也是这个原理 CTRE library

std::condition_variable_any cond;
boost::shared_mutex m;

void foo() {
    boost::shared_lock<boost::shared_mutex> lk(m);
    while(!some_condition()) {
        cond.wait(lk);
    }
}

c++20

 void testInterruptibleCVWait()
 {
     bool ready = false;
     std::mutex readyMutex;
     std::condition_variable_any readyCV;
 
     std::jthread t([&ready, &readyMutex, &readyCV] (std::stop_token st)
     {
         while (...)
         {
 
             ...
             {
                 std::unique_lock lg{readyMutex};
                 readyCV.wait_until(lg, [&ready] {return ready; }, st);
                 // also ends wait on stop request for st
             }
             ...
         }
    });
 ...
 } // jthread destructor signals stop request and therefore unblocks the CV wait and ends the started thread

除非你知道具体用途,否则别用volatile 用volatile就意味着被修饰的值是经常变的,所以会阻止软件上的优化

一个场景

void blink_twice() {
    volatile int *off = reinterpret_cast<int*>(0x40225190);
    volatile int *on = reinterpret_cast<int*>(0x40225194);
    int flag = (1 << 22);

    *on = flag;
    sleep(1);
    *off = flag;
    sleep(1);
    *on = flag;
    sleep(1);
    *off = flag;
}

写GPIO接口(操控单片机之类的板子)

struct MoveOnlyWidget {
    MoveOnlyWidget(int);
    MoveOnlyWidget(const MoveOnlyWidget&) = delete;
    MoveOnlyWidget(MoveOnlyWidget&&) = default;
    MoveOnlyWidget& operator=(const MoveOnlyWidget&) = delete;
    MoveOnlyWidget& operator=(MoveOnlyWidget&&) = default;
    MoveOnlyWidget() = default;
};

两个构造函数

建议用

static_assert(!std::is_constructible_v<MoveOnlyWidget>);
static_assert(!std::is_default_constructible_v<MoveOnlyWidget>);
static_assert(!std::default_initializable<MoveOnlyWidget>);

类似的编译期单测直接干掉这种小错误

std::tuple tp { 10, 20, 3.14, 42, "hello"};
printTuple(tp);

实现printTuple,怎么做

首先试着get单条

std::tuple tp {42, 10.5, "hello"};
std::cout << std::get<0>(tp) << ", ";
std::cout << std::get<1>(tp) << ", ";
std::cout << std::get<2>(tp) << ", ";

遍历tuple就是把std::get调用多次,就是生成get的模版参数N

template <typename T>
void printElem(const T& x) {
    std::cout << x << ',';
};

template <typename TupleT, std::size_t... Is>
void printTupleManual(const TupleT& tp) {
    (printElem(std::get<Is>(tp)), ...);
}


std::tuple tp { 10, 20, "hello"};
printTupleManual<decltype(tp), 0, 1, 2>(tp);

现在的问题就是如何生成Is列表,去掉printTupleManual 的模版参数

index_sequence能生成,所以有size就可以了

template <typename T>
void printElem(const T& x) {
    std::cout << x << ',';
};

template <typename TupleT, std::size_t... Is>
void printTupleManual(const TupleT& tp, std::index_sequence<Is...>) {
    (printElem(std::get<Is>(tp)), ...);
}


std::tuple tp { 10, 20, "hello"};
printTupleManual(tp, std::make_index_sequence<3>{});

而size tuple本身是有的

printTupleManual(tp, std::make_index_sequence<std::tuple_size_v<decltype(tp)>>{});

所以组合一下就可以了

template <typename TupleT, std::size_t TupSize = std::tuple_size_v<TupleT>>
void printTupleGetSize(const TupleT& tp) {
    printTupleManual(tp, std::make_index_sequence<TupSize>{});
}

std::tuple tp { 10, 20, "hello"};
printTupleGetSize(tp);

整理一下代码

template <typename TupleT, std::size_t... Is>
void printTupleImp(const TupleT& tp, std::index_sequence<Is...>) {
    size_t index = 0;
    auto printElem = [&index](const auto& x) {
        if (index++ > 0) 
            std::cout << ", ";
        std::cout << x;
    };

    std::cout << "(";
    (printElem(std::get<Is>(tp)), ...);
    std::cout << ")";
}

template <typename TupleT, std::size_t TupSize = std::tuple_size_v<TupleT>>
void printTuple(const TupleT& tp) {
    printTupleImp(tp, std::make_index_sequence<TupSize>{});
}

支持流式 «

#include <iostream>
#include <ostream>
#include <tuple>

template <typename TupleT, std::size_t... Is>
std::ostream& printTupleImp(std::ostream& os, const TupleT& tp, std::index_sequence<Is...>) {
    size_t index = 0;
    auto printElem = [&index, &os](const auto& x) {
        if (index++ > 0) 
            os << ", ";
        os << x;
    };

    os << "(";
    (printElem(std::get<Is>(tp)), ...);
    os << ")";
    return os;
}

template <typename TupleT, std::size_t TupSize = std::tuple_size<TupleT>::value>
std::ostream& operator <<(std::ostream& os, const TupleT& tp) {
    return printTupleImp(os, tp, std::make_index_sequence<TupSize>{}); 
}

int main() {
    std::tuple tp { 10, 20, "hello"};
    std::cout << tp << '\n';
}

再优化一下,加上索引

#include <iostream>
#include <ostream>
#include <tuple>

template <typename TupleT, std::size_t... Is>
std::ostream& printTupleImp(std::ostream& os, const TupleT& tp, std::index_sequence<Is...>) {
    auto printElem = [&os](const auto& x, size_t id) {
        if (id > 0) 
            os << ", ";
        os << id << ": " << x;
    };

    os << "(";
    (printElem(std::get<Is>(tp), Is), ...);
    os << ")";
    return os;
}

template <typename TupleT, std::size_t TupSize = std::tuple_size<TupleT>::value>
std::ostream& operator <<(std::ostream& os, const TupleT& tp) {
    return printTupleImp(os, tp, std::make_index_sequence<TupSize>{}); 
}

int main() {
    std::tuple tp { 10, 20, "hello"};
    std::cout << tp << '\n';
}

作者讨论怎么写driver代码更干净

右值引用有时候又被叫做万能引用,mayer在书里开始叫的,作者觉得右值引用并不万能

读书笔记

小技巧

template<typename Base, typename T>
inline bool instanceof(T *ptr) {
  return dynamic_cast<Base*>(ptr) != nullptr;
}

explict这个也是google代码规范推荐

class Vector {
 public:
  explicit Vector(int sz); // avoid implicit conversion from int to Vector
  /* ... */
}
Vector vec1(7);  // OK
Vector vec2 = 7; // NOT OK

字符串默认char *,要用后缀才能是string类型

/*
 * The "s" suffix needs one of the following namespace to be included:
 *   - using namespace std::literals 
 *   - using namespace std::string_literals 
 *   - using namespace std::literals::string_literal
 */
auto s1 = "Ian";  // const char* (C-style)
auto s2 = "Pan"s; // std::string

避免转义可以使用R”(blahblah)”

constexpr能用就用

视频

用小的int并没有啥好处 uint8之类的,并且如果用auto,都会提成成int。循环里用uint8是不如int的。这些琐碎的东西不需要研究,总之不要耍小聪明用uint8觉得省空间就行了,你哪怕用位域呢

在嵌入式领域,相关的程序员很爱用u8 来做各种临时计算之类了。他都是临时变量了,u8 u32真的没那么重要

PS : 水友地球补充,实际上有uint_fast8_t这种东西,背后就是unsigned int

手把手实现c++ rest api

提案在这里n4895 就是hazard pointer和rcu

开源项目需要人手

新项目介绍/版本更新

工作招聘

暂无推荐


本文永久链接

看到这里或许你有建议或者疑问或者指出错误,请留言评论! 多谢! 你的评论非常重要!也可以帮忙点赞收藏转发!多谢支持! 觉得写的不错那就给点吧, 在线乞讨 微信转账