十月待读

https://unixism.net/2019/04/linux-applications-performance-introduction/

https://www.moritz.systems/blog/mastering-unix-pipes-part-1/

https://neilmadden.blog/2020/11/25/parse-dont-type-check/

https://my.oschina.net/evilunix/blog/3003736

https://danlark.org/2020/11/11/miniselect-practical-and-generic-selection-algorithms/

https://github.com/y123456yz/reading-and-annotate-mongodb-3.6

https://zhuanlan.zhihu.com/p/265701877

https://blog.csdn.net/baijiwei/article/details/80504715

https://blog.csdn.net/baijiwei/article/details/78070355

qbe 一个小编译器后端 https://c9x.me/compile/

https://github.com/jsoysouvanh/Refureku

https://github.com/foonathan/lexy

http://stlab.cc/2020/12/01/forest-introduction.html

https://github.com/stlab/libraries/blob/develop/stlab/forest.hpp

https://medium.com/build-and-learn/fun-with-text-generation-pt-1-markov-models-in-awk-5e1b55fe560c

这个我估计以后也不会看。。。

Cli 程序设计规范

https://clig.dev/

https://danlark.org/2020/06/14/128-bit-division/

https://brevzin.github.io/c++/2020/12/01/tag-invoke/

Read More

std::future 为什么没有then continuation

本来concurrency-ts是做了future::then的,本计划要放在<experimental/future>

asio作者的实现 最终还是没合入

参考链接1里提到,这个方案作废了

参考链接2 的视频里 eric说了这个then contination方案的缺陷,future-promise的都要求太高

由于future-promise之间是需要通信且共享状态的,需要一些资源

  • condvar/mutex同步
  • 堆内存使用
  • 共享状态(shared_future)搞不好还得用引用计数
  • 保存不同种类的future需要type-erasure技术(类似std::any),这也是一笔浪费

作者的观点是lazy-future ,把资源动作全放在最后,把调用指定好,于是就有了一个泛化的std::then函数

Lazy future advantages

  • Async tasks can be composed…
    • … without allocation
    • … without synchronization
    • … without type-erasure
  • Composition is a generic algorithm
  • Blocking is a generic algorithm

展示的代码里没有future,就是各种lambda和execute和then 的结合

eric的作品链接在这里

https://github.com/facebookexperimental/libunifex 还在开发中。很有意思


ref

  • https://stackoverflow.com/questions/63360248/where-is-stdfuturethen-and-the-concurrency-ts
  • https://www.youtube.com/watch?v=tF-Nz4aRWAM
    • ppt https://github.com/CppCon/CppCon2019/blob/master/Presentations/a_unifying_abstraction_for_async_in_cpp/a_unifying_abstraction_for_async_in_cpp__eric_niebler_david_s_hollman__cppcon_2019.pdf
  • ASIO作者的设计 http://chriskohlhoff.github.io/executors/ 还挺好用的,用post取代std::async生成future,可以指定不同的executor,然后executor切换可以通过wrap来换,就相当于folly里的via 基本功能和folly差不太多了
  • 一个concurrency-ts future实现 https://github.com/jaredhoberock/future
  • executor设计还在推进中,我看计划是c++23,变化可能和eric说的差不多,https://github.com/executors/executors
    • http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p0443r14.html
    • 文档看不下去?这有个介绍写的不错 https://cor3ntin.github.io/posts/executors/
  • 有个介绍实现无需type erasure的future C++Now 2018: Vittorio Romeo “Futures Without Type Erasure” https://www.youtube.com/watch?v=Avvhs3PLP7o 简单说就是编译期确定调用链结构,用模版
    • 还有个文档解说 https://www.maxpagani.org/2018/07/31/it18-zero-allocation-and-no-type-erasure-futures/

看到这里或许你有建议或者疑问或者指出我的错误,请留言评论或者邮件mailto:wanghenshui@qq.com, 多谢!

觉得写的不错可以点开扫码赞助几毛 微信转账
Read More

grpc介绍以及原理

介绍ppt,需要了解protobuf的相关概念

整体架构如图

底层通过http2传输,也就带了流式传输功能 单向流双向流,只要proto消息带上 stream修饰符就行了

service Greeter {
  rpc SayHello3 (stream HelloRequest) returns (stream HelloReply) {}
}

http2是必要的么?还是google内网服务全体推http2?所有服务全切到http?如果存量服务没有怎么办?http2能带来哪些优点?

  • 针对http1 链接共享,数据压缩,流量控制,低延迟,还支持流式

官方回应

HTTP2 is used for many good reasons: HTTP2 is a standard and HTTP protocol is well known to proxies, firewalls and many software tools. The streaming nature of HTTP2 suits our needs very well, so no need to reinvent the wheel.

感觉还是主推http2, google没有内部各种各样的协议交互的问题

直接基于tcp socket或者websocket一样性能不差,我猜google内部组件对接上都推http就没有考虑这些方案

使用接口

  • 同步一元调用 Unary /流式 stream
    • 一元调用就是request respons one shot的形式,流式就是一个弱化的tcp流的感觉
  • 异步,得用个消费队列来接收消息
// 记录每个 AsyncSayHello 调用的信息
struct AsyncClientCall {
    HelloReply reply;
    ClientContext context;
    Status status;
    std::unique_ptr<ClientAsyncResponseReader<HelloReply>> response_reader;
};
class GreeterClient 
{
public:
    GreeterClient(std::shared_ptr<Channel> channel)
        : stub_(Greeter::NewStub(channel)) {}
    void SayHello(const std::string& user) 
    {
        HelloRequest request;
        request.set_name(user);
        AsyncClientCall* call = new AsyncClientCall;
        // 异步调用,非阻塞
        call->response_reader = stub_->AsyncSayHello(&call->context, request, &cq_);
        // 当 RPC 调用结束时,让 gRPC 自动将返回结果填充到 AsyncClientCall 中
        // 并将 AsyncClientCall 的地址加入到队列中
        call->response_reader->Finish(&call->reply, &call->status, (void*)call);
    }
    void AsyncCompleteRpc() 
    {
        void* got_tag;
        bool ok = false;
        // 从队列中取出 AsyncClientCall 的地址,会阻塞
        while (cq_.Next(&got_tag, &ok)) 
        {
            AsyncClientCall* call = static_cast<AsyncClientCall*>(got_tag);
            if (call->status.ok())
                std::cout << "Greeter received: " << call->reply.message() << std::endl;
            else
                std::cout << "RPC failed" << std::endl;
			
            delete call;  // 销毁对象 
        }
    }
private:
    std::unique_ptr<Greeter::Stub> stub_;
    CompletionQueue cq_;    // 队列
};
int main()
{
    auto channel = grpc::CreateChannel("localhost:5000", grpc::InsecureChannelCredentials());
    GreeterClient greeter(channel);
    // 启动新线程,从队列中取出结果并处理
    std::thread thread_ = std::thread(&GreeterClient::AsyncCompleteRpc, &greeter);
    for (int i = 0; i < 100; i++) {
        auto user = std::string("hello-world-") + std::to_string(i);
        greeter.SayHello(user);
    }
    return 0;
}

不像trpc会生成异步的客户端代码,future/promise

rpc从简单到复杂

主要做三件事儿

  • 服务端如何确定客户端要调用的函数;
    • 在远程调用中,客户端和服务端分别维护一个函数名id <-> 函数的对应表, 函数名id在所有进程中都是唯一确定的。客户端在做远程过程调用时,附上这个ID,服务端通过查表,来确定客户端需要调用的函数,然后执行相应函数的代码。
  • 如何进行序列化和反序列化;
    • 客户端和服务端交互时将参数或结果转化为字节流在网络中传输,那么数据转化为字节流的或者将字节流转换成能读取的固定格式时就需要进行序列化和反序列化,序列化和反序列化的速度也会影响远程调用的效率。
  • 如何进行网络传输(选择何种网络协议)
    • 多数RPC框架选择TCP作为传输协议,也有部分选择HTTP。如gRPC使用HTTP2。不同的协议各有利弊。TCP更加高效,而HTTP在实际应用中更加的灵活。

服务端注册函数,客户端调用名字对应的函数演化成 定义IDL文件描述接口,服务端实现接口,客户端调用接口 ,名字-函数注册信息全部隐藏在框架背后,并且,接口参数复杂化

protobuf或者thift能提供接口参数的codegen/parse

最简单的rpc ,拿nanorpc举例子

服务端,注册函数名字 函数(存一个map)

auto server = nanorpc::http::easy::make_server("0.0.0.0", "55555", 8, "/api/",
                                               std::pair{"test", [] (std::string const &s) { return "Tested: " + s; } }
                                              );

std::cout << "Press Enter for quit." << std::endl;
std::cin.get();

客户端,调用名字函数

auto client = nanorpc::http::easy::make_client("localhost", "55555", 8, "/api/");
std::string result = client.call("test", std::string{"test"});
std::cout << "Response from server: " << result << std::endl;

中间有个解析组件,把客户端发的函数名和参数拿到,从函数map里拿到函数,调用,结束

grpc这些框架,如何处理这些步骤?

首先,函数表,框架自身保存好,生成接口函数,既然客户端都直接调用接口函数了,这还算rpc么?

框架的客户端调用大多是这个样子的 grpc cpp为例子

 Status status = stub_->SayHello(&context, request, &reply);

生成的客户端接口和服务端的实现接口不一样,代码生成会生成一个代理类(stub), 中间框架自身根据注册信息和context调用服务端的实现,只有request response是完全一致的,函数是不一致的

劣势

  • grpc没有服务自治的能力,也没有整体插件化,不支持插件注入

现在公司使用框架,各种组件组合使用的场景非常多

比如公司级别统一的日志框架,公司级别统一的名字服务,公司级别统一的指标收集平台,统一的配置文件管理/下发平台

提供统一的访问平台来查阅,而不是傻乎乎的登到机器上,等登到机器上事情就闹大了

假设你要做一个新的rpc框架,你需要

  • proto文件要能管理起来
  • 生成文件组合到构建脚本之中 smfrpc框架的codegen很有意思,有机会可以研究一下
  • 服务治理相关的能力,预留logger接口,预留的指标采集接口
  • rpc消息染色怎么做?
  • 熔断机制/限流机制
  • 要不要做协议无关的rpc?tcp/http2 各有优点,某些场景协议特殊,比如sip,需要自己加seq管理
  • rpc的io模型,future/promise?merge? Coroutine?

假设你要做一个使用rpc的服务,你需要

  • 灵活的配置文件更新组件,查觉配置文件变动,重新加载 (可能需要公司级别的配置文件下发更新平台)
  • 域名名字服务,最好用统一的
  • 预留好日志logger,指标采集接口方便注入
  • 灰度倒入流量怎么做?tcpcopy?

ref

  • 知乎有个专栏,列了很多,可以参考 https://www.zhihu.com/column/c_1099707347118718976

  • https://colobu.com/2017/04/06/dive-into-gRPC-streaming/

  • 一篇grpc-go的分析 https://segmentfault.com/a/1190000019608421

  • protobuf指南 https://blog.csdn.net/u011518120/article/details/54604615

  • 一个future-promise rpc https://github.com/loveyacper/ananas/blob/master/docs/06_protobuf_rpc.md

  • 腾讯的phxrpc https://github.com/Tencent/phxrpc

  • 腾讯的tarsrpc 腾讯rpc真多啊 https://github.com/TarsCloud/TarsCpp

  • https://github.com/pfan123/Articles/issues/76 这个介绍了thrift


看到这里或许你有建议或者疑问或者指出我的错误,请留言评论或者邮件mailto:wanghenshui@qq.com, 多谢!

觉得写的不错可以点开扫码赞助几毛 微信转账
Read More


类型擦除技术 type erasure以及std::function设计实现


本文是type erased printabledesign space for std::function 的整理总结

说是类型擦除技术,不如说是多态技术

函数指针多态 几种做法

  • void* 传统的万能参数
  • 继承接口值多态,dynamic_cast
  • 值语意的多态,type erasure 也就是类型擦除
    • std::function boost::any_range boost::any

来举个例子, type erased printable

打印托管的值 godbolt链接

#include <memory>
#include <ostream>

struct PrintableBase {
    virtual void print(std::ostream& os) const = 0;
    virtual ~PrintableBase() = default;
};

template<class T>
struct PrintableImpl : PrintableBase {
    T t_;
    explicit PrintableImpl(T t) : t_(std::move(t)) {}
    void print(std::ostream& os) const override { os << t_; }
};

class UniquePrintable {
    std::unique_ptr<PrintableBase> p_;
public:
    template<class T>
    UniquePrintable(T t) : p_(std::make_unique<PrintableImpl<T>>(std::move(t))) { }

    friend std::ostream& operator<<(std::ostream& os, const UniquePrintable& self) {
        self.p_->print(os);
        return os;
    }
};

#include <iostream>

void printit(UniquePrintable p) {
    std::cout << "The printable thing was: " << p << "." << std::endl;
}

int main() {
    printit(42);
    printit("hello world");
}

直接打印值(传引用) Godbolt.

#include <ostream>

class PrintableRef {
    const void *data_;
    void (*print_)(std::ostream&, const void *);
public:
    template<class T>
    PrintableRef(const T& t) : data_(&t), print_([](std::ostream& os, const void *data) {
        os << *(const T*)data;
    }) { }

    friend std::ostream& operator<<(std::ostream& os, const PrintableRef& self) {
        self.print_(os, self.data_);
        return os;
    }
};

#include <iostream>

void printit(PrintableRef p) {
    std::cout << "The printable thing was: " << p << "." << std::endl;
}

int main() {
    printit(42);
    printit("hello world");
}

这两种类型擦除,一个是统一接口,记住值类型,然后打印方法和类型绑定

一个是干脆在一开始就吧打印方法实例化,值类型 void* 存地址

也就是上面说到的两种技术的展开

第一种虚函数的方法是有开销的

说到std::function和std::any,标准库为这种虚函数做了优化,也叫SBO

首先从std::function的设计谈起

  • 函数要不要保存?保存就是std::function,不保存就是function_ref. (一种view,提案中)

    • 需求,需不需要管理这个函数,生命周期等,function_ref只用不管
  • 需不需要拷贝?需要就是std::function,不需要拷贝就是std::unique_function. (一种unique guard,提案中) 也就是move-only

  • 需不需要共享?共享带来函数副作用了

    uto f = [i=0]() mutable { return ++i; };
    F<int()> alpha = f;
    F<int()> beta = alpha;
    F<int()> gamma = f;
    assert(alpha() == 1);
    assert(beta() == 2);  // beta shares alpha's heap-managed state
    assert(gamma() == 1);  
    

    可能会有个shared_function的东西(我感觉多余)

  • SBO相关设计 类似SSO 就是在栈上开个buffer存,不分配堆资源

    • buffer大小?要不要可定制? 当前不同的标准库用的buffer不一样,clang libc++ 是24 gcc libstdc++是16

    自己设计,可能会定制化

    template<class Signature, size_t Capacity = 24, size_t Align = alignof(std::max_align_t)>
    class F;
      
    using SSECallback = F<int(), 32, 32>;  // suitable for lambdas that capture MMX vector type
    

    这点子没人想过?不可能,已经有人实现了 inplace_function.

  • 如果SBO优化不了怎么办?是不是需要支持alloctor接口?

  • 强制SBO,不能SBO的直接编译爆错,inplace_function.做了

  • SBO优化,要保证对象nothrow

    • static_assert(std::is_nothrow_move_constructible_v<T>) inside the constructor of F.
  • SBO优化针对not trivially copyable如何处理

    • libc++ 保证可以SBO,但是libstdc++没有
    • 通过static_assert(is_trivially_relocatable_v && sizeof(T) <= Capacity && alignof(T) <= Align) inside the constructor of F控制
  • function能不能empty,能不能从nullptr构造

  • function之间能不能转换类型?

还有很多角落,不想看了,这也是有各种function提案补充的原因


ref

  • https://www.newsmth.net/nForum/#!article/Programming/3083 发现个02年的介绍boost::any的帖子卧槽,历史的痕迹
  • https://akrzemi1.wordpress.com/2013/11/18/type-erasure-part-i/
  • 历史的痕迹 any_iterator http://thbecker.net/free_software_utilities/type_erasure_for_cpp_iterators/any_iterator.html
  • std::function实现介绍 gcc源码级https://www.cnblogs.com/jerry-fuyi/p/std_function_interface_implementation.html
  • std::function实现介绍,由浅入深 https://zhuanlan.zhihu.com/p/142175297
  • 这个文章写的不错。我写了一半发现有人写了。。。 直接看这个就好了https://fuzhe1989.github.io/2017/10/29/cpp-type-erasure/

看到这里或许你有建议或者疑问或者指出我的错误,请留言评论或者邮件mailto:wanghenshui@qq.com, 多谢!

觉得写的不错可以点开扫码赞助几毛 微信转账
Read More

c++反射的几种实现以及介绍几个库

人需求真是复杂。又想要名字信息,又想要泛化的访问接口

反射实现的几种方案

  • 预处理一层
    • 代表 QT Unreal 先用宏声明好需要处理的字段,然后让编译框架中的预-预编译处理器先处理一遍,展开对应的标记
    • 用libclang来做,metareflect cpp-reflection 还有个原理介绍
  • 注册,有几种方案
    • 用宏拼方法+注册 Rttr
    • 手写注册 meta
    • 编译器推导(功能有限) magic_get

ref

  • 复述了这篇博客的内容 https://blog.csdn.net/D_Guco/article/details/106744416
  • Rttr 这个手法就是宏注册
  • ponder 也是有一个注册中心的,把字符串和函数指针绑起来
  • cista 官网介绍 这个库的思路和之前提到的magic_get差不多,也提供宏注入的手段,他说灵感来自这个帖子

看到这里或许你有建议或者疑问或者指出我的错误,请留言评论或者邮件mailto:wanghenshui@qq.com, 多谢! 你的评论非常重要!

觉得写的不错可以点开扫码赞助几毛 微信转账
Read More

boost.pfr(Precise and Flat Reflection)也叫magic_get 如何使用以及实现原理

用途,提供最基本的反射能力,即不需要指定的访问字段

这种设计即能保证tuple类型访问又能保留名字信息,通过静态反射来搞定

局限:仅仅支持简单的聚合类型(aggregate types),多了继承就不行了,空基类也不行

struct simple_aggregate {  // SimpleAggregare
    std::string name;
    int age;
    boost::uuids::uuid uuid;
};

struct empty {             // SimpleAggregare
};

struct aggregate : empty { // not a SimpleAggregare
    std::string name;
    int age;
    boost::uuids::uuid uuid;
};

用法

#include <iostream>
#include <boost/pfr.hpp>
 
struct  Record
{
  std::string name;
  int         age;
  double      salary;
};
 
struct Point
{
  int x;
  int y;
};
 
int main()
{
  Point pt{2, 3};
  Record rec {"Baggins", 111, 999.99};
   
  auto print = [](auto const& member) {
    std::cout << member << " ";
  };  
  
  boost::pfr::for_each_field(rec, print);
  boost::pfr::for_each_field(pt, print);
}

文档里也介绍了原理

  • at compile-time: use aggregate initialization to detect fields count in user-provided structure
    • BOOST_PFR_USE_CPP17 == 1:
      • at compile-time: structured bindings are used to decompose a type T to known amount of fields
    • BOOST_PFR_USE_CPP17 == 0 && BOOST_PFR_USE_LOOPHOLE == 1:
      • at compile-time: use aggregate initialization to detect fields count in user-provided structure
      • at compile-time: make a structure that is convertible to anything and remember types it has been converted to during aggregate initialization of user-provided structure
      • at compile-time: using knowledge from previous steps create a tuple with exactly the same layout as in user-provided structure
      • at compile-time: find offsets for each field in user-provided structure using the tuple from previous step
      • at run-time: get pointer to each field, knowing the structure address and each field offset
      • at run-time: a tuple of references to fields is returned => all the tuple methods are available for the structure
    • BOOST_PFR_USE_CPP17 == 0 && BOOST_PFR_USE_LOOPHOLE == 0:
      • at compile-time: let I be is an index of current field, it equals 0
      • at run-time: T is constructed and field I is aggregate initialized using a separate instance of structure that is convertible to anything
      • at compile-time: I += 1
      • at compile-time: if I does not equal fields count goto step c. from inside of the conversion operator of the structure that is convertible to anything
      • at compile-time: using knowledge from previous steps create a tuple with exactly the same layout as in user-provided structure
      • at compile-time: find offsets for each field in user-provided structure using the tuple from previous step
      • at run-time: get pointer to each field, knowing the structure address and each field offset
  • at run-time: a tuple of references to fields is returned => all the tuple methods are available for the structure

现在是c++17~c++20了,考虑BOOST_PFR_USE_CPP17 == 1 就是利用结构化绑定和展开

原型大概这样

template <typename T, typename F>
 // requires std::is_aggregate_v<T>
void for_each_member(T const & v, F f);

首先,我们要能探测出这个结构体有多少个字段

template <typename T>
constexpr auto size_() 
  -> decltype(T{\
  {}, {}, {}, {}\
               }, 0u)
{ return 4u; }
 
template <typename T>
constexpr auto size_() 
  -> decltype(T{\
  {}, {}, {}\
               }, 0u)
{ return 3u; }
 
template <typename T>
constexpr auto size_() 
  -> decltype(T{\
  {}, {}\
               }, 0u)
{ return 2u; }
 
template <typename T>
constexpr auto size_() 
  -> decltype(T{\
  {}\
               }, 0u)
{ return 1u; }
 
template <typename T>
constexpr auto size_() 
  -> decltype(T{}, 0u)
{ return 0u; }
 
template <typename T>
constexpr size_t size() 
{ 
  static_assert(std::is_aggregate_v<T>);
  return size_<T>(); 
}

这段代码有点鬼畜, jeklly对两个括号没法解析,所以我加了斜杠

主要看这个decltype(T{\{}, {}\}, 0u), 要明白这是逗号表达式,左边的值是无所谓的,也就是说最后推导的肯定是usigned

但是能用T里面构造出来,就说明有几个字段,就匹配到了某个函数,返回值就是字段的个数

这里我们假定都是能用值初始化的,但可能某些字段不可以这样初始化,所以要加一个cast函数,来强制转换一下

struct init
{
  template <typename T>
  operator T(); // never defined
};
template <typename T>
constexpr auto size_() 
  -> decltype(T{init{}, init{}, init{}, init{}\
               }, 0u)
{ return 4u; }
 
template <typename T>
constexpr auto size_() 
  -> decltype(T{init{}, init{}, init{}\
               }, 0u)
{ return 3u; }
 
template <typename T>
constexpr auto size_() 
  -> decltype(T{init{}, init{}}, 0u)
{ return 2u; }
 
template <typename T>
constexpr auto size_() 
  -> decltype(T{init{}}, 0u)
{ return 1u; }
 
template <typename T>
constexpr auto size_() 
  -> decltype(T{}, 0u)
{ return 0u; }
 
template <typename T>
constexpr size_t size() 
{ 
  static_assert(std::is_aggregate_v<T>);
  return size_<T>(); 
}

看上去可以了,但是size<Point>();还是会报错,因为简单类型不一定需要多个字段都初始化,所以可能会匹配多个

引入tag dispatch

template <unsigned I>
struct tag : tag<I - 1> {};
 
template <>
struct tag<0> {};
template <typename T>
constexpr auto size_(tag<4>) 
  -> decltype(T{init{}, init{}, init{}, init{}}, 0u)
{ return 4u; }
 
template <typename T>
constexpr auto size_(tag<3>) 
  -> decltype(T{init{}, init{}, init{}}, 0u)
{ return 3u; }
 
template <typename T>
constexpr auto size_(tag<2>) 
  -> decltype(T{init{}, init{}}, 0u)
{ return 2u; }
 
template <typename T>
constexpr auto size_(tag<1>) 
  -> decltype(T{init{}}, 0u)
{ return 1u; }
 
template <typename T>
constexpr auto size_(tag<0>) 
  -> decltype(T{}, 0u)
{ return 0u; }
 
template <typename T>
constexpr size_t size() 
{ 
  static_assert(std::is_aggregate_v<T>);
  return size_<T>(tag<4>{}); // highest supported number 
}

这样就不会匹配错误了

对应的for_each就是结构化绑定

template <typename T, typename F>
void for_each_member(T const& v, F f)
{
  static_assert(std::is_aggregate_v<T>);
 
  if constexpr (size<T>() == 4u)
  {
    const auto& [m0, m1, m2, m3] = v;
    f(m0); f(m1); f(m2); f(m3);
  }
  else if constexpr (size<T>() == 3u)
  {
    const auto& [m0, m1, m2] = v;
    f(m0); f(m1); f(m2);
  }
  else if constexpr (size<T>() == 2u)
  {
    const auto& [m0, m1] = v;
    f(m0); f(m1);
  }
  else if constexpr (size<T>() == 1u)
  {
    const auto& [m0] = v;
    f(m0);
  }
}

知道size就好泛化了。

boost.pfr实现的更加泛化,有机会可以研究研究


ref


看到这里或许你有建议或者疑问或者指出我的错误,请留言评论或者邮件mailto:wanghenshui@qq.com, 多谢! 你的评论非常重要!

觉得写的不错可以点开扫码赞助几毛 微信转账
Read More

std::exchange用法

这个函数没啥好说的,主要是为了偷东西 诞生的,实现非常简单

template<class T, class U = T>
constexpr // since C++20
T exchange(T& obj, U&& new_value)
{
    T old_value = std::move(obj);
    obj = std::forward<U>(new_value);
    return old_value;
}

比如参考链接1里面 move constructor的实现

struct S
{
  int n;
 
  S(S&& other) noexcept : n{std::exchange(other.n, 0)}
  {}
 
  S& operator=(S&& other) noexcept 
  {
    if(this != &other)
        n = std::exchange(other.n, 0); // 移动 n ,并于 other.n 留下零
    return *this;
  }
};

我看到的用法

template <promise_base::urgent Urgent>
void promise_base::make_ready() noexcept {
    if (_task) {
        if (Urgent == urgent::yes) {
            ::seastar::schedule_urgent(std::exchange(_task, nullptr));
        } else {
            ::seastar::schedule(std::exchange(_task, nullptr));
        }
    }
}

可能就要比较std::swap和他的区别了,直接上结论吧,上限是std::swap的性能,要不是为了偷东西这个特性,不要用

SO有个链接做了简单验证,见参考链接2

然后Ben Deane 有个案例 std::exchange 惯用法,在参考链接3 4 里。简单概括下

就是用std:exchange 来省掉没必要的临时变量,链接3 的ppt可以看下,写的很漂亮,作者叫他 The “swap-and-iterate” pattern

我把参考链接四的代码贴一下

以前用swap

class Dispatcher {
    // We hold some vector of callables that represents
    // events to dispatch or actions to take
    using Callback = /* some callable */;
    std::vector<Callback> callbacks_;
 
    // Anyone can register an event to be dispatched later
    void defer_event(const Callback& cb) {
        callbacks_.push_back(cb);
    }
 
    // All events are dispatched when we call process
    void process() {
        std::vector<Callback> tmp{};
        using std::swap; // the "std::swap" two-step
        swap(tmp, callbacks_);
        for (const auto& callback : tmp) {
            std::invoke(callback);
        }
    }
  
    void post_event(Callback& cb) {
        Callback tmp{};
        using std::swap;
        swap(cb, tmp);
        PostToMainThread([this, cb_ = std::move(tmp)] {
            callbacks_.push_back(cb_);
        });
    }
};

改成exchange

class Dispatcher {
    // ...
 
    // All events are dispatched when we call process
    void process() {
        for (const auto& callback : std::exchange(callbacks_, {}) {
            std::invoke(callback);
        }
    }
    
    void post_event(Callback& cb) {
        PostToMainThread([this, cb_ = std::exchange(cb, {})] {
            callbacks_.push_back(cb_);
        });
    }
};

可能你会问,直接std::move不就完事儿,这里作者强调接口的灵活性?

强调move并不会empty,并不会clear,可能还有值,比如std::optional

结合lock

原本std::swap 是这样的

class Dispatcher {
    // ...
 
    // All events are dispatched when we call process
    void process() {
        std::vector<Callback> tmp{};
        {
            using std::swap;
            std::scoped_lock lock{mutex_};
            swap(tmp, callbacks_);
        }
        for (const auto& callback : tmp) {
            std::invoke(callback);
        }
    }
};

改成exchange 省掉一个数组

class Dispatcher {
    // ...
 
    // All events are dispatched when we call process
    void process() {
        std::scoped_lock lock{mutex_};
        for (const auto& callback : std::exchange(callbacks_, {})) {
            std::invoke(callback);
        }
    }
};

能不能吧lock也省掉?临时变量声明周期是一行,一行就够了

class Dispatcher {
    // ...
 
    // All events are dispatched when we call process
    void process() {
        const auto tmp = (std::scoped_lock{mutex_}, std::exchange(callbacks_, {}));
        for (const auto& callback : tmp) {
            std::invoke(callback);
        }
    }
};

ref

  • https://zh.cppreference.com/w/cpp/utility/exchange
  • https://stackoverflow.com/questions/20807938/stdswap-vs-stdexchange-vs-swap-operator
  • https://github.com/CppCon/CppCon2017/blob/master/Lightning%20Talks%20and%20Lunch%20Sessions/std%20exchange%20idioms/std%20exchange%20idioms%20-%20Ben%20Deane%20-%20CppCon%202017.pdf
  • https://www.fluentcpp.com/2020/09/25/stdexchange-patterns-fast-safe-expressive-and-probably-underused/

看到这里或许你有建议或者疑问或者指出我的错误,请留言评论或者邮件mailto:wanghenshui@qq.com, 多谢!

觉得写的不错可以点开扫码赞助几毛 微信转账
Read More

探测指针地址是否有效


文章摘抄自这里

场景,当访问不合法的地址,当场segment fault,为了避免,如何探测?

两种方案

  • 捕获sigsegv信号
  • 查/proc/self/maps 的地址范围,做个校验
    • 问题:多线程竞争,可能导致误判,不可取

作者写了个简单的代码,这里直接列出来看看原理即可

#define _GNU_SOURCE
#include <stdint.h>
#include <signal.h>
#include <assert.h>
#include <stdlib.h>
#include <stdio.h>
#include <ucontext.h>

#ifdef __i386__
typedef uint32_t word_t;
#define IP_REG REG_EIP
#define IP_REG_SKIP 3
#define READ_CODE __asm__ __volatile__(".byte 0x8b, 0x03\n"  /* mov (%ebx), %eax */ \
                                       ".byte 0x41\n"        /* inc %ecx */ \
                                       : "=a"(ret), "=c"(tmp) : "b"(addr), "c"(tmp));
#endif

#ifdef __x86_64__
typedef uint64_t word_t;
#define IP_REG REG_RIP
#define IP_REG_SKIP 6
#define READ_CODE __asm__ __volatile__(".byte 0x48, 0x8b, 0x03\n"  /* mov (%rbx), %rax */ \
                                       ".byte 0x48, 0xff, 0xc1\n"  /* inc %rcx */ \
                                       : "=a"(ret), "=c"(tmp) : "b"(addr), "c"(tmp));
#endif

static void segv_action(int sig, siginfo_t *info, void *ucontext) {
    (void) sig;
    (void) info;
    ucontext_t *uctx = (ucontext_t*) ucontext;
    uctx->uc_mcontext.gregs[IP_REG] += IP_REG_SKIP;
}

struct sigaction peek_sigaction = {
    .sa_sigaction = segv_action,
    .sa_flags = SA_SIGINFO,
    .sa_mask = 0,
};

word_t peek(word_t *addr, int *success) {
    word_t ret;
    int tmp, res;
    struct sigaction prev_act;

    res = sigaction(SIGSEGV, &peek_sigaction, &prev_act);
    assert(res == 0);

    tmp = 0;
    READ_CODE

    res = sigaction(SIGSEGV, &prev_act, NULL);
    assert(res == 0);

    if (success) {
        *success = tmp;
    }

    return ret;
}

int main() {
    int success;
    word_t number = 22;
    word_t value;

    number = 22;
    value = peek(&number, &success);
    printf("%d %d\n", success, value);

    value = peek(NULL, &success);
    printf("%d %d\n", success, value);

    value = peek((word_t*)0x1234, &success);
    printf("%d %d\n", success, value);

    return 0;
}

看一乐啊,这里就是操作指针对应的寄存器,不保证正确(多线程下应该不对,这东西应该说进程级别的,放在最外层)

另外,如果不是写什么共享内存程序,segment fault 就挂了得了,别挽救了


看到这里或许你有建议或者疑问或者指出我的错误,请留言评论或者邮件mailto:wanghenshui@qq.com, 多谢!

觉得写的不错可以点开扫码赞助几毛 微信转账
Read More

(cppcon)Practical memory pool based allocators for Modern C++

Practical memory pool based allocators for Modern C++

又讲内存池实现的,内存池等于块池

bucket为基本单元 bucket has two properties: BlockSize and BlockCount

bucket主要接口,构造析构分配回收

class bucket {
public:
	const std::size_t BlockSize;
	const std::size_t BlockCount;
	bucket(std::size_t block_size, std::size_t block_count);
	~bucket();
	// Tests if the pointer belongs to this bucket
	bool belongs(void * ptr) const noexcept;
	// Returns nullptr if failed
	[[nodiscard]] void * allocate(std::size_t bytes) noexcept;
	void deallocate(void * ptr, std::size_t bytes) noexcept;
private:
	// Finds n free contiguous blocks in the ledger and returns the first block’s index or BlockCount on failure
	std::size_t find_contiguous_blocks(std::size_t n) const noexcept;
	// Marks n blocks in the ledger as “in-use” starting at ‘index’
	void set_blocks_in_use(std::size_t index, std::size_t n) noexcept;
	// Marks n blocks in the ledger as “free” starting at ‘index’
	void set_blocks_free(std::size_t index, std::size_t n) noexcept;
	// Actual memory for allocations
	std::byte* m_data{nullptr};
	// Reserves one bit per block to indicate whether it is in-use
	std::byte* m_ledger{nullptr};
};

bucket::bucket(std::size_t block_size, std::size_t block_count)
: BlockSize{block_size}
, BlockCount{block_count}
{
	const auto data_size = BlockSize * BlockCount;
	m_data = static_cast<std::byte*>(std::malloc(data_size));
	assert(m_data != nullptr);
	const auto ledger_size = 1 + ((BlockCount - 1) / 8);
	m_ledger = static_cast<std::byte*>(std::malloc(ledger_size));
	assert(m_ledger != nullptr);
	std::memset(m_data, 0, data_size);
	std::memset(m_ledger, 0, ledger_size);
}
bucket::~bucket() {
	std::free(m_ledger);
	std::free(m_data);
}


void * bucket::allocate(std::size_t bytes) noexcept {
	// Calculate the required number of blocks
	const auto n = 1 + ((bytes - 1) / BlockSize);
	const auto index = find_contiguous_blocks(n);
	if (index == BlockCount) {
		return nullptr;
	}
	set_blocks_in_use(index, n);
	return m_data + (index * BlockSize);
}

void bucket::deallocate(void * ptr, std::size_t bytes) noexcept {
	const auto p = static_cast<const std::byte *>(ptr);
	const std::size_t dist = static_cast<std::size_t>(p - m_data);
	// Calculate block index from pointer distance
	const auto index = dist / BlockSize;
	// Calculate the required number of blocks
	const auto n = 1 + ((bytes - 1) / BlockSize);
	// Update the ledger
	set_blocks_free(index, n);
}

然后就是由块来构成池子 指定 BlockSize和BlockCount

// The default implementation defines a pool with no buckets
template<std::size_t id>
struct bucket_descriptors {
	using type = std::tuple<>;
};

struct bucket_cfg16 {
	static constexpr std::size_t BlockSize = 16;
	static constexpr std::size_t BlockCount = 10000;
};
struct bucket_cfg32{
	static constexpr std::size_t BlockSize = 32;
	static constexpr std::size_t BlockCount = 10000;
};
struct bucket_cfg1024 {
	static constexpr std::size_t BlockSize = 1024;
	static constexpr std::size_t BlockCount = 50000;
};
template<>
struct bucket_descriptors<1> {
	using type = std::tuple<bucket_cfg16, bucket_cfg32, bucket_cfg1024>;
};

template<std::size_t id>
using bucket_descriptors_t = typename bucket_descriptors<id>::type;

template<std::size_t id>
static constexpr std::size_t bucket_count = std::tuple_size<bucket_descriptors_t<id>>::value;


template<std::size_t id>
using pool_type = std::array<bucket, bucket_count<id>>;

template<std::size_t id, std::size_t Idx>
struct get_size
    : std::integral_constant<std::size_t, std::tuple_element_t<Idx, bucket_descriptors_t<id>>::BlockSize>{\\
};
    
template<std::size_t id, std::size_t Idx>
struct get_count
    : std::integral_constant<std::size_t, std::tuple_element_t<Idx, bucket_descriptors_t<id>>::BlockCount>{\\
};

template<std::size_t id, std::size_t... Idx>
auto & get_instance(std::index_sequence<Idx...>) noexcept {
	static pool_type<id> instance{\{\{get_size<id, Idx>::value, get_count<id, Idx>::value} ...\}\};
	return instance;
}
template<std::size_t id>
auto & get_instance() noexcept {
	return get_instance<id>(std::make_index_sequence<bucket_count<id>>());
}

涉及到具体的分配策略,怎么找到所需的块呢?

直接找空闲的 有点像hash_map 开放地址法实现。浪费

// Assuming buckets are sorted by their block sizes
template<std::size_t id>
[[nodiscard]] void * allocate(std::size_t bytes) {
	auto & pool = get_instance<id>();
	for (auto & bucket : pool) {
		if(bucket.BlockSize >= bytes) {
			if(auto ptr = bucket.allocate(bytes); ptr != nullptr) {
				return ptr;
			}
		}
	}
	throw std::bad_alloc{};
}

需要额外的信息

template<std::size_t id>
[[nodiscard]] void * allocate(std::size_t bytes) {
	auto & pool = get_instance<id>();
	std::array<info, bucket_count<id>> deltas;
	std::size_t index = 0;
	for (const auto & bucket : pool) {
		deltas[index].index = index;
		if (bucket.BlockSize >= bytes) {
			deltas[index].waste = bucket.BlockSize - bytes;
			deltas[index].block_count = 1;
		} else {
			const auto n = 1 + ((bytes - 1) / bucket.BlockSize);
			const auto storage_required = n * bucket.BlockSize;
			deltas[index].waste = storage_required - bytes;
			deltas[index].block_count = n;
		}
		++index;
	}

    sort(deltas.begin(), deltas.end()); // std::sort() is allowed to allocate
    
	for (const auto & d : deltas)
		if (auto ptr = pool[d.index].allocate(bytes); ptr != nullptr)
			return ptr;
	
    throw std::bad_alloc{};
}

碎片问题?

实现allocator接口

不讲了。看代码

	template<typename T = std::uint8_t, std::size_t id = 0>
class static_pool_allocator {
public:
	//rebind不用实现吧,我记得好像废弃了
    template<typename U>
	static_pool_allocator(const static_pool_allocator<U, id> & other) noexcept
		: m_upstream_resource{other.upstream_resource()} {}
	template<typename U>
	static_pool_allocator & operator=(const static_pool_allocator<U, id> & other) noexcept {
		m_upstream_resource = other.upstream_resource();
		return *this;
	}
	static bool initialize_memory_pool() noexcept { return memory_pool::initialize<id>(); }
private:
	pmr::memory_resource * m_upstream_resource;
};

后面介绍了个分析allocate的工具

clang

  • 转成llvm bitcode -g -O0 -emit-llvm -DNDEBUG 然后用llvm-link链接pass
  • 设定llvm pass
    • 打印调用
  • 用llvm opt命令来执行这个pass

pass长这样

class AllocListPass : public llvm::FunctionPass {
public:
	static char ID;
	AllocListPass() : llvm::FunctionPass(ID) {}

    bool runOnFunction(llvm::Function & f) override {
		const auto pretty_name = boost::core::demangle(f.getName().str().c_str());
		static const std::regex call_regex{R"(void instrument::type_reg<([^,]+),(.+),([^,]+)>\(\))"};
		std::smatch match;
		if (std::regex_match(pretty_name, match, call_regex)) {
			if (match.size() == 4) {
				const auto pool_id = std::atoi(match[1].str().c_str());
				const auto type = match[2].str();
				const auto size = std::atoi(match[3].str().c_str());
				std::cout << "Pool ID: " << pool_id << ", Size: " << size << ", Type: " << type << "\n";
			}
		}
		return false; // does not alter the code, a read-only pass
	}
};
char AllocListPass::ID = 0;
static llvm::RegisterPass<AllocListPass> dummy("alloc-list", "This pass lists memory pool allocations");

llvm::ModulePass原理

  • dfs找入口 main等

  • 找到 type_reg<>
    • 记录分配信息
    • 打印函数调用
  • 检查递归,跳过一些分支

结果这样

Call graph for: Pool ID: 3, Size: 24, Type: std::__1::__list_node<int, void*>:
1. static_pool_allocator<std::__1::__list_node<int, void*>, 3ul>::allocate(unsigned long, void const*) called at /usr/include/c++/v1/memory:1547
2. std::__1::allocator_traits<static_pool_allocator<std::__1::__list_node<int, void*>, 3ul>>::allocate(static_pool_allocator<std::__1::__list_node<int, void*>, 3ul>&,
unsigned long) called at /usr/include/c++/v1/list:1079
3. std::__1::list<int, static_pool_allocator<int, 3ul>>::__allocate_node(static_pool_allocator<std::__1::__list_node<int, void*>, 3ul>&) called at
/usr/include/c++/v1/list:1569
4. std::__1::list<int, static_pool_allocator<int, 3ul>>::push_back(int const&) called at /home/program.cpp:12
5. x() called at /home/program.cpp:7
6. f() called at /home/program.cpp:2

llvm opt

opt -load alloc-analyzer.so -alloc-analyze -gen-hdr my_defs.hpp -entry-point "main"< home/program.bc -o /dev/null

ref

  • https://github.com/CppCon/CppCon2020/blob/main/Presentations/practical_memory_pool_based_allocators_for_modern_cpp/practical_memory_pool_based_allocators_for_modern_cpp__misha_shalem__cppcon_2020.pdf

看到这里或许你有建议或者疑问或者指出我的错误,请留言评论或者邮件mailto:wanghenshui@qq.com, 多谢!

觉得写的不错可以点开扫码赞助几毛 微信转账
Read More

^