Version 5.5.0 is released!


New changes in this release:

  • Adding gpu_bfloat16 type. Available with CUDA backend if the C++ compiler supports std::bfloat16_t.
  • Many performance optimisations, especially for gpu_half and gpu_bfloat16 types.
  • Adding gpu_return() and allow_preload() functions.