Skip to content

v1.5.0

Choose a tag to compare

@github-actions github-actions released this 10 Apr 23:43

1.5.0 (2026-04-10)

Features

  • compute: add AllocDeviceFloat32 and CopyToDevice to FusedEncoderProvider (8d6c90b)
  • compute: add fused PatchTST encoder layer CUDA kernels (4dfd46e)

Bug Fixes

  • compute: GPUEngine.Reshape honors dst argument (18a53fe)
  • compute: reuse dst GPU memory instead of allocating per call (#84) (26bbd49)
  • kernels: rename kernel_add in fused_encoder_bwd to avoid symbol clash (716bbd6)