-
Notifications
You must be signed in to change notification settings - Fork 663
Pull requests: pytorch/FBGEMM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
carve out sm (#2024)
cla signed
fb-exported
meta-exported
#5011
opened Oct 16, 2025 by
henrylhtsang
Loading…
A quick fix of "No such file or directory"
cla signed
fb-exported
meta-exported
#5010
opened Oct 16, 2025 by
Frederick-Zhu
Loading…
add feature flag to gate detailed memory breakdown
cla signed
fb-exported
meta-exported
#5009
opened Oct 15, 2025 by
ashuaibi7
Loading…
report scuba events for detailed sparse static memory info
cla signed
fb-exported
meta-exported
#5008
opened Oct 15, 2025 by
ashuaibi7
Loading…
refactor function to get tensor bytes and add logging
cla signed
fb-exported
meta-exported
#5007
opened Oct 15, 2025 by
ashuaibi7
Loading…
style: refactor function to single line for linter compliance (nfc)
cla signed
fb-exported
meta-exported
#5006
opened Oct 15, 2025 by
ashuaibi7
Loading…
Blackwell decode Op
cla signed
fb-exported
meta-exported
#5004
opened Oct 15, 2025 by
Aya-ZIbra
Loading…
pass stream for initialize
cla signed
fb-exported
meta-exported
#5003
opened Oct 14, 2025 by
henrylhtsang
Loading…
Buid time optimize (part2)
cla signed
fb-exported
meta-exported
#5000
opened Oct 13, 2025 by
gchalump
Loading…
Add Paged Attention to FMHA Cutlass Blackwell Forward kernel for fixed length
cla signed
fb-exported
meta-exported
#4999
opened Oct 13, 2025 by
sarithad-meta
Loading…
add monitroing metrics for dram cache perf -- metadata read & write
cla signed
fb-exported
meta-exported
#4996
opened Oct 10, 2025 by
kathyxuyy
Loading…
FP8 Convolution Kernel
cla signed
fb-exported
meta-exported
#4994
opened Oct 10, 2025 by
jwfromm
Loading…
Try adding device sync before attention. (#2008)
cla signed
fb-exported
meta-exported
#4992
opened Oct 9, 2025 by
jwfromm
Loading…
Adding python api to support sync trigger evict
cla signed
fb-exported
meta-exported
#4984
opened Oct 7, 2025 by
EddyLXJ
Loading…
: Add double type to be supported by permute_1D_sparse_data
cla signed
fb-exported
meta-exported
#4968
opened Oct 2, 2025 by
Shuchangd
Loading…
Build time optimiaztion
cla signed
fb-exported
meta-exported
#4954
opened Sep 30, 2025 by
gchalump
Loading…
Switch back to NVIDIA/cutlass, and upgrade to v4.2.1
cla signed
#4949
opened Sep 30, 2025 by
jasl
Loading…
Gate invalid triton autotune configs in AOTInductor for GFX95+
cla signed
fb-exported
meta-exported
#4940
opened Sep 26, 2025 by
JChunX
Loading…
Back out "Update to use Python 3.9 syntax"
cla signed
fb-exported
meta-exported
#4928
opened Sep 24, 2025 by
q10
Loading…
forward performance tuning for MI350
cla signed
module: rocm
#4925
opened Sep 24, 2025 by
liligwu
Loading…
more hipify v2 fixes (#4854)
cla signed
fb-exported
meta-exported
module: rocm
#4921
opened Sep 23, 2025 by
q10
Loading…
Support bf16 in blackwell cutlass decode attention kernel
cla signed
fb-exported
meta-exported
#4916
opened Sep 23, 2025 by
Aya-ZIbra
Loading…
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.