mirror of
https://github.com/NVIDIA/cutlass.git
synced 2026-03-24 00:57:33 +00:00
Commit Graph
Select branches
Hide Pull Requests
2.11
Deepseek
cutlass-3.5.0
cutlass_api
feature/2.10/updates_before_tagging
feature/3.0.0
feature/enable-mxfp-group-gemm-sm120
hwu36-patch-1
hwu36-patch-2
main
oss_ci
redirect
release/3.2.x
release/4.2
release/4.3
release/4.4
strided_output_conv
thakkarV-patch-1
thakkarv/4.0-changelog
v4
#10
#100
#1006
#1007
#1012
#1019
#102
#1021
#1022
#1024
#1035
#1037
#1041
#1043
#1047
#1049
#1053
#1059
#1065
#1068
#107
#1071
#1072
#1073
#1078
#1080
#1082
#1084
#1089
#1090
#1091
#1097
#1100
#1101
#1102
#1104
#1109
#1112
#1113
#1116
#1119
#1120
#1121
#1124
#1127
#1128
#1128
#1132
#1134
#1135
#1140
#1143
#1146
#1147
#1153
#1167
#1168
#1169
#1172
#1173
#1175
#1177
#1179
#1180
#1185
#1187
#1189
#1190
#1191
#1192
#1193
#1194
#1195
#1196
#1197
#1200
#1209
#1218
#1218
#1224
#1225
#1232
#1249
#1251
#1257
#1258
#1264
#1273
#1274
#1274
#1275
#1278
#1279
#1286
#1287
#1294
#13
#1302
#1303
#1305
#1306
#1308
#1318
#1325
#1328
#133
#1339
#134
#1346
#135
#1350
#1357
#1377
#1380
#1380
#1384
#1386
#1400
#1404
#141
#1411
#1413
#1415
#1416
#1417
#1420
#1428
#1433
#1437
#1439
#1451
#1453
#1453
#1454
#1458
#1465
#1468
#1469
#147
#1470
#1470
#1471
#1473
#1477
#1479
#148
#1486
#1491
#1494
#1495
#1498
#15
#150
#151
#1512
#1517
#1526
#1527
#1528
#1528
#1529
#1534
#1539
#1539
#1543
#1553
#1554
#1554
#1569
#1578
#1584
#1593
#1593
#1604
#1604
#1618
#1618
#162
#1623
#1630
#1632
#1638
#1639
#1641
#1647
#1650
#1652
#1653
#1653
#1656
#1656
#1658
#1661
#1664
#1665
#1666
#1667
#1673
#1674
#1674
#1679
#1680
#1695
#1700
#1702
#1702
#1708
#1709
#1713
#1714
#1727
#1733
#1753
#1765
#1771
#1774
#1776
#1782
#1784
#1787
#179
#1790
#1795
#1796
#1799
#1803
#1820
#1826
#1832
#1832
#1833
#1835
#1843
#1850
#1853
#1855
#1855
#1856
#1864
#187
#1870
#1871
#1878
#1880
#1883
#1887
#1887
#189
#1890
#1891
#1891
#1894
#1896
#1899
#1907
#1912
#192
#1925
#1926
#193
#1931
#1932
#1935
#1942
#1951
#1960
#1961
#1962
#1966
#1968
#1972
#1977
#1982
#1983
#1989
#1993
#2
#2005
#2020
#2021
#2024
#2026
#2030
#2031
#2033
#2035
#2035
#2037
#2045
#2051
#2059
#2066
#2069
#2078
#2078
#2082
#2086
#2089
#2090
#2095
#2104
#2110
#2111
#2112
#2120
#2122
#2123
#2124
#2129
#2130
#2134
#2135
#2136
#2137
#2139
#214
#2141
#2141
#2142
#2143
#2155
#2156
#2159
#216
#2160
#2160
#2161
#2167
#217
#2171
#2172
#2174
#2177
#2179
#2179
#218
#2180
#2185
#2188
#219
#2194
#2195
#2196
#2199
#220
#2203
#2204
#2211
#2213
#2216
#2219
#2220
#2221
#2224
#2234
#2248
#2249
#2250
#2251
#2255
#2256
#2257
#2257
#2267
#2269
#2269
#2270
#2273
#2275
#2276
#2279
#228
#2283
#2285
#2290
#2291
#2292
#2294
#2295
#2298
#2299
#230
#2305
#2307
#2311
#2315
#2317
#2318
#2324
#2328
#2328
#2329
#2330
#2333
#2340
#235
#2351
#2358
#2359
#2361
#2366
#237
#2370
#2371
#2374
#2375
#2377
#2378
#2379
#2383
#2385
#2387
#239
#2390
#2391
#2398
#2399
#24
#2400
#2401
#2402
#2402
#2407
#2414
#2416
#2417
#2419
#2420
#2421
#2422
#2425
#2429
#2436
#2439
#2447
#2448
#2457
#2457
#246
#2462
#2465
#2466
#2469
#2469
#247
#2472
#2477
#2480
#2481
#2485
#2489
#2492
#25
#2502
#2502
#2506
#251
#2510
#2511
#2514
#2516
#2517
#2526
#2527
#2527
#2529
#2536
#2537
#2538
#2540
#2543
#2544
#2548
#2553
#2554
#2554
#2556
#2558
#2558
#256
#2561
#2562
#2564
#2565
#2567
#2567
#2568
#2571
#2575
#2579
#2580
#2582
#2587
#259
#2591
#2592
#2594
#2594
#2596
#2598
#2599
#26
#2605
#2605
#2607
#2609
#2610
#2611
#2612
#2615
#2621
#2621
#2623
#2627
#2635
#2638
#2639
#264
#2644
#2645
#2646
#2646
#2648
#2650
#2651
#2652
#266
#2660
#2661
#2661
#2666
#2667
#2669
#2670
#2670
#2671
#2678
#2680
#2682
#2682
#2684
#2685
#2686
#2687
#2687
#2688
#2689
#2689
#2690
#2690
#2691
#2694
#2694
#2702
#2702
#2704
#2705
#2709
#2713
#2714
#2718
#2718
#2719
#272
#2721
#2729
#2731
#2731
#2734
#2734
#2739
#2739
#274
#2740
#2741
#2741
#2742
#2744
#2744
#2745
#2746
#2749
#2750
#2752
#2758
#2765
#2765
#2767
#2767
#2768
#277
#2774
#2776
#2777
#2777
#2780
#2783
#2783
#2786
#2788
#2788
#2789
#2790
#2795
#2798
#2799
#2799
#28
#2803
#2803
#2806
#2809
#2811
#2811
#2813
#2815
#2815
#2816
#2816
#2817
#2818
#2821
#2822
#2823
#2823
#2824
#2826
#2827
#2830
#2835
#2835
#2839
#2840
#2841
#2844
#2844
#2846
#285
#2850
#2850
#2851
#2851
#2853
#2853
#2857
#2857
#2860
#2865
#2868
#2869
#2870
#2870
#2875
#2877
#2881
#2884
#2889
#2890
#2891
#2892
#2893
#2894
#2896
#2898
#2899
#290
#290
#2910
#2914
#2916
#2916
#2917
#2919
#292
#2921
#2924
#2926
#2928
#2929
#2930
#2933
#2934
#2935
#2936
#2936
#2938
#2938
#2940
#2942
#2942
#2943
#2943
#2945
#2946
#295
#2954
#2954
#2957
#2964
#2965
#2969
#297
#2970
#2971
#2977
#2979
#298
#2982
#2985
#2986
#2986
#2988
#2989
#2990
#2993
#2993
#2995
#2998
#2998
#2999
#30
#3004
#3008
#3008
#3009
#301
#3012
#3013
#3013
#3014
#3014
#3019
#3020
#3021
#3022
#3027
#3028
#303
#3030
#3030
#3032
#3035
#3035
#3038
#3038
#3041
#3043
#3043
#3047
#3048
#3048
#3049
#3049
#305
#3050
#3050
#3052
#3052
#3053
#3053
#3055
#3055
#3059
#306
#3060
#3060
#3062
#3062
#3063
#3063
#3066
#3066
#3067
#3067
#3069
#3069
#3070
#3070
#3071
#3073
#3073
#3074
#3075
#3075
#3079
#308
#3080
#3082
#3082
#3084
#3084
#3087
#3088
#3088
#3091
#3092
#3093
#3093
#3097
#3097
#3098
#3098
#3102
#3102
#3103
#3103
#3104
#3105
#3106
#3108
#3112
#3112
#3115
#3115
#3116
#3116
#3118
#3118
#3120
#3121
#3121
#3123
#3123
#3124
#3124
#313
#318
#325
#33
#331
#341
#345
#363
#364
#365
#366
#375
#378
#379
#38
#381
#382
#383
#386
#388
#391
#392
#393
#394
#402
#403
#406
#407
#412
#413
#415
#419
#42
#424
#429
#433
#437
#440
#441
#442
#444
#446
#447
#449
#450
#451
#452
#453
#456
#46
#467
#468
#469
#47
#471
#472
#473
#477
#478
#479
#48
#480
#482
#486
#487
#488
#489
#493
#497
#497
#503
#507
#514
#516
#518
#518
#52
#53
#531
#532
#542
#543
#546
#550
#559
#562
#563
#564
#574
#576
#586
#587
#590
#597
#6
#6
#603
#604
#607
#608
#61
#615
#616
#618
#62
#620
#622
#623
#624
#626
#628
#629
#63
#631
#632
#633
#634
#635
#636
#637
#638
#639
#64
#641
#645
#646
#65
#650
#658
#659
#662
#669
#670
#671
#672
#677
#682
#691
#698
#7
#70
#701
#703
#704
#714
#717
#719
#720
#726
#727
#728
#730
#741
#743
#749
#752
#753
#754
#759
#760
#761
#764
#765
#766
#768
#773
#775
#776
#779
#786
#789
#790
#791
#796
#8
#805
#806
#807
#812
#82
#822
#823
#826
#828
#829
#83
#830
#832
#836
#838
#839
#841
#842
#844
#845
#846
#849
#853
#855
#857
#858
#862
#869
#87
#871
#878
#879
#883
#885
#891
#892
#893
#895
#896
#897
#9
#903
#905
#91
#912
#914
#915
#916
#917
#918
#920
#921
#925
#927
#932
#936
#937
#939
#940
#942
#945
#950
#951
#952
#957
#958
#96
#961
#967
#970
#976
#977
#979
#984
#992
#993
#995
#996
v0.1.0
v0.1.1
v1.0.0
v1.0.1
v1.1.0
v1.2.0
v1.3.0
v1.3.2
v1.3.3
v2.0.0
v2.1.0
v2.10.0
v2.11.0
v2.2.0
v2.3.0
v2.4.0
v2.5.0
v2.6.0
v2.6.1
v2.7.0
v2.8.0
v2.9.0
v2.9.1
v3.0.0
v3.1.0
v3.2.0
v3.2.1
v3.2.2
v3.3.0
v3.4.0
v3.4.1
v3.5.0
v3.5.1
v3.6.0
v3.7.0
v3.8.0
v3.9.0
v3.9.1
v3.9.2
v4.0.0
v4.1.0
v4.2.0
v4.2.1
v4.3.0
v4.3.1
v4.3.2
v4.3.3
v4.3.4
v4.3.5
v4.4.0
v4.4.1
v4.4.2
Select branches
Hide Pull Requests
2.11
Deepseek
cutlass-3.5.0
cutlass_api
feature/2.10/updates_before_tagging
feature/3.0.0
feature/enable-mxfp-group-gemm-sm120
hwu36-patch-1
hwu36-patch-2
main
oss_ci
redirect
release/3.2.x
release/4.2
release/4.3
release/4.4
strided_output_conv
thakkarV-patch-1
thakkarv/4.0-changelog
v4
#10
#100
#1006
#1007
#1012
#1019
#102
#1021
#1022
#1024
#1035
#1037
#1041
#1043
#1047
#1049
#1053
#1059
#1065
#1068
#107
#1071
#1072
#1073
#1078
#1080
#1082
#1084
#1089
#1090
#1091
#1097
#1100
#1101
#1102
#1104
#1109
#1112
#1113
#1116
#1119
#1120
#1121
#1124
#1127
#1128
#1128
#1132
#1134
#1135
#1140
#1143
#1146
#1147
#1153
#1167
#1168
#1169
#1172
#1173
#1175
#1177
#1179
#1180
#1185
#1187
#1189
#1190
#1191
#1192
#1193
#1194
#1195
#1196
#1197
#1200
#1209
#1218
#1218
#1224
#1225
#1232
#1249
#1251
#1257
#1258
#1264
#1273
#1274
#1274
#1275
#1278
#1279
#1286
#1287
#1294
#13
#1302
#1303
#1305
#1306
#1308
#1318
#1325
#1328
#133
#1339
#134
#1346
#135
#1350
#1357
#1377
#1380
#1380
#1384
#1386
#1400
#1404
#141
#1411
#1413
#1415
#1416
#1417
#1420
#1428
#1433
#1437
#1439
#1451
#1453
#1453
#1454
#1458
#1465
#1468
#1469
#147
#1470
#1470
#1471
#1473
#1477
#1479
#148
#1486
#1491
#1494
#1495
#1498
#15
#150
#151
#1512
#1517
#1526
#1527
#1528
#1528
#1529
#1534
#1539
#1539
#1543
#1553
#1554
#1554
#1569
#1578
#1584
#1593
#1593
#1604
#1604
#1618
#1618
#162
#1623
#1630
#1632
#1638
#1639
#1641
#1647
#1650
#1652
#1653
#1653
#1656
#1656
#1658
#1661
#1664
#1665
#1666
#1667
#1673
#1674
#1674
#1679
#1680
#1695
#1700
#1702
#1702
#1708
#1709
#1713
#1714
#1727
#1733
#1753
#1765
#1771
#1774
#1776
#1782
#1784
#1787
#179
#1790
#1795
#1796
#1799
#1803
#1820
#1826
#1832
#1832
#1833
#1835
#1843
#1850
#1853
#1855
#1855
#1856
#1864
#187
#1870
#1871
#1878
#1880
#1883
#1887
#1887
#189
#1890
#1891
#1891
#1894
#1896
#1899
#1907
#1912
#192
#1925
#1926
#193
#1931
#1932
#1935
#1942
#1951
#1960
#1961
#1962
#1966
#1968
#1972
#1977
#1982
#1983
#1989
#1993
#2
#2005
#2020
#2021
#2024
#2026
#2030
#2031
#2033
#2035
#2035
#2037
#2045
#2051
#2059
#2066
#2069
#2078
#2078
#2082
#2086
#2089
#2090
#2095
#2104
#2110
#2111
#2112
#2120
#2122
#2123
#2124
#2129
#2130
#2134
#2135
#2136
#2137
#2139
#214
#2141
#2141
#2142
#2143
#2155
#2156
#2159
#216
#2160
#2160
#2161
#2167
#217
#2171
#2172
#2174
#2177
#2179
#2179
#218
#2180
#2185
#2188
#219
#2194
#2195
#2196
#2199
#220
#2203
#2204
#2211
#2213
#2216
#2219
#2220
#2221
#2224
#2234
#2248
#2249
#2250
#2251
#2255
#2256
#2257
#2257
#2267
#2269
#2269
#2270
#2273
#2275
#2276
#2279
#228
#2283
#2285
#2290
#2291
#2292
#2294
#2295
#2298
#2299
#230
#2305
#2307
#2311
#2315
#2317
#2318
#2324
#2328
#2328
#2329
#2330
#2333
#2340
#235
#2351
#2358
#2359
#2361
#2366
#237
#2370
#2371
#2374
#2375
#2377
#2378
#2379
#2383
#2385
#2387
#239
#2390
#2391
#2398
#2399
#24
#2400
#2401
#2402
#2402
#2407
#2414
#2416
#2417
#2419
#2420
#2421
#2422
#2425
#2429
#2436
#2439
#2447
#2448
#2457
#2457
#246
#2462
#2465
#2466
#2469
#2469
#247
#2472
#2477
#2480
#2481
#2485
#2489
#2492
#25
#2502
#2502
#2506
#251
#2510
#2511
#2514
#2516
#2517
#2526
#2527
#2527
#2529
#2536
#2537
#2538
#2540
#2543
#2544
#2548
#2553
#2554
#2554
#2556
#2558
#2558
#256
#2561
#2562
#2564
#2565
#2567
#2567
#2568
#2571
#2575
#2579
#2580
#2582
#2587
#259
#2591
#2592
#2594
#2594
#2596
#2598
#2599
#26
#2605
#2605
#2607
#2609
#2610
#2611
#2612
#2615
#2621
#2621
#2623
#2627
#2635
#2638
#2639
#264
#2644
#2645
#2646
#2646
#2648
#2650
#2651
#2652
#266
#2660
#2661
#2661
#2666
#2667
#2669
#2670
#2670
#2671
#2678
#2680
#2682
#2682
#2684
#2685
#2686
#2687
#2687
#2688
#2689
#2689
#2690
#2690
#2691
#2694
#2694
#2702
#2702
#2704
#2705
#2709
#2713
#2714
#2718
#2718
#2719
#272
#2721
#2729
#2731
#2731
#2734
#2734
#2739
#2739
#274
#2740
#2741
#2741
#2742
#2744
#2744
#2745
#2746
#2749
#2750
#2752
#2758
#2765
#2765
#2767
#2767
#2768
#277
#2774
#2776
#2777
#2777
#2780
#2783
#2783
#2786
#2788
#2788
#2789
#2790
#2795
#2798
#2799
#2799
#28
#2803
#2803
#2806
#2809
#2811
#2811
#2813
#2815
#2815
#2816
#2816
#2817
#2818
#2821
#2822
#2823
#2823
#2824
#2826
#2827
#2830
#2835
#2835
#2839
#2840
#2841
#2844
#2844
#2846
#285
#2850
#2850
#2851
#2851
#2853
#2853
#2857
#2857
#2860
#2865
#2868
#2869
#2870
#2870
#2875
#2877
#2881
#2884
#2889
#2890
#2891
#2892
#2893
#2894
#2896
#2898
#2899
#290
#290
#2910
#2914
#2916
#2916
#2917
#2919
#292
#2921
#2924
#2926
#2928
#2929
#2930
#2933
#2934
#2935
#2936
#2936
#2938
#2938
#2940
#2942
#2942
#2943
#2943
#2945
#2946
#295
#2954
#2954
#2957
#2964
#2965
#2969
#297
#2970
#2971
#2977
#2979
#298
#2982
#2985
#2986
#2986
#2988
#2989
#2990
#2993
#2993
#2995
#2998
#2998
#2999
#30
#3004
#3008
#3008
#3009
#301
#3012
#3013
#3013
#3014
#3014
#3019
#3020
#3021
#3022
#3027
#3028
#303
#3030
#3030
#3032
#3035
#3035
#3038
#3038
#3041
#3043
#3043
#3047
#3048
#3048
#3049
#3049
#305
#3050
#3050
#3052
#3052
#3053
#3053
#3055
#3055
#3059
#306
#3060
#3060
#3062
#3062
#3063
#3063
#3066
#3066
#3067
#3067
#3069
#3069
#3070
#3070
#3071
#3073
#3073
#3074
#3075
#3075
#3079
#308
#3080
#3082
#3082
#3084
#3084
#3087
#3088
#3088
#3091
#3092
#3093
#3093
#3097
#3097
#3098
#3098
#3102
#3102
#3103
#3103
#3104
#3105
#3106
#3108
#3112
#3112
#3115
#3115
#3116
#3116
#3118
#3118
#3120
#3121
#3121
#3123
#3123
#3124
#3124
#313
#318
#325
#33
#331
#341
#345
#363
#364
#365
#366
#375
#378
#379
#38
#381
#382
#383
#386
#388
#391
#392
#393
#394
#402
#403
#406
#407
#412
#413
#415
#419
#42
#424
#429
#433
#437
#440
#441
#442
#444
#446
#447
#449
#450
#451
#452
#453
#456
#46
#467
#468
#469
#47
#471
#472
#473
#477
#478
#479
#48
#480
#482
#486
#487
#488
#489
#493
#497
#497
#503
#507
#514
#516
#518
#518
#52
#53
#531
#532
#542
#543
#546
#550
#559
#562
#563
#564
#574
#576
#586
#587
#590
#597
#6
#6
#603
#604
#607
#608
#61
#615
#616
#618
#62
#620
#622
#623
#624
#626
#628
#629
#63
#631
#632
#633
#634
#635
#636
#637
#638
#639
#64
#641
#645
#646
#65
#650
#658
#659
#662
#669
#670
#671
#672
#677
#682
#691
#698
#7
#70
#701
#703
#704
#714
#717
#719
#720
#726
#727
#728
#730
#741
#743
#749
#752
#753
#754
#759
#760
#761
#764
#765
#766
#768
#773
#775
#776
#779
#786
#789
#790
#791
#796
#8
#805
#806
#807
#812
#82
#822
#823
#826
#828
#829
#83
#830
#832
#836
#838
#839
#841
#842
#844
#845
#846
#849
#853
#855
#857
#858
#862
#869
#87
#871
#878
#879
#883
#885
#891
#892
#893
#895
#896
#897
#9
#903
#905
#91
#912
#914
#915
#916
#917
#918
#920
#921
#925
#927
#932
#936
#937
#939
#940
#942
#945
#950
#951
#952
#957
#958
#96
#961
#967
#970
#976
#977
#979
#984
#992
#993
#995
#996
v0.1.0
v0.1.1
v1.0.0
v1.0.1
v1.1.0
v1.2.0
v1.3.0
v1.3.2
v1.3.3
v2.0.0
v2.1.0
v2.10.0
v2.11.0
v2.2.0
v2.3.0
v2.4.0
v2.5.0
v2.6.0
v2.6.1
v2.7.0
v2.8.0
v2.9.0
v2.9.1
v3.0.0
v3.1.0
v3.2.0
v3.2.1
v3.2.2
v3.3.0
v3.4.0
v3.4.1
v3.5.0
v3.5.1
v3.6.0
v3.7.0
v3.8.0
v3.9.0
v3.9.1
v3.9.2
v4.0.0
v4.1.0
v4.2.0
v4.2.1
v4.3.0
v4.3.1
v4.3.2
v4.3.3
v4.3.4
v4.3.5
v4.4.0
v4.4.1
v4.4.2
-
982748aa73
[Hopper CuTeDSL] Add grouped GEMM persistent kernel and tests (#3091)
main
Johnsonms
2026-03-17 21:40:15 -07:00 -
da5e086dab
v4.4.2 update. (#3105)
v4.4.2
release/4.4
Junkai-Wu
2026-03-17 12:58:41 +08:00 -
1b741cabaa
v4.4.2 update. (#3104)
Junkai-Wu
2026-03-17 12:58:19 +08:00 -
772fbb264e
[CLI] add cutedsl fp16 gemm tutorial from 2 to 6 (#3106)
Linfeng Zheng
2026-03-17 10:11:55 +08:00 -
087c84df83
docs: Fix float16 documentation in elementwise_add notebook (#2949) (#3047)
Blake Ledden
2026-03-11 19:29:46 -07:00 -
6a188a33cb
Placeholder change.
oss_ci
Zekun Fan
2025-11-14 16:09:03 -08:00 -
73c59c055c
Support for Group GEMM in CUTLASS Profiler for Geforce and Spark (#3092)
dePaul Miller
2026-03-06 17:36:29 -08:00 -
e5fcd125a5
[fix] Boolean.__dsl_and__ emits arith.andi directly for i1 operands (#3087)
Johnsonms
2026-03-05 01:20:26 -08:00 -
a93d86ec83
Fix finding cuDNN (#2890)
TLescoatTFX
2026-03-05 02:51:37 +01:00 -
49e54f2b23
fix: add_help=False in temporary parser (#2721)
David W.H. Swenson
2026-03-02 01:33:42 -06:00 -
b9847690c5
Merge pull request #3028 from SzymonOzog/patch-3
drazi
2026-02-28 10:11:05 +08:00 -
4370102f9d
v4.4.1 update (#3080)
v4.4.1
Junkai-Wu
2026-02-28 03:01:08 +08:00 -
3bb6e28d3c
v4.4.1 update (#3079)
Junkai-Wu
2026-02-28 02:59:21 +08:00 -
c651d660d2
fix typo (#3012)
Tianqi Zhang (张天启)
2026-02-27 16:25:35 +08:00 -
518327d631
Fix error in Blackwell document of referring to Mxf4 format as NVF4 (#2977)
Ziang Li
2026-02-27 00:25:16 -08:00 -
de67bb7a42
Fix example in CuTe tutorials (#2752)
StevenYangCC
2026-02-27 16:24:34 +08:00 -
edf2f82c00
Fix register index bug in mma.sync.aligned.m16n8k16 (#2740)
Neil Kichler
2026-02-27 09:24:18 +01:00 -
79345359a7
Fix debug typo in sgemm_2.cu and sgemm_sm70.cu (#2678)
mnehete32
2026-02-27 13:53:59 +05:30 -
8b9b3d78df
fix typo in documentation (#2671)
zkyue
2026-02-27 16:23:37 +08:00 -
fc5bbc2dab
Fix typo in cute.nvgpu.warpgroup.mma doc (#2548)
Gabriel Wu
2026-02-27 16:22:55 +08:00 -
057635de5c
Remove redundant dsl example. (#3074)
Junkai-Wu
2026-02-26 21:10:59 +08:00 -
c213bfdfc1
Remove redundant dsl examples. (#3071)
v4.4.0
Junkai-Wu
2026-02-26 11:42:01 +08:00 -
954503d44c
Bump version to 4.4.0
Haicheng Wu
2026-02-25 00:04:04 -05:00 -
6c4200f1bc
Bump version from 4.3.5 to 4.4.0
Haicheng Wu
2026-02-25 00:03:23 -05:00 -
de93e8a4ac
Bump version from 4.3.5 to 4.4.0
Haicheng Wu
2026-02-25 00:03:04 -05:00 -
b92b9f0d37
Bump version from 4.3.5 to 4.4.0
Haicheng Wu
2026-02-25 00:02:41 -05:00 -
2aedca6f5e
Bump CUTLASS version to 4.4.0
Haicheng Wu
2026-02-25 00:01:56 -05:00 -
ae5ed7361b
Bump CUTLASS version to 4.4.0
hwu36-patch-2
Haicheng Wu
2026-02-25 00:01:26 -05:00 -
6450964b57
Update README
Haicheng Wu
2026-02-24 23:55:55 -05:00 -
284449fa5b
Revise chagnelog
Haicheng Wu
2026-02-24 23:54:56 -05:00 -
0853d81d70
Revise README
Haicheng Wu
2026-02-24 15:32:17 -05:00 -
3476ddb7bd
remove mixed_input_fmha_prefill (#3041)
Linfeng Zheng
2026-02-18 20:59:01 +08:00 -
291300ffff
[CuTeDSL] implment a cta-level norm example (both layernorm and rmsnorm) (#3009)
Yihan Chen
2026-02-14 17:54:03 +08:00 -
f9a5f76b7a
Replace fence proxy to the latest routine code in examples/distributed/all_reduce_tma.py (#3027)
aragorn-guan
2026-02-14 17:51:20 +08:00 -
ec7e6cb17b
Merge pull request #2971 from rsmallblue/tvm-ffi
drazi
2026-02-14 14:14:10 +08:00 -
395ab575f6
Merge branch 'main' into tvm-ffi
Yuan Xiaolan
2026-02-14 13:35:28 +08:00 -
d4bbf728ca
v4.4 tag release update. (#3032)
Junkai-Wu
2026-02-14 12:27:58 +08:00 -
beb80e04e1
Add option to not suffix prints with new line
Szymon Ożóg
2026-02-13 15:56:50 +01:00 -
01687cfba1
Merge pull request #3004 from tridao/add-sub-packed-f32x2
drazi
2026-02-13 20:46:26 +08:00 -
5c42d0f28c
Merge pull request #3021 from tridao/clc_no_multicast
drazi
2026-02-13 20:45:52 +08:00 -
1d36152f34
Merge pull request #3022 from tridao/nvvm_fmin
drazi
2026-02-13 20:45:08 +08:00 -
244e8d00d5
[Cute-DSL] Add cute.arch.fmin by calling nvvm
Tri Dao
2026-02-11 14:23:09 -05:00 -
5b83b34afd
[Cute-DSL] Add option for issue_clc_query without multicast
Tri Dao
2026-02-11 14:19:29 -05:00 -
8dbce01473
[CuTeDSL] Distributed example, using TMA load to access remote memory rank-by-rank, reducing in cta, broadcast result to all ranks by multimem TMA store (#2970)
aragorn-guan
2026-02-11 11:54:00 +08:00 -
71aa7a0abc
Merge pull request #2919 from pbelevich/patch-1
drazi
2026-02-11 11:48:58 +08:00 -
51935551fb
[CuTeDSL] Add sub_packed_f32x2 operation
Tri Dao
2026-02-04 21:18:46 +07:00 -
6b3e607b85
v4.4 release update v2. (#2999)
Junkai-Wu
2026-02-04 09:48:31 +08:00 -
de161925a5
pass in stream=-1
yuanxiaolan
2026-02-03 11:55:54 +08:00 -
de198b2419
fix tvm-ffi path in from_dlpack
yuanxiaolan
2026-01-22 13:47:45 +08:00 -
1cfbb53a23
[CuTeDSL] Fix: SM100 block-scale gemm overlapping accumulator (#2995)
Hua Huang
2026-02-03 11:01:41 +08:00 -
a4eb0e05f6
fix performance inssues in cute-dsl examples for 4.4-ctk13.1 release (#2988)
dongxiao
2026-01-30 13:31:04 +08:00 -
d252b01300
fix performance regression in cute-dsl examples for 4.4-ctk13.1 release (#2990)
myu-guo
2026-01-30 13:30:49 +08:00 -
acb45938e9
Update nvvm API call from nvvm enum to str (#2985)
Xiao Song
2026-01-27 17:28:29 +08:00 -
7a14467776
update api usage (#2969)
Xiao Song
2026-01-27 15:33:22 +08:00 -
51f82812ec
Merge pull request #2891 from ColinPeppler/main
drazi
2026-01-26 17:38:27 -08:00 -
9fba3195f9
v4.4 update. (#2979)
Junkai-Wu
2026-01-25 00:46:17 +08:00 -
2fafefb7b9
[Bug Fix]Set NumSplitsM to 1 when TileShapeM < 128 in sm90 fp8 blockwise scaling CollectiveMma (#2965)
Qi Yuhang
2026-01-23 15:56:52 +08:00 -
0edaa6e47d
Fix out-of-bounds TMA access in wgmma_tma_sm90 tutorial (#2945)
Johnsonms
2026-01-22 20:54:12 -08:00 -
431d070fcb
[docs] Add additional tip for generating less kernels in blockwise (#2940)
Aidan Do
2026-01-22 20:53:51 -08:00 -
667446a9dd
[Doc]Fix Mode Name and Stride in 0t_mma_atom.md (#2910)
Qi Yuhang
2026-01-23 12:53:30 +08:00 -
3f5bafb326
[Cutlass profiler] Fix SM100 FP8 nosmem epilogue shape_div “Divisibility Condition” for non‑multiple‑of‑64 N tiles (#2946)
Aidan Do
2026-01-19 23:27:34 -08:00 -
1e6da09275
[DOCS] Update docs to precisely describe env stream scenario (#2824)
Tianqi Chen
2026-01-19 20:16:37 -05:00 -
e594def95e
Don't access data_ptr of fake tensor. Fix EFC w/o epilogue
cutlass_api
jkosaian
2026-01-14 18:00:08 -08:00 -
8debf77437
fix: 2305 omissions (#2957)
Benjamin Leff
2026-01-13 23:55:05 -06:00 -
e222b2a9b9
Update TVM FFI version
jkosaian
2026-01-13 07:58:48 -08:00 -
87cab7bae2
2026-01-12 updates
jkosaian
2026-01-12 18:51:25 -08:00 -
147f5673d0
New RMS Norm example with unit tests (#2917)
Brian K. Ryu
2026-01-12 17:05:31 -08:00 -
8c52459504
Fix incorrect tensor layout strides in Blackwell MMA tutorial comments (#2921)
Johnsonms
2026-01-08 22:02:41 -08:00 -
0deda34b9f
fix typo (#2884)
kf-zhang
2026-01-09 13:57:06 +08:00 -
0d2b201e8c
v4.3.5 update. (#2934)
Junkai-Wu
2026-01-09 04:02:56 +08:00 -
4faf1a1568
v4.3.5 update. (#2935)
v4.3.5
release/4.3
Junkai-Wu
2026-01-09 04:02:14 +08:00 -
f86feb0aa8
Fix idx2crd docstring (#2914)
Wenxuan Tan
2026-01-07 10:11:38 -08:00 -
eb61c91147
Fix CUDA version checking in examples (#2894)
Andrew Yooeun Chun
2026-01-07 14:20:37 +09:00 -
670480df3a
Fix SFB Layout scale granularity representation (#2924)
Aidan Do
2026-01-06 20:55:21 -08:00 -
61b560983a
remove useless line (#2926)
veritas-Qiu
2026-01-07 12:54:08 +08:00 -
7c09485e25
2026-01-06 updates
jkosaian
2026-01-06 04:25:33 -08:00 -
7127592069
Replace CUDA driver API with runtime API (#2928)
dePaul Miller
2026-01-05 10:50:44 -08:00 -
2aee73922c
Minor fix for testing of blockscaled dense GEMM with TMA prefetch (#2930)
questa-quan-wang
2026-01-05 16:36:03 +08:00 -
3d9de19bb7
add constexpr specifier to make_tiled_copy (#2875)
tsu-bin
2026-01-04 04:39:43 +08:00 -
b6d7703e02
Refactor binary_op functions to remove unused result parameter
Pavel Belevich
2026-01-02 11:23:43 -05:00 -
f9bedd9096
Fix print statement for floor division result
Pavel Belevich
2026-01-02 11:15:15 -05:00 -
1810164f27
Update driver bug workaround description in CHANGELOG
v4.3.4
Haicheng Wu
2025-12-24 00:34:25 -05:00 -
709ccc7b92
Update README.md
Haicheng Wu
2025-12-24 00:33:54 -05:00 -
853ad93d60
Update README.md
Haicheng Wu
2025-12-24 00:21:59 -05:00 -
34a81f0497
Update driver bug workaround description in CHANGELOG
Haicheng Wu
2025-12-24 00:20:21 -05:00 -
3f4c086d09
new example with TMA prefetch feature targeting for DRAM latency bound cases (#2881)
questa-quan-wang
2025-12-23 15:29:48 +08:00 -
9a9dbab522
v4.3.4 update v2. (#2899)
Junkai-Wu
2025-12-23 11:29:06 +08:00 -
b7ecaa605d
v4.3.4 update v2. (#2898)
Junkai-Wu
2025-12-23 11:28:26 +08:00 -
7233a05f24
v4.3.4 update. (#2893)
Junkai-Wu
2025-12-22 00:49:35 +08:00 -
7f5fe3edf1
v4.3.4 update. (#2892)
Junkai-Wu
2025-12-22 00:49:12 +08:00 -
4b52d37ecd
docs: note when DSL dumps are populated
Colin Peppler
2025-12-19 17:05:12 -08:00 -
331e2f451c
add missing condition for sync (#2889)
dongxiao
2025-12-19 11:00:30 +08:00 -
ebf3165efb
[Bug Fix]Bypass launch grids for SM120 Kernel with SM90 Mainloop & SM100 TileScheduler (#2865)
Qi Yuhang
2025-12-18 08:51:38 +08:00 -
dfcb55de16
Fix batch adding for EFC
jkosaian
2025-12-16 14:08:23 -08:00 -
ead2fbfe13
Initial commit
jkosaian
2025-12-16 10:00:46 -08:00 -
d55f6beeeb
Bump version from 4.3.2 to 4.3.3
v4.3.3
Haicheng Wu
2025-12-11 23:59:52 -05:00 -
d4e16f5d4e
Bump version from 4.2.1 to 4.3.3
Haicheng Wu
2025-12-11 23:58:38 -05:00 -
6d4cf6d915
Update version of cutlass_library to 4.3.3
hwu36-patch-1
Haicheng Wu
2025-12-11 23:58:12 -05:00 -
d3a5492381
v4.3.3 update. (#2868)
Junkai-Wu
2025-12-11 13:26:58 +08:00 -
5873443bb6
v4.3.3 update (#2869)
Junkai-Wu
2025-12-11 13:26:17 +08:00