ik_llama.cpp/ggml/src/ggml-cuda/conv2d.cuh at c6a3903571ea0d686be2cdb2208b1cab5da1bc3d - ik_llama.cpp - Public git mirror

ikawrakow/ik_llama.cpp

mirror of https://github.com/ikawrakow/ik_llama.cpp.git synced 2026-04-25 17:09:22 +00:00

Files

Kawrakow 87e4762720 Port mdmd from mainline + Qwen2/2.5-VL support (#798 )

* Add mtmd: the beginning

* Add mtmd: mtmd.cpp compiles

* Add mtmd: clip initialization compiles

* Add mtmd: clip.cpp compiles

* Add mtmd: builds successfully

* Add CPU implementation for GGML_OP_GLU

* Add CUDA implementation for GGML_OP_GLU

* Add CPU implementation for GGML_OP_CONV_2D and GGML_OP_CONV_2D_DW

* Add CUDA implementation for GGML_OP_CONV_2D and GGML_OP_CONV_2D_DW

* Add mtmd: refresh CPU rope

* Add mtmd: refresh CUDA rope

* Add mtmd: add Qwen2-VL

* Add mtmd: Qwen2.5-VL text seems to work with this change

* Add mtmd: fix swiglu

* Add mtmd: use LOG_TEE so generated tokens show up in terminal

* Add mtmd: do not attempt to load a GPU backend if none are available

* GLU, not GPU

* Fix typo

* Fix new/free mismatch

* LOG stuff

* Add mtmd: this fixes gibberish on second image

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>

2025-09-27 08:45:29 +02:00

6 lines

149 B

Plaintext

Raw Blame History

 #pragma once
 #include "common.cuh"
 #define CUDA_CONV2D_BLOCK_SIZE 256
 void ggml_cuda_op_conv2d(ggml_backend_cuda_context & ctx, ggml_tensor * dst);