PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
image-captioning
image-text-retrieval
vision-and-language-pre-training
vision-language
vision-language-transformer
visual-question-answering
visual-reasoning
Updated 2026-03-03 12:43:11 +00:00