mirror of
https://github.com/salesforce/BLIP.git
synced 2026-02-27 14:33:56 +00:00
f5eacc9f082f53372b5534d0b0b9a7228dd17caf
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
This is the PyTorch implementation of the BLIP paper.
Catalog:
- Inference demo
- Pre-trained and finetuned checkpoints
- Pre-training code
- Finetuning code for Image-Text Retrieval, Image Captioning, VQA, and NLVR2
- Download of bootstrapped image-text dataset
Inference demo (Image Captioning and VQA):
Run our interactive demo using Colab notebook (no GPU needed):
Description
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
image-captioningimage-text-retrievalvision-and-language-pre-trainingvision-languagevision-language-transformervisual-question-answeringvisual-reasoning
Readme
9.7 MiB
Languages
Jupyter Notebook
72.5%
Python
27.5%