CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
Updated 2024-06-04 19:47:22 +00:00
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Updated 2023-06-12 21:34:17 +00:00