Tag: cs.Multimodal Total 10 articles All cs.GNN other logic pytorch cs.Meta cs.Transformer cs.Speech BI cs.diffusion CL cs.CoT cs.GAN CV cs.Multimodal cs.ContraLern cs.Object SE cs.MoE cs.Video IR cs.Recom OS 2023-04-04 WACV-2023 MixGen:A New Multi-Modal Data Augmentation 2023-01-15 NIPS-2022 Image as a Foreign Language:BEiT Pretraining for All Vision and Vision-Language Tasks 2023-01-15 NIPS-2022 CoCa:Contrastive Captioners are Image-Text Foundation Models 2023-01-14 ICML-2022 BLIP:Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation 2022-12-14 NIPS-2021 VLMo:Unified Vision-Language Pre-Training with Mixture-of-Modality-Experts 2022-12-12 NIPS-2021 Align before Fuse:Vision and Language Representation Learning with Momentum Distillation 2022-09-17 DeepAI-2022 Can Language Understand Depth? 2022-09-17 arXiv-2021 How Much Can CLIP Benefit Vision-and-Language Tasks? 2022-09-05 arXiv-2022 GLIPv2:Unifying Localization and Vision-Language Understanding 2022-09-05 ICLR-2022 Open-vocabulary Object Detection via Vision and Language Knowledge Distillation
2023-01-15 NIPS-2022 Image as a Foreign Language:BEiT Pretraining for All Vision and Vision-Language Tasks
2023-01-14 ICML-2022 BLIP:Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
2022-12-12 NIPS-2021 Align before Fuse:Vision and Language Representation Learning with Momentum Distillation