reference
KRISP uses a multimodal BERT-pretrained transformer to process question and image pairs in an implicit knowledge model, while a separate explicit knowledge model constructs a Knowledge Graph from question and image symbols to predict answers.

Authors

Sources

Referenced by nodes (2)