Datasets

This page links to datasets made by my lab. Please see the individual project pages linked below to download or learn more about each dataset.

BloomVQA


BloomVQA facilitates comprehensive evaluation of multi-modal large language models for their ability to understand stories, where questions are created based on Bloom’s Taxonomy to evaluate a range of different capabilities.

[Dataset Website] [ACL Findings Paper]

COCO-OOC


COCO-OOC is a dataset for evaluating how well a system can detect if any of the objects in a scene are out-of-context (OOC).

[Dataset Website] [IJCAI Paper]

Biased MNIST

Biased-MNIST is a dataset designed for assessing the robustness of deep learning algorithms for overcoming multiple spurious correlations.

[Dataset Website] [ECCV Paper]

Stream-51

Stream-51 is a dataset designed for streaming learning and open set recognition of 51 categories. It comes with ordering protocols enabling results to be directly compared against.

[Dataset Website] [CLVISION Paper]

Visual Query Detection v1 (VQDv1)

VQDv1 is a dataset for a novel language and vision task that involves providing 0 or more bounding boxes for an image that answer a query prompt. It was published at NAACL-2019.

[Dataset Website] [NAACL Paper]

TallyQA

TallyQA is a VQA dataset for open-ended counting. As of July 2018, it is the largest dataset for counting. Unlike previous datasets, it emphasizes complex counting questions rather than questions that can be answered using object detection alone. More information can be found on the project webpage. It was published at AAAI-2019.

[Dataset Website] [AAAI Paper]

AeroRIT

AeroRIT is a dataset for hyperspectral semantic segmentation with baselines for a bunch of deep learning algorithms provided. It was published in the journal TGRS.

[Dataset Website] [TGRS Paper]

DVQA

DVQA is a VQA dataset for data visualizations, and demands optical character recognition and the ability to handle out-of-vocabulary inputs and outputs. More information can be found on the project webpage. The dataset was published at CVPR-2018.

[Dataset Website] [CVPR Paper]

TDIUC: Task-driven image understanding challenge

As of October 2017, TDIUC is the largest VQA dataset with natural images. TDIUC includes 12 kinds of questions enabling the capabilities of VQA algorithms, including multi-modal large language models, to be better studied. More information can be found on the project webpage.

[Dataset Website] [ICCV Paper]

RIT-18

RIT-18 is a high-resolution multispectral dataset for semantic segmentation. We collected the data using a UAV, and it was annotated with 18 object categories. It was published in the journal TGRS.

[Dataset Website] [JPRS Paper]

VAIS

VAIS contains simultaneously acquired unregistered thermal and visible images of ships acquired from piers. It is suitable for multi-modal object classification research.

[Download Dataset] [PBVS Paper]