Refereed Publications

89. Harun, M.Y., Lee, K., Gallardo, J., Krishnan, G., Kanan, C. (2024) What Variables Affect Out-of-Distribution Generalization in Pretrained Models?. In: Neural Information Processing Systems (NeurIPS). [25.8% accept rate]

Key Words: Out-of-Distribution Generalization, Tunnel Effect Hypothesis, Neural Collapse

[Paper] [Project Website]

88. Harun, M.Y., Kanan, C. (2024) Overcoming the Stability Gap in Continual Learning. Transactions on Machine Learning Research (TMLR).

Key Words: Continual Learning, Catastrophic Forgetting, ML Systems

[Paper] [Project Website]

87. Vorontsov, E., Bozkurt, A., Casson, A., Shaikovski, G., Zelechowski, M., Severson, K., Zimmermann, E., Hall, J., Tenenholtz, N., Fusi, N., Yang, E., Mathieu, P., van Eck, A., Lee, D., Viret, J., Robert, E., Wang, Y. K., Kunz, J. D., Lee, M. C. H., Bernhard, J. H., Godrich, R. A., Oakley, G., Millar, E., Hanna, M., Wen, H., Retamero, J. A., Moye, W. A., Yousfi, R., Kanan, C., Klimstra, D. S., Rothrock, B., Liu, S., Fuchs, T. J. (2024). Virchow: A Foundation Model for Clinical-Grade Computational Pathology. Nature Medicine. doi: 10.1038/s41591-024-03141-0

Key Words: Computational Pathology, Foundation Model

[Paper] [Project Website]

86. Pareja, F., Dopeso, H., Wang, Y. K., Gazzo, A. M., Brown, D. N., Banerjee, M., Selenica, P., Bernhard, J. H., Derakhshan, F., da Silva, E. M., Colon-Cartagena, L., Basili, T., Marra, A., Sue, J., Ye, Q., Da Cruz Paula, A., Yeni, S., Pei, X., Safonov, A., Green, H., Gill, K., Zhu, Y., Lee, M. C. H., Godrich, R. A., Casson, A., Weigelt, B., Riaz, N., Wen, H. Y., Brogi, E., Mandelker, D., Hanna, M. G., Kunz, J. D., Rothrock, B., Chandarlapaty, S., Kanan, C., Oakley, J., Klimstra, D. S., Fuchs, T. J., Reis-Filho, J. S. (2024) Genomics driven artificial intelligence-based model applied to whole slide images accurately classifies invasive lobular carcinoma of the breast. Cancer Research. doi: 10.1158/0008-5472.CAN-24-1322

Key Words: AI in Genomics, Cancer Classification, Whole Slide Images

[Paper]

85. Gong, Y., Shrestha, R., Claypoole, J., Cogswell, M., Ray, A., Kanan, C., Divakaran, A. (2024) BloomVQA: Assessing Hierarchical Multi-modal Comprehension. In: Findings of the Annual Conference of the Association for Computational Linguistics (ACL Findings).

Key Words: Multi-modal Large Language Models, Visual Question Answering

[Paper] [Dataset]

84. Harun, M.Y., Gallardo, J., Chen, J., Kanan, C. (2024) GRASP: A Rehearsal Policy for Efficient Online Continual Learning. In: Conference on Lifelong Learning Agents (CoLLAs).

Key Words: Online Continual Learning

[Paper] [Project Website]

83. Chen, J., An, J., Lyu, H., Kanan, C., Luo, J. (2024) Learning to Evaluate the Artness of AI-generated Images. IEEE Transactions on Multimedia (ToM).

Key Words: AI-generated Images, Artistic Evaluation

[Paper]

82. Chen, J., An, J., Lyu, H., Kanan, C., Luo, J. (2024) Holistic Visual-Texual Sentiment Analysis with Prior Models. In: IEEE International Conference on Multimedia Information Processing and Retrieval (MIPR).

Key Words: Visual-Textual Sentiment Analysis

[Paper]

81. Ejaz, R., Gopalaswamy, V., Lees, A., Kanan, C. Cao, D., Betti, R. (2024) Deep Learning Based Predictive Models for Laser Direct Drive on the Omega Laser Facility. Physics of Plasmas. doi: 10.1063/5.0195675

Key Words: Deep Learning, Inertial Confinement Nuclear Fusion

[Paper]

80. Gallardo, J., Savur, C., Sahin, F., Kanan, C. (2024) Human Emotion Estimation through Physiological Data with Neural Networks. In: System of Systems Engineering Conference (SoSE).

Key Words: Continual Learning, Robotics

79. Verwimp, E., Ben-David, S., Bethge, M., Cossu, A., Gepperth, A., Hayes, T.L., Hüllermeier, E., Kanan, C., Kudithipudi, D., Lampert, C.H., Mundt, M., Pascanu, R., Popescu, A., Tolias, A.S., van de Weijer, J., Liu, B., Lomonaco, V., Tuytelaars, T., van de Ven, G. M. (2024) Continual learning: Applications and the road forward. Transactions on Machine Learning Research (TMLR).

Key Words: Deep Continual Learning

[Paper]

78. Harun, M.Y., Gallardo, J., Hayes, T.L., Kemker, R., Kanan, C. (2023) SIESTA: Efficient Online Continual Learning with Sleep. Transactions on Machine Learning Research (TMLR).

Key Words: Deep Continual Learning

[Paper] [Project Website]

77. Harun, M.Y., Gallardo, J., Hayes, T.L., Kanan, C. (2023) How efficient are today’s continual learning algorithms? In: CVPR Workshop on Continual Learning in Computer Vision (CLVISION).

Key Words: Continual Learning

[Paper]

76. Subramanian, K., Singh, S., Namba, J., Heard, J., Kanan, C., Sahin, F. (2023) Spatial and Temporal Attention-based emotion estimation on HRI-AVC dataset. In: IEEE International Conference on Systems, Man, and Cybernetics (SMC).

Key Words: Robotics, Deep Learning

[Paper]

75. Casson, A., Liu, S., Godrich, R.A., Aghdam, H., Lee, D., Malfroid, K., Rothrock, B., Kanan, C., Retamero, J., Hanna, M., Millar, E., Klimstra, D., Fuchs, T. (2023) Multi-resolution network trained on large-scale H&E whole slide images with weak labels. In: Medical Imaging with Deep Learning (MIDL). [Oral]

Key Words: Computational Pathology, Oncology

[Paper]

74. Rangnekar, A., Kanan, C., Hoffman, M. (2023) Semantic Segmentation with Active Semi-Supervised Learning. In: Winter Conference on Applications of Computer Vision (WACV).

Key Words: Active Learning, Semantic Segmentation, Semi-Supervised Learning

[Paper]

73. Rangnekar, A., Kanan, C., Hoffman, M. (2022) Semantic Segmentation with Active Semi-Supervised Representation Learning. In: British Machine Vision Conference (BMVC).

Key Words: Semantic Segmentation, Active Learning, Semi-Supervised Learning

[Paper]

72. Hayes, T.L., Nickel, M., Kanan, C., Denoyer, L., Szlam, A. (2022) Can I see an Example? Active Learning the Long Tail of Attributes and Relations. In: British Machine Vision Conference (BMVC).

Key Words: Active Learning, Scene Graphs, Continual Learning

[Paper]

71. Rangnekar, A., Ientilucci, E., Kanan, C., Hoffman, M. (2022) SpecAL: Towards Active Learning for Semantic Segmentation of Hyperspectral Imagery. In: Dynamic Data Driven Applications Systems Conference (DDDAS).

Key Words: Semantic Segmentation, Remote Sensing, Active Learning

[Paper]

70. Sur., I., Daniels, Z., Rahman, A., Faber, K., Gallardo, J., Hayes, H., Taylor, C., Gurbuz, M., Smith, J., Joshi, S., Japkowicz, N., Baron, M., Kira, Z., Kanan, C., Corizzo, R., Divakaran, A., Piacentino, M., Hostetler, J., Raghavan, A. (2022) System Design for an Integrated Lifelong Reinforcement Learning Agent for Real-Time Strategy Games. In: International Conference on AI-ML Systems (AIMLSys).

Key Words: Continual Learning, Starcraft 2, Reinforcement Learning

[Paper]

69. Shrestha, R., Kafle, K., Kanan, C. (2022) OccamNets: Mitigating Dataset Bias by Favoring Simpler Hypotheses. In: European Conference on Computer Vision (ECCV).

Key Words: Deep Learning, Bias Mitigation

Summary: We achieve state-of-the-art results on biased datasets using a novel neural network architecture, an approach completely orthogonal to existing methods. Our idea is to incorporate architectural inductive biases into the neural network that combat dataset bias.

[Paper]

68. Raciti, P., Sue, J., Retamero, J., Ceballos, R., Godrich, R., Kunz, J., Casson, A., Thiagarajan, D., Ebrahimzadeh, Z., Viret, J., Lee, D., Schüffler, P., DeMuth, G., Gulturk, E., Kanan, C., Rothrock, B., Reis-Filho, J., Klimstra, D.S., Reuter, V., Fuchs, T.J. (2022) Clinical Validation of Artificial Intelligence Augmented Pathology Diagnosis Demonstrates Significant Gains in Diagnostic Accuracy in Prostate Cancer Detection. Archives of Pathology and Laboratory Medicine. doi: 10.5858/arpa.2022-0066-OA

Key Words: Computational Pathology

[Paper]

67. Hayes, T.L. Kanan, C. (2022) Online Continual Learning for Embedded Devices. In: Conference on Lifelong Learning Agents (CoLLA).

Key Words: Deep Learning, Online Continual Learning

Summary: One of the major real-world applications for continual learning is learning on embedded devices, but surprisingly there has been little research in this area. We study the behavior of mobile network architectures when combined with continual learning algorithms. We argue that many continual learning algorithms do not meet the requirements of this application domain.

[Paper] [Talk]

66. Acharya, M., Roy, A., Koneripalli, K., Jha, S., Kanan, C., Divakaran, A. (2022) Detecting out-of-context objects using contextual cues. In: International Joint Conference on Artificial Intelligence (IJCAI).

Key Words: Deep Learning, Computer Vision, Object Detection

[Paper]

65. Mahmood, U., Bates, D., Erdi, Y., Mannelli, L., Corrias, G., Kanan, C. (2022) Deep learning and domain specific knowledge to segment the liver from synthetic dual energy CT iodine scans. Diagnostics. doi: 10.3390/diagnostics12030672

Key Words: Deep Learning, Radiology

Summary: We demonstrate the benefits of using synthetic dual-energy CT scans, produced with pix2pix, for liver segmentation in single energy scans.

[Paper]

64. Kothari, R.S., Bailey, R.J., Kanan, C., Pelz, J.B., Diaz, G.J. (2022) EllSeg-Gen, towards Domain Generalization for Head-Mounted Eyetracking. In: ACM Symposium on Eye Tracking Research and Applications (ETRA).

Key Words: Deep Learning, Eye Tracking

[Paper]

63. Shrestha, R., Kafle, K., Kanan, C. (2022) An Investigation of Critical Issues in Bias Mitigation Techniques. In: IEEE Winter Applications of Computer Vision Conference (WACV).

Key Words: Deep Learning, Dataset Bias, Bias Mitigation

Summary: We do a head-to-head fair comparison of methods to combat dataset bias in deep learning algorithms. Unfortunately, we find that most methods scale and work poorly across datasets. We also propose a new dataset for analyzing dataset bias in a controlled manner.

[Paper] [Project Website]

62. Zhang, Y., Hayes, T.L., Kanan, C. (2022) Disentangling Transfer and Interference in Multi-Domain Learning. In: AAAI Workshop on Practical Deep Learning in the Wild (PracticalDL).

Key Words: Deep Learning, Transfer Learning

[Paper]

61. Gallardo, J., Hayes, T.L., Kanan, C. (2021) Self-Supervised Training Enhances Online Continual Learning. In: British Machine Vision Conference (BMVC).

Key Words: Deep Learning, Self-supervised Learning, Continual learning

Summary: We demonstrate that self-supervised learning produces feature representations that work better in continual learning compared to supervised pre-training across multiple deep learning algorithms.

[Paper]

60. Acharya, M., Kanan, C. (2021) 2nd Place Solution for SODA10M Challenge 2021 – Continual Detection Track. In: ICCV 2021 Workshop: Self-supervised Learning for Next-Generation Industry-level Autonomous Driving. [Placed 2nd in competition]

Key Words: Continual Learning

[Paper]

59. Hayes, T.L., Krishnan, G.P., Bazhenov, M., Siegelmann, H.T., Sejnowski, T.J., Kanan, C. (2021) Replay in deep learning: Current approaches and missing biological elements. Neural Computation. doi:10.1162/neco_a_01433

Key Words: Deep Learning, Continual Learning, Neuroscience

Summary: We review the mechanisms that underlie replay in the brain and in deep learning to faciliate learning over time and memory consolidation.

[Journal Paper] [arXiv Paper]

58. Mahmood, U., Shrestha, R., Bates, D., Mannelli, L., Corrias, G., Erdi, Y. Kanan, C. (2021) Detecting Spurious Correlations with Sanity Tests for Artificial Intelligence Guided Radiology Systems. Frontiers in Digital Health. doi:10.3389/fdgth.2021.671015

Key Words: Deep Learning, Radiology, Dataset Bias

Summary: We describe a set of tests to determine if a deep learning system is using spurious covariates and is unlikely to generalize to other datasets.

[Paper]

57. Mahmood, U., Apte, A., Kanan, C., Bates, D., Corrias, G., Manneli, L., Oh, J., Erdi, Y., Nguyen, J., Deasy, J., Shukla-Dave, A. (2021) Quality control of radiomic features using 3D printed CT phantoms. Journal of Medical Imaging. doi: 10.1117/1.JMI.8.3.033505

Key Words: Radiology, CT, Bias

[Paper]

56. Khanal, B., Kanan, C. (2021) How does heterogeneous label noise impact generalization in neural networks? In: International Symposium on Visual Computing (ISVC).

Key Words: Label Noise, Deep Learning

[Paper]

55. Hayes, T. Kanan, C. (2021) Selective Replay Enhances Learning in Online Continual Analogical Reasoning. In: CVPR Workshop on Continual Learning in Computer Vision (CLVISION).

Key Words: Continual Learning, Replay, Reasoning

Summary: We pioneer continual learning for visual reasoning systems.

[Paper] [Code]

54. Lomonaco, V., Pellegrini, L., Cossu, A., Carta, A., Graffieti, G., Hayes, T., De Lange, M., Masana, M., Pomponi, J., Ven, G., Mundt, M., She, Q., Cooper, K., Forest, J., Belouadah, E., Calderara, S., Parisi, G., Cuzzolin, F., Tolias, A., Scardapane, S., Antiga, L., Ahmad, S., Popescu, A., Kanan, C., Weijer, J., Tuytelaars, T., Bacciu, D., Maltoni, D. (2021) Avalanche: An End-to-End Library for Continual Learning. In: CVPR Workshop on Continual Learning in Computer Vision (CLVISION).

Key Words: Deep Learning, Continual Learning

[Paper] [Software Library]

53. da Silva, L.M., Pereira, E.M., Salles, P.G.O., Godrich, R., Ceballos, R., Kunz, J.D., Casson, A., Viret, J., Chandarlapaty, S., Ferreira, C.G., Ferrari, B., Rothrock, B., Raciti, P., Reuter, V., Dogdas, B., DeMuth, G., Sue, J., Kanan, C., Grady, L., Fuchs, T.J., Reis-Filho, J.S. (2021) Independent real-world application of a clinical-grade automated prostate cancer detection syste. Journal of Pathology. doi:10.1002/path.5662

Key Words: Computational Pathology, Clinical Validation

Summary: We demonstrate the efficacy of Paige Prostate when used by pathologists.

[Paper]

52. Rangnekar, A., Ientilucci, E., Kanan, C., Hoffman, M. J. (2020) Uncertainty estimation for semantic segmentation of hyperspectral imagery. In: International Conference on Dynamic Data Driven Application Systems (DDDAS).

Key Words: Deep Learning, Remote Sensing

[Paper]

51. Teney, D., Kafle, K., Shrestha, R., Abbasnejad, E., Kanan, C., van den Hengel, A. (2020) On the Value of Out-of-Distribution Testing: An Example of Goodhart’s Law. In: Neural Information Processing Systems (NeurIPS).

Key Words: VQA, Dataset Bias, Metrics

Summary: We show that many systems the use the VQA-CP dataset, which is intended to assess the capabilities of VQA systems to generalize to out-of-distribution data, have flaws in their methodologies.

[Paper]

50. Roady, R., Hayes, T.L., Kemker, R., Gonzales, A., Kanan, C. (2020) Are open set classification methods effective on large-scale datasets? PLOS ONE. doi: 10.1371/journal.pone.0238302

Key Words: Deep Learning, Open Set Classification

Summary: We assess the ability of open-set classification algorithms to work on high-resolution datasets with hundreds of categories.

[Paper]

49. Acharya, M., Hayes, T., Kanan, C. (2020) RODEO: Replay for Online Object Detection. In: British Machine Vision Conference (BMVC).

Key Words: Online Streaming Learning, Object Detection

Summary: We created the first algorithm for online learning for object detection.

[Video] [Code] [Paper]

48. Hayes, T., Kafle, K., Shrestha, R., Acharya, M., Kanan, C. (2020) REMIND Your Deep Neural Network to Prevent Catastrophic Forgetting In: European Conference on Computer Vision (ECCV).

Key Words: Streaming Learning, Deep Learning, Representation Learning

Summary: We created a state-of-the-art method for online learning in CNNs and assess its efficacy on incremental learning on ImageNet and on VQA datasets.

[Video] [Code] [Paper]

47. Roady, R., Hayes, T., Kanan, C. (2020) Improved Robustness to Open Set Inputs via Tempered Mixup. In: ECCV Workshop on Adversarial Robustness in the Real World (AROW).

Key Words: Open Set Classification, Deep Learning, Regularization

Summary: We created a state-of-the-art method for open set classification using a new form of mixup regularization.

[Paper]

46. Shrestha, R., Kafle, K., Kanan, C. (2020) A negative case analysis of visual grounding methods for VQA. In: Annual Conference of the Association for Computational Linguistics (ACL).

Key Words: Deep Learning, Dataset Bias, VQA

Summary: We demonstrate that state of the art methods for VQA-CP are achieving their efficacy by acting as regularizers, and not due to the reasons proposed by their authors. We then provide a simple approach that rivals the state of the art.

[Code] [Paper]

45. Roady, R., Hayes, T.L., Vaidya, H., Kanan, C. (2020) Stream-51: Streaming Classification and Novelty Detection from Videos. In: CVPR Workshop on Continual Learning in Computer Vision (CLVISION).

Key Words: Streaming Learning, Open Set Classification

Summary: We created a new dataset and protocols for streaming learning of temporally correlated images with out-of-distribution detection.

[Project and Dataset] [Paper]

44. Hayes, T.L., Kanan, C. (2020) Lifelong Machine Learning with Deep Streaming Linear Discriminant Analysis. In: CVPR Workshop on Continual Learning in Computer Vision (CLVISION).

Key Words: Deep Learning, Streaming Learning

Summary: We created a simple method for streaming learning that rivals state of the art methods on ImageNet using a fraction of the computational resources.

[Video] [Code] [Paper]

43. Raciti, P., Sue, J., Ceballos, R., Godrich, R., Kunz, J., Kapur, S., Reuter, V.E., Grady, L., Kanan, C., Klimstra, D., Fuchs, T. (2020) Novel Artificial Intelligence System Increases the Detection of Prostate Cancer in Whole Slide Images of Core Needle Biopsies. Modern Pathology. doi: 10.1038/s41379-020-0551-y

Key Words: Deep Learning, Computational Pathology

Summary: We created a clinical-grade AI-based system that helps pathologists diagnose prostate cancer.

[Paper]

42. Rangnekar, A., Mokashi, N., Ientilucci, E., Kanan, C., Hoffman, M.J (2020) AeroRIT: A New Scene for Hyperspectral Image Analysis. IEEE Transactions on Geoscience and Remote Sensing (TGRS). doi: 10.1109/TGRS.2020.2987199

Key Words: Deep Learning, Semantic Segmentation

[Code] [Paper]

41. Kothari, R., Yang, Z., Kanan, C., Bailey, R., Pelz, J., Diaz, G. (2020) Gaze-in-wild: A dataset for studying eye and head coordination in everyday activities. Scientific Reports. doi: 10.1038/s41598-020-59251-5

Key Words: Machine Learning, Eye Tracking

Summary: Using a mobile eye tracker equipped with an inertial measurement unit and a 3D stereo camera, we created a dataset of naturalistic gaze movements while subjects’ performed everyday tasks. We created machine learning algorithms for gaze event classification.

[Paper]

 40.  Kafle, K., Shrestha, R., Cohen, S., Price, B., Kanan, C. (2020) Answering Questions about Data Visualizations using Efficient Bimodal Fusion. IEEE Winter Applications of Computer Vision Conference (WACV-2020).

Key Words: Deep Learning, Visual Question Answering, Chart Question Answering

Summary: We created a new model that greatly surpasses the state-of-the-art and human baselines on the chart question answering datasets FigureQA and DVQA.

[Paper]

39. Seedat, N. Kanan, C. (2019) Towards calibrated and scalable uncertainty representations for neural networks. NeurIPS-2019 Workshop on Bayesian Deep Learning.

Key Words: Deep Learning, Uncertainty Estimation

[Paper]

38. Kafle, K., Shrestha, R., Kanan, C. (2019) Challenges and Prospects in Vision and Language Research. Frontiers in Artificial Intelligence.

Key Words: Deep Learning, Language and Vision

Summary: In this review and position paper, we critically assess the state of vision and language research in AI, with a focus on the challenges of creating good datasets and performing meaningful evaluation.

[Paper]

 37.  Chaudhary, A.K., Kothari, R., Acharya, M., Dangi, S., Nair, N., Bailey, R., Kanan, C., Diaz, G., Pelz, J.B. (2019) RITnet: Real-time Semantic Segmentation of the Eye for Gaze Tracking.. In: The 2019 OpenEDS Workshop: Eye Tracking for VR and AR..

Key Words: Semantic Segmentation, Deep Learning, Eye Tracking

Summary: Winner of the Facebook eye semantic segmentation challenge.

[Winning Announcement]

 36.  Shrestha, R., Kafle, K., Kanan, C. (2019) Answer Them All! Toward Universal Visual Question Answering Models. Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR-2019)

Key Words: Deep Learning, Visual Question Answering, Vision and Language

Summary: We demonstrate that the best models for reasoning VQA datasets (e.g., CLEVR) and natural image datasets (e.g., VQAv2) do not transfer across datasets, and we present a new model called RAMEN that works well across both. We ensure all models use the same visual features and answer vocabularies.

[Code] [Paper]

 35.  Acharya, M., Jariwala, K., Kanan, C. (2019) VQD: Visual Query Detection In Natural Scenes. In: Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-2019).

Key Words: Object Detection, Natural Language Processing, Visual Question Answering

Summary: We generalize the referring expression recognition problem to multiple boxes in a new dataset called VQD.

[Project Website and Dataset]

 34.  Parisi, G.I., Kemker, R., Part, J.L., Kanan, C., Wermter, S. (2019) Continual lifelong learning with neural networks: A review. Neural Networks. doi: 10.1016/j.neunet.2019.01.012

Key Words: Deep Learning, Lifelong Machine Learning

Summary: We comprehensively review lifelong learning in artificial and biological neural networks.

[Journal Version] [arXiv Version]

 33.  Hayes, T., Cahill, N., Kanan, C. (2019) Memory Efficient Experience Replay for Streaming Learning. In: International Conference on Robotics and Automation (ICRA-2019).

Key Words: Deep Learning, Streaming Learning, Catastrophic Forgetting

Summary: We explore how to efficiently perform experience replay to facilitate online learning in a single pass in neural network models without catastrophic forgetting.

[Code and Project Webpage]

 32.  Acharya, M., Kafle, K., Kanan, C. (2019) TallyQA: Answering Complex Counting Questions. In: AAAI-2019.

Key Words: Deep Learning, Visual Question Answering, Vision and Language

Summary: We created TallyQA for open-ended counting. TallyQA emphasizes questions that require more than just object detection, including attributes and relations. Our algorithm demonstrates how to scale relational neural networks to real-world imagery.

[Code and Dataset] [Project Webpage]

 31.  Birmingham, E., Svärd, J., Kanan, C., Fischer, H. (2018) Exploring Emotional Expression Recognition in Aging Adults using the Moving Window Technique. PLOS One. doi:10.1371/journal.pone.0205341

 30.  Kafle, K., Cohen, S., Price, B., Kanan, C. (2018) DVQA: Understanding Data Visualizations via Question Answering. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR-2018).

Key Words: Deep Learning, Visual Question Answering, Vision and Language

Summary: We created a new dataset for answering questions about bar charts. We demonstrate that existing VQA algorithms could not solve this problem, and we propose new methods capable of handling out-of-vocabulary answers.

[Demo] [Project Webpage]

 29.  Hayes, T., Kemker, R., Cahill, N., Kanan, C. (2018) New Metrics and Experimental Paradigms for Continual Learning. In: Real-World Challenges and New Benchmarks for Deep Learning in Robotic Vision (CVPRW).

Key Words: Deep Learning, Lifelong Learrning, Streaming Learning

 28.  Binaee, K., Starynska, A., Pelz, J., Kanan, C., Diaz, G. (2018) Characterizing the Temporal Dynamics of Information in Visually Guided Predictive Control Using LSTM Recurrent Neural Networks. In: Proc. 40th Annual Conference of the Cognitive Science Society (CogSci-2018).

 27.  Kemker, R., Luu, R., Kanan, C. (2018) Low-Shot Learning for the Semantic Segmentation of Remote Sensing Imagery. IEEE Transactions on Geoscience and Remote Sensing (TGRS).

Key Words: Deep Learning, Semantic Segmentation, Remote Sensing, Low-Shot Learning

Summary: We use self-taught feature learning and semi-supervised learning to do semantic segmentation of hyperspectral imagery.

[Journal Version] [Accepted arXiv Preprint]

 26.  Kemker, R., Kanan, C. (2018) FearNet: Brain-Inspired Model for Incremental Learning. In: International Conference on Learning Representations (ICLR-2018).

Key Words: Incremental Learning, Deep Learning, Catastrophic Forgetting

Summary: We created a new dual-memory neural network that achieves state-of-the-art results on incremental learning tasks.

 25.  Kemker, R., McClure, M., Abitino, A., Hayes, T., Kanan, C. (2018) Measuring Catastrophic Forgetting in Neural Networks. In: AAAI-2018.

Key Words: Deep Learning, Incremental Learning, Catastrophic Forgetting

Summary: We develop methods for measuring catastrophic forgetting and use them to show that methods purported to prevent catastrophic forgetting often fail.

 24.  Kemker, R., Salvaggio, C., Kanan, C. (2018) Algorithms for Semantic Segmentation of Multispectral Remote Sensing Imagery using Deep Learning. ISPRS Journal of Photogrammetry and Remote Sensing. doi: 10.1016/j.isprsjprs.2018.04.014

Key Words: Deep Learning, Semantic Segmentation, Remote Sensing, Low-Shot Learning

Summary: We created the RIT-18 dataset for remote sensing imagery. We then show that pre-training semantic segmentation algorithms on synthetic imagery enables them to be used successfully when the amount of actual data is scarce.

[RIT-18 Dataset] [Journal Version] [Accepted arXiv Preprint]

 23.  Kleynhans, T., Montanaro, M., Gerace, A., Kanan, C. (2017) Predicting Top-of-Atmosphere Thermal Radiance Using MERRA-2 Atmospheric Data with Deep Learning. Remote Sensing, 9(11), 1133; doi:10.3390/rs9111133.

Key Words: Deep Learning, Remote Sensing

 22.  Graham, D., Langroudi, S., Kanan, C., Kudithipudi, D. (2017) Convolutional Drift Networks for Spatio-Temporal Processing. In: IEEE International Conference on Rebooting Computing 2017.

Key Words: Egocentric Video Activity Recognition, Echo-state Networks, Deep Learning

Summary: We combine echo-state networks with CNNs for egocentric video recognition.

 21.  Kafle, K., Yousefhussien, M., Kanan, C. (2017) Data Augmentation for Visual Question Answering. In: International Natural Language Generation Conference (INLG-2017).

Key Words: Visual Question Answering, Natural Language Generation

Summary: We pioneer two methods for data augmentation for VQA.

 20.  Kafle, K., Kanan, C. (2017) An Analysis of Visual Question Answering Algorithms. International Conference on Computer Vision (ICCV-2017).

Key Words: Deep Learning, Image Reasoning, Dataset Bias, Dataset Creation

Summary: We explore methods for compensating for dataset bias, and propose 12 different kinds of VQA questions. Using our new TDIUC dataset, we assess state-of-the-art VQA algorithms and discover what kind of questions are easy and what kinds are hard.

[Project Webpage]

 19.  Kumra, S. Kanan, C. (2017) Robotic Grasp Detection using Deep Convolutional Neural Networks. Proc. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS-2017).

Key Words: Deep Learning, Robotics, Grasping

 18.  Hezaveh, M.M. Kanan, C., Salvaggio, C. (2017) Roof damage assessment using deep learning. IEEE Applied Pattern Recognition Workshop (AIPR).

Key Words: Deep Learning

 17.  Kafle, K., Kanan, C. (2017) Visual Question Answering: Datasets, Algorithms, and Future Challenges. Computer Vision and Image Understanding (CVIU). doi:10.1016/j.cviu.2017.06.005

Key Words: Visual Question Answering, Deep Learning, Review

Summary: We critically review the state of Visual Question Answering.

[Journal Version] [Accepted arXiv Preprint]

 16.  Kemker, R. Kanan, C. (2017) Self-Taught Feature Learning for Hyperspectral Image Classification. IEEE Transactions on Geoscience and Remote Sensing (TGRS), 55(5): 2693 – 2705.

Key Words: Deep Learning, Self-taught Learning, Hyperspectral Remote Sensing

Summary: We achieved state-of-the-art results on several hyperspectral remote sensing datasets by using deep convolutional autoencoders and independent component analysis to learn features from unlabeled datasets.

 15.  Kafle, K., Kanan, C. (2016) Answer-Type Prediction for Visual Question Answering. Proceedings of IEEE Computer Vision and Pattern Recognition Conference 2016 (CVPR-2016).

Key Words: Visual Question Answering, Deep Learning

Summary: We combined deep learning with a conditional version of Quadratic Discriminant Analysis to do Visual Question Answering.

 14.  Yousefhussien, M., Browning, N.A., and Kanan, C. (2016) Online Tracking using Saliency. In: Proc. IEEE Winter Applications of Computer Vision Conference (WACV-2016).

Key Words: Deep Learning, Gnostic Fields, Online Tracking

Summary: We combined deep learning with gnostic fields to do online tracking of vehicles in videos.

 13.  Wang, P., Cottrell, G., Kanan, C. (2015) Modeling the Object Recognition Pathway: A Deep Hierarchical Model Using Gnostic Fields. Proceedings of the Cognitive Science Society Conference (CogSci-2015).

Key Words: Object recognition, Feature learning, Brain-inspired

Summary: We used hierarchical Independent Components Analysis (ICA) to learn a visual representation with multiple levels, and then we combined this with gnostic fields.

 12.  Zhang, M.M, Choi, J., Daniilidis, K., Wolf, M.T. & Kanan, C. (2015) VAIS: A Dataset for Recognizing Maritime Imagery in the Visible and Infrared Spectrums. In: Proc of the 11th IEEE Workshop on Perception Beyond the Visible Spectrum (PBVS-2015).

Key Words:Autonomous ships, Object recognition, Infrared

Summary: This paper describes work at JPL to build a dataset for recognizing ships in the visible and infrared spectrums. VAIS is now part of the OTCBVS Benchmark Dataset Collection.

[Download the VAIS Dataset]

 11.  Kanan, C. Bseiso, D., Ray, N., Hsiao, J., & Cottrell, G. (2015) Humans Have Idiosyncratic and Task-specific Scanpaths for Judging Faces. Vision Research. doi:10.1016/j.visres.2015.01.013

Key Words: Eye Tracking, Face Perception, Multi-Fixation Pattern Analysis (MFPA)

Summary: We describe algorithms that can make inferences about a person from their eye movements, which we call Multi-Fixation Pattern Analysis (MFPA). We used MFPA to show that humans have scanpath routines for different face judgment tasks. Beyond addressing questions in psychology, the technology could be used for other applications such as medical diagnosis and biometrics.

[Journal Version] [Accepted Preprint]

 10.  Khosla, D., Huber, D.J., & Kanan, C. (2014) A Neuromorphic System for Visual Object Recognition. Biologically Inspired Cognitive Architectures 8: 33-45.

Key Words: Object Recognition, Object Localization, Brain-Inspired

Summary: This paper is based on work that I did back in 2005-2006 with colleagues at HRL Labs. It describes a system that can localize and classify multiple objects in a scene, and it does so by combining attention algorithms with brain-inspired classifiers.

 9.  Kanan, C. (2014) Fine-Grained Object Recognition with Gnostic Fields. In Proceedings of the IEEE Winter Applications of Computer Vision Conference (WACV-2014).

Key Words: Object Recognition, Computer Vision

Summary: I show that Gnostic Fields surpass state-of-the-art methods for fine-grained object categorization of dogs and birds. I also show that they can classify images in real-time.

[Project Webpage]

 8.  Kanan, C., Ray, N., Bseiso, D., Hsiao, J., & Cottrell, G.W. (2014) Predicting an Observer’s Task Using Multi-Fixation Pattern Analysis. In Proceedings of the ACM Symposium on Eye Tracking Research and Applications (ETRA-2014).

Key Words: Eye Movements, Machine Learning

Summary: I re-analyze a data set gathered by Jeremy Wolfe’s group using new techniques that I developed.

 7.  Kanan, C. (2013) Active Object Recognition with a Space-Variant Retina. ISRN Machine Vision, 2013:138057. doi:10.1155/2013/138057

Key Words: Object Recognition, Active Vision, Eye Movements, Computational Neuroscience

Summary: I developed a brain-inspired space-variant vision model that achieves near state-of-the-art accuracy on object recognition problems. The model acquires evidence using sequential fixations, uses foveated ICA filters, and uses a gnostic field to integrate evidence acquired from the fixations.

 6.  Kanan, C. (2013) Recognizing Sights, Smells, and Sounds with Gnostic Fields. PLoS ONE: e54088. doi:10.1371/journal.pone.0054088

Key Words: Stimulus Classification, Music Classification, Electronic Nose, Image Recognition, Computer Vision

Summary: I developed a new kind of “localist” neural network called a gnostic field that is easy to implement as well as being fast to train and run. The model is tested on its ability to classify images (Caltech-256 and CUB-200), musical artists, and odors. Gnostic fields exceeded the best methods across modalities and datasets.

[Project Webpage]

 5.  Birmingham, E., Meixner, T., Iarocci, G., Kanan, C., Smilek, D., & Tanaka, J. (2012) The Moving Window Technique: A Window into Age-Related Changes in Attention to Facial Expressions of Emotion. Child Development, 84: 1407-1424. doi:10.1111/cdev.12039

Key Words: Face Processing, Emotion Recognition

Summary: We develop a new computer mouse-driven technique for assessing attention, and the approach is used in a developmental study of facial expression recognition.

 4.  Kanan, C. & Cottrell, G. W. (2012) Color-to-Grayscale: Does the Method Matter in Image Recognition? PLoS ONE, 7(1): e29740. doi:10.1371/journal.pone.0029740.

Key Words: Color-to-grayscale, Image Recognition, Computer Vision

Summary: We tested 13 color-to-grayscale algorithms in a modern descriptor based image recognition framework with 4 feature types: SIFT, SURF, Geometric Blur, and Local Binary Patterns (LBP). We discovered that the method can have a significant influence on performance, even when using robust features.

 3.  Kanan, C. & Cottrell, G. W. (2010) Robust Classification of Objects, Faces, and Flowers Using Natural Image Statistics. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR-2010).

Key Words: Object Recognition, Active Vision, Eye Movements, Computational Neuroscience

Summary: We used simulated eye movements with a model of V1 to achieve state-of-the-art results as of early 2010 on a number of challenging datasets in computer vision.

[Project Webpage] [MATLAB Demo] [MATLAB Code for Experiments] [Supplementary Materials]

 2.  Kanan, C., Flores, A., & Cottrell, G. W. (2010) Color Constancy Algorithms for Object and Face Recognition. Lecture Notes in Computer Science, 6453 (International Symposium on Visual Computing 2010): 199-210.

Key Words: Object Recognition, Computer Vision

Summary: We examine the performance of color constancy algorithms in this paper. Our later work on color-to-grayscale algorithms is substantially more rigorous.

 1.  Kanan, C., Tong, M. H., Zhang, L., & Cottrell, G. W. (2009) SUN: Top-down saliency using natural statistics. Visual Cognition, 17: 979-1003.

Key Words: Attention, Active Vision, Eye Movements, Computational Psychology

Summary: We modeled task-driven visual search, and demonstrated that appearance is predictive of human fixation locations.

[Project Webpage]