Publications
Research, scholarly, and creative achievements
Scholarly work organized by publication type, with external profiles available for citation metrics, indexing, and identifier-based verification.
Selected publications
Publications that define the research program
These selected entries connect the publication record to the research program: machine-usable scholarly knowledge, ETD infrastructure, retrieval evaluation, and ethical AI for archives.
- Learning from LLM Disagreement in Retrieval Evaluation JCDL 2025 · LLM disagreement as evidence for retrieval evaluation.
- Building datasets to support information extraction and structure parsing from ETDs International Journal on Digital Libraries, 2024 · Benchmark data for ETD structure extraction.
- Evaluating the Impact of Automated Labeling on Retrieval Instability in Neural IR SIGIR Doctoral Consortium, 2025 · Retrieval stability under LLM-derived labels.
- Archives, Digital Search, and AI Ethics Routledge Companion, 2024 · Ethical AI services for public archives.
Journal Articles
- International Journal on Digital Libraries, Vol. 25 (2), 2024
- Teaching Natural Language Processing through Big Data Text Summarization with Problem-Based LearningData and Information Management, Vol. 4 (1), 2020
- Cadernos BAD, Vol. 1 , 2019
- The Code4Lib Journal, (34), 2016
Book Chapters
- In The Routledge Companion to Libraries, Archives, and the Digital Humanities, 2024
Conference Papers
- In New Trends in Theory and Practice of Digital Libraries, 2026
- In Proceedings of the 2025 ACM/IEEE Joint Conference on Digital Libraries (JCDL ’25), Virtual Event, pp. 129–138. 2025
- In Proceedings of the 2025 ACM/IEEE Joint Conference on Digital Libraries (JCDL ’25), Virtual Event, pp. 197–206. 2025
- In Proceedings of the 2025 ACM/IEEE Joint Conference on Digital Libraries (JCDL ’25), Virtual Event, pp. 177–186. 2025
- In Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’25), Padua, Italy, pp. 4209. 2025
- In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2025
- In Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI 2024), Vancouver, Canada, pp. 22878–22884. 2024
- In Proceedings of the 2023 ACM/IEEE Joint Conference on Digital Libraries (JCDL ’23), Santa Fe, New Mexico, USA, pp. 13–24. 2023
- In Proceedings of the 2023 ACM/IEEE Joint Conference on Digital Libraries (JCDL ’23), Santa Fe, New Mexico, USA, pp. 61–65. 2023
- ScanBank: A Benchmark Dataset for Figure Extraction from Scanned Electronic Theses and DissertationsIn Proceedings of the 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL ’21), Virtual Event, pp. 565–566. 2021
- In Proceedings of the 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL ’21), Virtual Event, pp. 230–233. 2021
- A Heuristic Baseline Method for Metadata Extraction from Scanned Electronic Theses and DissertationsIn Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020 (JCDL ’20), Virtual Event, China, pp. 515–516. 2020
Workshop Papers
- In 36th ACM Conference on Hypertext and Social Media (HT 2025), Chicago, Illinois, USA. 2025
- In 48th International ACM SIGIR Conference on Research and Development in Information Retrieval, Padua, Italy. 2025
- In 2024 IEEE International Conference on Big Data (BigData ’24), Washington, DC, USA, pp. 2400–2409. 2024
- In Companion Proceedings of the ACM Web Conference 2023 (WWW ’23 Companion), Austin, TX, USA, pp. 834–842. 2023
- In Companion Proceedings of the Web Conference 2022 (WWW ’22 Companion), Virtual Event, Lyon, France, pp. 784–788. 2022
- In 2022 IEEE International Conference on Big Data (Big Data ’22), Osaka, Japan, pp. 2473–2481. 2022
Extended Abstracts
- In 2024 IEEE International Conference on Big Data (BigData ’24), Washington, DC, USA, pp. 8677–8679. 2024
- In 2024 IEEE International Conference on Big Data (BigData ’24), Washington, DC, USA, pp. 8825–8827. 2024
- In 2024 IEEE International Conference on Big Data (BigData ’24), Washington, DC, USA, pp. 8620–8622. 2024
- In Proceedings of the 23rd ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL ’23), Santa Fe, New Mexico, USA, pp. 256–257. 2023
- In Proceedings of the 2023 ACM/IEEE Joint Conference on Digital Libraries (JCDL ’23), Santa Fe, New Mexico, USA, pp. 323–324. 2023
- Presented at GL24 — Twenty-Fourth International Conference on Grey Literature, GreyNet International. 2022
- Presented at 25th International Symposium on Electronic Theses and Dissertations, Novi Sad, Serbia. 2022
- In 2021 IEEE International Conference on Big Data (BigData ’21), Orlando, FL, USA, pp. 6043–6045. 2021
- Presented at 24th International Symposium on Electronic Theses and Dissertations, Abu Dhabi, UAE. 2021
- Presented at CNI: Coalition for Networked Information Spring 2021 Membership Meeting. 2021
- Presented at CNI: Coalition for Networked Information Fall 2020 Membership Meeting. 2020
- In Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020 (JCDL ’20), Virtual Event, China, pp. 557–558. 2020
- Presented at CNI: Coalition for Networked Information Fall 2019 Membership Meeting. 2019
Tutorials
- In Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020 (JCDL ’20), Virtual Event, China, pp. 565–566. 2020
- In Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020 (JCDL ’20), Virtual Event, China, pp. 567–568. 2020
- Presented at 22nd International Symposium on Electronic Theses and Dissertations. 2019
Workshops (Hosting/Organizing)
Preprints, White Papers, and Other Miscellanies
- CoRR, Vol. abs/2509.04759 , 2025
- Virginia Tech, 2021
- Virginia Tech, 2020