Research notes

Research Blog

Concise notes on digital libraries, machine learning, scholarly communication, and the systems that make research evidence easier to find, evaluate, and reuse.

3 posts 10 topics RSS feed

December 19, 2025

Learning from LLM Disagreement in Retrieval Evaluation

Our JCDL 2025 paper with Bipasha Banerjee and Edward A. Fox examines how model disagreement changes retrieval evaluation when large language models filter scholarly records before ranking. “Learning from LLM Disagreement in Retrieval Evaluation” shows that disagreement between relevance labelers can identify cases near the boundary of an information need. In thematic retrieval tasks, particularly ones involving Sustainable Development Goals (SDGs), those boundary cases determine which records remain available to a dashboard,...
6 min read /
- AI
- Digital Libraries
- Information Retrieval
- SDGs
- JCDL
February 10, 2025

The VTechAGP Dataset: A Benchmark for Academic-to-General-Audience Paraphrasing

I recently collaborated with Ming Cheng and Jiaying Gong, two members of the Machine Learning Laboratory research team led by Dr. Hoda Eldardiry. We created the VTechAGP dataset to support research on text simplification and paraphrase generation.
2 min read /
- NLP
- Datasets
- Digital Libraries
- Text Simplification
December 5, 2024

Small, Locally-Hosted LLMs for Sustainable Development Goal Classification

We are excited to announce that “Agentic AI for Improving Precision in Identifying Contributions to Sustainable Development Goals” has been accepted as a poster at the 2024 IEEE International Conference on Big Data (IEEE BigData 2024), which will take place from December 15–18, 2024, in Washington, DC. Learn more about the conference.
2 min read /
- AI
- SDGs
- IEEE BigData
- Poster