Abstract:
Screening research papers for inclusion in a literature review is a time-consuming manual
process. We explore automating this process using OpenAI’s GPT-3.5 Turbo large language model (LLM).
Given text prompts specifying the inclusion/exclusion criteria, the LLM evaluated the abstract of each paper.
It is classified into one of four categories: meeting both criteria, violating the first criteria, violating the
second criteria, or violating both criteria. Our Python code interfaced with the OpenAI API to pass paper
abstracts as prompts to the LLM. For 347 papers, the LLM flagged 173 as meeting the criteria, with 3
additional papers included after accounting for missing abstracts, yielding 176 papers selected for full-text
retrieval. A manual review of a sample suggested reasonable accuracy. While further validation is needed,
this demonstrates LLMs’ potential for accelerating systematic literature reviews.
Description:
1. Mintii, M.M., 2023. Exploring the landscape of STEM education and personnel training: a comprehensive
systematic
review.
Educational
Dimension,
9,
pp.149–172.
Available
from:
https://doi.org/10.31812/ed.583
2. Hamaniuk, V.A., 2021. The potential of Large Language Models in language education. Educational
Dimension, 5, pp.208–210. Available from: https://doi.org/10.31812/ed.650