Preprint / Version 2

The applicability of Large Language Model(LLM) techniques to legacy information retrieval systems such as OPAC

##article.authors##

DOI:

https://doi.org/10.51094/jxiv.679

Keywords:

OPAC, large language model, LLM, generative AI, University Libraries, Online Puyblic Access Catalogue

Abstract

This paper shows that with advances in large language model (LLM) techniques such as GPT, they can be applied to legacy retrieval systems such as the Library's Online Public Access Catalogue (OPAC) for the following tasks: generating search questions, transforming them into search query, semantically aware retrieval, search results and evaluating their suitability. It was shown that it could be applied to each of the search processes. It was also investigated that the OPAC itself can serve as an information infrastructure for LLM.

Conflicts of Interest Disclosure

The authors declare no conflicts of interest associated with this manuscript.

Downloads *Displays the aggregated results up to the previous day.

Download data is not yet available.

References

“ 情 報 オ ー バ ー ロ ー ド.” Wikipedia.org,

“科学技術指標 2020・Html 版” 科学技術・学術政策研究所 (NISTEP).(2020)

Zhu, Yutao, et al. “Large Language Models for Information Retrieval: A Survey.” ArXiv.org, 2023

“情報検索のための大規模言語モデル.” Qiita, (2023)

Ai, Qingyao, et al. “Information Retrieval Meets Large Language Models: A Strategic Report from Chinese IR Community.” AI Open, vol. 4, Elsevier BV, Jan. 2023,pp. 80–90.

検索過程図書館情報学用語辞典 第 5 版. “検索過程 (けんさくかてい) とは? ”コトバンク, 2014

宮尾 祐介.“大規模言語モデルの原理と可能性”, 2023

“RAG とは何ですか? - 検索拡張生成の説明 - AWS.”Amazon Web Services, Inc., 2023

”「2.1.5 データサイエンスに基づく問題解決」に関するコラム ― 生成 AI ― “情報基礎教育ポータルサイト.”, 2024

“CiNii の DocumentLoader からベクトルデータベース Chroma にロードし近傍探索してみる(附 Embedding Model の比較).” Qiita, 8 Jan. 2024

Sun, Weiwei, et al. “Learning to Tokenize for Generative Retrieval.” ArXiv.org, 2023

Li, Yongqi, et al. “Learning to Rank in Generative Retrieval.” ArXiv.org, 2023

Ovadia, Oded, et al.“Fine-Tuning or Retrieval? Comparing Knowledge Injection in LLMs.” ArXiv.org, 2023

Freedom Preetham. “RAGs Do Not Reduce Hallucinations in LLMs ― Math Deep Dive — by Freedom Preetham.Feb, 2024

WALLIS, Richard. "MARC and beyond: our three Linked Data choices." Paper presented at: IFLA WLIC 2018 – Kuala Lumpur, Malaysia – Transform Libraries, Transform Societies in Session 113 - Information Technology.

“Overview of the BIBFRAME 2.0 Model (BIBFRAME- Bibliographic Framework Initiative, Library of Congress).” Loc.gov, 2016, ] “Guides: Penn Libraries Linked Data Framework: Appendix: Linked Data and Other Formats.”Upenn.edu, 2024

江上 周作, and 福田 賢一郎. “大規模言語モデルを用いた SPARQL クエリ生成の予備的実験.” 人工知能学会第二種研究会資料, vol. 2023

Berners-Lee, Tim.“The Semantic Web.”W3.org, 2024

次田瞬. “意味がわかる AI 入門 : 自然言語処理をめぐる哲学の挑戦.” 筑摩書房, 2023.11.

Kumar, Selva. “Semantic Search with ElasticSearch - GoPenAI.” Medium, GoPenAI, 26 Sept. 2023

高橋 和輝, “Cognitive Search を使ったベクトル検索のメリットとは? ChatGPT システムと連携したデモで解説!”, 4 Dec. 2023

岸田 和明.“情報検索の発展過程と新たな動き (情報検索の新潮流).”情報の科学と技術, vol. 50, no. 1, 一般社団法人 情報科学技術協会, 2000, pp. 3–8,

“CiNii の 検 索 結 果 を LLM で ま と め る.”Qiita, 10 Dec. 2023

Shi, Xiang, et al. “Know Where to Go: Make LLM a Relevant, Responsible, and Trustworthy Searcher.”ArXiv.org, 2023

Posted


Submitted: 2024-04-27 13:19:21 UTC

Published: 2024-05-07 01:18:21 UTC — Updated on 2024-05-13 09:43:12 UTC

Versions

Reason(s) for revision

Corrected errors in reference numbers.
Section
Information Sciences