Exploring Open Large Language Models for the Japanese Language: A Practical Guide
DOI:
https://doi.org/10.51094/jxiv.682Keywords:
Large Language Models, Japanese LanguageAbstract
While large language models (LLMs) have demonstrated remarkable capabilities in handling Japanese, they are conventionally trained on English-centric corpora, which may cause a deficiency in understanding and generating Japanese texts. In response, researchers have been actively developing LLMs with a specific focus on Japanese, many of which have been made publicly available. This rapid growth has made it challenging to obtain a comprehensive overview of the developments. To address this issue, this report reviews open LLMs for Japanese, including instruction-tuned models and multimodal models. We also introduce existing LLM evaluation benchmarks for Japanese, aiming to offer a practical guide to choosing the most suitable model. We continually update our work at https://github.com/llm-jp/awesome-japanese-llm.
Conflicts of Interest Disclosure
The author declares no conflicts of interest associated with this manuscript.Downloads *Displays the aggregated results up to the previous day.
References
ABEJA. (2022). gpt-neox-japanese-2.7b. https://huggingface.co/abeja/gpt-neox-japanese-2.7b. https://huggingface.co/abeja/gpt-neox-japanese-2.7b
Akiba, T., Shing, M., Tang, Y., Sun, Q., & Ha, D. (2024). Evolutionary Optimization of Model Merging Recipes. CoRR, abs/2403.13187. https://doi.org/10.48550/ARXIV.2403.13187
Bai, J., Bai, S., Chu, Y., Cui, Z., Dang, K., Deng, X., Fan, Y., Ge, W., Han, Y., Huang, F., Hui, B., Ji, L., Li, M., Lin, J., Lin, R., Liu, D., Liu, G., Lu, C., Lu, K., … Zhu, T. (2023). Qwen Technical Report. CoRR, abs/2309.16609. https://doi.org/10.48550/ARXIV.2309.16609
Chang, K., Cramer, M., Soni, S., & Bamman, D. (2023). Speak, Memory: An Archaeology of Books Known to ChatGPT/GPT-4. In H. Bouamor, J. Pino, & K. Bali (Eds.), Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (pp. 7312–7327). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.emnlp-main.453
Chang, Y., Wang, X., Wang, J., Wu, Y., Zhu, K., Chen, H., Yang, L., Yi, X., Wang, C., Wang, Y., Ye, W., Zhang, Y., Chang, Y., Yu, P. S., Yang, Q., & Xie, X. (2023). A Survey on Evaluation of Large Language Models. CoRR, abs/2307.03109. https://doi.org/10.48550/ARXIV.2307.03109
Chen, H., Jiao, F., Li, X., Qin, C., Ravaut, M., Zhao, R., Xiong, C., & Joty, S. (2023). ChatGPT’s One-year Anniversary: Are Open-Source Large Language Models Catching up? CoRR, abs/2311.16989. https://doi.org/10.48550/ARXIV.2311.16989
Chen, Z., Handa, H., & Shirahama, K. (2023). JCSE: Contrastive Learning of Japanese Sentence Embeddings and Its Applications. CoRR, abs/2301.08193. https://doi.org/10.48550/ARXIV.2301.08193
Chiang, W.-L., Li, Z., Lin, Z., Sheng, Y., Wu, Z., Zhang, H., Zheng, L., Zhuang, S., Zhuang, Y., Gonzalez, J. E., Stoica, I., & Xing, E. P. (2023). Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90%* ChatGPT Quality. https://lmsys.org/blog/2023-03-30-vicuna/
Chiang, W.-L., Zheng, L., Sheng, Y., Angelopoulos, A. N., Li, T., Li, D., Zhang, H., Zhu, B., Jordan, M., Gonzalez, J. E., & Stoica, I. (2024). Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference.
Clavié, B. (2023). JaColBERT and Hard Negatives, Towards Better Japanese-First Embeddings for Retrieval: Early Technical Report. CoRR, abs/2312.16144. https://doi.org/10.48550/ARXIV.2312.16144
Csaki, Z., Li, B., Li, J., Xu, Q., Pawakapan, P., Zhang, L., Du, Y., Zhao, H., Hu, C., & Thakker, U. (2024). SambaLingo: Teaching Large Language Models New Languages.
Dong, Y., Wang, Z., Sreedhar, M., Wu, X., & Kuchaiev, O. (2023). SteerLM: Attribute Conditioned SFT as an (User-Steerable) Alternative to RLHF. In H. Bouamor, J. Pino, & K. Bali (Eds.), Findings of the Association for Computational Linguistics: EMNLP 2023 (pp. 11275–11288). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.findings-emnlp.754
Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., … Zou, A. (2023). A framework for few-shot language model evaluation (v0.4.0) [Computer software]. Zenodo. https://doi.org/10.5281/zenodo.10256836
Gao, T., Yao, X., & Chen, D. (2021). SimCSE: Simple Contrastive Learning of Sentence Embeddings. In M.-F. Moens, X. Huang, L. Specia, & S. W. Yih (Eds.), Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (pp. 6894–6910). Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.emnlp-main.552
Gao, Y., Xiong, Y., Gao, X., Jia, K., Pan, J., Bi, Y., Dai, Y., Sun, J., Guo, Q., Wang, M., & Wang, H. (2023). Retrieval-Augmented Generation for Large Language Models: A Survey. CoRR, abs/2312.10997. https://doi.org/10.48550/ARXIV.2312.10997
Han, N., Ueda, N., Otake, M., Katsumata, S., Kamata, K., Kiyomaru, H., Kodama, T., Sugawara, S., Chen, B., Matsuda, H., Miyao, Y., Murawaki, Y., & Ryu, K. (2024). llm-jp-eval. https://github.com/llm-jp/llm-jp-eval. https://github.com/llm-jp/llm-jp-eval
Hayashibe, Y. (2020). Japanese Realistic Textual Entailment Corpus. In N. Calzolari, F. Béchet, P. Blache, K. Choukri, C. Cieri, T. Declerck, S. Goggi, H. Isahara, B. Maegaard, J. Mariani, H. Mazo, A. Moreno, J. Odijk, & S. Piperidis (Eds.), Proceedings of the Twelfth Language Resources and Evaluation Conference (pp. 6827–6834). European Language Resources Association. https://aclanthology.org/2020.lrec-1.843
Hendrycks, D., Burns, C., Basart, S., Zou, A., Mazeika, M., Song, D., & Steinhardt, J. (2021). Measuring Massive Multitask Language Understanding. 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. https://openreview.net/forum?id=d7KBjmI3GmQ
Hirano, M. (2024). Construction of a Japanese Financial Benchmark for Large Language Models. CoRR, abs/2403.15062. https://doi.org/10.48550/ARXIV.2403.15062
Hirano, M., & Imajo, K. (2024). Construction of Domain-specified Japanese Large Language Model for Finance through Continual Pre-training.
Ho, X., Nguyen, A.-K. D., Dao, T.-A., Jiang, J., Chida, Y., Sugimoto, K., To, H. Q., Boudin, F., & Aizawa, A. (2024). A Survey of Pre-trained Language Models for Processing Scientific Text. CoRR, abs/2401.17824. https://doi.org/10.48550/ARXIV.2401.17824
Hono, Y., Mitsuda, K., Zhao, T., Mitsui, K., Wakatsuki, T., & Sawada, K. (2023). An Integration of Pre-Trained Speech and Language Models for End-to-End Speech Recognition. CoRR, abs/2312.03668. https://doi.org/10.48550/ARXIV.2312.03668
Hornyak, T. (2023). Why Japan is building its own version of ChatGPT. Nature.
Huang, Y., Bai, Y., Zhu, Z., Zhang, J., Zhang, J., Su, T., Liu, J., Lv, C., Zhang, Y., Lei, J., Fu, Y., Sun, M., & He, J. (2023). C-Eval: A Multi-Level Multi-Discipline Chinese Evaluation Suite for Foundation Models. In A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, & S. Levine (Eds.), Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 - 16, 2023. http://papers.nips.cc/paper%5C_files/paper/2023/hash/c6ec1844bec96d6d32ae95ae694e23d8-Abstract-Datasets%5C_and%5C_Benchmarks.html
Inoue, Y., Sasaki, K., Ochi, Y., Fujii, K., Tanahashi, K., & Yamaguchi, Y. (2024). Heron-Bench: A Benchmark for Evaluating Vision Language Models in Japanese.
Iwakura, T., Komiya, K., & Tachibana, R. (2016). Constructing a Japanese Basic Named Entity Corpus of Various Genres. In X. Duan, R. E. Banchs, M. Zhang, H. Li, & A. Kumaran (Eds.), Proceedings of the Sixth Named Entity Workshop (pp. 41–46). Association for Computational Linguistics. https://doi.org/10.18653/v1/W16-2706
Jiang, A. Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D. S., de Las Casas, D., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., Lavaud, L. R., Lachaux, M.-A., Stock, P., Scao, T. L., Lavril, T., Wang, T., Lacroix, T., & Sayed, W. E. (2023). Mistral 7B. CoRR, abs/2310.06825. https://doi.org/10.48550/ARXIV.2310.06825
Jinnai, Y. (2024). calm2-7b-chat-dpo-experimental. https://huggingface.co/cyberagent/calm2-7b-chat-dpo-experimental. https://huggingface.co/cyberagent/calm2-7b-chat-dpo-experimental
Kamata, K. (2023). Nejumi LLM Leaderboard. https://api.wandb.ai/links/wandb-japan/xm2pju5m. https://api.wandb.ai/links/wandb-japan/xm2pju5m
Kandpal, N., Deng, H., Roberts, A., Wallace, E., & Raffel, C. (2023). Large Language Models Struggle to Learn Long-Tail Knowledge. In A. Krause, E. Brunskill, K. Cho, B. Engelhardt, S. Sabato, & J. Scarlett (Eds.), International Conference on Machine Learning, ICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA (Vol. 202, pp. 15696–15707). PMLR. https://proceedings.mlr.press/v202/kandpal23a.html
KARAKURI. (2024). KARAKURI LM. https://huggingface.co/karakuri-ai/karakuri-lm-70b-chat-v0.1. https://huggingface.co/karakuri-ai/karakuri-lm-70b-chat-v0.1
Kawazoe, A., Tanaka, R., Mineshima, K., & Bekki, D. (2015). An Inference Problem Set for Evaluating Semantic Theories and Semantic Processing Systems for Japanese. In M. Otake, S. Kurahashi, Y. Ota, K. Satoh, & D. Bekki (Eds.), New Frontiers in Artificial Intelligence - JSAI-isAI 2015 Workshops, LENLS, JURISIN, AAA, HAT-MASH, TSDAA, ASD-HR, and SKL, Kanagawa, Japan, November 16-18, 2015, Revised Selected Papers (Vol. 10091, pp. 58–65). https://doi.org/10.1007/978-3-319-50953-2_5
Khattab, O., & Zaharia, M. (2020). ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT. In J. X. Huang, Y. Chang, X. Cheng, J. Kamps, V. Murdock, J.-R. Wen, & Y. Liu (Eds.), Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, SIGIR 2020, Virtual Event, China, July 25-30, 2020 (pp. 39–48). ACM. https://doi.org/10.1145/3397271.3401075
Kiyomaru, H., Matsuda, H., Suzuki, J., Han, N., Sugawara, S., Sasaki, S., Kurita, S., Nakamura, T., Kodama, T., & Okamoto, T. (2024). llm-jp-13b-dpo-lora-hh_rlhf_ja-v1.1. https://huggingface.co/llm-jp/llm-jp-13b-dpo-lora-hh_rlhf_ja-v1.1. https://huggingface.co/llm-jp/llm-jp-13b-dpo-lora-hh_rlhf_ja-v1.1
Kudo, T. (2018). Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates. In I. Gurevych & Y. Miyao (Eds.), Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 66–75). Association for Computational Linguistics. https://doi.org/10.18653/v1/P18-1007
Kunitsu, Y. (2023). The potential of GPT-4 as a support tool for pharmacists: analytical study using the Japanese national examination for pharmacists. JMIR Medical Education, 9, e48452.
Kuniyoshi, S. (2023a). kunishou/databricks-dolly-15k-ja. https://huggingface.co/datasets/kunishou/databricks-dolly-15k-ja. https://huggingface.co/datasets/kunishou/databricks-dolly-15k-ja
Kuniyoshi, S. (2023b). kunishou/oasst1-89k-ja. https://huggingface.co/datasets/kunishou/oasst1-89k-ja. https://huggingface.co/datasets/kunishou/oasst1-89k-ja
Kurihara, K., Kawahara, D., & Shibata, T. (2022). JGLUE: Japanese General Language Understanding Evaluation. In N. Calzolari, F. Béchet, P. Blache, K. Choukri, C. Cieri, T. Declerck, S. Goggi, H. Isahara, B. Maegaard, J. Mariani, H. Mazo, J. Odijk, & S. Piperidis (Eds.), Proceedings of the Thirteenth Language Resources and Evaluation Conference (pp. 2957–2966). European Language Resources Association. https://aclanthology.org/2022.lrec-1.317
Lee, M., Nakamura, F., Shing, M., McCann, P., Akiba, T., & Orii, N. (2023a). Japanese StableLM Base Beta 70B. https://huggingface.co/stabilityai/japanese-stablelm-base-beta-70b. https://huggingface.co/stabilityai/japanese-stablelm-base-beta-70b
Lee, M., Nakamura, F., Shing, M., McCann, P., Akiba, T., & Orii, N. (2023b). Japanese StableLM Base Gamma 7B. https://huggingface.co/stabilityai/japanese-stablelm-base-gamma-7b. https://huggingface.co/stabilityai/japanese-stablelm-base-gamma-7b
Levine, A., Huang, C., Wang, C., Batista, E., Szymanska, E., Ding, H., Chou, H. W., Pessiot, J.-F., Effendi, J., Chiu, J., Ohlhus, K. T., Chopra, K., Shinzato, K., Murakami, K., Xiong, L., Chen, L., Kubota, M., Tkachenko, M., Lee, M., … Higashiyama, Y. (2024). RakutenAI-7B: Extending Large Language Models for Japanese. CoRR, abs/2403.15484. https://doi.org/10.48550/ARXIV.2403.15484
Li, Y., Wang, S., Ding, H., & Chen, H. (2023). Large Language Models in Finance: A Survey. 4th ACM International Conference on AI in Finance, ICAIF 2023, Brooklyn, NY, USA, November 27-29, 2023, 374–382. https://doi.org/10.1145/3604237.3626869
Lin, L., Durbin, J., Sato, M., & von Bock, F. (2023). Shisa 7B. https://huggingface.co/augmxnt/shisa-7b-v1. https://huggingface.co/augmxnt/shisa-7b-v1
Liu, H., Li, C., Li, Y., & Lee, Y. J. (2023). Improved Baselines with Visual Instruction Tuning. CoRR, abs/2310.03744. https://doi.org/10.48550/ARXIV.2310.03744
Liu, X., Lei, X., Wang, S., Huang, Y., Feng, Z., Wen, B., Cheng, J., Ke, P., Xu, Y., Tam, W. L., Zhang, X., Sun, L., Wang, H., Zhang, J., Huang, M., Dong, Y., & Tang, J. (2023). AlignBench: Benchmarking Chinese Alignment of Large Language Models. CoRR, abs/2311.18743. https://doi.org/10.48550/ARXIV.2311.18743
METI. (2024). GENIAC. https://www.meti.go.jp/english/policy/mono_info_service/geniac/index.html. https://www.meti.go.jp/english/policy/mono_info_service/geniac/index.html
Morishita, T., Yamaguchi, A., Morio, G., Hikaru, T., Imaichi, O., & Sogawa, Y. (2024). JFLD: A Japanese Benchmark for Deductive Reasoning based on Formal Logic. Proceedings of the Joint International Conference on Computational Linguistics, Language Resources and Evaluation.
Nakao, T., Miki, S., Nakamura, Y., Kikuchi, T., Nomura, Y., Hanaoka, S., Yoshikawa, T., Abe, O., & others. (2024). Capability of GPT-4V (ision) in the Japanese National Medical Licensing Examination: Evaluation Study. JMIR Medical Education, 10(1), e54393.
Okazaki, N., Mizuki, S., Iida, H., Loem, M., Hirai, S., Hattori, K., Ohi, M., Yokota, R., Fujii, K., & Nakamura, T. (2023). Swallow. https://huggingface.co/tokyotech-llm/Swallow-70b-hf. https://huggingface.co/tokyotech-llm/Swallow-70b-hf
Okazaki, N., Mizuki, S., Iida, H., Loem, M., Hirai, S., Hattori, K., Ohi, M., Yokota, R., Fujii, K., & Nakamura, T. (2024). Swallow-MS-7b-v0.1. https://huggingface.co/tokyotech-llm/Swallow-MS-7b-v0.1. https://huggingface.co/tokyotech-llm/Swallow-MS-7b-v0.1
OpenAI. (2022). ChatGPT. https://openai.com/chatgpt. https://openai.com/chatgpt
OpenAI. (2023). GPT-4 Technical Report. CoRR, abs/2303.08774. https://doi.org/10.48550/ARXIV.2303.08774
OpenAI. (2024). Introducing OpenAI Japan. https://openai.com/blog/introducing-openai-japan. https://openai.com/blog/introducing-openai-japan
Park, C., Lee, H., Park, H., Kim, H., Kim, S., Cho, S., Kim, S., & Lee, S. (2023). Open Ko-LLM Leaderboard. Upstage, National Information Society Agency. https://huggingface.co/spaces/upstage/open-ko-llm-leaderboard
Passaglia, S., & Yu, S. (2023). Rakuda Benchmark. https://github.com/yuzu-ai/japanese-llm-ranking. https://github.com/yuzu-ai/japanese-llm-ranking
Rafailov, R., Sharma, A., Mitchell, E., Manning, C. D., Ermon, S., & Finn, C. (2023). Direct Preference Optimization: Your Language Model is Secretly a Reward Model. In A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, & S. Levine (Eds.), Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 - 16, 2023. http://papers.nips.cc/paper%5C_files/paper/2023/hash/a85b405ed65c6477a4fe8302b5e06ce7-Abstract-Conference.html
RIKEN. (2023). Release of ichikara-instruction (in Japanese). https://liat-aip.sakura.ne.jp/wp/llm%E3%81%AE%E3%81%9F%E3%82%81%E3%81%AE%E6%97%A5%E6%9C%AC%E8%AA%9E%E3%82%A4%E3%83%B3%E3%82%B9%E3%83%88%E3%83%A9%E3%82%AF%E3%82%B7%E3%83%A7%E3%83%B3%E3%83%87%E3%83%BC%E3%82%BF%E4%BD%9C%E6%88%90/llm%E3%81%AE%E3%81%9F%E3%82%81%E3%81%AE%E6%97%A5%E6%9C%AC%E8%AA%9E%E3%82%A4%E3%83%B3%E3%82%B9%E3%83%88%E3%83%A9%E3%82%AF%E3%82%B7%E3%83%A7%E3%83%B3%E3%83%87%E3%83%BC%E3%82%BF-%E5%85%AC%E9%96%8B/. https://liat-aip.sakura.ne.jp/wp/llm%E3%81%AE%E3%81%9F%E3%82%81%E3%81%AE%E6%97%A5%E6%9C%AC%E8%AA%9E%E3%82%A4%E3%83%B3%E3%82%B9%E3%83%88%E3%83%A9%E3%82%AF%E3%82%B7%E3%83%A7%E3%83%B3%E3%83%87%E3%83%BC%E3%82%BF%E4%BD%9C%E6%88%90/llm%E3%81%AE%E3%81%9F%E3%82%81%E3%81%AE%E6%97%A5%E6%9C%AC%E8%AA%9E%E3%82%A4%E3%83%B3%E3%82%B9%E3%83%88%E3%83%A9%E3%82%AF%E3%82%B7%E3%83%A7%E3%83%B3%E3%83%87%E3%83%BC%E3%82%BF-%E5%85%AC%E9%96%8B/
Sasaki, A., Hirakawa, M., Horie, S., & Nakamura, T. (2023). ELYZA-tasks-100. https://huggingface.co/datasets/elyza/ELYZA-tasks-100. https://huggingface.co/datasets/elyza/ELYZA-tasks-100
Sasaki, A., Hirakawa, M., Horie, S., Nakamura, T., Passaglia, S., & Oba, D. (2023). ELYZA-japanese-Llama-2-13b. https://huggingface.co/elyza/ELYZA-japanese-Llama-2-13b. https://huggingface.co/elyza/ELYZA-japanese-Llama-2-13b
Sawada, K., Zhao, T., Shing, M., Mitsui, K., Kaga, A., Hono, Y., Wakatsuki, T., & Mitsuda, K. (2024, May). Release of Pre-Trained Models for the Japanese Language. Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024). https://arxiv.org/abs/2404.01657
Sennrich, R., Haddow, B., & Birch, A. (2016). Neural Machine Translation of Rare Words with Subword Units. In K. Erk & N. A. Smith (Eds.), Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 1715–1725). Association for Computational Linguistics. https://doi.org/10.18653/v1/P16-1162
Shinkawa, T. (2012). Substitutes for Immigrants? Social Policy Responses to Population Decreases in Japan. American Behavioral Scientist, 56(8), 1123–1138.
StabilityAI. (2023a). Japanese MT-Bench. https://github.com/Stability-AI/FastChat. https://github.com/Stability-AI/FastChat
StabilityAI. (2023b). JP Language Model Evaluation Harness. https://github.com/Stability-AI/lm-evaluation-harness. https://github.com/Stability-AI/lm-evaluation-harness
Sukeda, I., Suzuki, M., Sakaji, H., & Kodera, S. (2023). JMedLoRA: Medical Domain Adaptation on Japanese Large Language Models using Instruction-tuning. CoRR, abs/2310.10083. https://doi.org/10.48550/ARXIV.2310.10083
Sun, Y., Wan, Z., Ueda, N., Yahata, S., Cheng, F., Chu, C., & Kurohashi, S. (2024). Rapidly Developing High-quality Instruction Data and Evaluation Benchmark for Large Language Models with Minimal Human Effort: A Case Study on Japanese. The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024).
Suzuki, M., Hirano, M., & Sakaji, H. (2023). From Base to Conversational: Japanese Instruction Dataset and Tuning Large Language Models. In J. He, T. Palpanas, X. Hu, A. Cuzzocrea, D. Dou, D. Slezak, W. Wang, A. Gruca, J. C.-W. Lin, & R. Agrawal (Eds.), IEEE International Conference on Big Data, BigData 2023, Sorrento, Italy, December 15-18, 2023 (pp. 5684–5693). IEEE. https://doi.org/10.1109/BIGDATA59044.2023.10386605
Takahashi, K., Omi, T., Arima, K., & Ishigaki, T. (2024). Pretraining and Updating Language- and Domain-specific Large Language Model: A Case Study in Japanese Business Domain.
Tanaka, Y., Nakata, T., Aiga, K., Etani, T., Muramatsu, R., Katagiri, S., Kawai, H., Higashino, F., Enomoto, M., Noda, M., & others. (2024). Performance of generative pretrained transformer on the national medical licensing examination in japan. PLOS Digital Health, 3(1), e0000433.
Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., Bikel, D., Blecher, L., Canton-Ferrer, C., Chen, M., Cucurull, G., Esiobu, D., Fernandes, J., Fu, J., Fu, W., … Scialom, T. (2023). Llama 2: Open Foundation and Fine-Tuned Chat Models. CoRR, abs/2307.09288. https://doi.org/10.48550/ARXIV.2307.09288
Tsukagoshi, H., Sasano, R., & Takeda, K. (2023). Japanese SimCSE Technical Report. CoRR, abs/2310.19349. https://doi.org/10.48550/ARXIV.2310.19349
Wang, L., Ma, C., Feng, X., Zhang, Z., Yang, H., Zhang, J., Chen, Z., Tang, J., Chen, X., Lin, Y., Zhao, W. X., Wei, Z., & Wen, J.-R. (2023). A Survey on Large Language Model based Autonomous Agents. CoRR, abs/2308.11432. https://doi.org/10.48550/ARXIV.2308.11432
Wang, Y., Zhong, W., Li, L., Mi, F., Zeng, X., Huang, W., Shang, L., Jiang, X., & Liu, Q. (2023). Aligning Large Language Models with Human: A Survey. CoRR, abs/2307.12966. https://doi.org/10.48550/ARXIV.2307.12966
Wei, J., Bosma, M., Zhao, V. Y., Guu, K., Yu, A. W., Lester, B., Du, N., Dai, A. M., & Le, Q. V. (2022). Finetuned Language Models are Zero-Shot Learners. The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022. https://openreview.net/forum?id=gEZrGCozdqR
Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., Davison, J., Shleifer, S., von Platen, P., Ma, C., Jernite, Y., Plu, J., Xu, C., Le Scao, T., Gugger, S., … Rush, A. (2020). Transformers: State-of-the-Art Natural Language Processing. In Q. Liu & D. Schlangen (Eds.), Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations (pp. 38–45). Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.emnlp-demos.6
Xi, Z., Chen, W., Guo, X., He, W., Ding, Y., Hong, B., Zhang, M., Wang, J., Jin, S., Zhou, E., Zheng, R., Fan, X., Wang, X., Xiong, L., Zhou, Y., Wang, W., Jiang, C., Zou, Y., Liu, X., … Gui, T. (2023). The Rise and Potential of Large Language Model Based Agents: A Survey. CoRR, abs/2309.07864. https://doi.org/10.48550/ARXIV.2309.07864
Yada, S., Nakamura, Y., Wakamiya, S., & Aramaki, E. (2022). Real-mednlp: Overview of real document-based medical natural language processing task. Proceedings of the 16th NTCIR Conference on Evaluation of Information Access Technologies, 285–296.
Yanaka, H., & Mineshima, K. (2022). Compositional Evaluation on Japanese Textual Entailment and Similarity. Transactions of the Association for Computational Linguistics, 10, 1266–1284. https://doi.org/10.1162/tacl_a_00518
Yang, A., Xiao, B., Wang, B., Zhang, B., Bian, C., Yin, C., Lv, C., Pan, D., Wang, D., Yan, D., Yang, F., Deng, F., Wang, F., Liu, F., Ai, G., Dong, G., Zhao, H., Xu, H., Sun, H., … Wu, Z. (2023). Baichuan 2: Open Large-scale Language Models. CoRR, abs/2309.10305. https://doi.org/10.48550/ARXIV.2309.10305
Yin, Z., Wang, H., Horio, K., Kawahara, D., & Sekine, S. (2024). Should We Respect LLMs? A Cross-Lingual Study on the Influence of Prompt Politeness on LLM Performance. CoRR, abs/2402.14531. https://doi.org/10.48550/ARXIV.2402.14531
Zhao, W. X., Liu, J., Ren, R., & Wen, J.-R. (2022). Dense Text Retrieval based on Pretrained Language Models: A Survey. CoRR, abs/2211.14876. https://doi.org/10.48550/ARXIV.2211.14876
Zheng, L., Chiang, W.-L., Sheng, Y., Zhuang, S., Wu, Z., Zhuang, Y., Lin, Z., Li, Z., Li, D., Xing, E. P., Zhang, H., Gonzalez, J. E., & Stoica, I. (2023). Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena. In A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, & S. Levine (Eds.), Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 - 16, 2023. http://papers.nips.cc/paper%5C_files/paper/2023/hash/91f18a1287b398d378ef22505bf41832-Abstract-Datasets%5C_and%5C_Benchmarks.html
Zhou, H., Gu, B., Zou, X., Li, Y., Chen, S. S., Zhou, P., Liu, J., Hua, Y., Mao, C., Wu, X., Li, Z., & Liu, F. (2023). A Survey of Large Language Models in Medicine: Progress, Application, and Challenge. CoRR, abs/2311.05112. https://doi.org/10.48550/ARXIV.2311.05112
Zhu, D., Chen, J., Shen, X., Li, X., & Elhoseiny, M. (2023). MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models. CoRR, abs/2304.10592. https://doi.org/10.48550/ARXIV.2304.10592
Downloads
Posted
Submitted: 2024-04-24 12:28:38 UTC
Published: 2024-04-26 11:22:08 UTC
License
Copyright (c) 2024
Kaito Sugimoto
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.