大規模言語モデル時代の機械翻訳の展望

東山, 翔平

doi:10.51094/jxiv.932

##article.authors##

東山, 翔平国立研究開発法人情報通信研究機構ユニバーサルコミュニケーション研究所

DOI:

https://doi.org/10.51094/jxiv.932

キーワード:

機械翻訳、多言語処理、大規模言語モデル

抄録

近年の大規模言語モデルの発展は目覚ましく，自然言語処理の諸技術，特に機械翻訳を含むテキスト生成技術は大きな進展を遂げている．本稿では，大規模言語モデルを用いた機械翻訳研究の進展と主な課題を紹介するとともに，今後の展望を述べる．

利益相反に関する開示

本論文に関して，開示すべき利益相反関連事項はない．

ダウンロード *前日までの集計結果を表示します

ダウンロード実績データは、公開の翌日以降に作成されます。

引用文献

乾健太郎，“ChatGPT の出現は自然言語処理の専門家に何を問いかけているか，” 自然言語処理，vol.30，no.2，pp.273–274，2023．

T. Brown, B. Mann, N. Ryder, M. Subbiah, J.D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, and D. Amodei, “Language models are few-shot learners,” Advances in Neural Information Processing Systems, vol.33, pp.1877–1901, Curran Associates, Inc., 2020.

藤田篤，山田優，影浦峡，“翻訳と機械翻訳：年次大会のテーマセッションを通じての知見，” 自然言語処理，vol.27，no.4，pp.975–981，2020．

影浦峡，“改めて、翻訳とは何か：Google NMT が使える時代に，” 言語処理学会第 23 回年次大会発表論文集，pp.931–934，

．

田嶌奈々，“特集「言語サービスの国際規格」『何でも教えてキカク』今さら聞けない基本知識－おさらい編，” JTF ジャーナル，no.280，pp.6–11，日本翻訳連盟，2015．

中澤敏明，“機械翻訳の新しいパラダイム：ニューラル機械翻訳の原理，” 情報管理，vol.60，no.5，pp.299–306，2017．

T. Kocmi, E. Avramidis, R. Bawden, O. Bojar, A. Dvorkovich, C. Federmann, M. Fishel, M. Freitag, T. Gowda, R. Grundkiewicz, B. Haddow, P. Koehn, B. Marie, C. Monz, M. Morishita, K. Murray, M. Nagata, T. Nakazawa, M. Popel, M. Popovic, and M. Shmatova, “Findings of the 2023 conference on machine translation (WMT23): LLMs are here but not quite there yet,” Proceedings of the Eighth Conference on Machine Translation, pp.1–42, Association for Computational Linguistics, Singapore, Dec. 2023.

J. Pang, F. Ye, L. Wang, D. Yu, D.F. Wong, S. Shi, and Z. Tu, “Salute the classic: Revisiting challenges of machine translation in the age of large language models,” 2024. https://arxiv.org/abs/2401.08350

Z. He, T. Liang, W. Jiao, Z. Zhang, Y. Yang, R. Wang, Z. Tu, S. Shi, and X. Wang, “Exploring Human-Like Translation Strategy with Large Language Models,” Transactions of the Association for Computational Linguistics, vol.12, pp.229–246, 03 2024.

C. Lyu, Z. Du, J. Xu, Y. Duan, M. Wu, T. Lynn, A.F. Aji, D.F. Wong, and L. Wang, “A paradigm shift: The future of machine translation lies with large language models,” Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pp.1339–1352, ELRA and ICCL, Torino, Italia, May 2024.

S. Maruf, F. Saleh, and G. Haffari, “A survey on document-level neural machine translation: Methods and evaluation,” ACM Computing Surveys, vol.54, no.2, pp.1–36, March 2021.

A. Fujita, “Attainable text-to-text machine translation vs. translation: Issues beyond linguistic processing,” Proceedings of Machine Translation Summit XVIII: Research Track, pp.215–230, Association for Machine Translation in the Americas, Virtual, Aug. 2021.

K. Semenov, V. Zouhar, T. Kocmi, D. Zhang, W. Zhou, and Y.E. Jiang, “Findings of the WMT 2023 shared task on machine translation with terminologies,” Proceedings of the Eighth Conference on Machine Translation, pp.663–671, Association for Computational Linguistics, Singapore, Dec. 2023.

Y. Wang, Z. Sun, S. Cheng, W. Zheng, and M. Wang, “Controlling styles in neural machine translation with activation prompt,” Findings of the Association for Computational Linguistics: ACL 2023, pp.2606–2620, Association for Computational Linguistics, Toronto, Canada, July 2023.

OpenAI, “GPT-4 technical report,” 2024. https://arxiv.org/abs/2303.08774

L. Wang, Z. Du, W. Jiao, C. Lyu, J. Pang, L. Cui, K. Song, D. Wong, S. Shi, and Z. Tu, “Benchmarking and improving long-text translation with large language models,” Findings of the Association for Computational Linguistics ACL 2024, pp.7175–7187, Association for Computational Linguistics, Bangkok, Thailand and virtual meeting, Aug. 2024.

Y. Moslem, R. Haque, J.D. Kelleher, and A. Way, “Adaptive machine translation with large language models,” Proceedings of the 24th Annual Conference of the European Association for Machine Translation, pp.227–237, European Association for Machine Translation, Tampere, Finland, June 2023.

Y. Gao, Y. Xiong, X. Gao, K. Jia, J. Pan, Y. Bi, Y. Dai, J. Sun, M. Wang, and H. Wang, “Retrieval-augmented generation for large language models: A survey,” 2024. https://arxiv.org/abs/2312.10997

S. Santy, S. Dandapat, M. Choudhury, and K. Bali, “INMT: Interactive neural machine translation prediction,” Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations, pp.103–108, Association for Computational Linguistics, Hong Kong, China, Nov. 2019.

J. Zhu, Y. Zhou, J. Zhang, and C. Zong, “Attend, translate and summarize: An efficient method for neural cross-lingual summarization,” Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp.1309–1321, Association for Computational Linguistics, Online, July 2020.

W. Zhu, H. Liu, Q. Dong, J. Xu, S. Huang, L. Kong, J. Chen, and L. Li, “Multilingual machine translation with large language models: Empirical results and analysis,” Findings of the Association for Computational Linguistics: NAACL 2024, pp.2765–2781, Association for Computational Linguistics, Mexico City, Mexico, June 2024.

S. Shen, L. Logeswaran, M. Lee, H. Lee, S. Poria, and R. Mihalcea, “Understanding the capabilities and limitations of large language models for cultural commonsense,” Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pp.5668–5680, Association for Computational Linguistics, Mexico City, Mexico, June 2024.

W. Wang, W. Jiao, J. Huang, R. Dai, J.-t. Huang, Z. Tu, and M. Lyu, “Not all countries celebrate thanksgiving: On the cultural dominance in large language models,” Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp.6349–6384, Association for Computational Linguistics, Bangkok, Thailand, Aug. 2024.

B. Yao, M. Jiang, D. Yang, and J. Hu, “Benchmarking LLM-based machine translation on cultural awareness,” 2024. https://arxiv.org/abs/2305.14328

H. Xu, Y.J. Kim, A. Sharaf, and H.H. Awadalla, “A paradigm shift in machine translation: Boosting translation performance of large language models,” The Twelfth International Conference on Learning Representations, pp.1–21, May 2024.

T. Kocmi and C. Federmann, “Large language models are state-ofthe-art evaluators of translation quality,” Proceedings of the 24th Annual Conference of the European Association for Machine Translation, pp.193–203, European Association for Machine Translation, Tampere, Finland, June 2023.

Q. Lu, B. Qiu, L. Ding, K. Zhang, T. Kocmi, and D. Tao, “Error analysis prompting enables human-like translation evaluation in large language models,” Findings of the Association for Computational Linguistics ACL 2024, pp.8801–8816, Association for Computational Linguistics, Bangkok, Thailand and virtual meeting, Aug. 2024.

M. Treviso, N.M. Guerreiro, S. Agrawal, R. Rei, J. Pombal, T. Vaz, H. Wu, B. Silva, D. vanStigt, and A.F.T. Martins, “xTower: A multilingual LLM for explaining and correcting translation errors,” 2024. https://arxiv.org/abs/2406.19482

N.M. Guerreiro, R. Rei, D. vanStigt, L. Coheur, P. Colombo, and A.F.T. Martins, “xCOMET: Transparent machine translation evaluation through fine-grained error detection,” 2023. https://arxiv.org/abs/2310.10482