Loading…

MOSS: An Open Conversational Large Language Model

Conversational large language models (LLMs) such as ChatGPT and GPT-4 have recently exhibited remarkable capabilities across various domains, capturing widespread attention from the public. To facilitate this line of research, in this paper, we report the development of MOSS, an open-sourced convers...

Full description

Saved in:

Bibliographic Details
Published in:	International journal of automation and computing 2024-10, Vol.21 (5), p.888-905
Main Authors:	Sun, Tianxiang, Zhang, Xiaotian, He, Zhengfu, Li, Peng, Cheng, Qinyuan, Liu, Xiangyang, Yan, Hang, Shao, Yunfan, Tang, Qiong, Zhang, Shiduo, Zhao, Xingjian, Chen, Ke, Zheng, Yining, Zhou, Zhejian, Li, Ruixiao, Zhan, Jun, Zhou, Yunhua, Li, Linyang, Yang, Xiaogui, Wu, Lingling, Yin, Zhangyue, Huang, Xuanjing, Jiang, Yu-Gang, Qiu, Xipeng
Format:	Article
Language:	English
Subjects:	Artificial Intelligence Chatbots Computer Science Effectiveness Large language models Research Article
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Conversational large language models (LLMs) such as ChatGPT and GPT-4 have recently exhibited remarkable capabilities across various domains, capturing widespread attention from the public. To facilitate this line of research, in this paper, we report the development of MOSS, an open-sourced conversational LLM that contains 16 B parameters and can perform a variety of instructions in multi-turn interactions with humans. The base model of MOSS is pre-trained on large-scale unlabeled English, Chinese, and code data. To optimize the model for dialogue, we generate 1.1 M synthetic conversations based on user prompts collected through our earlier versions of the model API. We then perform preference-aware training on preference data annotated from AI feedback. Evaluation results on real-world use cases and academic benchmarks demonstrate the effectiveness of the proposed approaches. In addition, we present an effective practice to augment MOSS with several external tools. Through the development of MOSS, we have established a complete technical roadmap for large language models from pre-training, supervised fine-tuning to alignment, verifying the feasibility of chatGPT under resource-limited conditions and providing a reference for both the academic and industrial communities. Model weights and code are publicly available at https://github.com/OpenMOSS/MOSS .
ISSN:	2731-538X 1476-8186 2731-5398 1751-8520
DOI:	10.1007/s11633-024-1502-8