Loading…

OpenAssistant Conversations -- Democratizing Large Language Model Alignment

Aligning large language models (LLMs) with human preferences has proven to drastically improve usability and has driven rapid adoption as demonstrated by ChatGPT. Alignment techniques such as supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) greatly reduce the requir...

Full description

Saved in:

Bibliographic Details
Published in:	arXiv.org 2023-10
Main Authors:	Köpf, Andreas, Kilcher, Yannic, Dimitri von Rütte, Anagnostidis, Sotiris, Zhi-Rui Tam, Stevens, Keith, Barhoum, Abdullah, Nguyen, Minh Duc, Oliver, Stanley, Nagyfi, Richárd, Shahul, E S, Suri, Sameer, Glushkov, David, Dantuluri, Arnav, Maguire, Andrew, Schuhmann, Christoph, Nguyen, Huu, Mattick, Alexander
Format:	Article
Language:	English
Subjects:	Alignment Chatbots Domains Feedback Large language models Machine learning
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites
container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Köpf, Andreas Kilcher, Yannic Dimitri von Rütte Anagnostidis, Sotiris Zhi-Rui Tam Stevens, Keith Barhoum, Abdullah Nguyen, Minh Duc Oliver, Stanley Nagyfi, Richárd Shahul, E S Suri, Sameer Glushkov, David Dantuluri, Arnav Maguire, Andrew Schuhmann, Christoph Nguyen, Huu Mattick, Alexander
description	Aligning large language models (LLMs) with human preferences has proven to drastically improve usability and has driven rapid adoption as demonstrated by ChatGPT. Alignment techniques such as supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) greatly reduce the required skill and domain knowledge to effectively harness the capabilities of LLMs, increasing their accessibility and utility across various domains. However, state-of-the-art alignment techniques like RLHF rely on high-quality human feedback data, which is expensive to create and often remains proprietary. In an effort to democratize research on large-scale alignment, we release OpenAssistant Conversations, a human-generated, human-annotated assistant-style conversation corpus consisting of 161,443 messages in 35 different languages, annotated with 461,292 quality ratings, resulting in over 10,000 complete and fully annotated conversation trees. The corpus is a product of a worldwide crowd-sourcing effort involving over 13,500 volunteers. Models trained on OpenAssistant Conversations show consistent improvements on standard benchmarks over respective base models. We release our code and data under a fully permissive licence.
format	article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2802666671</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2802666671</sourcerecordid><originalsourceid>FETCH-proquest_journals_28026666713</originalsourceid><addsrcrecordid>eNqNTr0KwjAYDIJg0b5DwDnQJtZ2LVURVFzcS9DPkNJ-qflSB5_eDD6AN9wPd8PNWCKVykW1kXLBUqIuyzK5LWVRqISdriNgTWQpaAy8cfgGTzpYh8SF4DsY3N3H_LFo-Fl7A5HRTDqai3tAz-veGhwAw4rNn7onSH-6ZOvD_tYcxejdawIKbecmj7FqZRUvRJS5-m_1BQo9PQs</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2802666671</pqid></control><display><type>article</type><title>OpenAssistant Conversations -- Democratizing Large Language Model Alignment</title><source>ProQuest - Publicly Available Content Database</source><creator>Köpf, Andreas ; Kilcher, Yannic ; Dimitri von Rütte ; Anagnostidis, Sotiris ; Zhi-Rui Tam ; Stevens, Keith ; Barhoum, Abdullah ; Nguyen, Minh Duc ; Oliver, Stanley ; Nagyfi, Richárd ; Shahul, E S ; Suri, Sameer ; Glushkov, David ; Dantuluri, Arnav ; Maguire, Andrew ; Schuhmann, Christoph ; Nguyen, Huu ; Mattick, Alexander</creator><creatorcontrib>Köpf, Andreas ; Kilcher, Yannic ; Dimitri von Rütte ; Anagnostidis, Sotiris ; Zhi-Rui Tam ; Stevens, Keith ; Barhoum, Abdullah ; Nguyen, Minh Duc ; Oliver, Stanley ; Nagyfi, Richárd ; Shahul, E S ; Suri, Sameer ; Glushkov, David ; Dantuluri, Arnav ; Maguire, Andrew ; Schuhmann, Christoph ; Nguyen, Huu ; Mattick, Alexander</creatorcontrib><description>Aligning large language models (LLMs) with human preferences has proven to drastically improve usability and has driven rapid adoption as demonstrated by ChatGPT. Alignment techniques such as supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) greatly reduce the required skill and domain knowledge to effectively harness the capabilities of LLMs, increasing their accessibility and utility across various domains. However, state-of-the-art alignment techniques like RLHF rely on high-quality human feedback data, which is expensive to create and often remains proprietary. In an effort to democratize research on large-scale alignment, we release OpenAssistant Conversations, a human-generated, human-annotated assistant-style conversation corpus consisting of 161,443 messages in 35 different languages, annotated with 461,292 quality ratings, resulting in over 10,000 complete and fully annotated conversation trees. The corpus is a product of a worldwide crowd-sourcing effort involving over 13,500 volunteers. Models trained on OpenAssistant Conversations show consistent improvements on standard benchmarks over respective base models. We release our code and data under a fully permissive licence.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Alignment ; Chatbots ; Domains ; Feedback ; Large language models ; Machine learning</subject><ispartof>arXiv.org, 2023-10</ispartof><rights>2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2802666671?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>780,784,25753,37012,44590</link.rule.ids></links><search><creatorcontrib>Köpf, Andreas</creatorcontrib><creatorcontrib>Kilcher, Yannic</creatorcontrib><creatorcontrib>Dimitri von Rütte</creatorcontrib><creatorcontrib>Anagnostidis, Sotiris</creatorcontrib><creatorcontrib>Zhi-Rui Tam</creatorcontrib><creatorcontrib>Stevens, Keith</creatorcontrib><creatorcontrib>Barhoum, Abdullah</creatorcontrib><creatorcontrib>Nguyen, Minh Duc</creatorcontrib><creatorcontrib>Oliver, Stanley</creatorcontrib><creatorcontrib>Nagyfi, Richárd</creatorcontrib><creatorcontrib>Shahul, E S</creatorcontrib><creatorcontrib>Suri, Sameer</creatorcontrib><creatorcontrib>Glushkov, David</creatorcontrib><creatorcontrib>Dantuluri, Arnav</creatorcontrib><creatorcontrib>Maguire, Andrew</creatorcontrib><creatorcontrib>Schuhmann, Christoph</creatorcontrib><creatorcontrib>Nguyen, Huu</creatorcontrib><creatorcontrib>Mattick, Alexander</creatorcontrib><title>OpenAssistant Conversations -- Democratizing Large Language Model Alignment</title><title>arXiv.org</title><description>Aligning large language models (LLMs) with human preferences has proven to drastically improve usability and has driven rapid adoption as demonstrated by ChatGPT. Alignment techniques such as supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) greatly reduce the required skill and domain knowledge to effectively harness the capabilities of LLMs, increasing their accessibility and utility across various domains. However, state-of-the-art alignment techniques like RLHF rely on high-quality human feedback data, which is expensive to create and often remains proprietary. In an effort to democratize research on large-scale alignment, we release OpenAssistant Conversations, a human-generated, human-annotated assistant-style conversation corpus consisting of 161,443 messages in 35 different languages, annotated with 461,292 quality ratings, resulting in over 10,000 complete and fully annotated conversation trees. The corpus is a product of a worldwide crowd-sourcing effort involving over 13,500 volunteers. Models trained on OpenAssistant Conversations show consistent improvements on standard benchmarks over respective base models. We release our code and data under a fully permissive licence.</description><subject>Alignment</subject><subject>Chatbots</subject><subject>Domains</subject><subject>Feedback</subject><subject>Large language models</subject><subject>Machine learning</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNqNTr0KwjAYDIJg0b5DwDnQJtZ2LVURVFzcS9DPkNJ-qflSB5_eDD6AN9wPd8PNWCKVykW1kXLBUqIuyzK5LWVRqISdriNgTWQpaAy8cfgGTzpYh8SF4DsY3N3H_LFo-Fl7A5HRTDqai3tAz-veGhwAw4rNn7onSH-6ZOvD_tYcxejdawIKbecmj7FqZRUvRJS5-m_1BQo9PQs</recordid><startdate>20231031</startdate><enddate>20231031</enddate><creator>Köpf, Andreas</creator><creator>Kilcher, Yannic</creator><creator>Dimitri von Rütte</creator><creator>Anagnostidis, Sotiris</creator><creator>Zhi-Rui Tam</creator><creator>Stevens, Keith</creator><creator>Barhoum, Abdullah</creator><creator>Nguyen, Minh Duc</creator><creator>Oliver, Stanley</creator><creator>Nagyfi, Richárd</creator><creator>Shahul, E S</creator><creator>Suri, Sameer</creator><creator>Glushkov, David</creator><creator>Dantuluri, Arnav</creator><creator>Maguire, Andrew</creator><creator>Schuhmann, Christoph</creator><creator>Nguyen, Huu</creator><creator>Mattick, Alexander</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20231031</creationdate><title>OpenAssistant Conversations -- Democratizing Large Language Model Alignment</title><author>Köpf, Andreas ; Kilcher, Yannic ; Dimitri von Rütte ; Anagnostidis, Sotiris ; Zhi-Rui Tam ; Stevens, Keith ; Barhoum, Abdullah ; Nguyen, Minh Duc ; Oliver, Stanley ; Nagyfi, Richárd ; Shahul, E S ; Suri, Sameer ; Glushkov, David ; Dantuluri, Arnav ; Maguire, Andrew ; Schuhmann, Christoph ; Nguyen, Huu ; Mattick, Alexander</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_28026666713</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Alignment</topic><topic>Chatbots</topic><topic>Domains</topic><topic>Feedback</topic><topic>Large language models</topic><topic>Machine learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Köpf, Andreas</creatorcontrib><creatorcontrib>Kilcher, Yannic</creatorcontrib><creatorcontrib>Dimitri von Rütte</creatorcontrib><creatorcontrib>Anagnostidis, Sotiris</creatorcontrib><creatorcontrib>Zhi-Rui Tam</creatorcontrib><creatorcontrib>Stevens, Keith</creatorcontrib><creatorcontrib>Barhoum, Abdullah</creatorcontrib><creatorcontrib>Nguyen, Minh Duc</creatorcontrib><creatorcontrib>Oliver, Stanley</creatorcontrib><creatorcontrib>Nagyfi, Richárd</creatorcontrib><creatorcontrib>Shahul, E S</creatorcontrib><creatorcontrib>Suri, Sameer</creatorcontrib><creatorcontrib>Glushkov, David</creatorcontrib><creatorcontrib>Dantuluri, Arnav</creatorcontrib><creatorcontrib>Maguire, Andrew</creatorcontrib><creatorcontrib>Schuhmann, Christoph</creatorcontrib><creatorcontrib>Nguyen, Huu</creatorcontrib><creatorcontrib>Mattick, Alexander</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>ProQuest - Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Köpf, Andreas</au><au>Kilcher, Yannic</au><au>Dimitri von Rütte</au><au>Anagnostidis, Sotiris</au><au>Zhi-Rui Tam</au><au>Stevens, Keith</au><au>Barhoum, Abdullah</au><au>Nguyen, Minh Duc</au><au>Oliver, Stanley</au><au>Nagyfi, Richárd</au><au>Shahul, E S</au><au>Suri, Sameer</au><au>Glushkov, David</au><au>Dantuluri, Arnav</au><au>Maguire, Andrew</au><au>Schuhmann, Christoph</au><au>Nguyen, Huu</au><au>Mattick, Alexander</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>OpenAssistant Conversations -- Democratizing Large Language Model Alignment</atitle><jtitle>arXiv.org</jtitle><date>2023-10-31</date><risdate>2023</risdate><eissn>2331-8422</eissn><abstract>Aligning large language models (LLMs) with human preferences has proven to drastically improve usability and has driven rapid adoption as demonstrated by ChatGPT. Alignment techniques such as supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) greatly reduce the required skill and domain knowledge to effectively harness the capabilities of LLMs, increasing their accessibility and utility across various domains. However, state-of-the-art alignment techniques like RLHF rely on high-quality human feedback data, which is expensive to create and often remains proprietary. In an effort to democratize research on large-scale alignment, we release OpenAssistant Conversations, a human-generated, human-annotated assistant-style conversation corpus consisting of 161,443 messages in 35 different languages, annotated with 461,292 quality ratings, resulting in over 10,000 complete and fully annotated conversation trees. The corpus is a product of a worldwide crowd-sourcing effort involving over 13,500 volunteers. Models trained on OpenAssistant Conversations show consistent improvements on standard benchmarks over respective base models. We release our code and data under a fully permissive licence.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2023-10
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_2802666671
source	ProQuest - Publicly Available Content Database
subjects	Alignment Chatbots Domains Feedback Large language models Machine learning
title	OpenAssistant Conversations -- Democratizing Large Language Model Alignment
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T03%3A18%3A28IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=OpenAssistant%20Conversations%20--%20Democratizing%20Large%20Language%20Model%20Alignment&rft.jtitle=arXiv.org&rft.au=K%C3%B6pf,%20Andreas&rft.date=2023-10-31&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2802666671%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_28026666713%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2802666671&rft_id=info:pmid/&rfr_iscdi=true