Loading…

OpenAssistant Conversations -- Democratizing Large Language Model Alignment

Aligning large language models (LLMs) with human preferences has proven to drastically improve usability and has driven rapid adoption as demonstrated by ChatGPT. Alignment techniques such as supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) greatly reduce the requir...

Full description

Saved in:
Bibliographic Details
Published in:arXiv.org 2023-10
Main Authors: Köpf, Andreas, Kilcher, Yannic, Dimitri von Rütte, Anagnostidis, Sotiris, Zhi-Rui Tam, Stevens, Keith, Barhoum, Abdullah, Nguyen, Minh Duc, Oliver, Stanley, Nagyfi, Richárd, Shahul, E S, Suri, Sameer, Glushkov, David, Dantuluri, Arnav, Maguire, Andrew, Schuhmann, Christoph, Nguyen, Huu, Mattick, Alexander
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Köpf, Andreas
Kilcher, Yannic
Dimitri von Rütte
Anagnostidis, Sotiris
Zhi-Rui Tam
Stevens, Keith
Barhoum, Abdullah
Nguyen, Minh Duc
Oliver, Stanley
Nagyfi, Richárd
Shahul, E S
Suri, Sameer
Glushkov, David
Dantuluri, Arnav
Maguire, Andrew
Schuhmann, Christoph
Nguyen, Huu
Mattick, Alexander
description Aligning large language models (LLMs) with human preferences has proven to drastically improve usability and has driven rapid adoption as demonstrated by ChatGPT. Alignment techniques such as supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) greatly reduce the required skill and domain knowledge to effectively harness the capabilities of LLMs, increasing their accessibility and utility across various domains. However, state-of-the-art alignment techniques like RLHF rely on high-quality human feedback data, which is expensive to create and often remains proprietary. In an effort to democratize research on large-scale alignment, we release OpenAssistant Conversations, a human-generated, human-annotated assistant-style conversation corpus consisting of 161,443 messages in 35 different languages, annotated with 461,292 quality ratings, resulting in over 10,000 complete and fully annotated conversation trees. The corpus is a product of a worldwide crowd-sourcing effort involving over 13,500 volunteers. Models trained on OpenAssistant Conversations show consistent improvements on standard benchmarks over respective base models. We release our code and data under a fully permissive licence.
format article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2802666671</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2802666671</sourcerecordid><originalsourceid>FETCH-proquest_journals_28026666713</originalsourceid><addsrcrecordid>eNqNTr0KwjAYDIJg0b5DwDnQJtZ2LVURVFzcS9DPkNJ-qflSB5_eDD6AN9wPd8PNWCKVykW1kXLBUqIuyzK5LWVRqISdriNgTWQpaAy8cfgGTzpYh8SF4DsY3N3H_LFo-Fl7A5HRTDqai3tAz-veGhwAw4rNn7onSH-6ZOvD_tYcxejdawIKbecmj7FqZRUvRJS5-m_1BQo9PQs</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2802666671</pqid></control><display><type>article</type><title>OpenAssistant Conversations -- Democratizing Large Language Model Alignment</title><source>ProQuest - Publicly Available Content Database</source><creator>Köpf, Andreas ; Kilcher, Yannic ; Dimitri von Rütte ; Anagnostidis, Sotiris ; Zhi-Rui Tam ; Stevens, Keith ; Barhoum, Abdullah ; Nguyen, Minh Duc ; Oliver, Stanley ; Nagyfi, Richárd ; Shahul, E S ; Suri, Sameer ; Glushkov, David ; Dantuluri, Arnav ; Maguire, Andrew ; Schuhmann, Christoph ; Nguyen, Huu ; Mattick, Alexander</creator><creatorcontrib>Köpf, Andreas ; Kilcher, Yannic ; Dimitri von Rütte ; Anagnostidis, Sotiris ; Zhi-Rui Tam ; Stevens, Keith ; Barhoum, Abdullah ; Nguyen, Minh Duc ; Oliver, Stanley ; Nagyfi, Richárd ; Shahul, E S ; Suri, Sameer ; Glushkov, David ; Dantuluri, Arnav ; Maguire, Andrew ; Schuhmann, Christoph ; Nguyen, Huu ; Mattick, Alexander</creatorcontrib><description>Aligning large language models (LLMs) with human preferences has proven to drastically improve usability and has driven rapid adoption as demonstrated by ChatGPT. Alignment techniques such as supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) greatly reduce the required skill and domain knowledge to effectively harness the capabilities of LLMs, increasing their accessibility and utility across various domains. However, state-of-the-art alignment techniques like RLHF rely on high-quality human feedback data, which is expensive to create and often remains proprietary. In an effort to democratize research on large-scale alignment, we release OpenAssistant Conversations, a human-generated, human-annotated assistant-style conversation corpus consisting of 161,443 messages in 35 different languages, annotated with 461,292 quality ratings, resulting in over 10,000 complete and fully annotated conversation trees. The corpus is a product of a worldwide crowd-sourcing effort involving over 13,500 volunteers. Models trained on OpenAssistant Conversations show consistent improvements on standard benchmarks over respective base models. We release our code and data under a fully permissive licence.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Alignment ; Chatbots ; Domains ; Feedback ; Large language models ; Machine learning</subject><ispartof>arXiv.org, 2023-10</ispartof><rights>2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2802666671?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>780,784,25753,37012,44590</link.rule.ids></links><search><creatorcontrib>Köpf, Andreas</creatorcontrib><creatorcontrib>Kilcher, Yannic</creatorcontrib><creatorcontrib>Dimitri von Rütte</creatorcontrib><creatorcontrib>Anagnostidis, Sotiris</creatorcontrib><creatorcontrib>Zhi-Rui Tam</creatorcontrib><creatorcontrib>Stevens, Keith</creatorcontrib><creatorcontrib>Barhoum, Abdullah</creatorcontrib><creatorcontrib>Nguyen, Minh Duc</creatorcontrib><creatorcontrib>Oliver, Stanley</creatorcontrib><creatorcontrib>Nagyfi, Richárd</creatorcontrib><creatorcontrib>Shahul, E S</creatorcontrib><creatorcontrib>Suri, Sameer</creatorcontrib><creatorcontrib>Glushkov, David</creatorcontrib><creatorcontrib>Dantuluri, Arnav</creatorcontrib><creatorcontrib>Maguire, Andrew</creatorcontrib><creatorcontrib>Schuhmann, Christoph</creatorcontrib><creatorcontrib>Nguyen, Huu</creatorcontrib><creatorcontrib>Mattick, Alexander</creatorcontrib><title>OpenAssistant Conversations -- Democratizing Large Language Model Alignment</title><title>arXiv.org</title><description>Aligning large language models (LLMs) with human preferences has proven to drastically improve usability and has driven rapid adoption as demonstrated by ChatGPT. Alignment techniques such as supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) greatly reduce the required skill and domain knowledge to effectively harness the capabilities of LLMs, increasing their accessibility and utility across various domains. However, state-of-the-art alignment techniques like RLHF rely on high-quality human feedback data, which is expensive to create and often remains proprietary. In an effort to democratize research on large-scale alignment, we release OpenAssistant Conversations, a human-generated, human-annotated assistant-style conversation corpus consisting of 161,443 messages in 35 different languages, annotated with 461,292 quality ratings, resulting in over 10,000 complete and fully annotated conversation trees. The corpus is a product of a worldwide crowd-sourcing effort involving over 13,500 volunteers. Models trained on OpenAssistant Conversations show consistent improvements on standard benchmarks over respective base models. We release our code and data under a fully permissive licence.</description><subject>Alignment</subject><subject>Chatbots</subject><subject>Domains</subject><subject>Feedback</subject><subject>Large language models</subject><subject>Machine learning</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNqNTr0KwjAYDIJg0b5DwDnQJtZ2LVURVFzcS9DPkNJ-qflSB5_eDD6AN9wPd8PNWCKVykW1kXLBUqIuyzK5LWVRqISdriNgTWQpaAy8cfgGTzpYh8SF4DsY3N3H_LFo-Fl7A5HRTDqai3tAz-veGhwAw4rNn7onSH-6ZOvD_tYcxejdawIKbecmj7FqZRUvRJS5-m_1BQo9PQs</recordid><startdate>20231031</startdate><enddate>20231031</enddate><creator>Köpf, Andreas</creator><creator>Kilcher, Yannic</creator><creator>Dimitri von Rütte</creator><creator>Anagnostidis, Sotiris</creator><creator>Zhi-Rui Tam</creator><creator>Stevens, Keith</creator><creator>Barhoum, Abdullah</creator><creator>Nguyen, Minh Duc</creator><creator>Oliver, Stanley</creator><creator>Nagyfi, Richárd</creator><creator>Shahul, E S</creator><creator>Suri, Sameer</creator><creator>Glushkov, David</creator><creator>Dantuluri, Arnav</creator><creator>Maguire, Andrew</creator><creator>Schuhmann, Christoph</creator><creator>Nguyen, Huu</creator><creator>Mattick, Alexander</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20231031</creationdate><title>OpenAssistant Conversations -- Democratizing Large Language Model Alignment</title><author>Köpf, Andreas ; Kilcher, Yannic ; Dimitri von Rütte ; Anagnostidis, Sotiris ; Zhi-Rui Tam ; Stevens, Keith ; Barhoum, Abdullah ; Nguyen, Minh Duc ; Oliver, Stanley ; Nagyfi, Richárd ; Shahul, E S ; Suri, Sameer ; Glushkov, David ; Dantuluri, Arnav ; Maguire, Andrew ; Schuhmann, Christoph ; Nguyen, Huu ; Mattick, Alexander</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_28026666713</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Alignment</topic><topic>Chatbots</topic><topic>Domains</topic><topic>Feedback</topic><topic>Large language models</topic><topic>Machine learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Köpf, Andreas</creatorcontrib><creatorcontrib>Kilcher, Yannic</creatorcontrib><creatorcontrib>Dimitri von Rütte</creatorcontrib><creatorcontrib>Anagnostidis, Sotiris</creatorcontrib><creatorcontrib>Zhi-Rui Tam</creatorcontrib><creatorcontrib>Stevens, Keith</creatorcontrib><creatorcontrib>Barhoum, Abdullah</creatorcontrib><creatorcontrib>Nguyen, Minh Duc</creatorcontrib><creatorcontrib>Oliver, Stanley</creatorcontrib><creatorcontrib>Nagyfi, Richárd</creatorcontrib><creatorcontrib>Shahul, E S</creatorcontrib><creatorcontrib>Suri, Sameer</creatorcontrib><creatorcontrib>Glushkov, David</creatorcontrib><creatorcontrib>Dantuluri, Arnav</creatorcontrib><creatorcontrib>Maguire, Andrew</creatorcontrib><creatorcontrib>Schuhmann, Christoph</creatorcontrib><creatorcontrib>Nguyen, Huu</creatorcontrib><creatorcontrib>Mattick, Alexander</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>ProQuest - Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Köpf, Andreas</au><au>Kilcher, Yannic</au><au>Dimitri von Rütte</au><au>Anagnostidis, Sotiris</au><au>Zhi-Rui Tam</au><au>Stevens, Keith</au><au>Barhoum, Abdullah</au><au>Nguyen, Minh Duc</au><au>Oliver, Stanley</au><au>Nagyfi, Richárd</au><au>Shahul, E S</au><au>Suri, Sameer</au><au>Glushkov, David</au><au>Dantuluri, Arnav</au><au>Maguire, Andrew</au><au>Schuhmann, Christoph</au><au>Nguyen, Huu</au><au>Mattick, Alexander</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>OpenAssistant Conversations -- Democratizing Large Language Model Alignment</atitle><jtitle>arXiv.org</jtitle><date>2023-10-31</date><risdate>2023</risdate><eissn>2331-8422</eissn><abstract>Aligning large language models (LLMs) with human preferences has proven to drastically improve usability and has driven rapid adoption as demonstrated by ChatGPT. Alignment techniques such as supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) greatly reduce the required skill and domain knowledge to effectively harness the capabilities of LLMs, increasing their accessibility and utility across various domains. However, state-of-the-art alignment techniques like RLHF rely on high-quality human feedback data, which is expensive to create and often remains proprietary. In an effort to democratize research on large-scale alignment, we release OpenAssistant Conversations, a human-generated, human-annotated assistant-style conversation corpus consisting of 161,443 messages in 35 different languages, annotated with 461,292 quality ratings, resulting in over 10,000 complete and fully annotated conversation trees. The corpus is a product of a worldwide crowd-sourcing effort involving over 13,500 volunteers. Models trained on OpenAssistant Conversations show consistent improvements on standard benchmarks over respective base models. We release our code and data under a fully permissive licence.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2023-10
issn 2331-8422
language eng
recordid cdi_proquest_journals_2802666671
source ProQuest - Publicly Available Content Database
subjects Alignment
Chatbots
Domains
Feedback
Large language models
Machine learning
title OpenAssistant Conversations -- Democratizing Large Language Model Alignment
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T03%3A18%3A28IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=OpenAssistant%20Conversations%20--%20Democratizing%20Large%20Language%20Model%20Alignment&rft.jtitle=arXiv.org&rft.au=K%C3%B6pf,%20Andreas&rft.date=2023-10-31&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2802666671%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_28026666713%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2802666671&rft_id=info:pmid/&rfr_iscdi=true