Loading…
OpenAssistant Conversations -- Democratizing Large Language Model Alignment
Aligning large language models (LLMs) with human preferences has proven to drastically improve usability and has driven rapid adoption as demonstrated by ChatGPT. Alignment techniques such as supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) greatly reduce the requir...
Saved in:
Published in: | arXiv.org 2023-10 |
---|---|
Main Authors: | , , , , , , , , , , , , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | |
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Köpf, Andreas Kilcher, Yannic Dimitri von Rütte Anagnostidis, Sotiris Zhi-Rui Tam Stevens, Keith Barhoum, Abdullah Nguyen, Minh Duc Oliver, Stanley Nagyfi, Richárd Shahul, E S Suri, Sameer Glushkov, David Dantuluri, Arnav Maguire, Andrew Schuhmann, Christoph Nguyen, Huu Mattick, Alexander |
description | Aligning large language models (LLMs) with human preferences has proven to drastically improve usability and has driven rapid adoption as demonstrated by ChatGPT. Alignment techniques such as supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) greatly reduce the required skill and domain knowledge to effectively harness the capabilities of LLMs, increasing their accessibility and utility across various domains. However, state-of-the-art alignment techniques like RLHF rely on high-quality human feedback data, which is expensive to create and often remains proprietary. In an effort to democratize research on large-scale alignment, we release OpenAssistant Conversations, a human-generated, human-annotated assistant-style conversation corpus consisting of 161,443 messages in 35 different languages, annotated with 461,292 quality ratings, resulting in over 10,000 complete and fully annotated conversation trees. The corpus is a product of a worldwide crowd-sourcing effort involving over 13,500 volunteers. Models trained on OpenAssistant Conversations show consistent improvements on standard benchmarks over respective base models. We release our code and data under a fully permissive licence. |
format | article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2802666671</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2802666671</sourcerecordid><originalsourceid>FETCH-proquest_journals_28026666713</originalsourceid><addsrcrecordid>eNqNTr0KwjAYDIJg0b5DwDnQJtZ2LVURVFzcS9DPkNJ-qflSB5_eDD6AN9wPd8PNWCKVykW1kXLBUqIuyzK5LWVRqISdriNgTWQpaAy8cfgGTzpYh8SF4DsY3N3H_LFo-Fl7A5HRTDqai3tAz-veGhwAw4rNn7onSH-6ZOvD_tYcxejdawIKbecmj7FqZRUvRJS5-m_1BQo9PQs</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2802666671</pqid></control><display><type>article</type><title>OpenAssistant Conversations -- Democratizing Large Language Model Alignment</title><source>ProQuest - Publicly Available Content Database</source><creator>Köpf, Andreas ; Kilcher, Yannic ; Dimitri von Rütte ; Anagnostidis, Sotiris ; Zhi-Rui Tam ; Stevens, Keith ; Barhoum, Abdullah ; Nguyen, Minh Duc ; Oliver, Stanley ; Nagyfi, Richárd ; Shahul, E S ; Suri, Sameer ; Glushkov, David ; Dantuluri, Arnav ; Maguire, Andrew ; Schuhmann, Christoph ; Nguyen, Huu ; Mattick, Alexander</creator><creatorcontrib>Köpf, Andreas ; Kilcher, Yannic ; Dimitri von Rütte ; Anagnostidis, Sotiris ; Zhi-Rui Tam ; Stevens, Keith ; Barhoum, Abdullah ; Nguyen, Minh Duc ; Oliver, Stanley ; Nagyfi, Richárd ; Shahul, E S ; Suri, Sameer ; Glushkov, David ; Dantuluri, Arnav ; Maguire, Andrew ; Schuhmann, Christoph ; Nguyen, Huu ; Mattick, Alexander</creatorcontrib><description>Aligning large language models (LLMs) with human preferences has proven to drastically improve usability and has driven rapid adoption as demonstrated by ChatGPT. Alignment techniques such as supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) greatly reduce the required skill and domain knowledge to effectively harness the capabilities of LLMs, increasing their accessibility and utility across various domains. However, state-of-the-art alignment techniques like RLHF rely on high-quality human feedback data, which is expensive to create and often remains proprietary. In an effort to democratize research on large-scale alignment, we release OpenAssistant Conversations, a human-generated, human-annotated assistant-style conversation corpus consisting of 161,443 messages in 35 different languages, annotated with 461,292 quality ratings, resulting in over 10,000 complete and fully annotated conversation trees. The corpus is a product of a worldwide crowd-sourcing effort involving over 13,500 volunteers. Models trained on OpenAssistant Conversations show consistent improvements on standard benchmarks over respective base models. We release our code and data under a fully permissive licence.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Alignment ; Chatbots ; Domains ; Feedback ; Large language models ; Machine learning</subject><ispartof>arXiv.org, 2023-10</ispartof><rights>2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2802666671?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>780,784,25753,37012,44590</link.rule.ids></links><search><creatorcontrib>Köpf, Andreas</creatorcontrib><creatorcontrib>Kilcher, Yannic</creatorcontrib><creatorcontrib>Dimitri von Rütte</creatorcontrib><creatorcontrib>Anagnostidis, Sotiris</creatorcontrib><creatorcontrib>Zhi-Rui Tam</creatorcontrib><creatorcontrib>Stevens, Keith</creatorcontrib><creatorcontrib>Barhoum, Abdullah</creatorcontrib><creatorcontrib>Nguyen, Minh Duc</creatorcontrib><creatorcontrib>Oliver, Stanley</creatorcontrib><creatorcontrib>Nagyfi, Richárd</creatorcontrib><creatorcontrib>Shahul, E S</creatorcontrib><creatorcontrib>Suri, Sameer</creatorcontrib><creatorcontrib>Glushkov, David</creatorcontrib><creatorcontrib>Dantuluri, Arnav</creatorcontrib><creatorcontrib>Maguire, Andrew</creatorcontrib><creatorcontrib>Schuhmann, Christoph</creatorcontrib><creatorcontrib>Nguyen, Huu</creatorcontrib><creatorcontrib>Mattick, Alexander</creatorcontrib><title>OpenAssistant Conversations -- Democratizing Large Language Model Alignment</title><title>arXiv.org</title><description>Aligning large language models (LLMs) with human preferences has proven to drastically improve usability and has driven rapid adoption as demonstrated by ChatGPT. Alignment techniques such as supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) greatly reduce the required skill and domain knowledge to effectively harness the capabilities of LLMs, increasing their accessibility and utility across various domains. However, state-of-the-art alignment techniques like RLHF rely on high-quality human feedback data, which is expensive to create and often remains proprietary. In an effort to democratize research on large-scale alignment, we release OpenAssistant Conversations, a human-generated, human-annotated assistant-style conversation corpus consisting of 161,443 messages in 35 different languages, annotated with 461,292 quality ratings, resulting in over 10,000 complete and fully annotated conversation trees. The corpus is a product of a worldwide crowd-sourcing effort involving over 13,500 volunteers. Models trained on OpenAssistant Conversations show consistent improvements on standard benchmarks over respective base models. We release our code and data under a fully permissive licence.</description><subject>Alignment</subject><subject>Chatbots</subject><subject>Domains</subject><subject>Feedback</subject><subject>Large language models</subject><subject>Machine learning</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNqNTr0KwjAYDIJg0b5DwDnQJtZ2LVURVFzcS9DPkNJ-qflSB5_eDD6AN9wPd8PNWCKVykW1kXLBUqIuyzK5LWVRqISdriNgTWQpaAy8cfgGTzpYh8SF4DsY3N3H_LFo-Fl7A5HRTDqai3tAz-veGhwAw4rNn7onSH-6ZOvD_tYcxejdawIKbecmj7FqZRUvRJS5-m_1BQo9PQs</recordid><startdate>20231031</startdate><enddate>20231031</enddate><creator>Köpf, Andreas</creator><creator>Kilcher, Yannic</creator><creator>Dimitri von Rütte</creator><creator>Anagnostidis, Sotiris</creator><creator>Zhi-Rui Tam</creator><creator>Stevens, Keith</creator><creator>Barhoum, Abdullah</creator><creator>Nguyen, Minh Duc</creator><creator>Oliver, Stanley</creator><creator>Nagyfi, Richárd</creator><creator>Shahul, E S</creator><creator>Suri, Sameer</creator><creator>Glushkov, David</creator><creator>Dantuluri, Arnav</creator><creator>Maguire, Andrew</creator><creator>Schuhmann, Christoph</creator><creator>Nguyen, Huu</creator><creator>Mattick, Alexander</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20231031</creationdate><title>OpenAssistant Conversations -- Democratizing Large Language Model Alignment</title><author>Köpf, Andreas ; Kilcher, Yannic ; Dimitri von Rütte ; Anagnostidis, Sotiris ; Zhi-Rui Tam ; Stevens, Keith ; Barhoum, Abdullah ; Nguyen, Minh Duc ; Oliver, Stanley ; Nagyfi, Richárd ; Shahul, E S ; Suri, Sameer ; Glushkov, David ; Dantuluri, Arnav ; Maguire, Andrew ; Schuhmann, Christoph ; Nguyen, Huu ; Mattick, Alexander</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_28026666713</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Alignment</topic><topic>Chatbots</topic><topic>Domains</topic><topic>Feedback</topic><topic>Large language models</topic><topic>Machine learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Köpf, Andreas</creatorcontrib><creatorcontrib>Kilcher, Yannic</creatorcontrib><creatorcontrib>Dimitri von Rütte</creatorcontrib><creatorcontrib>Anagnostidis, Sotiris</creatorcontrib><creatorcontrib>Zhi-Rui Tam</creatorcontrib><creatorcontrib>Stevens, Keith</creatorcontrib><creatorcontrib>Barhoum, Abdullah</creatorcontrib><creatorcontrib>Nguyen, Minh Duc</creatorcontrib><creatorcontrib>Oliver, Stanley</creatorcontrib><creatorcontrib>Nagyfi, Richárd</creatorcontrib><creatorcontrib>Shahul, E S</creatorcontrib><creatorcontrib>Suri, Sameer</creatorcontrib><creatorcontrib>Glushkov, David</creatorcontrib><creatorcontrib>Dantuluri, Arnav</creatorcontrib><creatorcontrib>Maguire, Andrew</creatorcontrib><creatorcontrib>Schuhmann, Christoph</creatorcontrib><creatorcontrib>Nguyen, Huu</creatorcontrib><creatorcontrib>Mattick, Alexander</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>ProQuest - Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Köpf, Andreas</au><au>Kilcher, Yannic</au><au>Dimitri von Rütte</au><au>Anagnostidis, Sotiris</au><au>Zhi-Rui Tam</au><au>Stevens, Keith</au><au>Barhoum, Abdullah</au><au>Nguyen, Minh Duc</au><au>Oliver, Stanley</au><au>Nagyfi, Richárd</au><au>Shahul, E S</au><au>Suri, Sameer</au><au>Glushkov, David</au><au>Dantuluri, Arnav</au><au>Maguire, Andrew</au><au>Schuhmann, Christoph</au><au>Nguyen, Huu</au><au>Mattick, Alexander</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>OpenAssistant Conversations -- Democratizing Large Language Model Alignment</atitle><jtitle>arXiv.org</jtitle><date>2023-10-31</date><risdate>2023</risdate><eissn>2331-8422</eissn><abstract>Aligning large language models (LLMs) with human preferences has proven to drastically improve usability and has driven rapid adoption as demonstrated by ChatGPT. Alignment techniques such as supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) greatly reduce the required skill and domain knowledge to effectively harness the capabilities of LLMs, increasing their accessibility and utility across various domains. However, state-of-the-art alignment techniques like RLHF rely on high-quality human feedback data, which is expensive to create and often remains proprietary. In an effort to democratize research on large-scale alignment, we release OpenAssistant Conversations, a human-generated, human-annotated assistant-style conversation corpus consisting of 161,443 messages in 35 different languages, annotated with 461,292 quality ratings, resulting in over 10,000 complete and fully annotated conversation trees. The corpus is a product of a worldwide crowd-sourcing effort involving over 13,500 volunteers. Models trained on OpenAssistant Conversations show consistent improvements on standard benchmarks over respective base models. We release our code and data under a fully permissive licence.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2023-10 |
issn | 2331-8422 |
language | eng |
recordid | cdi_proquest_journals_2802666671 |
source | ProQuest - Publicly Available Content Database |
subjects | Alignment Chatbots Domains Feedback Large language models Machine learning |
title | OpenAssistant Conversations -- Democratizing Large Language Model Alignment |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T03%3A18%3A28IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=OpenAssistant%20Conversations%20--%20Democratizing%20Large%20Language%20Model%20Alignment&rft.jtitle=arXiv.org&rft.au=K%C3%B6pf,%20Andreas&rft.date=2023-10-31&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2802666671%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_28026666713%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2802666671&rft_id=info:pmid/&rfr_iscdi=true |