Loading…

Explain-and-Test: An Interactive Machine Learning Framework for Exploring Text Embeddings

Text embeddings-mappings of collections of text to points in high-dimensional space-are a common object of analysis. A classic method to visualize these embeddings is to create a nonlinear projection to two dimensions and look for clusters and other structures in the resulting map. Explaining why ce...

Full description

Saved in:
Bibliographic Details
Main Authors: Raval, Shivam, Wang, Carolyn, Viegas, Fernanda, Wattenberg, Martin
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Text embeddings-mappings of collections of text to points in high-dimensional space-are a common object of analysis. A classic method to visualize these embeddings is to create a nonlinear projection to two dimensions and look for clusters and other structures in the resulting map. Explaining why certain texts cluster together, however, can be difficult. In this paper, we introduce a human-in-the-loop framework for applying machine learning (ML) to this challenge. The framework has two stages: (1) explain, in which we use ML to produce a description of a pattern; and (2) test, in which the user can verify the explanation by entering new text that fits the pattern, and sees where it appears on the map. If the new text is mapped to the original cluster, that is evidence in favor of the ML-generated explanation. We illustrate this process with a visualization application that provides two kinds of explanations: Natural Language Explanations and Contrastive PhraseClouds. Scenarios on exploring academic papers and literary work showcase the benefit of our workflow in discovering related topics and analyzing thematic differences in text.
ISSN:2771-9553
DOI:10.1109/VIS54172.2023.00052