Loading…

Automated directed fairness testing

Fairness is a critical trait in decision making. As machine-learning models are increasingly being used in sensitive application domains (e.g. education and employment) for decision making, it is crucial that the decisions computed by such models are free of unintended bias. But how can we automatic...

Full description

Saved in:
Bibliographic Details
Main Authors: Udeshi, Sakshi, Arora, Pryanshu, Chattopadhyay, Sudipta
Format: Conference Proceeding
Language:English
Subjects:
Citations: Items that cite this one
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Fairness is a critical trait in decision making. As machine-learning models are increasingly being used in sensitive application domains (e.g. education and employment) for decision making, it is crucial that the decisions computed by such models are free of unintended bias. But how can we automatically validate the fairness of arbitrary machine-learning models? For a given machine-learning model and a set of sensitive input parameters, our Aeqitas approach automatically discovers discriminatory inputs that highlight fairness violation. At the core of Aeqitas are three novel strategies to employ probabilistic search over the input space with the objective of uncovering fairness violation. Our Aeqitas approach leverages inherent robustness property in common machine-learning models to design and implement scalable test generation methodologies. An appealing feature of our generated test inputs is that they can be systematically added to the training set of the underlying model and improve its fairness. To this end, we design a fully automated module that guarantees to improve the fairness of the model. We implemented Aeqitas and we have evaluated it on six stateof- the-art classifiers. Our subjects also include a classifier that was designed with fairness in mind. We show that Aeqitas effectively generates inputs to uncover fairness violation in all the subject classifiers and systematically improves the fairness of respective models using the generated test inputs. In our evaluation, Aeqitas generates up to 70% discriminatory inputs (w.r.t. the total number of inputs generated) and leverages these inputs to improve the fairness up to 94%.
ISSN:2643-1572
DOI:10.1145/3238147.3238165