Loading…

Simulating Organizational Data from Redacted Input for Inference Enterprise Modeling

Organizations that use data to assess insider threats, or other workforce outcomes, need to evaluate the quality of their assessment methods. This evaluation relies on an accurate representation of the predictors and criterion variables within the organization's workforce. However, privacy conc...

Full description

Saved in:
Bibliographic Details
Published in:Digital threats (Print) 2022-03, Vol.3 (1), p.1-30
Main Authors: Sticha, Paul J., Diaz, Tirso E., Axelrad, Elise T., Vermillion, Sean D., Buede, Dennis M.
Format: Article
Language:English
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Organizations that use data to assess insider threats, or other workforce outcomes, need to evaluate the quality of their assessment methods. This evaluation relies on an accurate representation of the predictors and criterion variables within the organization's workforce. However, privacy concerns often limit the information that is available for evaluation. For example, the organization might anonymize identifying information of its workforce, or the evaluation might be restricted to use group statistics, such as marginal distributions of predictors and criteria, along with their intercorrelations. In this paper we demonstrate a hybrid approach for simulating workforce data to support inference-enterprise evaluation, including the crowdsourced elicitation of marginal distributions and correlations of predictors and the simulation of a workforce population from the statistical properties of a redacted set of predictor distributions. The methods provide a way to simulate a population that has statistical characteristics of the workforce, in order to assess the performance of the assessment methods. The statistical methods are supplemented by expert judgments for situations where required information is not available. We evaluate these methods using anonymized data from an actual organization.
ISSN:2692-1626
2576-5337
DOI:10.1145/3457910