A Bayesian decision-theoretic framework for data privacy
Abstract
The scientific and economic value of data continues to increase as our technology does. New hardware and software technologies allow, but also often require, richer and vaster data to operate. As the value of input data to these systems is recognised, so too is the loss of privacy for data providers. In this landscape, data privacy is a fundamental issue for data science disciplines such as statistics and machine learning. More generally, it is an issue for scientific and industrial pursuits that use sensitive data. We develop a framework for measuring privacy from a Bayesian decision-theoretic point of view. With our framework we can generate new privacy principles that are rigorously justified, assess existing privacy definitions using decision-theory, and create new definitions that are fit for purpose. We take special note of assessing privacy for deterministic algorithms, which are overlooked by current privacy standards, and Monte Carlo samples from posterior distributions.