The Australian Census Data at Risk

The Australian Census Data at Risk

The Australian Census Data at Risk: Vulnerability of the ABS perturbation method identified.

Today, we release our work that identifies and demonstrates a vulnerability of the Perturbation Algorithm used by the Australian Bureau of Statistics for its online tool, TableBuilder, that enables querying the Australian Census Data.

In a nutshell, the algorithm named TBE, perturbing answers to the queries by adding noise distributed within a bounded range (possibly undisclosed) is faulty and puts the highly sensitive original census data at major risk of being revealed.

We demonstrated how an attacker, who may not know the perturbation parameters, can not only find any hidden parameters of the algorithm but also remove the noise to obtain the original answer to some query of choice. None of the attacks we presented depend on any background information.

Implications of this attack go beyond re-identification risks, as the attack makes it possible to reveal values intended to be hidden by the TBE perturbation algorithm and hence can reconstruct the original census data.

While the attack is also applicable to the actual Australian census data available through TableBuilder, for ethical considerations we only show the success of the attack on synthetic data. We note however, that the perturbation method used in the online ABS TableBuilder tool is proven vulnerable to this attack. Perturbed answers can be retrieved with probability of more than 95% with only 200 queries.

Please see the paper to learn more. We believe that since the TableBuilder tool allows access to some of the most sensitive and personal data, it is important for the public to know the security/privacy measures in place behind usage of their data. Please take a moment to read the discussion we provide in section 7 of the paper as it clarifies a number of considerations regarding transparency in the design and implementation of security mechanisms, vulnerability disclosure, communication of these results to ABS and better approaches to protect data as sensitive as the Census data.

We demonstrated how an attacker, who may not know the perturbation parameters, can not only find any hidden parameters of the algorithm but also remove the noise to obtain the original answer to some query of choice. None of the attacks we presented depend on any background information.

https://arxiv.org/abs/1902.06414

For the full version of the paper please see:

https://arxiv.org/pdf/1902.06414.pdf

Back to the top of this page