SIT 719
SIT719 Security and Privacy Issues in Analytics
Credit Task 8.2: k-anonymity for Sensitive Data Privacy
Overview
Data owners want a way to transform a dataset containing highly sensitive information into a
privacy-preserving, low-risk set of records that can be shared with anyone. k-anonymity, a privacy
model commonly applied to protect the data subjects’ privacy in data sharing scenarios, and the
guarantees that k-anonymity can provide when used to anonymise data. There are different open
source and commercial tools which utilizes this privacy model to protect the sensitive data.
Amnesia is a data anonymization tool that allows to remove identifying information from data.
Amnesia not only removes direct identifiers like names, SSNs etc but also transforms secondary
identifiers like birth date and zip code so that individuals cannot be identified in the data. Amnesia
supports k-anonymity.
Please see the task description for the detailed tasks.
This is a Credit task, so please make sure you are already up to date with all Pass tasks before
attempting this task.
Task Description
Instructions:
1. Write a 500 word summary addressing the followings:
a) Quasi-identifiers
) k-anonymity
c) How k-anonymity can help prevent privacy attack?
2. Do some research to identify some commercial and open-source tools for data
anonymization.
Then, Make a list of the tools.
Upload the summary report to the onTrack system.
Overview
Task Description