Great Deal! Get Instant $10 FREE in Account on First Order + 10% Cashback on Every Order Order Now

SIT 719 SIT719 Security and Privacy Issues in Analytics Credit Task 8.2: k-anonymity for Sensitive Data Privacy Overview Data owners want a way to transform a dataset containing highly sensitive...

1 answer below »
SIT 719


SIT719 Security and Privacy Issues in Analytics

Credit Task 8.2: k-anonymity for Sensitive Data Privacy
Overview

Data owners want a way to transform a dataset containing highly sensitive information into a
privacy-preserving, low-risk set of records that can be shared with anyone. k-anonymity, a privacy
model commonly applied to protect the data subjects’ privacy in data sharing scenarios, and the
guarantees that k-anonymity can provide when used to anonymise data. There are different open
source and commercial tools which utilizes this privacy model to protect the sensitive data.

Amnesia is a data anonymization tool that allows to remove identifying information from data.
Amnesia not only removes direct identifiers like names, SSNs etc but also transforms secondary
identifiers like birth date and zip code so that individuals cannot be identified in the data. Amnesia
supports k-anonymity.

Please see the task description for the detailed tasks.

This is a Credit task, so please make sure you are already up to date with all Pass tasks before
attempting this task.

Task Description
Instructions:

1. Write a 500 word summary addressing the followings:
a) Quasi-identifiers
) k-anonymity
c) How k-anonymity can help prevent privacy attack?

2. Do some research to identify some commercial and open-source tools for data
anonymization.

Then, Make a list of the tools.


Upload the summary report to the onTrack system.
    Overview
    Task Description
Answered Same Day May 27, 2021 SIT719

Solution

Neha answered on Jun 03 2021
150 Votes
Quasi-Identifie
The Quasi-Identifier can be defined as a piece of information which can be used by an intruder to find out something specific about a target or individual. This can be predicted from a large number of people (Zhang, X., Liu, C., Nepal, S. and Chen, J). The intruder can find this out using the following personal information about the specific target person:
· Specific target person is well known, and the information is publicly available.
· The publicly available registries or the medias.
· The information which individual post about themselves over the social media.
· The information which is disclosed by individual to multiple people.
It is important to know that it is possible predict a quasi-identifier using some other variable. Both the variables are considered as quasi identifiers. There is no point which can protect the variable A but not variable B and it is easy for the intruder to predict a variable using variable B (Koot, M.R., Mandjes, M., van’t Noordende, G. and de Laat, C). It is important to search for the related variables present in a data set. Examples of the co
elated variables are date of birth for a baby and date of discharge from hospital, date of death and date or autopsy, weight at birth and weight of baby at discharge, age and date of graduation etc.
K-Anonymity
The K-Anonymity can be defined as a privacy model which is applied to the data set to protect it and the privacy in data sharing scenarios (LeFevre, K., DeWitt, D.J. and Ramakrishnan, R). The k anonymity can provide privacy when used with anonymise data. There are many privacy-preserving systems which have the goal of providing K-Anonymity for the data subjects. The basic idea is to use...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here