Differentially Private Publication For Correlated Non-Numerical Data
Abstract Differential privacy has made a significant progress in numerical data preserving. Compared with numerical data, non-numerical data (e.g. entity object) are also widely applied in intelligent processing tasks. But non-numerical data may reveal more user’s privacy. Recently, researchers attempt to take advantage of the exponential mechanism of differential privacy to solve this challenge. Nonetheless, exponential mechanism has a drawback in correlated data protection, which can not achieve expected privacy degree. To remedy this issue, in this paper, an effective correlated non-numerical data release mechanism is proposed by defining the notion of Correlation-Indistinguishability and designing a correlated exponential mechanism to realize Correlation-Indistinguishability in practice. Inspired by the concept of indistinguishability, Correlation-Indistinguishability can guarantee the correlations of the probability distributions between the output distribution and original data the same to an adversary. In addition, we would rather let two Gaussian white samples pass through a designed filter, to realize the definition of Correlation-Indistinguishability, than using independent exponential variables. Experimental evaluation demonstrates that our mechanism outperforms current schemes in terms of security and utility for frequent items mining.