Document sans titre
Séminaire du thème IC :
Differentially Private Sequential Data Publication
Le 20 Février 2013 à 15h00
IntervenantClaude Castelluccia
Dr.INRIA
E-mail :
claude.castelluccia@inria.frSite :
http://planete.inrialpes.fr/~ccastel/Lieu| LAAS-CNRS - Salle de Conférences |
| 7 avenue du Colonel Roche |
| 31077 TOULOUSE Cedex 4 |
RésuméSequential data is being increasingly used in a variety of applications. Publishing sequential data is of vital importance to the advancement of these applications. However, as shown by the re-identification attacks on the AOL and Netflix datasets, releasing sequential data may pose considerable threats to individual privacy. Recent research has indicated the failure of existing sanitization techniques to provide claimed privacy guarantees. It is therefore urgent to respond to this failure by developing new schemes with provable privacy guarantees. Differential privacy is one of the only models that can be used to provide such guarantees. Due to the inherent sequentiality and high-dimensionality, it is challenging to apply differential privacy to sequential data. In this work, we address this challenge by employing a variable-length n-gram model, which extracts the essential information of a sequential database in terms of a set of variable-length $n$-grams. Our approach makes use of a carefully designed exploration tree structure and a set of novel techniques based on the Markov assumption in order to lower the magnitude of added noise. The published $n$-grams are useful for many purposes. Furthermore, we develop a solution for generating a synthetic database, which enables a wider spectrum of data analysis tasks. Extensive experiments on real-life datasets demonstrate that our approach substantially outperforms the state-of-the-art techniques.
Biographie : Claude Castelluccia is a Directeur de Recherche at Inria Grenoble, where he leads the PRIVATICS research group (http://planete.inrialpes.fr/~ccastel/ ).
Retour