IJWMT Vol. 2, No. 3, 15 Jun. 2012
Cover page and Table of Contents: PDF (size: 107KB)
Full Text (PDF, 107KB), PP.28-33
Views: 0 Downloads: 0
Privacy-preserving, Data mining, Data Perturbation, Additive Noise
In the last decade, more and more researches have focused on privacy-preserving data mining(PPDM). The previous work can be divided into two categories: data modification and data encryption. Data encryption is not used as widely as data modification because of its high cost on computing and communications. Data perturbation, including additive noise, multiplicative noise, matrix multiplication, data swapping, data shuffling, k-anonymization, Blocking, is an important technology in data modification method. PPDM has two targets: privacy and accuracy, and they are often at odds with each other. This paper begins with a proposal of two new noise addition methods for perturbing the original data, followed by a discussion of how they meet the two targets. Experiments show that the methods given in this paper have higher accuracy than existing ones under the same condition of privacy strength.
Likun Liu,liang Hu,Di Wang,Yanmei Huo,Lei Yang,Kexin Yang,"Two Noise Addition Methods For Privacy-Preserving Data Mining", IJWMT, vol.2, no.3, pp.28-33, 2012. DOI: 10.5815//ijwmt.2012.03.05
[1]Agrawal, R., Srikant,R. Privacy-perserving data mining. Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, Dallas, Texas, USA, May 16-18, 2000, pages 439-450. ACM, New York, 2000.
[2]D.Agrawal and C.C.Aggarwal,” On the design and quantication of privacy preserving data mining algorithms,” in Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of Database Systems. pages. 247-255, Santa Barbara, CA, 2001.
[3]J.J.Kim and W.E.Winkler.Multiplicative noise for masking continuous data. Technical Report Statistics #2003-01, Statistical Research Division, U.S. Bureau of the Census, Washington D.C, April, 2003.
[4]M.Artin, Algebra. PrenticeHall, 1991
[5]S.E.Fienberg and J.McIntyre.Data swapping: Variations on a theme by dalenius and reiss. Technical report, National Institute of Statistical Sciences, Research Triangle Park, NC, 2003.
[6]K.Muralidhar and R.Sarathy.Data shuffing-a new masking approach for numerical data. Management Science,52(5):658-670,May,2006
[7]P.Samarati.Protecting respondents identities in microdata release. IEEE Transactions on Knowledge and Data Engineering,13(6):1010-1027, November/December 2001.
[8]L.Sweeney.k-anonymity: a model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems, 10(5):557-570,2002
[9]S.Agrawal, V.Krishnan, and J.R.Haritsa,"On addressing efficiency concerns in privacy-preserving mining," Proc.of 9th Intl.Conf. on Database Systems for Advanced Applications(DASFAA), pages. 113–124, 2004.
[10]Li Liu, Murat Kantarcioglu, Bhavani Thuraisingham. Privacy Preserving Decision Tree mining from Perturbed Data. In proceedings of 42th Hawaii International Conference on System Sciences. 2009.
[11]Mohammad Ali Kadampur, Somayajulu D.V.L.N. A Noise Addition Scheme in Decision Tree for rivacy Preserving Data Mining. Journal of Computing, vol 2, no1, pages.2151-9617.January 2010.
[12]H.Kargupta, S.Datta, Q.Wang, and K.Sivakumar. On the privacy preserving properties of random data perturbation techniques. In Proceeding of the IEEE International Conferenceon Data Mining(ICDM’03), pages 99-106, Melbourne, FL, November 2003
[13]S.Guo, X.Wu, and Y.Li. On the lower bound of reconstruction error for spectral filtering based privacy preserving data mining. In Proceedings of the10th European Conference on Principles and Practice of Knowledge Discovery in Databases(PKDD’06), pages 520-5277, Berlin, Germany, September, 2006.
[14]Z.Huang, W.Du, and B.Chen. Deriving private information from randomized data. In Proceedings of the 2005 ACM SIGMOD Conference, pages 37-48, Baltimroe, MD, June 2005
[15]Zhai Fangwen, Yang Zehong, Song Yixu, Liu Yi. A Novel Similarity Measure Framework on Financial Data Mining. 2010 Second International Conference on Networks Security, Wireless Communications and Trusted Computing,Wuhan,China,April 24-25,2010
[16]K.Chen, G.Sun,and L.Liu. Towards attack-resilient geometric data perturbation. In Proceedings of the 2007 SIAM International Conference on Data Mining(SDM’07), Minneapolis, MN, April 2007