SLGP Header

Privacy Preserving Data Mining using Random Decision Tree

IJEECC Front Page

Abstract
Data processing with information privacy and information utility has been emerged to manage distributed information expeditiously. In this paper, to deal with this advancement in privacy protective data processing technology victimization intensify approach of Random Decision Tree (RDT). Random Decision Tree provides higher potency and information privacy than Privacy secured Data mining Techniques. Privacy Preserving Data mining is simply too slow and impracticable to modify really massive scale analytics to manage era of huge information. Random Decision Tree is employed for multiple data processing tasks like classification, regression, ranking, and multiple classifications. Privacy protective RDT uses each randomization and cryptographic technique which offers information privacy for a few Decision trees primarily based learning task.
Keywords: Privacy Preserving, Data Mining, Random Decision Tree.
I.Introduction
Data Mining is quick growing field of distributed atmosphere and method of discovering fascinating patterns and knowledge from giant information. It’s additionally known as KDD process i.e. information Discovery from knowledge. It permits knowledge analysis whereas conserving knowledge privacy. Data privacy conserving is forestalling personal secret or non-public data from unnecessarily distributed or in public identified or not be put-upon by person or by oppose. In privacy preserving data processing, fascinating and helpful data is distributed with privacy of guidance has been preserved. There square measure 2 stages in privacy conserving knowledge mining initial is knowledge assortment and second knowledge commercial enterprise. In data assortment, knowledge holder stores knowledge that is gathered by data owner. In knowledge commercial enterprise, knowledge may be free to knowledge recipient by knowledge holder and knowledge recipient mines printed secured knowledge. Cryptographic techniques square measure typically too slow to be sensible and can become computationally expensive because the rise in size of the info set and communications between numerous parties increase [1]. Crypto graphical techniques cannot handle huge data. During this paper, we tend to square measure victimization privacy conserving RDT is Random decision Tree with privacy conserving data processing which is developed by Fan et al. [3]. Privacy conserving RDT is combination of randomization and cryptography technique. This resolution provides Associate in nursing order of magnitude improvement inefficiency over existing solutions whereas providing a lot of knowledge privacy and knowledge utility. This can be an efficient resolution to privacy-preserving data processing for the massive knowledge challenge. Random decision Tree provides higher potency and knowledge privacy than crypto graphical technique. RDT provides a structural property, a lot of specifically, the very fact that solely specific nodes (the leaves) within the classification tree have to be compelled to be encrypted /decrypted, and secure token passing prevents adversary from utilizing count techniques to decipher instance classifications, because the branch structure of the tree is hidden from all parties. RDT to get trees. That square measure random in structure, providing USA with an analogous finish result as perturbation while not the associated pitfalls. A random structure provides security against investing priority information to get the whole classification model or instances.

References:

  1. Jaideep Vaidya, Senior Member, IEEE, Basit Shafiq, Member, IEEE, Wei Fan, Member, IEEE, Danish Mehmood, And David Lorenzi “A Random Decision Tree Framework Or Privacy-Preserving Data Mining” Proc. IEEE Transactions On Dependable And Secure Computing, Vol. 11, No. 5, September/October 2014.
  2. J. Vaidya, C. Clifton, and M. Zhu, Privacy-Preserving Data Mining.ser. Advances in Information Security first ed., vol. 19, Springer-Verlag, 2005.
  3. ?W. Fan, H. Wang, P.S. Yu, and S. Ma, “Is Random Model Better? On Its Accuracy and Efficiency,” Proc. Third IEEE Int’l Conf. Data Mining (ICDM ’03), pp. 51-58, 2003.
  4. W. Fan, J. McCloskey, and P. S. Yu, “A General Framework for Accurate and Fast Regression by Data Summarization in Random Decision Trees,” Proc. 12th ACM SIGKDD Int’l Conf. Knowledge Discovery and Data Mining (KDD ’06), pp. 136-146, 2006.
  5. X. Zhang, Q. Yuan, S. Zhao, W. Fan, W. Zheng, and Z. Wang,“Multi-Label Classification without the Multi-Label Cost,” Proc. SIAM Int’l Conf. Data Mining (SDM ’10), pp. 778-789, 2010.
  6. A. Dhurandhar and A. Dobra, “Probabilistic Characterization of Random Decision Trees,” J. Machine Learning Research, vol. 9, pp. 2321-2348, 2008.
  7. G. Jagannathan, K. Pillaipakkamnatt, and R.N. Wright, “A Practical Differentially Private Random Decision Tree Classifier,” Proc. IEEE Int’l Conf. Data Mining Workshops (ICDMW ’09), pp. 114-121,2009.
  8. J. Vaidya, C. Clifton, M. Kantarcioglu, and A.S. Patterson,“ Privacy-Preserving Decision Trees over Vertically Partitioned Data,” ACM Trans. Knowledge Discovery from Data, vol. 2, no. 3,pp. 1-27, 2008.
  9. O. Goldreich, “General Cryptographic Protocols,” The Foundations of Cryptography, vol. 2, pp. 599-764, Cambridge Univ. Press, 2004.