A COMPREHENSIVE SURVEY ON PRIVACY PRESERVING DISTRIBUTED DATA MINING WITH EVOLUTIONARY COMPUTING

J. Aruna Santhi

Abstract


Data mining is really a procedure for nontrivial extraction of implicit, formerly unknown, and potentially helpful information from data in databases. Actually, the word “knowledge discovery” is much more general compared to term “data mining.” Data mining is generally seen like a step towards the entire process of understanding discovery, although both of these terms are thought as synonyms within the computer literature. Posting data about people without revealing sensitive details about them is a vital problem. In many applications, data mining has to be done in distributed data scenarios. Distributed data mining (DDM) techniques have become necessary for large and multi-scenario datasets requiring resources, which are heterogeneous and distributed. In such situations, data owners may be concerned with the misuse of data, hence, they do not want their data to be mined, especially when these contain sensitive information.

Distributed data mining techniques use sensitive data from distributed databases held by different parties. This makes direct conflict by having an individual’s need and to privacy. It's thus crucial to build up sufficient security approaches for safeguarding privacy of person values employed for data mining.In this paper, we consider privacy-protecting naïve-Bayes classifier for flat partitioned distributed data and propose data mining privacy by decomposition (DMPD) way in which uses genetic formula to look for optimal set of features partitioning by classification precision and k-anonymity constraints. Here, we study maintaining privacy in distributed data mining and how two (or even more) parties will find frequent item sets from  distributed databases without revealing each party’s area of the data to another. It also incorporates  the privacy issues related to Distributed data mining (PPDDM) from a wider perspective and investigate various approaches that can help to provide privacy for sensitive information DDBs.


Keywords


Data Mining Publishing Data Data Mining Privacy By Decomposition (DMPD);

References


Byung Hoon Park and Hilloi Karagupta, “Distributed Data Mining: Algorithms, Systems and Applications”, University of MaryLand, 2002.

Dr. Sujni Paul, Dr.V.Saravanan, “Knowledge integration in a Parallel and distributed environment with association rule mining using XML data“, IJCSNS International Journal of Computer Science and Network Security, VOL.8 No.5, May 2008.

Kargupta, H., Kamath, C., and Chan, P., “Distributed and Parallel Data Mining: Emergence, Growth and Future Directions, Advances in Distributed Data Mining, (eds) Hillol Kargupta and Philip Chan, AAAI Press, pp. 407-416, 1999.

Assaf Schuster, Ran Wolff, and Dan Trock, “A High-Performance Distributed Algorithm for Mining Association Rules”. In Third IEEE International Conference on Data Mining, Florida, USA, November 2003.

R. Agrawal, T. Imielinski, and A. Swami, “Mining Associations between Sets of Items in Massive Databases,” Proceedings of the ACM SIGMOD, Washington, DC, pp. 207-216, May 1993.


Full Text: PDF

Refbacks

  • There are currently no refbacks.




Copyright © 2012 - 2018, All rights reserved.| ijitr.com

Creative Commons License
International Journal of Innovative Technology and Research is licensed under a Creative Commons Attribution 3.0 Unported License.Based on a work at IJITR , Permissions beyond the scope of this license may be available at http://creativecommons.org/licenses/by/3.0/deed.en_GB.