INOCULATION IN ELEVATED EXTENT DATA ORGANIZATION

K.David Raju, G.Cherani Sarveswari, K. Likitha, B. Harika

Abstract


This paper suggested a stride Q-statistic that evaluates the performance of the FS formula. Q-statistic accounts for both the soundness of selected feature subset and also the conjecture precision. The paper suggested Booster to improve the performance of the existing FS formula. However, caused by an FS formula in line with the conjecture precision is going to be unstable within the variations within the training set, particularly in high dimensional data. This paper proposes a brand new evaluation measure Q-statistic that comes with the soundness from the selected feature subset additionally towards the conjecture precision. Then, we advise the Booster of the FS formula that reinforces the need for the Q-statistic from the formula applied. A significant intrinsic trouble with forward selection is, however, a switch within the decision from the initial feature can lead to a totally different feature subset and therefore the soundness from the selected set of features can be really low even though the selection may yield high precision. This paper proposes Q-statistic to judge the performance of the FS formula having a classifier. This can be a hybrid way of measuring the conjecture precision from the classifier and also the stability from the selected features. The MI estimation with statistical data involves density estimation of high dimensional data. Although much researches happen to be done on multivariate density estimation, high dimensional density estimation with small sample dimensions are still a formidable task. Then your paper proposes Booster on selecting feature subset from the given FS formula.


Keywords


Booster; Feature Selection; Q-Statistic; FS Algorithm; High Dimensional Data;

References


HyunJi Kim, Byong Su Choi, and Moon Yul Huh, “Booster in High DimensionalData Classification”,ieee transactions on knowledge and data engineering, vol. 28, no. 1, january 2016.

T. R. Golub, D. K. Slonim, P. Tamayo, C. Huard, M. Gaasenbeek, J. P. Mesirov, H. Coller, M. L. Loh, J. R. Downing, M. A. Caligiuri, C. D. Bloomfield, and E. S. Lander, “Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring,” Am. Assoc. Advancement Sci., vol. 286, no. 5439, pp. 531–537, 1999.

Q. Hu, L. Zhang, D. Zhang, W. Pan, S. An, and W. Pedrycz, “Measuring relevance between discrete and continuous features based on neighborhood mutual information,” Expert Syst. With Appl., vol. 38, no. 9, pp. 10737–10750, 2011.

G. Brown, A. Pocock, M. J. Zhao, and M. Lujan, “Conditional likelihood maximization: A unifying framework for information theoretic feature selection,” J. Mach. Learn. Res., vol. 13, no. 1, pp. 27–66, 2012.

H. Liu, J. Li, and L.Wong, “A comparative study on feature selection and classification methods using gene expression profiles and proteomic patterns,” Genome Informatics Series, vol. 13, pp. 51–60, 2002.

J. Stefanowski, “An experimental study of methods combining multiple classifiers-diversified both by feature selection and bootstrap sampling,” Issues Representation Process. Uncertain Imprecise Inf., Akademicka OficynaWydawnicza, Warszawa, pp. 337–354, 2005.

S. A. Sajan, J. L. Rubenstein, M. E. Warchol, and M. Lovett, “Identification of direct downstream targets of Dlx5 during early inner ear development,” Human Molecular Genetics, vol. 20, no. 7, pp. 1262–1273, 2011.


Full Text: PDF

Refbacks

  • There are currently no refbacks.




Copyright © 2012 - 2020, All rights reserved.| ijitr.com

Creative Commons License
International Journal of Innovative Technology and Research is licensed under a Creative Commons Attribution 3.0 Unported License.Based on a work at IJITR , Permissions beyond the scope of this license may be available at http://creativecommons.org/licenses/by/3.0/deed.en_GB.