A MANUSCRIPT COMBINATION PLAN FOR CONTROLLING DATA REDUCTION ISSUES

Talari Raja Sekhar, Aparna Allada

Abstract


Mining software repositories is definitely an interdisciplinary domain, which aims to use data mining techniques to cope with software engineering problems to automate the bug triaging process. To prevent the costly price of manual bug triage, we advise a computerized bug triage approach, which are applicable text classification strategies to predict designers for bug reviews. Within this approach, an insect report is planned to some document along with a related developer is planned towards the label from the document. Then, bug triage is converted to a problem of text classification and it is instantly solved with mature text classification techniques. Data reduction for bug triage aims to construct a little-scale and-quality group of bug data by getting rid of bug reviews and words that are redundant or non-informative. Within our work, we combine existing techniques of instance selection and have selection to concurrently lessen the bug dimension and also the word dimension. Our work provides an approach to leverage methods on data processing to form reduced as well as high-quality bug data in software development as well as maintenance. To find out order of applying instance selection as well as feature selection, we take out attributes from historical bug data sets and a predictive model was considered for a latest bug data set. Our data reduction can effectively decrease the data scale and get better the accuracy of bug triage.

Keywords


Data Reduction; Bug Triage; Bug Data; Data Processing; Software Development; Developers; Redundant; Feature Selection; Instance Selection; Word Dimension

References


P. S. Bishnu and V. Bhattacherjee, “Software fault prediction using quad tree-based k-means clustering algorithm,” IEEE Trans. Knowl. Data Eng., vol. 24, no. 6, pp. 1146–1150, Jun. 2012.

H. Brighton and C. Mellish, “Advances in instance selection for instance-based learning algorithms,” Data Mining Knowl. Discovery, vol. 6, no. 2, pp. 153–172, Apr. 2002.

S. Breu, R. Premraj, J. Sillito, and T. Zimmermann, “Information needs in bug reports: Improving cooperation between developers and users,” in Proc. ACM Conf. Comput. Supported Cooperative Work, Feb. 2010, pp. 301–310.

G. Jeong, S. Kim, and T. Zimmermann, “Improving bug triage with tossing graphs,” in Proc. Joint Meeting 12th Eur. Softw. Eng. Conf. 17th ACM SIGSOFT Symp. Found. Softw. Eng., Aug. 2009, pp. 111–120.

T. M. Khoshgoftaar, K. Gao, and N. Seliya, “Attribute selection and imbalanced data: Problems in software defect prediction,” in Proc. 22nd IEEE Int. Conf. Tools Artif. Intell., Oct. 2010, pp. 137–144.

T. Kohonen, J. Hynninen, J. Kangas, J. Laaksonen, and K. Torkkola, “LVQ_PAK: The learning vector quantization program package,” Helsinki Univ. Technol., Esbo, Finland, Tech. Rep. A30, 1996.


Full Text: PDF

Refbacks

  • There are currently no refbacks.




Copyright © 2012 - 2020, All rights reserved.| ijitr.com

Creative Commons License
International Journal of Innovative Technology and Research is licensed under a Creative Commons Attribution 3.0 Unported License.Based on a work at IJITR , Permissions beyond the scope of this license may be available at http://creativecommons.org/licenses/by/3.0/deed.en_GB.