The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
Os desenvolvedores de software podem escrever vários fragmentos de código-fonte semelhantes, incluindo o mesmo erro em produtos de software. Para remover esses fragmentos de código defeituosos, os desenvolvedores inspecionam os clones de código se encontrarem um bug em seu código. Embora vários métodos de detecção de clones de código tenham sido propostos para identificar clones de blocos de código ou funções, essas ferramentas nem sempre se adaptam à tarefa de inspeção de código porque um fragmento de código defeituoso pode ser muito menor que blocos de código, por exemplo, uma única linha de código. Para permitir que os desenvolvedores pesquisem clones de código de um fragmento de código defeituoso tão pequeno em um produto de software de grande escala, propomos um método usando a Distância Lempel-Ziv Jaccard, que é uma aproximação da Distância de Compressão Normalizada. Conduzimos um experimento usando um conjunto de dados de pesquisa existente e uma pesquisa com usuários em uma empresa. O resultado mostra que nosso método relata com eficiência fragmentos de código clonados defeituosos e o desempenho é aceitável para desenvolvedores de software.
Takashi ISHIO
Nara Institute of Science and Technology
Naoto MAEDA
NEC Corporation
Kensuke SHIBUYA
NEC Corporation
Kenho IWAMOTO
NEC Corporation
Katsuro INOUE
Osaka University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copiar
Takashi ISHIO, Naoto MAEDA, Kensuke SHIBUYA, Kenho IWAMOTO, Katsuro INOUE, "NCDSearch: Sliding Window-Based Code Clone Search Using Lempel-Ziv Jaccard Distance" in IEICE TRANSACTIONS on Information,
vol. E105-D, no. 5, pp. 973-981, May 2022, doi: 10.1587/transinf.2021EDP7222.
Abstract: Software developers may write a number of similar source code fragments including the same mistake in software products. To remove such faulty code fragments, developers inspect code clones if they found a bug in their code. While various code clone detection methods have been proposed to identify clones of either code blocks or functions, those tools do not always fit the code inspection task because a faulty code fragment may be much smaller than code blocks, e.g. a single line of code. To enable developers to search code clones of such a small faulty code fragment in a large-scale software product, we propose a method using Lempel-Ziv Jaccard Distance, which is an approximation of Normalized Compression Distance. We conducted an experiment using an existing research dataset and a user survey in a company. The result shows our method efficiently reports cloned faulty code fragments and the performance is acceptable for software developers.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2021EDP7222/_p
Copiar
@ARTICLE{e105-d_5_973,
author={Takashi ISHIO, Naoto MAEDA, Kensuke SHIBUYA, Kenho IWAMOTO, Katsuro INOUE, },
journal={IEICE TRANSACTIONS on Information},
title={NCDSearch: Sliding Window-Based Code Clone Search Using Lempel-Ziv Jaccard Distance},
year={2022},
volume={E105-D},
number={5},
pages={973-981},
abstract={Software developers may write a number of similar source code fragments including the same mistake in software products. To remove such faulty code fragments, developers inspect code clones if they found a bug in their code. While various code clone detection methods have been proposed to identify clones of either code blocks or functions, those tools do not always fit the code inspection task because a faulty code fragment may be much smaller than code blocks, e.g. a single line of code. To enable developers to search code clones of such a small faulty code fragment in a large-scale software product, we propose a method using Lempel-Ziv Jaccard Distance, which is an approximation of Normalized Compression Distance. We conducted an experiment using an existing research dataset and a user survey in a company. The result shows our method efficiently reports cloned faulty code fragments and the performance is acceptable for software developers.},
keywords={},
doi={10.1587/transinf.2021EDP7222},
ISSN={1745-1361},
month={May},}
Copiar
TY - JOUR
TI - NCDSearch: Sliding Window-Based Code Clone Search Using Lempel-Ziv Jaccard Distance
T2 - IEICE TRANSACTIONS on Information
SP - 973
EP - 981
AU - Takashi ISHIO
AU - Naoto MAEDA
AU - Kensuke SHIBUYA
AU - Kenho IWAMOTO
AU - Katsuro INOUE
PY - 2022
DO - 10.1587/transinf.2021EDP7222
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E105-D
IS - 5
JA - IEICE TRANSACTIONS on Information
Y1 - May 2022
AB - Software developers may write a number of similar source code fragments including the same mistake in software products. To remove such faulty code fragments, developers inspect code clones if they found a bug in their code. While various code clone detection methods have been proposed to identify clones of either code blocks or functions, those tools do not always fit the code inspection task because a faulty code fragment may be much smaller than code blocks, e.g. a single line of code. To enable developers to search code clones of such a small faulty code fragment in a large-scale software product, we propose a method using Lempel-Ziv Jaccard Distance, which is an approximation of Normalized Compression Distance. We conducted an experiment using an existing research dataset and a user survey in a company. The result shows our method efficiently reports cloned faulty code fragments and the performance is acceptable for software developers.
ER -