Author Name | Affiliation | Fanlong Zhang | School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China | Xiaohong Su | School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China | Wen Zhao | School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China | Tiantian Wang | School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China |
|
Abstract: |
There are lots of code clones appearing in software, which are similar code fragments with each other. In the past decades, researchers have proposed some state-of-the-art methods to detect clones. The code clones have showing some relationship with the evolution of software. In order to explore relationships between clones and their evolution, we propose a framework to cluster clones with a Fuzzy C-means clustering method. Firstly, we detect all the clones using NiCad, and build the clone genealogies for multiple versions software. Secondly, we extract some metrics to describe the clones and their evolution. Finally, we cluster all clone’s vectors, which are generated with the different metrics for different proposes. Experimental results on six open source software packages have shown the relationships among the clone life, the number of change times, the clone pattern and et al. can help developers to understand clones. |
Key words: code clones clone clustering clone analysis clone evolution empirical study |
DOI:10.11916/j.issn.1005-9113.15316 |
Clc Number:TP311.5 |
Fund: |
|
Descriptions in Chinese: |
基于演化的克隆代码聚类分析研究 张凡龙,苏小红,赵雯,王甜甜 (哈尔滨工业大学,计算机科学与技术学院) 创新点说明:(1)提出一个克隆代码聚类分析框架,可探索克隆代码及其演化过程之间的联系; (2)提取相应的度量值表示克隆代码及其演化过程,并生成聚类向量用于克隆分析; (3)在六个实验系统上进行实证研究,并揭示了克隆代码及其演化过程之间的关系。 研究目的: 软件中存在大量的克隆代码,并且克隆代码也会随着软件演化。在克隆代码及其演化过程中,隐藏着一些可以帮助程序开发人员理解和维护克隆代码的关系。为探索克隆代码及其演化过程的关系,本文提出了一个基于模糊C均值聚类的克隆代码分析方法。 研究方法: 在本文所提出的克隆代码聚类分析方法中,首先使用NiCad检测多版本软件系统中的克隆代码,并构建相应的克隆家系描述克隆演化过程。然后,提取相应的度量值表示克隆代码及其演化过程,并使用度量值生成克隆聚类向量。最后,使用模糊C均值聚类方法分析克隆聚类向量,并根据聚类结果探索克隆代码及其演化关系。 结果: 本文在6个开源系统上进行了实证研究,通过分析克隆寿命、克隆变化次数和克隆模式等克隆演化特征,揭示了克隆代码及其演化过程之间的关系。 结论: 本文所得到的克隆代码及演化的关系可以帮助程序开发人员理解克隆代码及其演化过程,并可以进一步的指导开发人员对克隆代码进行维护和管理。 关键词:克隆代码,克隆聚类,克隆分析,克隆演化,实证研究 |