引用本文: | 逄龙,王甜甜,苏小红,马培军.支持多程序语言的静态信息提取方法[J].哈尔滨工业大学学报,2011,43(3):62.DOI:10.11918/j.issn.0367-6234.2011.03.013 |
| PANG Long,WANG Tian-tian,SU Xiao-hong,MA Pei-jun.Retrieval method of static code information for multi-language[J].Journal of Harbin Institute of Technology,2011,43(3):62.DOI:10.11918/j.issn.0367-6234.2011.03.013 |
|
摘要: |
为了满足代码分析对多语言静态信息提取的需求,克服当前构建单语言提取重用率低、过程复杂等不足,采用直接修改GCC特定解析阶段源代码的方法建立统一的提取接口.针对所需静态信息的不同,按GCC内部机制,提出了运行改入点与内部辅助函数重用相结合的提取方法,具体包括类型和函数声明信息的采集、函数体内程序语句的遍历以及多语言统一中间表示的获取,重用了GCC内部高质量代码,从而降低了构建静态信息提取所需的重复开销.通过对比试验表明该方法程序语言解析能力稳定健壮且效率高,能够直接提取大型开源程序的静态信息. |
关键词: 静态信息 GCC编译器 程序的中间表示 代码静态分析 |
DOI:10.11918/j.issn.0367-6234.2011.03.013 |
分类号:TP311.11 |
基金项目:国家自然科学基金资助项目(60673035);高等学校博士学科点专项科研基金资助项目(20092302110040) |
|
Retrieval method of static code information for multi-language |
PANG Long, WANG Tian-tian, SU Xiao-hong, MA Pei-jun
|
School of Computer Science and Technology,Harbin Institute of Technology,150001 Harbin,China
|
Abstract: |
There are many requirements for the multilanguage static information retrieval,and it is wasteful and complex to build specific front end for each language.So to meet the need and conquer the weakness we present a method based on GCC source code change to provide a uniform interface for retrieval.According to the static information type and the GCC inside mechanism,this method integrates the specific hook point with the GCC’s internal functions to gather the needed.The details to be collected include: the type and function declaration,the statements traverse and the uniform multilanguage middle-representation.The reusability of this method reduces the duplicated cost of the construction for each language.The comparison experiments shows that this method can efficiently and robustly parse multilanguage and be directly applied to large-scale open source code to retrieve the static information. |
Key words: static code information GCC compiler program middle-representation static code analysis |