Skip to main content Accessibility help
×
Home

Semantic code browsing*

  • ISABEL GARCÍA-CONTRERAS (a1), JOSÉ F. MORALES (a1) and MANUEL V. HERMENEGILDO (a1) (a2)

Abstract

Programmers currently enjoy access to a very high number of code repositories and libraries of ever increasing size. The ensuing potential for reuse is however hampered by the fact that searching within all this code becomes an increasingly difficult task. Most code search engines are based on syntactic techniques such as signature matching or keyword extraction. However, these techniques are inaccurate (because they basically rely on documentation) and at the same time do not offer very expressive code query languages. We propose a novel approach that focuses on querying for semantic characteristics of code obtained automatically from the code itself. Program units are pre-processed using static analysis techniques, based on abstract interpretation, obtaining safe semantic approximations. A novel, assertion-based code query language is used to express desired semantic characteristics of the code as partial specifications. Relevant code is found by comparing such partial specifications with the inferred semantics for program elements. Our approach is fully automatic and does not rely on user annotations or documentation. It is more powerful and flexible than signature matching because it is parametric on the abstract domain and properties, and does not require type definitions. Also, it reasons with relations between properties, such as implication and abstraction, rather than just equality. It is also more resilient to syntactic code differences. We describe the approach and report on a prototype implementation within the Ciao system.

Copyright

Footnotes

Hide All
*

This research has received funding from the EU FP7 agreement no 318337, ENTRA, Spanish MINECO TIN2012-39391 StrongSoft and TIN2015-67522-C3-1-R TRACES projects, and the Madrid M141047003 N-GREENS program.

Footnotes

References

Hide All
Bruynooghe, M. 1991. A practical framework for the abstract interpretation of Logic Programs. Journal of Logic Programming 10, 91124.
Cabeza, D. and Hermenegildo, M. 2000. A new module system for Prolog. In International Conference CL 2000, LNAI, vol. 1861. Springer-Verlag, 131148.
Cousot, P. and Cousot, R. 1977. Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints. In Proc. of POPL'77. ACM Press, 238252.
Gallagher, J. and de Waal, D. 1994. Fast and precise regular approximations of logic programs. In Proc. of ICLP'94. MIT Press, 599613.
Hermenegildo, M., Puebla, G., Bueno, F. and Lopez-Garcia, P. 2005. Integrated program debugging, verification, and optimization using abstract interpretation (and the ciao system preprocessor). Science of Computer Programming 58, 1–2 (October), 115140.
Hermenegildo, M. V., Bueno, F., Carro, M., López, P., Mera, E., Morales, J. and Puebla, G. 2012. An overview of ciao and its design philosophy. Theory and Practice of Logic Programming 12, 1–2, 219252. http://arxiv.org/abs/1102.5497.
Maarek, Y. S., Berry, D. M. and Kaiser, G. E. 1991. An information retrieval approach for automatically constructing software libraries. Software Engineering, IEEE Transactions on 17, 8, 800813.
McMillan, C., Hariri, N., Poshyvanyk, D., Cleland-Huang, J. and Mobasher, B. 2012. Recommending source code for use in rapid software prototypes. In Proceedings of the 34th International Conference on Software Engineering. IEEE Press, 848858.
Mitchell, N. 2008. Hoogle overview. The Monad.Reader 12 (November), 2735.
Muthukumar, K. and Hermenegildo, M. 1991. Combined determination of sharing and freeness of program variables through abstract interpretation. In International Conference on Logic Programming (ICLP 1991). MIT Press, 4963.
Muthukumar, K. and Hermenegildo, M. 1992. Compile-time derivation of variable dependency using abstract interpretation. Journal of Logic Programming 13, 2/3 (July), 315347.
Puebla, G., Bueno, F. and Hermenegildo, M. 2000a. An assertion language for constraint logic programs. In Analysis and Visualization Tools for Constraint Programming. Number 1870 in LNCS. Springer-Verlag, 2361.
Puebla, G., Bueno, F. and Hermenegildo, M. 2000b. Combined static and dynamic assertion-based debugging of constraint logic programs. In Logic-based Program Synthesis and Transformation (LOPSTR'99), Number 1817 in LNCS. Springer-Verlag, 273292.
Puebla, G., Correas, J., Hermenegildo, M., Bueno, F., García de la Banda, M., Marriott, K. and Stuckey, P. J. 2004. A generic framework for context-sensitive analysis of modular programs. In Program Development in Computational Logic, Number 3049 in LNCS. Springer-Verlag, 234261.
Puebla, G. and Hermenegildo, M. 1999. Abstract multiple specialization and its application to program parallelization. J. of Logic Programming. Special Issue on Synthesis, Transformation and Analysis of Logic Programs 41, 2&3 (November), 279316.
Reiss, S. P. 2009. Semantics-based code search. In Proceedings of the 31st International Conference on Software Engineering. IEEE Computer Society, 243253.
Rollins, E. J. and Wing, J. M. 1991. Specifications as Search Keys for Software Libraries. In Proceedings of the Eighth International Conference on Logic Programming. MIT Press, 173187.
Stulova, N., Morales, J. F. and Hermenegildo, M. V. 2014. Assertion-based debugging of higher-order (C)LP programs. In 16th Int'l. ACM SIGPLAN Symposium on Principles and Practice of Declarative Programming (PPDP'14). ACM Press.
Vaucheret, C. and Bueno, F. 2002. More precise yet efficient type inference for logic programs. In SAS'02. Number 2477 in LNCS. Springer, 102116.

Keywords

Type Description Title
PDF
Supplementary materials

García-Contreras supplementary material
Online Appendix

 PDF (421 KB)
421 KB

Semantic code browsing*

  • ISABEL GARCÍA-CONTRERAS (a1), JOSÉ F. MORALES (a1) and MANUEL V. HERMENEGILDO (a1) (a2)

Metrics

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed