Type inference for C : applications to the static analysis of incomplete programs.
No Thumbnail Available
Date
2020
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Type inference is a feature that is common to a variety of programming languages. While, in the past, it has been prominently present in functional ones (e.g., ML and Haskell), today, many object-oriented/ multi-paradigm languages such as C# and C++ offer, to a certain extent, such a feature. Nevertheless, type inference still is an unexplored subject in the realm of C. In particular, it remains open whether it is possible to devise a technique that encompasses the idiosyncrasies of this language. The first difficulty encountered when tackling this problem is that parsing C requires, not only syntactic, but also semantic information. Yet, greater challenges emerge due to C’s intricate type system. In this work, we present a unification-based framework that lets us infer the missing struct, union, enum, and typedef declarations in a program. As an application of our technique, we investigate the reconstruction of partial programs. Incomplete source code naturally appears in software development: during design and while evolving, testing, and analyzing programs; therefore, understanding it is a valuable asset. With a reconstructed well-typed program, one can: (i) enable static analysis tools in scenarios where components are absent; (ii) improve precision of “zero setup” static analysis tools; (iii) apply stub generators, symbolic executors, and testing tools on code snippets; and (iv) provide engineers with an assortment of compilable benchmarks for performance and correctness validation. We evaluate our technique on code from a variety of C libraries, including GNU’s Coreutils and on snippets from popular projects such as CPython, FreeBSD, and Git.
Description
Keywords
Parsing, Constraints, Syntax, Partial programs
Citation
MELO, L. T. C. et al. Type inference for C: applications to the static analysis of incomplete programs. ACM Transactions on Programming Languages and Systems, v. 42, n. 3, nov. 2020. Disponível em: <https://dl.acm.org/doi/10.1145/3421472>. Acesso em: 25 ago. 2021.