"Automatic Control Understanding for Natural Programs"

{\AI 91-161, Artificial Intelligence Laboratory, Dept. \ of Computer Sciences, The University of Texas at Austin\/}, (Ph.D. \ thesis)(1991)

Program understanding involves recognizing abstract concepts like "read-process loop" in existing programs. Programmers spend much of their time understanding programs, so studying and automating the process has many benefits.

Programming plans are units of programming knowledge connecting abstract concepts and their implementations. Existing research assumes that plan instances can be recognized to recover the programmer's abstract concepts and intentions, but this approach has not been confirmed empirically.

We present a practical method for bottom-up control concept recognition in large, unstructured imperative programs. Control concepts are abstract notions about interactions between control flow, data flow, and computation, such as "do loop", "read-process loop" and "bounded linear search". They are recognized by comparing an abstract program representation against a library of standard implementation plans. The program representation is a hierarchical control flow/data flow graph decomposed into a tree of sub-models using propers( single entry.exit control flow sub-graphs). Plans are represented by similar graphs with added qualifications. recognition is based on simple matching between sub-models and plans. the method was implemented in the UNPROG program understander and tested with Cobol and Lisp source programs.

This method is robust, efficient and scalable. the program representation can be formed for all language constructs which permit static determination of control flow; and focus recognition on criterial program features. The number of sub-models and comparisons increases linearly with program size.

UNPROG has been applied to automatic Cobol restructuring. Knowledge associated with plans and concepts permits more specific and insightful transformation, code generation, and documentation than is possible with syntactic methods. Control understanding can similarly raise the level of other reverse engineering and reenginering tools for applications like analysis, documentation, and translation.

We also showed how our method and UNPROG can be used for empirical study of programs at the conceptual level. Results can be used to improve recognizer performance, acquire plans, catalog natural plans and concepts, test the hypothesis that programs are planful, and characterize program populations.


To order: Send $20.00 to

          AI Lab Publications
          Dept. of Computer Sciences
          The University of Texas at Austin
          78712-1188
          (512) 471-7316

Web Master
Original: 1-Dec-1994
Update: 2-Dec-1994