I have also prepared an invited paper and an invited talk for SAS 2001 on the analysis of multithreaded programs.
We have developed the following analyses and optimizations:
We used this algorithm to solve a range of analysis problems, including static race detection for parallel divide and conquer programs, automatic parallelization of sequential divide and conquer programs, static detection of array bounds violations, and elimination of array bounds checks. See our PLDI 2000 paper on this topic for more details.
Our JPDC 1998 paper presents an algorithm that performs both data and computation lock coarsening for object-based programs. Our POPL 1997 paper presents a set of general transformations that move lock constructs both within and between procedures to coarsen the lock granularity.
Our ICS 1999 paper presents experimental results from an implementation of this technique. The results show that, for our set of benchmark programs, adaptive replication can improve the overall performance by up to a factor of three over versions that use no replication, and can reduce the memory consumption by up to a factor of four over versions that fully replicate updated objects. Furthermore, adaptive replication never significantly harms the performance and never increases the memory usage without a corresponding increase in the performance.
We have recently developed an extended type system for Java; this system guarantees that any well-typed program is free of data races. It enables the programmer to specify the protection mechanism for each object as part of the object's type. The type can specify either the mutual exclusion lock that protects the object from unsynchronized concurrent accesses, or that threads can safely access the object without synchronization because either 1) the object is immutable, 2) the object is accessible to only the current thread and is not shared between threads, or 3) the current thread holds the unique reference to the object. The type checker uses the type declarations to verify that the program uses each object only in accordance with its declared protection mechanisms.
The result is that programmers can aggressively remove superfluous synchronization without introducing data races. See our OOPSLA 2001 paper on this topic for more details.
We have developed a new analysis approach, commutativity analysis , that exploits the structure of the object-oriented paradigm to view the computation as consisting of operations on objects. It then analyzes the computation at this granularity to determine if operations commute (i.e. generate the same result regardless of the order in which they execute). If all of the operations in a given computation commute, the compiler can automatically generate parallel code. We have used this technique to successfully parallelize a variety of complete application programs, with the automatically parallelized versions running between 14 and 19 times faster (on a 24 processor machine) than the original sequential versions. See our TOPLAS 1997 paper on this topic for more details.
In collaboration with Daniel Jackson, I am investigating ways to check that a program correctly implements high-level design properties expressed using object models. My contribution focuses on new, very precise, program analyses that leverage the design information to focus the analysis on properties of interest to the programmer. The National Science Foundation is currently supporting this research through an ITR grant.
Finally, I have a range of Java analysis and optimization projects underway. These include the exploration of techniques for implementing Real-Time Java, space optimizations for Java programs, techniques that leverage program analysis information to provide very fast atomic transactions, and a range of memory system optimizations such as region-based allocation, elimination of write barriers, and (leveraging program analysis information about object lifetimes) explicit alloc/free optimizations.