Compiling Standard ML for Efficient Execution on Modern Machines
Last modified: Tue Mar 14 19:35:36 2000 GMT.
Many language theoreticians have taken great efforts in designing higher-level programming languages that are more elegant and more expressive than conventional languages. However, few of these new languages have been implemented very efficiently. The result is that most software engineers still prefer to use conventional languages, even though the new higher-level languages offer a better and simpler programming model.
This dissertation concentrates on improving the performance of programs written in Standard ML (SML)---a statically typed functional language---on today's RISC machines. SML poses tough challenges to efficient implementations: very frequent function calls, polymorphic types, recursive data structures, higher-order functions, and first-class continuations. This dissertation presents the design and evaluation of several new compilation techniques that meet these challenges by taking advantage of some of the higher-level language features in SML.
Type-directed compilation exploits the use of compile-time type information to optimize data representations and function calling conventions. By inserting coercions at each type instantiation and abstraction site, data objects in SML can use the same unboxed representations as in C, even with the presence of polymorphic functions. Measurements show that a simple set of type-based optimizations improves the performance of the non-type-based compiler by about 19% on a DECstation 5000.
Space-efficient closure representations utilizes the compile-time control and data flow information to optimize closure representations. By extensive closure sharing and allocating as many closures in registers as possible, the new closure conversion algorithm achieves very good asymptotic space usage, and improves the performance of the old compiler by about 14% on a DECstation 5000, even without using a stack. Further empirical and analytic studies show that the execution cost of stack-allocated and heap-allocated activation records is similar, but heap allocation is simpler to implement and allows very efficient first-class continuations.
Unrolling lists takes advantage of the higher-level language abstraction in SML to support more efficient representations for lists. By representing each cons cell using multiple car fields and one cdr field, the unrolled list reduces the memory used for links and significantly shortens the length of control-dependence and data-dependence chains in operations on lists.
PublishedPh.D. Thesis. Dept. of Computer Science, Princeton University, November 1994.
TextGzipped Postscript [353k]
Copyright © 1996-2022 The FLINT Group
<flint at cs dot yale dot edu>
Yale University Department of Computer Science