A Debugger for Standard ML

1995 ◽  
Vol 5 (2) ◽  
pp. 155-200 ◽  
Author(s):  
Andrew Tolmach ◽  
Andrew W. Appel

AbstractWe have built a portable, instrumentation-based, replay debugger for the Standard ML of New Jersey compiler. Traditional ‘source-level’ debuggers for compiled languages actually operate at machine level, which makes them complex, difficult to port, and intolerant of compiler optimization. For secure languages like ML, however, debugging support can be provided without reference to the underlying machine, by adding instrumentation to program source code before compilation. Because instrumented code is (almost) ordinary source, it can be processed by the ordinary compiler. Our debugger is thus independent from the underlying hardware and runtime system, and from the optimization strategies used by the compiler. The debugger also provides reverse execution, both as a user feature and an internal mechanism. Reverse execution is implemented using a checkpoint and replay system; checkpoints are represented primarily by first-class continuations.

2010 ◽  
Vol 20 (5-6) ◽  
pp. 417-461 ◽  
Author(s):  
DANIEL SPOONHOWER ◽  
GUY E. BLELLOCH ◽  
ROBERT HARPER ◽  
PHILLIP B. GIBBONS

AbstractWe present a semantic space profiler for parallel functional programs. Building on previous work in sequential profiling, our tools help programmers to relate runtime resource use back to program source code. Unlike many profiling tools, our profiler is based on a cost semantics. This provides a means to reason about performance without requiring a detailed understanding of the compiler or runtime system. It also provides a specification for language implementers. This is critical in that it enables us to separate cleanly the performance of the application from that of the language implementation. Some aspects of the implementation can have significant effects on performance. Our cost semantics enables programmers to understand the impact of different scheduling policies while hiding many of the details of their implementations. We show applications where the choice of scheduling policy has asymptotic effects on space use. We explain these use patterns through a demonstration of our tools. We also validate our methodology by observing similar performance in our implementation of a parallel extension of Standard ML.


1994 ◽  
Author(s):  
Robert Harper ◽  
Frank Pfenning ◽  
Peter Lee ◽  
Eugene Rollins
Keyword(s):  

1996 ◽  
Vol 6 (1) ◽  
pp. 111-141 ◽  
Author(s):  
John Greiner

AbstractThe weak polymorphic type system of Standard ML of New Jersey (SML/NJ) (MacQueen, 1992) has only been presented as part of the implementation of the SML/NJ compiler, not as a formal type system. As a result, it is not well understood. And while numerous versions of the implementation have been shown unsound, the concept has not been proved sound or unsound. We present an explanation of weak polymorphism and show that a formalization of this is sound. We also relate this to the SML/NJ implementation of weak polymorphism through a series of type systems that incorporate elements of the SML/NJ type inference algorithm.


1998 ◽  
Vol 8 (6) ◽  
pp. 621-625 ◽  
Author(s):  
OLIVIER DANVY

A string-formatting function such as printf in C seemingly requires dependent types, because its control string determines the rest of its arguments. Examples:formula hereWe show how changing the representation of the control string makes it possible to program printf in ML (which does not allow dependent types). The result is well typed and perceptibly more efficient than the corresponding library functions in Standard ML of New Jersey and in Caml.


2014 ◽  
Vol 24 (6) ◽  
pp. 613-674 ◽  
Author(s):  
K. C. SIVARAMAKRISHNAN ◽  
LUKASZ ZIAREK ◽  
SURESH JAGANNATHAN

AbstractMultiMLton is an extension of the MLton compiler and runtime system that targets scalable, multicore architectures. It provides specific support for ACML, a derivative of Concurrent ML that allows for the construction of composable asynchronous events. To effectively manage asynchrony, we require the runtime to efficiently handle potentially large numbers of lightweight, short-lived threads, many of which are created specifically to deal with the implicit concurrency introduced by asynchronous events. Scalability demands also dictate that the runtime minimize global coordination. MultiMLton therefore implements a split-heap memory manager that allows mutators and collectors running on different cores to operate mostly independently. More significantly, MultiMLton exploits the premise that there is a surfeit of available concurrency in ACML programs to realize a new collector design that completely eliminates the need for read barriers, a source of significant overhead in other managed runtimes. These two symbiotic features - a thread design specifically tailored to support asynchronous communication, and a memory manager that exploits lightweight concurrency to greatly reduce barrier overheads - are MultiMLton's key novelties. In this article, we describe the rationale, design, and implementation of these features, and provide experimental results over a range of parallel benchmarks and different multicore architectures including an 864 core Azul Vega 3, and a 48 core non-coherent Intel SCC (Single-Cloud Computer), that justify our design decisions.


2021 ◽  
Vol 2021 ◽  
pp. 1-15
Author(s):  
Jing Ge Feng ◽  
Ye Ping He ◽  
Qiu Ming Tao

Automatic vectorization is an important technique for compilers to improve the parallelism of programs. With the widespread usage of SIMD (Single Instruction Multiple Data) extensions in modern processors, automatic vectorization has become a hot topic in the research of compiler techniques. Accurately evaluating the effectiveness of automatic vectorization in typical compilers is quite valuable for compiler optimization and design. This paper evaluates the effectiveness of automatic vectorization, analyzes the limitation of automatic vectorization and the main causes, and improves the automatic vectorization technology. This paper firstly classifies the programs by two main factors: program characteristics and transformation methods. Then, it evaluates the effectiveness of automatic vectorization in three well-known compilers (GCC, LLVM, and ICC, including their multiple versions in recent 5 years) through TSVC (Test Suite for Vectorizing Compilers) benchmark. Furthermore, this paper analyzes the limitation of automatic vectorization based on source code analysis, and introduces the differences between academic research and engineering practice in automatic vectorization and the main causes, Finally, it gives some suggestions as to how to improve automatic vectorization capability.


Sign in / Sign up

Export Citation Format

Share Document