Standard ML of New Jersey

This holiday weekend, when I wasn’t busy catching up on reading for my classes (how can I already be behind after only one week?) or finding more bugs in Rhythmbox’s Bonobo interface, I was giving myself a crash-course in Standard ML of New Jersey.

Standard ML of New Jersey (or SML/NJ) is the language the project in CS 502 (grad-level Compilers) will be written in, so I thought it made sense to, you know, learn the language before I need to start working on the projects for the class. Knowing it will also help quite a bit later on in CS 456 (Programming Languages) when we switch over from C to it.

Anyway. SML/NJ is a nifty little (mostly) functional programming language. Even though I haven’t worked with functional languages before, I didn’t have much trouble picking up the basics. I managed to get through pretty much all of Elements of ML Programming by Jeffrey Ullman this weekend (though I merely flipped through the last section, on the specifics of ML syntax), which served as a pretty good introduction to the language.

Functional programming is fundamentally different from imperative languages (like C) and object-oriented languages (like C++ and Java mostly are). Briefly, in imperative languages, you do things by executing statements for their side effects, such as changing values in memory. The object-oriented languages I’ve used are pretty much the same, except they bundle things up into objects.

Functional languages, on the other hand, get things done by evaluating expressions. Instead of looping and changing values in memory, you call functions recursively and create new values. Functions are all over the place and are more flexible than in imperative languages; functions can easily be passed to or returned from other functions, thus allowing you to construct new functions out of old ones at run time.

One of the features of SML/NJ that I really like is its pattern matching. Its case expressions are sort of like case statements in C, but instead of looking for particular values, you can look for structural patterns and decompose data into individual components easily. Even better, function calls are handled using pattern matching. Besides letting you write recursive functions in a really natural way, it pretty much eliminates the need for the visitor pattern you might use in C++ or Java to emulate polymorphic binding on function arguments. If our CS 352 (undergrad-level Compilers) project had used SML/NJ instead of Java, we could’ve eliminated entire classes and a bunch of indirection using this feature alone.

The type system SML/NJ uses is also pretty neat. For one thing, most of the time the compiler can figure out what data type everything is without needing to label everything yourself. There’s also several nifty ways to construct new data types out of old ones; one of them is sort of a cross between enums and unions in C. With these special datatypes and pattern matching, you can do a lot of powerful stuff without a lot of code, and have it still be readable.

Of course, the true test of how much I like SML/NJ will come after I’ve spent a couple of months writing projects in it, instead of what are still largely just first impressions. Only time will tell.

3 Responses

  1. SML sounds like an evil bastard child of LISP and Prolog, with actual useful stuff thrown in. Huh.

  2. That could be. About all I know about LISP is that it has Lots of Irritating Silly Parentheses, which SML/NJ doesn’t have too big a problem with. The designers of SML also seem to have recognized that purely functional constructs aren’t always the way to go, and included support for mutable values (references and arrays) and iterative constructs (while loops).

  3. Modern versions of LISP sort-of have mutables (set-q) and some pseudo-iterative constructs (which can be emulated via lambda anyway).

    LISP’s syntax is messy because it was originally intended to be the internal compiled representation for a higher-level language, but somewhere along the line the LISP developers decided that the internal representation was good enough. I’d imagine that if they’d actually done the higher-level language it would have looked somewhat like Tcl (which, incidentally, is my favorite functional/imperative/OO-hybrid language).

Comments are closed.