You are here

Modula-3 (article reprinted from USENIX ;login: November/December 1992)

(reprinted from USENIX ;login: November/December 1992)

Have you seen Modula-3?

Consider:

        MODULE Main;    (* hello1.m3 *)
        IMPORT Stdio, Wr;
        BEGIN
            Wr.PutText(Stdio.stdout, "Hello, world.\n");
        END Main.

The above program is written in Modula-3, one of the latest languages to descend from Algol. As you can see, the source looks a lot like Modula-2 or Pascal (or, if you live on the bleeding edge, you might agree that it looks a lot like Oberon).

As is true of the "Hello, World" program when written in other languages, our example does not begin to explicate Modula-3's interesting features; however, as with C, if you started by looking at a program which used all or even some of the interesting features, you would probably giggle and turn the page.

Modula-3 has explicit boundaries between modules. Types, constants, variables and procedures declared in a module are only accessable to other modules if the declaring module explicitly "exports" them. Contrast this to C, where all top-level names in a source file are globally visible at link time unless they are declared "static". A module which depends on entities from some other module must explicitly "import" the symbols it needs.

Consider:

        MODULE Main;    (* hello2.m3 *)
        FROM Wr    IMPORT PutText;
        FROM Stdio IMPORT stdout;
        BEGIN
            PutText(stdout, "Hello, world.\n");
        END Main.

In the first example, we imported the symbol tables from the Wr and Stdio modules, and then used their PutText and stdout symbols by qualifying those symbols with the names of the modules that exported them. In this second example, we imported only the PutText and stdout symbols -- now we can refer to them without module qualifiers. Note that the current recommended practice among experienced Modula-3 programmers is to always import symbol tables and use qualified references; you will rarely or never see local symbol aliases used in published Modula-3 source.)

Modula-3 is a strongly typed language whose loopholes are difficult enough and inconvenient enough to use that one tends to localize them to one or only a few "modules" (which, for now, you can think of as "source files"). This leads toward that ideal state of affairs where most of the ugly and nonportable code in a software system is located in one place; modules which are not declared to the compiler as "able to contain ugly, nonportable code" are prohibiting from containing such code. Examples of such code include most kinds of type casts, pointer arithmetic which could lead to alignment or byte-order gaffes, and use of dynamic memory in ways that could lead to memory leaks. Noone has yet invented a compiler which rejects code that is portably ugly.

Modula-3 has "objects", which for the OO-impaired are basically "records" (think "structs") which you can "subclass" (make another object just like this one but which adds some fields at the end or overrides the default values of existing fields) and which can contain "methods" (fields which are procedure ("function") pointers, calls to which are automagically rewritten by the compiler to pass as the first parameter the object variable whose method is being referenced).

Modula-3 has "garbage collection" which means that you almost never "free" dynamic memory when you are done with it; it is reclaimed by the runtime system at some time after the last reference to it disappears. In fact the DISPOSE procedure needed to deallocate dynamic memory is only available in modules which are declared to the compiler as UNSAFE (meaning that they are allowed to contain ugly, nonportable code).

Modula-3 has a defined "threads" interface, which means that coprocessing programs can look like what they do instead of being twisted around a very busy select(2) or poll(2) system call. The freely-available uniprocessor implementation of Modula-3's threads package uses select(2) and setjmp(2) and longjmp(2), but programmers don't have to think in terms of these details and the resulting improvement in readability -- and writability -- of coprocessing programs is quite dramatic. And of course, if you use threads in your code and someday move it to a multiprocessor with thread support, you'll get multiprocessing and concurrency for free.

In support of threads, Modula-3 has mutex semaphores and a built-in LOCK statement to lock and unlock these mutexes around critical sections of code, as well as a general interface that lets threads share access to "condition variables", which BSD kernel hackers will recognize as "like sleep() and wakeup(), only how do they do that in user mode?"

Consider the following example, which shows short cycle burner that will prove that the thread scheduler is indeed preemptive:

        MODULE Main;

        IMPORT Stdio, Wr, Fmt, Thread, Text, Time;

        TYPE
            LockedIndex = MUTEX OBJECT
                index := 0;
            END;

        VAR
            Inner := NEW(LockedIndex);
            Outer := 0;

        PROCEDURE InnerF(self: Thread.Closure): REFANY RAISES {} =
        BEGIN
            LOOP
                LOCK Inner DO
                    INC(Inner.index);
                END;
            END;
        END InnerF;

        BEGIN
            EVAL Thread.Fork(NEW(Thread.Closure, apply := InnerF));
            LOOP
                Time.LongPause(1);
                LOCK Inner DO
                    Wr.PutText(Stdio.stdout,
                        "Inner=" & Fmt.Int(Inner.index) & "; " &
                        "Outer=" & Fmt.Int(Outer) & "\n");
                    Wr.Flush(Stdio.stdout);
                    INC(Outer);
                    Inner.index := 0;
                END;
            END;
        END Main.

This program forks off a thread which increments, with locking, a global index variable. We made this variable an "object" which subclasses the "MUTEX" object type, since this is the usual style for object types with a single mutex to lock all of its resources. We could as easily have made this a "record" type with an explicit mutex field; for that matter we could have made the index and the mutex separate global variables with no "record" or "object" type to aggregate them. Anyway, the main thread forks a thread that executes the ApplyF function, which loops forever, incrementing the global index variable which is protected by a mutex. The main thread then loops, waiting one second and then printing and clearing the index that the other thread is furiously incrementing. On the author's workstation, this program prints:

        % ./a.out
        Inner=144204; Outer=0
        Inner=126392; Outer=1
        Inner=114215; Outer=2
        Inner=125996; Outer=3
        ^C

Exceptions

Last in our brief survey of Modula-3's features, we note "exceptions". An experienced C programmer knows that there are two kinds of code: the kind where you check the return codes of all system and library calls, and then there's the kind that people actually write. In the first kind (which has been characterized as "you're crossing the street to buy an ice cream cone, so after every step you stop and check yourself all over to see if you have just been hit by a crashlanding 747 jetliner"), you discover that return codes breed more return codes, since if your function discovers an unusual error it would like to propagate this unusualness to its caller, which must do the same, all the way up the call stack until some caller "handles" it (or more often in the code we've seen, casts it to "void"). Modula-3 has as part of every procedure declaration a list of "conditions" which that procedure is capable of "raising". Code which calls that procedure has the option of wrapping the call in a TRY...EXCEPT...END statement if it wants to have a chance to "handle" certain exceptions; in the absence of any caller who cares, the program exits with an error. This leaves the return value available for a full range of useful things, none of which are reserved as magic cookies. It also avoids most occasions where an error encoding must be mapped from "-1 was an error" to "but NULL is *my* return error code".

A Larger Example

So, why would you care about Modula-3, given that the world seems to be switching to C++ and Objective-C, or pouring megaspecmarks into Common Lisp? Simply put, "it ain't over 'til the fat lady sings." C++ is an endlessly flexible language, much worse so than C; measured by the ability to write code which other programmers cannot understand or to write code which even the author cannot understand or to write code which does not do what it looks like it should be doing, C++ is the first language to eclipse C. (Sorry, Larry, C++ beat Perl by a couple of years.) One could (and has) argued that C++'s design not only permits bad code, it encourages it. Common Lisp, on the other hand, is a beautiful, elegant language which will someday join ADA in the museum of beached whales.

Of the languages whose definitions are in the public domain and for which there are freely-available, portable implementations available for most of the popular POSIX-like platforms, Modula-3 is the one in which it is hardest to write code which does not do what it looks like it should be doing, or which even the author cannot understand, or which other programmers cannot understand. You may not see this from the examples shown so far, but consider the program in Figure 2, which sends an IP/UDP datagram (think "packet") requesting the network time from all inetd(8)'s on the local subnet, collects the replies, and prints the average or "network" time.

        UNSAFE MODULE Main;

        IMPORT Datagram, Netdb, Net, MyTime;            (* local hacks *)
        IMPORT Stdio, Wr, Fmt, Thread, Time, Text;      (* standard *)
        IMPORT Word, Ctypes;                            (* nonportable *)

        TYPE
            T = Ctypes.long;    (* 64-bit machines will in be trouble here *)

        VAR
            Done := FALSE;
            Rcvd := 0;
            Waits := 0;
            NetTime: T;

        CONST
            TimeDifferential = -2085978496;             (* inetd(8)'s offset *)

        PROCEDURE ProtocolF(self: Thread.Closure): REFANY RAISES {} =
        VAR
            port := Datagram.NewClient(NIL, mayBroadcast := TRUE);
            server := Netdb.NewRemotePort("255.255.255.255", "time", "udp");
            timeval := NEW(REF T);
        BEGIN
            port.send(NIL, 0, server);  (* 0-length datagram is a "request" *)
            LOOP
                EVAL port.recv(timeval, BYTESIZE(timeval^), server);
                timeval^ := Word.Minus(Net.ntohl(timeval^), TimeDifferential);
                IF Rcvd = 0 THEN
                    NetTime := timeval^;
                ELSE
                    NetTime := Word.Divide(Word.Plus(NetTime, timeval^), 2);
                END;
                INC(Rcvd);
                Done := FALSE;
            END;
        END ProtocolF;

        BEGIN
            EVAL Thread.Fork(NEW(Thread.Closure, apply := ProtocolF));
            REPEAT
                Done := TRUE;
                Time.LongPause(1);
                INC(Waits);
            UNTIL Done;
            IF Rcvd = 0 THEN
                Wr.PutText(Stdio.stdout, "No Replies.\n");
            ELSE
                NetTime := Word.Minus(NetTime, Waits);
                Wr.PutText(Stdio.stdout,
                    "Network time: " & MyTime.TimeToText(NetTime) &
                    " (" & Fmt.Int(Rcvd) & " replies)\n");
            END;
        END Main.

One important note about this code: the Datagram module is a quick and dirty hack that we cobbled together to test some assumptions about IP/UDP performance; the actual IP/UDP interface supported in Modula-3 is likely to be quite different. Likewise the Netdb, Net, and MyTime modules are all local hacks that you don't have and wouldn't want anyway. As is true of any language which is less than a decade old, Modula-3's standard libraries are still evolving.

The program makes slightly contrived use of the Thread interface; the goal is to keep collecting responses until none appear for one second. A C programmer would use alarm(2), or select(2) with a timeout. This program starts a thread which blocks in port.recv() (which, given the presence of the Thread interface, was designed without any explicit timeouts of its own); whenever a datagram comes in, this thread receives it and computes it into the running average. The main thread loops, waiting one second and then exiting the loop only when no datagrams have been received by the other thread during the last second. The code is sloppy in that it should protect its thread-shared variables with mutexes, but as a demonstration it is already as complicated as would be useful.

The program is also of the "ugly and nonportable" variety; a more robust implementation would hide all of the details of the Word ("unsigned") arithmetic in other modules so that this module could do its job as straighforwardly as possible. We chose this example because it shows Modula-3 code trying to deal with the UNIX system call interface. This, in other words, is as ugly as system-dependent Modula-3 source ever has to get. You might wonder why we NEW() the timeval variable and dereference it every- where rather than creating a normal variable and passing it by ADR() in the one place we actually need its address. This has to do with the declaration of the Datagram object's recv method, which due to dampness behind its authors' ears, was rather more selective than it could have been.

To get an idea of what is really possible given threads and garbage collection, consider an IP/DNS name server which needs to concurrently process multiple incoming and outgoing "zone transfers" over IP/TCP, all the while receiving, forwarding, and generating DNS requests and replies over IP/UDP. The popular BIND name server forks a subprocess for each zone transfer -- a major performance penalty if you don't have a copy-on-write fork() and your nameserver core image is tens of megabytes in size. BIND also has a very busy select(2) at its core, along with a memory management scheme that can make grown programmers want to quit their jobs and go drive tow-trucks. Given garbage collection and threads, the hardest parts of this sort of program just obviously slide into place with almost zero insertion force. Any C programmer who likes to use the CPP to layer garbage collection on top of malloc(3) and threads on top of select(2) will probably not enjoy Modula-3 very much since all that stuff is done for you and your application's code is mostly goal- rather than mechanism-oriented.

Comparisons

The features highlighted by the last example are: (1) variables can be given types, or initial values, or both, and if both are specified then the initial value must be of the given type; (2) this is also true of formal procedure parameters; (3) actual procedure parameters may be given positionally or by name; (4) aggregates may be returned by functions, or assigned to local procedure ("auto") variables; (5) if you want to call a typed procedure ("function") and throw away the result, you have to explicitly EVAL it; (6) EVAL in Modula-3 does not do anything like what it does in Lisp or Perl; (7) expressions of type TEXT, which includes "quoted strings" and results of functions of type TEXT, can be catenated inline with the "&" operator; (8) the ever-present newline ("\n") works as you'd expect; (9) most statements are innately compound, which means that IF and WHILE need an END, but BEGIN is meaningless for them; (10) dynamic variables created with NEW can be forgotten about, with no explicit deallocation; (11) expressions can be contained in parenthesis but need not be, and the expression used for IF and WHILE and REPEAT can be parenthesisless.

Features which are not highlighted in this example but which are interesting: (1) (*comments (*can be*) nested*); (2) NEW can fill in or override default values of fields in record ("struct") or object ("struct with magic") types; (3) record and object fields can _have_ default values, which, as with variables and formal procedure parameters, cause the type to be imputed if no type is specified.

Features which are not highlighted in this example but which are interesting: (1) (*comments (*can be*) nested*); (2) NEW can fill in or override default values of fields in record ("struct") or object ("struct with magic") types; (3) record and object fields can _have_ default values, which, as with variables and formal procedure parameters, cause the type to be imputed if no type is specified.

Differences from C which you will probably find bizarre or irritating: (1) NEW takes as its argument a pointer type rather than the size or type of the thing being allocated; (2) there are no (pre,post)-(inc,dec)rement operators, so you have to use INC(x) for x++ and DEC(x) for x--, and neither INC() or DEC returns the new value; (3) arith- metic on unsigned integers is painful and awkward; (4) compilation time and object size are both very large compared to PCC or GCC if you use the current version of the freely-available DEC SRC compiler (which is the only one in existence at this time); (5) case is significant even for built-ins, so you _must_ type a fair amount of your program text in UPPER CASE; (6) printf() is not impossible but not straightforward, either.

Differences from C++ which will either make you tense or relieved: (1) there are no enforced constructors or destructors, and though there is a convention for constructors there is none for destructors; (2) multiple inheritance, long considered either a botch or a blight (depending on who you ask), isn't here at all.

The advantages (or disadvantages, depending on who you ask) of strong type checking have already been argued elsewhere. To the oft-quoted "strong type-checking is for weak minds" argument, we counter that "software systems are getting larger and more complex; programmers' minds are not."

History

Modula-3 was designed in the late 1980's by researchers at Digital's Systems Research Center ("DEC SRC") in Palo Alto, California, and at the Olivetti Research Center ("ORC") in Menlo Park, California. It descends most recently from Modula-2+, which came from Cedar and Mesa and Pascal; Modula-3 was lightly cross-pollinated with Oberon, as Niklaus Wirth (creator of Pascal, Modula-2, Oberon, and more recently Oberon-2) was on sabbatical at DEC SRC during part of the time that Modula-3 was being conceived. Legend has it that Wirth's main contribution to Modula-3 was to encourage its designers to leave things out; this "smallness" is apparent in that Modula-3 is smaller by far than Modula-2+, though it is still larger than Oberon or Oberon-2.

A portable implementation of Modula-3 was written at DEC SRC and has been made more-or-less freely available by Digital Equipment Corporation. This compiler generates C as its intermediate language, which accounts for its portability but also its moderate speed and largish object code size; on the bright side it is free, and runs on most of the common POSIX-like platforms including Ultrix (MIPS and VAX), SunOS (SPARC and 68000), RS6000, and HP-UX (PA-RISC and 68000). More ports are under way, as is development and standardization of the runtime library. There are restrictions on the use of this compiler for commercial products, but you should get those details by reading the release notes that come with the DEC SRC compiler.

Future

To many programmers, C and C++ seem like forgone conclusions. Modula-3 is the only serious challenger to C++ as the next massively popular system and application programming language. This author believes that there is a good chance that there will be a market for programmers and CASE tools in Modula-3 in the next year or two, since it is a practical yet elegant software design and implementation tool which encourages clean, bug-free code and is a true pleasure to work in no matter how large the project might be (or become).

If you enjoyed Modula-2 or Pascal but found them incomplete and limiting, it's a safe bet that you will find whatever you were missing...in Modula-3.

If C is driving you nuts but you get cold sweats whenever you think about C++, you may find a way out of your dilemma...in Modula-3.

Resources System Programming with Modula-3, Greg E. Nelson et al, Prentice Hall Modula-3, Samual P. Harbison, Prentice Hall comp.lang.modula3 — (usenet newsgroup) gatekeeper.dec.com:~ftp/DEC/Modula-3/* — a freely available, portable Modula-3 compiler for several UNIX variants including Ultrix, SunOS, HP-UX.)

(Editor's node in 2014: see also the Modula-3 web site.

$Id: m3-art,v 1.4 1992/10/13 16:47:07 vixie Exp $