The POSIX API

The Basis library provides a useful collection of POSIX functions on Unix systems. These are grouped together under substructures in the Posix structure. The source for these functions can be found in the compiler in the boot/Posix directory surprisingly enough. Much of the SML implementation is just a wrapper around calls to the corresponding C functions. The C code is in the runtime under the c-libs/ directory.

Posix.Error

The syserror value is a representation of the POSIX errno error codes. This is the same type as OS.syserror which appears in the OS.SysErr exception.

It appears as an abstract type but internally it is represented as an integer with the same value as the errno code. The errorMsg function returns the same string used by the perror() C library function. The unique name returned by errorName is derived from the symbol for the POSIX error code. If you need the actual integer value for the error code then you can use toWord.

Posix.FileSys

The Posix.FileSys structure provides functions for dealing with directories and files except for the actual I/O which is in Posix.IO. Where possible you should use the corresponding OS.FileSys functions which are intended to be a bit more portable.

At this level you are working with Unix file descriptors, represented by the type file_desc. There is also a OS.IO.iodesc type for the file descriptors used by the poll interface in OS.IO. A separate type is used to make the OS.IO interface more independent of the POSIX layer and therefore more portable. Underneath they are both Unix file descriptors.

Most of the functions should be straight-forward to use. The flags may need some explaining. Flags are represented by lists of values of some flag type. The underlying bit patterns of the values in a list are or-ed together. Each substructure containing flags inherits a flags function from the POSIX_FLAGS signature to convert the list to the bit pattern. For example write

structure FS = Posix.FileSys
...
    FS.chmod("myfile", FS.S.flags
                [FS.S.irusr, FS.S.irgrp, FS.S.iroth])
...

to set the permissions of the file "myfile" to 0444. The sticky mode bit with value 01000 is deliberately filtered out by the stat functions as it is not part of the POSIX standard so you will never be able to test or preserve it.

To give the functions a workout here is a stat program that reports a file's status in detail. First I start with some useful declarations. The wordToDec function is needed since the SysWord.toString function always formats in hexadecimal. (See the Basis documentation on the WORD signature).

structure FS = Posix.FileSys

exception Error of string
fun toErr msg = TextIO.output(TextIO.stdErr, msg)

fun wordToDec w = SysWord.fmt StringCvt.DEC w

Here is the main function. It is pretty boiler-plate by now. It only recognises a single command line argument which must be the file name. The various functions we are using use the OS.SysErr exception so I've put in a catch-all for it.

fun main(arg0, argv) =
let
in
    case argv of
      [file] => (stat file; OS.Process.success)

    | _ => (toErr "Usage: statx filename\n"; OS.Process.failure)
end
handle
  OS.SysErr (msg, _) =>
    (
        toErr(concat["Error: ", msg, "\n"]);
        OS.Process.failure
    )

| Error msg => (toErr msg; OS.Process.failure)

| x => (toErr(concat["Uncaught exception: ", exnMessage x,"\n"]);
        OS.Process.failure)

Here is the function to report the stat data. I've put in a SysErr handler on the stat function so that I can report the file name. This is the most likely error to come from the program. Note that for better portability you should use the matching integer structure when printing integers e.g. Position.toString for Position.int types. It happens on the Intel x86 architecture that Position = Int but this may not be true on other architectures.

fun stat file =
let
    val st = (FS.stat file) 
                handle OS.SysErr (msg, _) =>
                    raise Error (concat[ file, ": ", msg, "\n"])

    val mode = FS.ST.mode st
    val uid  = FS.ST.uid st
    val gid  = FS.ST.gid st
in
    print(concat["  File: ", file, "\n"]);

    print(concat["  Size: ",
        Position.toString(FS.ST.size st), "\n"]);

    print(concat["  Type: ",
        filetypeToString st, "\n"]);

    print(concat["  Mode: ",
        SysWord.fmt StringCvt.OCT (FS.S.toWord mode),
        "/", modeToString mode, "\n"]);

    print(concat["   Uid: ",
        uidToInt uid, "/", uidToName uid, "\n"]);

    print(concat["   Gid: ",
        gidToInt gid, "/", gidToName gid, "\n"]);

    print(concat["Device: ",
        devToString(FS.ST.dev st), "\n"]);

    print(concat[" Inode: ",
        wordToDec(FS.inoToWord(FS.ST.ino st)), "\n"]);

    print(concat[" Links: ",
        Int.toString(FS.ST.nlink st), "\n"]);

    print(concat["Access: ", Date.toString(
        Date.fromTimeLocal(FS.ST.atime st)), "\n"]);

    print(concat["Modify: ", Date.toString(
        Date.fromTimeLocal(FS.ST.mtime st)), "\n"]);

    print(concat["Change: ", Date.toString(
        Date.fromTimeLocal(FS.ST.ctime st)), "\n"]);
    ()
end

The first of the helper functions called from stat is filetypeToString. It searches a list of predicate functions to find one that works on the stat buffer value. The matching name is returned. I've put the list of predicates within a local block so that is private to filetypeToString without being inside it. This way the list isn't built every time that the function is called, which is wasteful. This doesn't matter on this small program but it very well might in other programs.

local
    val type_preds = [
            (FS.ST.isDir,   "Directory"),
            (FS.ST.isChr,   "Char Device"),
            (FS.ST.isBlk,   "Block Device"),
            (FS.ST.isReg,   "Regular File"),
            (FS.ST.isFIFO,  "FIFO"),
            (FS.ST.isLink,  "Symbolic Link"),
            (FS.ST.isSock,  "Socket")
            ]
in
    fun filetypeToString st =
    let
        val pred = List.find (fn (pred, _) => pred st) type_preds
    in
        case pred of
          SOME (_, name) => name
        | NONE => "Unknown"
    end
end

The modeToString helper function iterates over a list of flag testing functions, one for each position in the final mode string. I've used currying to make each of the expressions in the list a function from a mode to the character for the mode in the string.

local
    fun test flag ch mode =
    (
        if FS.S.anySet(FS.S.flags [flag], mode)
        then
            ch
        else
            #"-"
    )

    fun test2 flag1 ch1 flag2 ch2 mode =
    (
        if FS.S.anySet(FS.S.flags [flag1], mode)
        then
            ch1
        else
            if FS.S.anySet(FS.S.flags [flag2], mode)
            then
                ch2
            else
                #"-"
    )

    val flags  = [
            test  FS.S.irusr #"r",
            test  FS.S.iwusr #"w",
            test2 FS.S.isuid #"s" FS.S.ixusr #"x",
            test  FS.S.irgrp #"r",
            test  FS.S.iwgrp #"w",
            test2 FS.S.isgid #"s" FS.S.ixusr #"x",
            test  FS.S.iroth #"r",
            test  FS.S.iwoth #"w",
            test  FS.S.ixoth #"x"
            ]
in
    fun modeToString mode =
    let
        val chars = foldl
            (fn (func, rslt) => (func mode)::rslt)
            [] flags
    in
        implode(rev chars)
    end
end

The next group of functions convert uids and gids to their string forms, both as a decimal number and as a name from the passwd/group files. These use functions from the Posix.ProcEnv and Posix.SysDB structures, described later. If there is any exception then I assume that the name is not known in the file.

local
    structure PROC = Posix.ProcEnv
    structure DB   = Posix.SysDB
in
    fun uidToInt uid = wordToDec(PROC.uidToWord uid)
    pun gidToInt gid = wordToDec(PROC.gidToWord gid)

    fun uidToName uid =
    (
        (DB.Passwd.name(DB.getpwuid uid))
            handle _ => "unknown"
    )

    fun gidToName gid =
    (
        (DB.Group.name(DB.getgrgid gid))
            handle _ => "unknown"
    )
end

Finally here is devToString. I need to do some bit operations to separate the bytes of the dev_t word. The current SML definition for a device value does not allow for the newer 64-bit device numbers. But on Linux on Intel x86 I get the major and minor numbers in the lower 16 bits of the word. This is not very portable.

fun devToString dev =
let
    val word = FS.devToWord dev
    val w1 = SysWord.andb(SysWord.>>(word, 0w8), 0wxff)
    val w2 = SysWord.andb(word, 0wxff)
in
    concat[wordToDec w1, ",", wordToDec w2]
end

POSIX_FLAGS

This signature is an interface that is inherited into each distinct set of flags in other Posix structures. See for example Posix.FileSys.S for the mode flags. It provides common operations on flags which are represented as bit-strings internally. See the section called Posix.FileSys for an example of flag use.

Posix.IO

This structure provides the functions that deal with the content of files as a stream of bytes. It works with the file descriptors that were created with the Posix.FileSys functions. There is not a lot of need to use the read and write functions in this structure for general purpose binary file I/O as the BinIO structure in the section called The Portable I/O API should be all that you will need. You could use them in conjunction with other functions that deal with file descriptors such as the file locking functions.

A good demonstration of programming at this level can be found in the implementation of the execute function in the Unix structure. (See the boot/Unix directory of the compiler). It shows how to fork and exec a child process and build portable I/O streams from file descriptors. Central to building I/O streams are the mkReader and mkWriter functions that are declared in the OS_PRIM_IO signature. (See the section called The Portable I/O API). These build reader and writer objects for buffered I/O given a POSIX file descriptor. You can find implementations of them in the PosixBinPrimIO and PosixTextPrimIO structures. The result is this code from the Unix structure.

fun fdReader (name : string, fd : PIO.file_desc) =
      PosixTextPrimIO.mkReader {
          initBlkMode = true,
          name = name,
          fd = fd
        }

fun fdWriter (name, fd) =
      PosixTextPrimIO.mkWriter {
          appendMode = false,
          initBlkMode = true,
          name = name,
          chunkSize=4096,
          fd = fd
        }

fun openOutFD (name, fd) =
      TextIO.mkOutstream (
        TextIO.StreamIO.mkOutstream (
          fdWriter (name, fd), IO.BLOCK_BUF))

fun openInFD (name, fd) =
      TextIO.mkInstream (
        TextIO.StreamIO.mkInstream (
          fdReader (name, fd), NONE))

The name argument is only used for error reporting to distinguish the stream. The implementation in the PosixBinPrimIO and PosixTextPrimIO structures use the Posix.IO.setfl function to change the blocking mode as requested by the blocking and non-blocking versions of the I/O functions in a reader or writer. You need to supply the correct initial state for these modes. If you opened the file with, for example, Posix.FileSys.openf with O_APPEND or O_NONBLOCK using the flags in Posix.FileSys.O then you must pass in the appropriate values for initBlkMode and appendMode.

The openOutFD and openInFD functions assemble the stream layers as shown in Figure 3-2. The output stream is set to be fully buffered. Other possible buffered modes, from the IO structure, are NO_BUF for no buffering at all and LINE_BUF if you want to flush the buffer after each line of text. (LINE_BUF is the same as BLOCK_BUF for binary streams).

Once you have built a stream on a file descriptor you cannot easily retrieve the file descriptor to manipulate it while the stream is live. If you call TextIO.StreamIO.getReader for example intending to get the reader's ioDesc field then the stream will be terminated on the assumption that you will be taking over all I/O from then on. If you need access to the file descriptor then you should save it somewhere yourself. You might do this if you want to use the polling interface of the OS.IO structure. (The canInput function on streams doesn't poll, it just attempts to do a non-blocking read on the file descriptor).

Here is the code for Unix.executeInEnv. It demonstrates file descriptor manipulation while forking and setting up some pipes.

structure P = Posix.Process
structure PIO = Posix.IO
structure SS = Substring

fun executeInEnv (cmd, argv, env) = let
      val p1 = PIO.pipe ()
      val p2 = PIO.pipe ()

      fun closep () = (
            PIO.close (#outfd p1); 
            PIO.close (#infd p1);
            PIO.close (#outfd p2); 
            PIO.close (#infd p2)
          )

      val base = SS.string(SS.taker
                    (fn c => c <> #"/") (SS.all cmd))

      fun startChild () =
      (
        case protect P.fork () of
          SOME pid =>  pid                      (* parent *)

        | NONE =>
        let
            val oldin = #infd p1
            val newin = Posix.FileSys.wordToFD 0w0

            val oldout = #outfd p2
            val newout = Posix.FileSys.wordToFD 0w1
        in
            PIO.close (#outfd p1);
            PIO.close (#infd p2);

            if (oldin = newin) then () else (
                PIO.dup2{old = oldin, new = newin};
                PIO.close oldin);

            if (oldout = newout) then () else (
                PIO.dup2{old = oldout, new = newout};
                PIO.close oldout);

            P.exece (cmd, base::argv, env)
        end

      val _ = TextIO.flushOut TextIO.stdOut

      val pid = (startChild ())
                    handle ex => (closep(); raise ex)

      val ins = openInFD (base^"_exec_in", #infd p2)
      val outs = openOutFD (base^"_exec_out", #outfd p1)

  in
    (* close the child-side fds *)
    PIO.close (#outfd p2);
    PIO.close (#infd p1);

    (* set the fds close on exec *)
    PIO.setfd (#infd p2, PIO.FD.flags [PIO.FD.cloexec]);
    PIO.setfd (#outfd p1,PIO.FD.flags [PIO.FD.cloexec]);

    PROC {
      pid = pid,
      ins = ins,
      outs = outs
    }
  end

The startChild function forks (see the section called Posix.Process and Posix.Signal) and dups file descriptors in the usual way to get the pipes connected to stdin and stdout while being careful that they are not already connected that way. Remember to close the unused ends of the pipe in the parent and child or else you won't be able to get an end-of-file indication when the child exits.

Posix.ProcEnv

This structure provides access to information about a process such as its uid, gid, pid, running time or environment variables.

You can also get system information via the uname and sysconf functions. You form the string argument to sysconf by deleting the _SC_ prefix from the POSIX value name, for example to get _SC_OPEN_MAX write Posix.ProcEnv.sysconf "OPEN_MAX". All of the _SC_ values defined in unistd.h on your system should be available this way.

To use file descriptors with isatty you need the conversion function in Posix.FileSys. For example to determine if stdin is a tty:

fun isatty() = Posix.ProcEnv.isatty (Posix.FileSys.wordToFD 0w0)

Posix.Process and Posix.Signal

This structure provides functions to fork and exec processes, kill and wait for them. Equivalent functions for the C library's alarm(), pause() and sleep() functions are also included. You can find a demonstration of fork and exec in the section called Posix.IO.

The kill function uses the signal values defined in Posix.Signal. This defines a type signal with values for each of the POSIX signals. You can also convert these to the integer codes for your platform with the toWord function.

Unfortunately the POSIX API does not currently provide for setting signal handlers. For that you need to resort the older signal API of the SMLofNJ structure in the section called Signals in Chapter 4. (If you are looking in the boot/Unix directory of the compiler, the unix-signals* files define the signals for this older API).

Posix.SysDB

This structure provides an API for reading the /etc/passwd and /etc/group files. The uidToName function in the statx program of the section called Posix.FileSys provides a little demonstration of the API.

Posix.TTY

This structure provides a termio-style API to terminals.

The following function from the ttyx program shows how to change the erase character on your terminal. (Updating a single field in a record is a pain in SML).

fun setErase ch =
let
    val fd   = Posix.FileSys.wordToFD 0w0
    val attr = TTY.getattr fd
    val new_attr = TTY.termios {
            iflag  = TTY.getiflag attr,
            oflag  = TTY.getoflag attr,
            cflag  = TTY.getcflag attr,
            lflag  = TTY.getlflag attr,
            cc     = TTY.V.update
                        (TTY.getcc attr, [(TTY.V.erase, ch)]),
            ispeed = TTY.getispeed attr,
            ospeed = TTY.getospeed attr
            }
in
    TTY.setattr(fd, TTY.TC.sanow, new_attr)
end

Note that at the time of writing this, the Basis library documentation for Posix.TTY doesn't match SML/NJ version 110.0.7. In version 110.0.7 there is no internal structure called Posix.TTY.CF. Its contents appear directly in Posix.TTY. Similarly these functions which should be in the Posix.TTY.TC structure appear directly in Posix.TTY: getattr, setattr, sendbreak, drain, flush, and flow.