The Standard ML Basis Library


The PRIM_IO signature

The PRIM_IO signature is an abstraction of the fundamental system call operations commonly available on file descriptors. Higher level IO facilities do not access the OS structure directly, but access the appropriate primitive IOreader and writer that accomplishes the required system call.

Several operations in the PRIM_IO interface will raise exceptions that have been left intentionally unspecified. The actual exception raised will usually be operating-system dependent, but may vary. For example, a reader connected to an algorithm that generates prime numbers might raise all kinds of exceptions. In addition, one would expect readVec and readVecNB to raise Size if the resulting vector would exceed the maximum allowed vector size. Similarly, one would expect readArr, readArrNB, writeArr, writeArrNB, writeVec and writeVecNB to raise Subscript if array bounds are violated. Readers and writers should not, in general, raise the IO.Io exception. It is assumed that the higher levels will appropriately handle these exceptions.

A reader is required to raise IO.Io if any of its functions, except close or getPos, is invoked after a call to close. A writer is required to raise IO.Io if any of its functions, except close, is invoked after a call to close.


Synopsis

signature PRIM_IO
structure BinPrimIO : PRIM_IO
structure TextPrimIO : PRIM_IO
structure WideTextPrimIO : PRIM_IO

Interface

type array
type vector
type elem
eqtype pos
val compare : (pos * pos) -> order
datatype reader
  = RD of { name : string, chunkSize : int, readVec : (int -> vector) option, readArr : ({buf : array, i : int, sz : int option} -> int) option, readVecNB : (int -> vector option) option, readArrNB : ({buf : array, i : int, sz : int option} -> int option) option, block : (unit -> unit) option, canInput : (unit -> bool) option, avail : unit -> int option, getPos : (unit -> pos) option, setPos : (pos -> unit) option, endPos : (unit -> pos) option, verifyPos : (unit -> pos) option, close : unit -> unit, ioDesc : OS.IO.iodesc option }
datatype writer
  = WR of { name : string, chunkSize : int, writeVec : ({buf : vector, i : int, sz : int option} -> int) option, writeArr : ({buf : array, i : int, sz : int option} -> int) option, writeVecNB : ({buf : vector, i : int, sz : int option} -> int option) option, writeArrNB : ({buf : array, i : int, sz : int option} -> int option) option, block : (unit -> unit) option, canOutput : (unit -> bool) option, getPos : (unit -> pos) option, setPos : (pos -> unit) option, endPos : (unit -> pos) option, verifyPos : (unit -> pos) option, close : unit -> unit, ioDesc : OS.IO.iodesc option }
val augmentReader : reader -> reader
val augmentWriter : writer -> writer

Description

type array
type vector
type elem
elem is an abstraction that represents the ``element'' of a file (or device, etc.). Typically, elem is a character or a byte. One typically reads or writes a sequence of elements in one system call: this sequence is the vector type. Sometimes it is useful to read or write the sequence from a mutable structure, which is the array type.

eqtype pos
This is an abstraction of a position in a file.

compare (pos, pos')
returns LESS, EQUAL, or GREATER when pos is less than, equal to, or greater than pos', respectively, in some underlying linear ordering on pos values.

datatype reader
A reader is a file (device, etc.) opened for reading, or some other kind of algorithm that produces elements, not necessarily connected to the outside world.

name
The name associated with this reader, used in error messages shown to the user.
chunkSize
The recommended (efficient) size of read operations on this reader. This is typically set to the block size of the operating system's buffers. chunkSize = 1 strongly recommends (but cannot guarantee) unbuffered reads. chunkSize <= 0 is illegal.
readVec n
when present, reads i elements for 1 <= i <= n, returning a vector v of length i; or (if end-of-stream is detected) returns an empty vector. Blocks (waits) if necessary until end-of-stream is detected or at least one element is available.
readArr{buf, i, sz}
when present, reads k elements (1 <= k <= sz) into the array buf starting at offset i, returning k; blocks (waits) if necessary until at least one element is available. If no elements remain before end-of-stream, returns 0.
readVecNB n
when present, reads i elements without blocking for 1 <= i <= n, creating a vector v, returning SOME(v); or if end-of-stream is detected without blocking, returns SOME(fromList[]); or if a read would block, returns NONE.
readArrNB {buf, i, sz}
when present, reads k elements without blocking, for 1 <= k <= sz into the array buf, starting at offset i, returning SOME k; if no elements remain before end-of-stream (determined without blocking), returns SOME(0); or (if a read would block) returns NONE.
block()
when present, blocks until at least one element is available for reading without blocking.
canInput()
when present, returns true if and only if the next read can proceed without blocking.
avail()
returns the number of bytes available on the ``device,'' or NONE if it cannot be determined. For files or strings, this is the file or string size minus the current position; for most other input sources, this is probably NONE. This can be used as a hint by inputAll. Note that this is a byte count, not an element count.
getPos()
returns the current position in the file. The getPos function must be non-decreasing (in the absence of setPos operations or other interference on the underlying object).
setPos(i)
moves to position i in file. Raises an exception if unimplemented or invalid.
endPos()
returns the position at end-of-stream. Raise an exception if unimplemented or invalid.
verifyPos()
returns the true current position in the file. Similar to getPos, except that the latter may maintain its own notion of file position for efficiency, whereas verifyPos will typically perform a system call to obtain the underlying operating system's value of the file position.
close
closes the reader and frees operating system resources. Further operations on the reader (besides close and getPos) raise IO.ClosedStream.
ioDesc
when present, ioDesc is the abstract operating system descriptor associated with this stream.
Implementation note:

Providing more of the optional functions increases functionality and/or efficiency of clients. If the reader can provide more than the minimum set in a way that is more efficient then the obvious synthesis, then by all means it should do so. Providing more than the minimum by just doing the obvious synthesis inside the primitive I/O layer is not recommended because then clients won't get the ``hint'' about which are the efficient (``recommended'') operations.

  1. Absence of all of readVec, readArr and block means that blocking input is not possible.
  2. Absence of all of readVecNB, readArrNB and canInput means that non-blocking input is not possible.
  3. Absence of readVecNB means that non-blocking input requires two system calls (using canInput and readVec).
  4. Absence of readArr or readArrNB means that input into an array requires extra copying. Note that the ``lazy functional stream'' model does not use arrays at all.
  5. Absence of endPos means that very large inputs (where vectors must be pre-allocated) cannot be done efficiently (in one system call, without copying).
  6. The client is likely to call getPos on every read operation. Thus, the reader should maintain its own count of (untranslated) elements to avoid repeated system calls. This should not be done on streams opened for atomic append, of course, where the information cannot be obtained except by a system call.
  7. Absence of setPos prevents random access.


datatype writer
A writer is a file (device, etc.) opened for writing.

name
The name associated with this file or device, for use in error messages shown to the user.
chunkSize
The recommended (efficient) size of write operations on this writer. chunkSize <= 0 is illegal.
writeVec{buf, i, sz}
when present, writes k elements from buf starting at offset i, for 0 < k <= sz, to the output device, and returns k. If necessary, blocks (waits) until the device can accept at least one element.
writeArr{buf, i, sz}
when present, writes k elements from buf, starting at offset i, for 0 < k <= sz, to the output device, and returns k. If necessary, blocks (waits) until the device can accept at least one element.
writeVecNB{buf, i, sz}
when present, writes k elements from buf starting at offset i, for 0 < k <= sz, to the output device without blocking, and returns SOME(k); or (if the write would block) returns NONE.
writeArrNB{buf, i, sz}
when present, writes k elements from buf starting at offset i, for 0 < k <= sz, to the output device without blocking, and returns SOME(k); or (if the write would block) returns NONE.
block()
when present, blocks until the writer is guaranteed to be able to write without blocking.
canOutput()
when present, returns true if and only if the next write can proceed without blocking.
getPos()
returns the current position within the file. May raise an exception if unimplemented or invalid.
endPos()
returns the position at end-of-stream, without actually changing the current position
setPos(i)
moves to position i in the file, so future writes occur at this position. May raise an exception if unimplemented or invalid.
verifyPos()
returns the true current position in the file. Similar to getPos, except that the latter may maintain its own notion of file position for efficiency, whereas verifyPos will typically perform a system call to obtain the underlying operating system's value of the file position.
close()
closes the writer and frees operating system resources. Further operations (other than close) raise IO.ClosedStream.
ioDesc
when present, ioDesc is the abstract operating system descriptor associated with this stream.

The write operations return the number of full elements that have been written. If the size of an element is greater than 1 byte, it is possible that an additional part of an element might be written. For example, if one tries to write 2 elements, each of size 3 bytes, the underlying system write operation may report that only 4 of the 6 bytes has been written. Thus, one full element has been written, plus plus part of the second.

Implementation note:

One of writeVec, writeVecNB, writeArr or writeArrNB must be provided. Providing more of the optional functions increases functionality and/or efficiency of clients:

  1. Absence of all of writeVec, writeArr and block means that blocking output is not possible.
  2. Absence of all of writeVecNB, writeArrNB and canOutput means that non-blocking output is not possible.
  3. Absence of writeArr or writeArrNB means that extra copying will be required to write from an array.
  4. Absence of setPos prevents random access.

Unlike readers, which can expect their getPos functions to be called frequently, writers need need not implement getPos in a super-efficient manner: a system call for each getPos is acceptable. Furthermore, getPos need not be supported for writers (it may raise an exception), whereas for readers it must be implemented (even if inaccurately).



augmentReader rd
produces a reader in which as many as possible of readVec, readArr, readVecNB, readArrNB are provided, by synthesizing these from the operations of rd.

For example, augmentReader can synthesize readVec from readVecNB and block, synthesize vector reads from array reads, synthesize array reads from vector reads, as needed.

augmentWriter wr
produces a writer in which as many as possible of writeVec, writeArr, writeVecNB, writeArrNB are provided, by synthesizing these from the operations of wr.


See Also

BIN_IO, IMPERATIVE_IO, STREAM_IO, TEXT_IO, POSIX_IO, OS.IO

[ INDEX | TOP | Parent | Root ]

Last Modified April 21, 1996
Comments to John Reppy.
Copyright © 1997 Bell Labs, Lucent Technologies