Networking in Idris

I wrote a very basic network library a little while back for Idris, but recently decided to scrap it and start again. In this post, I’ll describe how to use the new library, the guarantees that dependent types give us, and show some examples. I’ll also describe my plans for the coming months. Any comments or criticisms would be very much appreciated!

First off, it’s important that I say that this is very much a work in progress right now — there are a number of limitations that I hope to iron out soon  (for example, we can only operate on strings) but I hope this will be enough to get us started.

You will need:

Once you have installed all of these (in order to do this, run idris –install <packagename> from the directory for the packages, or just make for Idris), we can start having a look at how everything works:

The Low Level Stuff

The library is largely in two parts: the higher-level effects for verified TCP communication, and the lower-level socket interface.

The Network.Socket library is a low-level abstraction over the C socket library, and it contains all of the goodness you would expect: socket, bind, connect, send, recv and so on. This is to allow people to build things on top of the basics of the library (so, if you fancy doing some nice raw socket goodness, this should provide a nice starting point!).

There’s also a thin C library, idrisnet.c, which provides some helper functionality to bridge the gap imposed by the FFI.

I haven’t generated the documentation just yet, but the functionality should be fairly clear from the file, and accessible to anyone with knowledge of programming with C sockets.

 

The Higher-Level Stuff

Primer: The New Effects Library

If you haven’t already, I strongly recommend you read the details about the “old” (I hesitate to call it that!) effects library here. The paper details how to model side effects algebraically, and shows how, since effectual operations may be predicated on the state of an associated resource, the library may be used to enforce resource usage protocols.

One issue we found when using the library for a larger project was that it was quite difficult to encapsulate failure. Initial versions of IdrisWeb (a POC web framework) and IdrisNet encapsulated failures by parameterising a resource over the particular step of the computation (for example, Bound or Listening) and recording any failure at the value level. This wasn’t ideal, since many extra handler functions needed to be created, and it was rather difficult to propagate failures in the program.

It’s not really my place to describe this, since the new Effects library was written by Edwin, but the general gist is that we may now use the result of an effectual operation to calculate the output resource. A good example is the ‘Hello World’ of resource-aware DSLs: file handling. Taken from here (not written by me!), we firstly specify an ADT to abstractly model the effectual operations:

data FileIO : Effect where
     Open : String -> (m : Mode) -> 
            {() ==> if result then OpenFile m else ()} FileIO Bool
     Close : {OpenFile m ==> ()} FileIO ()
     ReadLine : {OpenFile Read} FileIO String
     WriteLine : String -> {OpenFile Write} FileIO ()
     EOF : {OpenFile Read} FileIO Bool

For those of you who are familiar with the existing Effects syntax, you will notice some differences. Firstly, the { … } notation is syntactic sugar for a resource transition. Let’s take Open as an example:

Open : String -> (m : Mode) -> {() ==> if result then OpenFile m else ()} FileIO Bool

This means that the Open operation takes a String, a Mode (for example, read or write) and returns a Bool. This Boolean return value is bound to the variable result in the curly-bracketed expression:

{() ==> if result then OpenFile m else ()}

This indicates that we start the operation with the resource (). If the operation is successful (so, if it returns True) then the resulting resource is (OpenFile m), where m is the mode we specified before. If not, then the resource remains ().

We also then (as in the original paper) supply handler functions, which handle the abstract effects in a given execution context. To use these operations, we must wrap them in a function like so:

open : Handler FileIO e =>
       String -> (m : Mode) ->
       { [FILE_IO ()] ==> [FILE_IO (if result then OpenFile m else ())] } EffM e Bool
open f m = Open f m

We once again use syntactic sugar to show how the resource changes as a result of the operation.

TCPClient

The TCPClient.idr file allows us to build TCP Clients (funny, that!), which are guaranteed to follow the correct resource-usage protocol and are therefore guaranteed not to make erroneous calls to the underlying library. We can also guarantee freedom from handle leaks, since the types will specify that all allocated resources must also be freed. Additionally, the state (in our case, the client socket) is contained within the resource, so the user doesn’t have to worry about maintaining and using the descriptor.

Let’s have a bit more of a look at the TCPClient effect itself. We want to keep track of whether a client is connected (and therefore may send and receive data along the TCP channel) and also if an error has occurred (meaning that the channel may not be used but must still be closed). To do this, we define two resources, ClientConnected and ErrorState.

data ClientConnected : Type where
 CC : Socket -> ClientConnected
data ErrorState : Type where
 ES : Socket -> ErrorState

We then define the effect.

-- TCP Client Effect
data TCPClient : Effect where
 Connect : SocketAddress -> Port -> {() ==> interpConnectRes result}
                                    TCPClient (SocketOperationRes Socket)
 Close : { ClientConnected ==> ()} TCPClient () 
 Finalise : { ErrorState ==> () } TCPClient () 
 WriteString : String -> { ClientConnected ==> interpOperationRes result}
                         TCPClient (SocketOperationRes ByteLength) 
 ReadString : ByteLength -> 
              { ClientConnected ==> interpOperationRes result }
              TCPClient (SocketOperationRes (String, ByteLength))

I think a diagram makes this more clear…

The state diagram for the TCPClient effect

The state diagram for the TCPClient effect

Essentially, we start in the uninitialised state (). When we try to connect using tcpConnect (the wrapper function around the Connect operation), we either successfully connect and therefore transition to ClientConnected, or the connection fails and we remain in the uninitialised state. By closing the connection using tcpClose, we transition back to the uninitialised state.

The more interesting transitions happen as a result of the possibly failing resources tcpSend and tcpRecv.  These will either succeed, and remain in the ClientConnected state, or fail, and transition into the ErrorState state.

To implement this transition, we can essentially distil the results of a TCP socket operation into 4 cases:

  • The operation succeeds, returning the number of bytes sent on a send operation, or the number of bytes received on a receive operation.
  • The operation fails, but returns the codes EAGAIN or EWOULDBLOCK. These codes indicate that the operation failed, but can be safely retried.
  • The operation fails with a fatal error. In this case, we may not use the socket link again, and should close it at the next opportunity.
  • The remote socket has closed the connection.

To represent this programmatically, we define the following data type:

data SocketOperationRes a = OperationSuccess a
                          | FatalError SocketError -- Most socket errors are fatal.
                          | RecoverableError SocketError -- EAGAIN / EWOULDBLOCK
                          | ConnectionClosed

The interesting part comes with linking this to the resource transition. To do this, we make use of the fact that Idris treats types as first-class terms, thereby allowing them to be computed, and implement a function which calculates the output resource, given the result of the operation:

interpOperationRes : SocketOperationRes a -> Type
interpOperationRes (OperationSuccess _) = ClientConnected
interpOperationRes (FatalError _) = ErrorState
interpOperationRes (RecoverableError _) = ClientConnected
interpOperationRes ConnectionClosed = ()

So, if the operation succeeds or a recoverable error occurs, we remain in the ClientConnected state. If we encounter a fatal error, we transition to the FatalError state, and finally if the connection is closed, we transition to the uninitialised state.

TCPServer

Well, we can connect to things, but it’d be good to have something to connect to. We do this in a similar way to TCPClient, defining resources representing each state of the server: uninitialised, bound, listening, and in an error state.

data ServerBound : Type where
 SB : Socket -> ServerBound
data ServerListening : Type where 
 SL : Socket -> ServerListening
data ServerError : Type where
 SE : Socket -> ServerError

And the corresponding effect and state diagram:

data TCPServer : Effect where
 Bind : SocketAddress -> Port -> { () ==> interpBindRes result }
                                 TCPServer (SocketOperationRes ()) 
 Listen : { ServerBound ==> interpListenRes result } 
          TCPServer (SocketOperationRes ()) 
 Accept : ClientProgram t ->
          { ServerListening ==> interpOperationRes result } 
          TCPServer (SocketOperationRes t)
 CloseBound : { ServerBound ==> () } TCPServer () 
 CloseListening : { ServerListening ==> () } TCPServer () 
 Finalise : { ServerError ==> () } TCPServer ()
TCPServer State Diagram

TCPServer State Diagram

 

Let us look in detail at accepting a client, since this is (in my opinion) quite cool. The Accept operation has the following signature:

 Accept : ClientProgram t ->
          { ServerListening ==> interpOperationRes result } 
          TCPServer (SocketOperationRes t)

Note the ClientProgram parameter: this is a type synonym for the following effectual operation:

ClientProgram : Type -> Type
ClientProgram t = {[TCPSERVERCLIENT (ClientConnected)] ==> 
                   [TCPSERVERCLIENT ()]} EffM IO t

What this means is that we present the Accept operation with an effectual program which assumes a connected client—in this case, the socket gained from the accept operation—and results in the socket being closed. This is then run if a client is successfully accepted, and we have a static guarantee that the client behaves properly and is cleaned up.

(I’ve omitted the details of TCPSERVERCLIENT here since this is already far too long for a blog post, but essentially it’s like TCPCLIENT but without the connection operation).

Using the Libraries

I’ve written two basic applications making use of the libraries: EchoClient and EchoServer. They work as you might expect: the EchoClient application sends a message to the EchoServer, and the EchoServer sends it back. We’ll discuss EchoClient here, although both are on the repository.

At the top level, we use the Effects DSL to run the effectful echoClient program:

main : IO ()
main = run [(), ()] (echoClient (IPv4Addr 127 0 0 1) 1234)

We then attempt to connect to the server, using the given details:

echoClient : SocketAddress -> Port -> 
             { [TCPCLIENT (), STDIO] ==> [TCPCLIENT (), STDIO]} EffM IO ()
echoClient sa port = do
  connect_res <- tcpConnect sa port
  case connect_res of
       OperationSuccess sock => do putStr "Connected!\n"
                                   getAndSend
       RecoverableError _ => echoClient sa port
       ConnectionClosed => do putStr "Unable to connect :(\n"
                              return ()
       FatalError _ => do putStr "Unable to connect :(  \n"
                          return ()

The important concept to recognise here is that due to the correspondence between the result of the tcpConnect operation, by case-splitting, we may ascertain the resource given by the result in each branch. This allows us to ensure that the operations within the branch correspond to the resource. To give a concrete example, we may invoke the getAndSend function (discussed below) which requires a ClientConnected resource in the OperationSuccess branch, but attempting to do the same in the FatalError branch will result in a compile-time error.

The getAndSend and recvAndPrint functions take a very similar form, making use of the send and receive functions to communicate with the server. They also make use of the composability of resources to also allow messages to be printed to the console. I’m going to omit them in the interest of brevity (although, this post is now knocking on 2000 words, so that’s a bit rich…) but you can see them in the repository.

 

The Future…

If you’ve got this far, firstly, thanks for your time and I hope you found something at least a little bit interesting somewhere along the way! This is still early work, and there’s still a lot to do. Most seriously, this only works with strings at the moment: ideally, we’d have some kind of stream of Bits8 characters. The programs written with the effectual TCP library are also pretty verbose at the moment, and ideally it would be nice to find some way of making them more concise whilst retaining the guarantees we have at the moment. Basic concurrency support should also be along soon.

More interestingly, I’m currently working on a DSL to specify packet structures and marshal lower-level, much like the one described in this paper. That should come along soon, and I’ll hopefully use it as part of bigger projects.

Thanks for reading, and any comments would be gratefully received!

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>