Understanding Unix sockets
What are one shot servers in ipc-channel
? Why are they even called “servers”? Why “one shot”? And, perhaps most importantly, what was the rationale behind them and how are they supposed to be used?
These are some of the questions I've been asking recently.
Well a bit of git archaeology¹ revealed² that one shot servers were originally just called servers and that they were designed to sit nicely on top of Unix sockets (which are used to implement ipc-channel
on Unix variants).
If like me, you've heard of Unix sockets, but aren't entirely sure what they are or how they are used, this article is for you.
TCP/IP sockets
Before exploring Unix sockets, let's recap the possibly more familiar behaviour of TCP/IP sockets.
At the networking level, a TCP/IP socket consists of a port and an IP address. A server creates a socket and then accepts connection requests from clients. Each resultant connection consists of a pair of sockets³: (client IP, client port) and (server IP, server port).
At the application programming level, a server creates a socket, binds the socket to the server IP address and a well-known port, and then sets the socket into a listening state, so clients can connect to the socket. The API for a socket is a superset of that of a file and allows binding, listening, accepting as well as reading and writing data. A socket has a socket descriptor rather than a file descriptor, but both socket and file descriptors are identified by an integer allocated from the same table for a process. A socket being listened to by a server has a socket descriptor and each resultant connection is given a new socket with a distinct socket descriptor and integer index.
Unix sockets
Unix sockets are similar to TCP/IP sockets except that they are used for interprocess communication (IPC) within a single operating system instance.
At the networking level, a Unix socket is identified by a file system path. A server creates a socket, binds it to a file system path, sets it listening, and then accepts connection requests from clients. Each connection consists of a pair of sockets, although the server socket is distinct from the socket the server used to accept the connection.
At the application programming level, Unix sockets and TCP/IP sockets are variants of the same socket abstraction. The main difference is that a Unix socket is bound to a file system path rather than an IP address and port. So Unix sockets are, like TCP/IP sockets, accessed via a socket descriptor allocated from the set of file descriptors associated with a process.
Connections are bidirectional. Data sent to the client end of a connection can be received from the server end of the connection. Similarly, data sent to the server end of the connection can be received from the client end of the connection.
Let's look at the server side and client side of Unix sockets in more detail.
Server side Unix socket
A process which will act as a server creates a Unix socket and then accepts connection requests from client processes:
- Create a socket. A socket descriptor is returned.
- Bind a file system path to the socket.
- Set the socket listening for connection requests from client processes.
- Accept connection requests. Once a connection request is received, a connection is created and the file descriptor of the server end of the connection is returned from
accept()
. - The server may then receive requests (by reading from the connection file descriptor) and send responses (by writing to the connection file descriptor).
- When the server is finished with the connection, it can close the file descriptor of the connection.
- When the server is finished with the socket, it can close the socket descriptor of the connection. (It should also delete the file system path which was bound to the socket.)
If the server process terminates, the operating system automatically closes any open connection file descriptors and any open socket descriptors. However, the file system path which was bound to the socket is not deleted automatically.
See the following sequence diagram for a summary:
Client side Unix socket
To connect to a server socket bound to a given file system path, a client:
- Create a socket.
- Connects to the socket passing the file system path. The return value is a file descriptor of the client end of the connection.
- The client can then send requests to the server (by writing to the file descriptor) and receive responses (by reading from the same file descriptor).
- When the client is finished with the connection, it can close the file descriptor of the connection.
- When the client is finished with its socket, it can close the socket.
If the client process terminates, the operating system automatically closes any open connection file descriptors and any open socket descriptors.
See the following sequence diagram for a summary:
References
The following references were useful in writing this post.
Text book
- “Internetworking with TCP/IP: Principles, protocols, and architectures” (volume 1, fourth edition) by Douglas E. Comer. Highkly recommended as an authoritive source on TCP/IP networking.
Man pages
Unix Programmer's Manual: Supplementary Documents 1
Rust code
While writing this post, I wrote a little example of how to use Unix sockets in Rust.
Conclusion
Having understood Unix sockets a bit better, I'm going to circle back to ipc-channel and look for similarities to the programming model for one-shot servers. But let's save that for another post.
#Rust #IpcChannel #SoftwareArchaeology
Footnotes:
¹ git bisect
is really handy for finding where something first appears in a codebase. You just have to remember to use git bisect --bad
for an occurrence of the thing and git bisect --good
for an earlier commit which didn't include the thing. That's because git bisect
aims to find where a bug was introduced, and bugs are always bad. Alternatively, grab a couple of scripts I wrote to encapsulate this.
² For more detail, see Document one shot server rationale.
³ Until writing this post, I was under the impression that a new server port would be allocated for each new connection, but it turns out that the server's listen port (e.g. 80) is used. As a friend of mine says, in a New Jersey accent, “you live and loin”.
Feedback: email me.