Saturday, February 2, 2013

Fixing IRC....

From time to time I see stories around the Internet about IRC and various comments about how it should just "be fixed", as if that were something easy to do. Unfortunately "fixing" IRC is quite hard. Why you ask?

IRC (Internet Relay Chat) works by sending messages between clients connected to the edge of a network of connected servers. The network of connected servers might be as few as one or as many as a hundred. Each server passes messages on to other servers or clients depending on various fields in the IRC message. The goal is near to instantaneous transmission of text messages from one client to another. That the transmission isn't instantaneous is where the problems begin.

To think about an IRC network in a more abstract sense, it could be considered to be a large distributed database where each IRC server holds its own copy of the database where the database contains information relating to other IRC servers that are connected to the network, all of the IRC clients that are connected to the network and which server they are connected to and lastly all of the channels (chat rooms) that a client is a member of. All of this information can be used to build a graph that maps the path between any two clients at a given point in time.

When a new client connects to the IRC network, its presence is announced by a message being distributed throughout the IRC network by the server that it first connects to. Similarly messages that represent the client joining a channel are broadcast throughout the network, as too is a message sent out when the client disconnects, effectively removing the client from the database.

When a new server connects to the IRC network its name is announced to all of the other servers in the IRC network, similar to what happens with an IRC client. However it must also receive a complete dump of the IRC database from the server it connected to. When two servers join together with each owning their own IRC network, the fun begins.

If an IRC network becomes divided due to a connection between IRC servers severing then in order to allow the two parts to continue functioning as a chat network, clients are allowed to leave/join, etc. This now means that it is possible for two clients to connect to each half and both choose (for example) the same chat name. When the networks joins back together, each IRC server will see a name collision. This problem is partially overcome with the use of time stamps - synchronised time stamps - but this starts to highlight the problem.

The problem for IRC is that it is a distributed database with clients attached to each database server that want to exchange messages with each other via the servers and not just interact with the server to which it is directly attached. It is a distributed database without any concept of there being a master server making the network of IRC servers functionally similar to a hive: the knowledge about the environment is shared with each and every member knowing everything there is to know.

Whilst there has been consideration given to using multicast, support for it is required across backbone routers but that brings another problem to the fore: reliability. It is essential for a message announcing that a client is joining a chat room reach all of the participants before any messages sent to the chat room for as part of the group communication. Similarly a message announcing a client withdrawing from the network must be distributed after all of the other messages from a client. This rules out using simple protocols that are layered on UDP. To meet all of these usability concerns, TCP is the transport layer protocol that delivers all of the IRC data.

So there you have it: fixing IRC requires developing a new database transport protocol that is light weight, reliable, provides sequenced delivery of data from one side of the Internet to another, allows for databases to merge with potentially conflicting data.