What are WebSockets?

A brief introduction to WebSockets and their applications

Photo by Greg Rosenke on Unsplash

The term “sockets” in computer science has been around for quite some time and it generally refers to a software interface that two processes (locally on the same machine or on different endpoints somewhere on the network) use to communicate with each other. Think of it like a door through which processes send and receive messages. Similarly, WebSockets have been created to add real-time communication between servers and clients in web applications. Does that mean that real-time communication didn’t exist before the WebSocket protocol? We will dive into that next and talk more about WebSockets and what are they used for.

Background

The web has been using the HTTP protocol to allow communication between a client and a server using a request-response model. The way this works is that a client sends a request over a TCP/IP connection to a server asking for a specific resource. This request should always start with a header that specifies the HTTP method. It will also include a lot of key-value pairs that inform the server about what kind of responses the client understands. Upon receiving the request, the server responds with the appropriate resource. That being said, the server won’t be able to push new data to the client unless a new request is made. This was not a big deal in the early days of the web when real-time communication was nothing but a far-fetched dream and websites were only required to serve simple assets. Nowadays, however, many web applications are expected to include advanced features and models such as real-time chat rooms, publish/subscribe models, or stock exchange boards.

HTTP 1.1 Request-Response Cycle

Initially, this was dealt with by adjusting the tools that were available back then. One of these solutions was to come up with the XMLHttpRequest object which allows a client to make requests to a server without refreshing the web page. Many modern browsers adopted this new standard as it widely improved the user-interaction with web applications. A while later, AJAX came to life which allowed asynchronous requests to be made to servers using JavaScript. In order to simulate real-time communication, developers came up with a way to send AJAX requests automatically based on a pre-set time interval. So for example a client would make an AJAX request to the server to fetch new data every 2 seconds. This was known as short polling.

HTTP Short-Polling

Another more adopted solution was to make a request to the server and persist the connection until a server responds with some data. Once that cycle was completed, the client would initiate another request and so on. Developers called this technique long polling. Using this technique, the server would keep the request open until the requested data becomes available.

HTTP Long-Polling

While those half-duplex uni-directional techniques seemed to get the job done, they had a lot of limitations, especially when dealing with more complex scenarios. This was mainly due to the high overhead involved in constantly sending requests and blocking server resources. This lead to the creation of “WebSockets”, a term coined in 2008 by its founding fathers Michael Carter and Ian Hickson.

Okay so…what are WebSockets?

A WebSocket is an application layer protocol that allows for full-duplex bidirectional communication with minimal overhead. In layman’s terms, full-duplex indicates that both the client and the server can send messages simultaneously. Websockets use the HTTP protocol to establish a connection between a server and a client and then use TCP/IP to transmit data. Note that WebSockets are NOT an alternative to HTTP but rather an upgrade. That being said, the request header sent by a client to a server to use the WebSocket protocol is quite similar to that sent when using HTTP. However, there are some new key-value pairs added to both the request and the response headers. Below is an example of a typical handshake to upgrade to WebSockets.

Client handshake request

GET /chat HTTP/1.1
Host: sample.com:3000
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: somekey
Sec-WebSocket-Version: someversion

Server handshake response

HTTP 1.1 101 Switching protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: somehash
Sec-Websocket-Version: someversion

The completion of this handshake indicates that both the client and the server have agreed to keep the TCP/IP connection running until one of the sides decides to terminate it. Therefore, we now have a persistent bi-directional connection. As you may have guessed, this allows the server to push new data as soon as it becomes available without having to wait for the client to explicitly request it. This paves the way for web applications to add a lot of complex features that require real-time communication.

WebSocket Protocol

In what scenarios should WebSockets be used?

Attempting to use WebSockets for applications that require occasional communication is simply overkill. The advantages won’t really justify the high costs incurred to keep a server running. This can be a serious issue when trying to scale up to handle thousands of connections at the same time. However, there are many scenarios where WebSockets would be the perfect solution. If you are building an application that requires one of the following, then updating to WebSockets would be the way to go:

  1. A chat application that requires the client to have a fast reaction time.
  2. A stock-price application where a server is required to push new data to all the connected clients as soon as new data is available.
  3. A food delivery application with real-time location support, where the message exchange frequency between the client and the server is expected to be relatively high. Recall that WebSockets are highly efficient in the sense that they don’t have to keep establishing connections frequently. This process requires a lot of overhead to exchange request/response headers which can be minimized if WebSockets were used instead.

It is quite important to note that while WebSockets may sound simple to work with, a lot of complexity gets introduced the moment a server reaches its limit, and scaling up is required. Implementing the protocol by itself rarely gets the job done and many complexities need to be taken care of by the developer. Examples of such complexities are restoring lost connections, authentication, and synchronizing messages between clients that are connected to different servers. Luckily, many developers out there have thought about such challenges and have written several libraries that not only implement the protocol but also add some features that would facilitate the implementation of WebSockets for more sophisticated scenarios.

WebSocket Libraries

There a variety of open source libraries available to be used in different programming languages. Those are mainly divided into two categories; libraries that are a pure implementation of the WebSocket protocol, and others that act as a wrapper around the WebSocket protocol. The former builds on top of the implementation to also include many features that are typically required by real-time applications. Some of the most popular WebSocket libraries are ws and Socket.io.

ws

The following library does all the heavy lifting to allow a server to support the WebSocket protocol. It was designed to be used with Node.js since it does not support WebSockets on the server-side out of the box. The fact that the ws library is a pure implementation of the WebSocket protocol means that a server that uses the “ws” library can communicate with a client that is using the WebSocket API. The downside of using “ws” is that the developer is expected to take care of all the complexities that were mentioned earlier.

Socket.io

A very popular library that not only implements the WebSocket protocol but also provides additional features that are commonly required by real-time applications. One of the main features of this library is that it has an automatic fallback mechanism. This means that if for some reason the server and the client fail to communicate using WebSockets, it would fall back to long polling. This feature was quite valuable in the earlier days when WebSockets were just introduced since not all browsers were still supporting the new protocol.

Other features include connection restoration, support for namespacing, and binary messaging (blobs, buffers, etc … ). Namespacing is a great feature for use-cases where certain clients are expected to be grouped based on permissions/common-interests.

Although this library comes with many features, it does have some drawbacks. If a server is set up to support WebSocket connections using Socket.io, then only Socket.io clients can connect to it and vice versa.WebSocket products (SaaS )

WebSocket Solutions (SaaS)

While many open-source libraries allow developers to implement WebSockets, there are some concerns that those libraries do not address. In larger applications, managing resources when using WebSockets can become troublesome. What happens after a server has received data from a client? In many scenarios, further data handling is required to read/write data to the database, check the data integrity, and so forth. Other issues include balancing the load between multiple servers, message ordering, and managing connections between several clients and servers. At this point, many businesses decide to go for readily available solutions to handle all these complexities introduced by applications requiring real-time functionality. There are several solutions in the market that support WebSockets such as Pusher and Ably.

While those solutions have their differences, they have a lot in common. Solutions as such usually support the publisher/subscriber model in which all the subscribed clients receive updates from the publisher as soon data becomes available. They also handle message delivery in case a client gets disconnected for some time. This is especially important for mobile phone users. Other features include historical messages retrieval and message synchronization across multiple servers.

Conclusion

To wrap up, the limitations of using techniques that rely on the HTTP protocol and XMLHttpRequest API to simulate real-time communications in web applications lead to the creation of WebSockets. Unlike other techniques that rely on the request/response model like long polling, WebSockets ensure that a) once a connection is established between a client and a server, it will persist until one of the sides decides to terminate it and b) a server can push data to a client without having to wait for a request to do so. This has many advantages over using the regular HTTP protocol since it offers full-duplex bidirectional communication, has very low latency, and is highly efficient because a request has to be made only once.

References

WebSockets — A Conceptual Deep Dive

REST vs WebSockets

Full Stack Web Developer

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store