Zoom Architecture illustrated

What is Zoom?

Zoom provides video conferencing for collaborations, education, and misc. communications. It integrates with other services such as voice, and messaging into one easy-to-use, reliable, and high-quality platform.

What is Zoom architecture?

Zoom architecture includes client-side apps and server-side infrastructure. Zoom client is the software installed on clients’ computers or devices to connect to the servers. The servers are hardware and software to host meetings, route the traffic, and provide the associated services. Zoom servers are located in data centers, public cloud, and corporate networks.

zoom architecture

Table of Content

  1. Zoom Architecture
  2. Distributed cloud-native infrastructure
  3. Network technologies
  4. Security
  5. Back-of-the-envelope calculation

1. Zoom Architecture

The key components are Zoom client, Data Centers, Public Cloud, and Web infrastructure.

1. The Zoom Client is the software on computers and devices to access the Zoom servers. It performs video content processing, encoding, and decoding. It also provides network Quality of Service (QoS).  

2. The Data centers are the places where the Zoom servers (virtual machines) hold the meetings. The servers are organized in Meeting Zones. Inside, Multimedia routers (MMR) are for routing and dynamic optimization of stream traffic. Zone controllers (ZC) are for the management and orchestration of all activity that occurs within a given Meeting Zone. It also reports their status to the Global Cloud Controller.  The HTTP tunnels (HT) work as connection points to some clients who cannot connect to Zoom through other network channels.

3. Some Zoom servers are located inside the corporate network (on-premise deployment). There is a firewall between the internal network and the public cloud.  The internal users connect to the internal Zoom servers directly to avoid public traffic and have a higher degree of security and performance.

4. The Public cloud provides web applications services and Notification services. The cloud controllers inside the public cloud are for syncing meetings between the public cloud and the data center.

5. The Web infrastructure hosts the zoom.us website, the SDK for external developers who want to build systems around Zoom, and various components.

2. Distributed cloud-native infrastructure

Zoom builds its own distributed cloud-native infrastructure.  Cloud-Native means the system is architected to use cloud technology from the ground up. The microservices allow developers to seamlessly grow capacity.

1. Data centers

From the beginning of 2011, Zoom has built geographically distributed data centers globally. They use a network of private links to connect multiple others. The users can connect to the data center closest to their locations. They serve real-time video conferencing in their data centers or inside corporate networks.

zoom distributed system

In 2015, Zoom partnered with Equinix and had 13 data centers all over the world. Since then,  Equinix has helped with the key business, specifically the financial services and government sectors. Zoom has 17 data centers in 4/2020.

2. Cloud

They have been also using public clouds (AWS) to host metadata of the meetings, web applications and other services. During the pandemic, Zoom uses AWS for real-time traffic as well. Zoom also expands to Oracle Cloud to support some educational users. 

3. Network technologies

1. Multimedia routing

Zoom’s video-first client/server architecture separates video content processing from stream routing. They put the video processing at the client side, dynamically encoding and decoding based on the network performance and bandwidth. The network protocols UDP, TCP, TLS, and HTTPS are used (Not UDP only). If there are only two participants, peer-to-peer is used. On the server side, the Multimedia router determines optimal paths to connect the participants. In this way, multimedia routing supports more participants in each meeting by reducing computing, which is shared by the client.

2. Multi-bitrate encoding

Video and audio files are converted into digital data, compressed, and encoded in bitrate. The higher the bitrate, the less compression, the better the audio or video quality. Zoom’s multi-bitrate encoding is to use single stream multiple layers, in which each stream has every resolution and bitrate you might need, and the stream by itself can adjust to multiple resolutions. This eliminates the need to encode and decode the multiple streams for each endpoint and provides superior quality and reliability for various network environments and different devices.  

3. QoS at application layer

Quality of service (QoS) is the measurement and technologies to manage the traffic and ensure the quality of critical applications. The measurement includes several related aspects of the network, such as packet loss, bit rate, throughput, transmission delay, availability, jitter, etc. Most existing QoS solutions are deployed in the network layer. Here QoS is applied on the Application layer at the client side. They use proprietary algorithms to optimize video/audio and prioritize the factors that are more important for the particular type of device.

4. Security

The security issues emerged during the pandemic. Zoom took quick action to fix and clarify the confusion.

1.   End-to-end Encryption (sort of)

End-to-end encryption is the act of applying encryption to messages on one device such that only the device to which it is sent can decrypt it. Another option is encryption-in-transit, whereby messages are encrypted on the sender’s end, delivered to the server, decrypted there, re-encrypted, and then delivered to the recipient and decrypted on their end. 

zoom Encryption

Zoom previously offered TLS encryption for its clients, the same as the secure HTTPS web connections. But the data was only encrypted between each meeting participant and Zoom’s servers. Technically, It is not full “end-to-end encryption”. Oded Gal (CPO) clarified their “end to end encryption” for clients, such as PC, Mac, iOS, Android,  Zoom room, but not its web client (such as Chromebook users) or third-party clients that use the SDK, telephone, SIP/H.323 devices, on-premise configurations, or Lync/Skype clients, as Zoom says these can’t be end-to-end encrypted.

The key for encryption comes from the “key management system” in Zoom cloud infrastructure. Companies can choose whether to move their “key management system” to a corporate network (when available).

2. Meeting control and identity management

In March 2020, the news reported hijackers in the Zoom classes, so-called “zoombombing”. Since then, they added protective features to prevent this. These included a waiting room, password, muting controls, administrative controls, role-based access controls, limiting screen sharing, etc.

5. Back-of-the-envelope calculation (2020)

# of meeting per day300 millions
# of participants per day100~ 300 billions
# of participants per meeting(Enterprise users) 300~500
(Basic or Pro users) 100~200
# of meetings per server(vm) 100 ~ 200
# of data centers17

How Zoom works (YouTube)
How TikTok works

Comments are closed