Zoom provides video conferencing for collaborations, educations and misc. communications. It integrates with other services such as voice, messaging into one easy-to-use, reliable and high quality platform.
Zoom architecture includes client-side app and server-side infrastructure. Zoom client is the software installed on clients’ computers or devices to connect to the servers. The servers are hardware and software to host meetings, route the traffic and provide the associated services. Zoom servers are located in data centers, public cloud and corporate networks.
Table of Content
- Zoom Architecture
- Distributed cloud-native infrastructure
- Network technologies
- Back-of-the-envelope calculation
1. Zoom Architecture
The key components are Zoom client, Data Centers, Public Cloud, Web infrastructure.
1. The Zoom Client is the software on computers and devices to access the Zoom servers. It performs the video content processing, encoding and decoding. It also provides network Quality of Service (QoS).
2. The Data centers are the places where the zoom servers (virtual machines) hold the meetings. The servers are organized in Meeting Zones. Inside, Multimedia routers (MMR) are for routing and dynamic optimization of stream traffic. Zone controllers (ZC) are for the management and orchestration of all activity that occurs within a given Meeting Zone. It also report their status to the Global Cloud Controller. The HTTP tunnels (HT) work as connection points to some clients who cannot connect Zoom through other network channels.
3. Some zoom servers are located inside the corporate network (on-premise deployment). There is firewall between internal network and public cloud. The internal users connect to the internal Zoom servers directly to avoid public traffic and have higher degree of security and performance.
4. The Public cloud provides the web applications services and Notification services. The cloud controllers inside the public cloud are for syncing meeting between public cloud and data center.
5. The Web infrastructure hosts zoom.us website, the SDK for external developers who want to build systems around Zoom, and various components.
2. Distributed cloud-native infrastructure
Zoom build their own distributed cloud-native infrastructure. Cloud-Native means the system are architected to using cloud technology from ground-up. The microservices allow developers to seamlessly grow capacity.
1. Data centers
From the beginning 2011, Zoom have built geographically distributed data centers globally. They use a network of private links to connect multiple others. The users can connect to the data center closest to their locations. They serve real-time video-conferencing in their data centers or inside corporate networks.
In 2015, Zoom partnered with Equinix and had 13 data centers all over the world. Since then, Equinix helps with the key business, specifically the financial services and government sectors. Zoom has 17 data centers in 4/2020.
They have been also using public clouds (AWS) to host metadata of the meetings, web applications and other services. During pandemic, Zoom uses AWS for real-time traffic as well. Zoom also expands to Oracle cloud to support some educational users.
3. Network technologies
1. Multimedia routing
Zoom’s video-first client/server architecture separates video content processing from stream routing. They put the video processing at the client side, dynamically encoding and decoding based on the network performance and bandwidth. The network protocols UDP, TCP, TLS and HTTPS are used (Not UDP only). If there are only two participants, peer-to-peer is used. At the server side, Multimedia router determines optimal paths to connect the participants. In this way, multimedia routing supports more participants in each meeting by reducing computing, which shared by client.
2. Multi-bitrate encoding
Video and audio files are converted into digital data, compressed and encoded in bitrate. The higher the bitrate, the less compression, the better the audio or video quality. Zoom’s multi-bitrate encoding is to use single stream multiple layers, in which each stream has every resolution and bitrate you might need, and the stream by itself can adjust to multiple resolutions. This eliminates the need to encode and decode the multiple streams for each endpoint, and provides superior quality and reliability for various network environments and different devices.
3. QoS at application layer
Quality of service (QoS) is the measurement and technologies to manage the traffic and ensure the quality of critical applications. The measurement includes several related aspects of the network, such as packet loss, bit rate, throughput, transmission delay, availability, jitter, etc. Most existing QoS solutions are deployed in network layer. Here QoS is applied on Application layer at client side. They use proprietary algorithms to optimize video/audio and prioritize the factors that are more important for the particular type of devices.
The security issues emerged during pandemic. Zoom took quick actions to fix and clarified the confusions.
1. End-to-end Encryption (sort of)
End-to-end encryption is the act of applying encryption to messages on one device such that only the device to which it is sent can decrypt it. Another option is encryption-in-transit, whereby messages are encrypted on the sender’s end, delivered to the server, decrypted there, re-encrypted, and then delivered to the recipient and decrypted on their end.
Zoom previously offered TLS encryption for its clients, the same as the secure HTTPS web connections. But the data was only encrypted between each meeting participant and Zoom’s servers. Technically, It is not full “end-to-end encryption”. Oded Gal (CPO) clarified their “end to end encryption” for clients, such as PC, Mac, iOS, Android, Zoom room, but not its web client (such as Chromebook users) or third-party clients that use the SDK, telephone, SIP/H.323 devices, on-premise configurations, or Lync/Skype clients, as Zoom says these can’t be end-to-end encrypted.
The key for encryption comes from “key management system” in Zoom cloud infrastructure. Companies can choose whether to move their “key management system” to corporate network (when available).
2. Meeting control and identity management
In March 2020, the news reported hijackers in the zoom classes, so called “zoombombing”. Since then, they added the protective features to prevent this. These included waiting room, password, muting controls, administrative controls, role-based access controls, limiting screen sharing etc.
5. Back-of-the-envelope calculation (2020)
|# of meeting per day||300 millions|
|# of participants per day||100~ 300 billions|
|# of participants per meeting||(Enterprise users) 300~500|
(Basic or Pro users) 100~200
|# of meetings per server(vm)||100 ~ 200|
|# of data centers||17|