Capacity planning

The capacity of each Transcoding Conferencing Node in your Pexip Infinity deployment — in terms of the number of connections* that can be handled simultaneously — depends on a variety of factors including:

  • The server capacity and hardware configuration.
  • The type of call — Full HD, HD, SD, or audio-only, the codec, and whether there is a presentation stream included.
  • The number of unique VMRs being used (and thus the number of backplanes being reserved).
  • The type of gateway call — whether the inbound and outbound legs are on the same transcoding node or not, and the types of client involved in the call.

* A connection can be a call or presentation from an endpoint to a Virtual Meeting Room or Virtual Auditorium, a backplane between Transcoding Conferencing Nodes, or a call into or out of the Infinity Gateway. In this context, a connection is analogous to a port. In some situations, a single conference participant such as a WebRTC or Skype for Business client requires two connections (one for the video call, and one for presentation content).

When a connection is proxied via a Proxying Edge Node, the proxying node also consumes connection resources in order to forward the media streams on to a Transcoding Conferencing Node. A transcoding node always consumes the same amount of connection resources regardless of whether it has a direct connection to an endpoint, or it is receiving the media streams via a proxying node.

In all cases, you must also have sufficient concurrent call licenses available.

The following sections explain each of these factors. For some comprehensive examples showing how these different factors can combine to influence capacity, see Resource allocation examples.

Server capacity

The capacity and configuration of the server on which the Conferencing Node is running determines the number of calls that can be handled. This is influenced by a number of factors, including processor generation, number of cores, processor speed, hypervisor and BIOS settings.

For more information, see Server design recommendations and Example Conferencing Node server configurations.

When deploying on a cloud service we have recommended instance types and call capacity guidelines for Amazon Web Services (AWS), Microsoft Azure and Google Cloud Platform.

Call types and resource requirements

The type of call (Full HD, HD, SD, or audio-only) affects the amount of resource required by a Transcoding Conferencing Node to handle the call.

In general, when compared to a single high definition HD 720p call:

  • a Full HD 1080p call uses twice the resource
  • an SD standard definition call uses half the resource
  • an audio-only call uses one twelfth of the resource.

A WebRTC call using the VP8 codec uses the same amount of resource as H.264, and the VP9 codec uses around 25% more resource, so VP9 at 720p uses the equivalent of 1.25 HD resources, and VP9 at 1080p uses the equivalent of 2.5 HD resources. WebRTC clients also use 0.5 HD additional resources for sending presentation content and 1 additional HD resource when receiving full motion presentation. Note that within the same conference some participants may use VP9 (if they are connected to a Conferencing Node using the AVX2 or later instruction set) while other participants may use VP8 (if they are connected to a Conferencing Node on older hardware).

If you want to limit video calls to specific resolutions (and limit the transcoding node resources that are reserved for calls), you should use the Maximum call quality setting (see Setting and limiting call quality for more information).

On startup, each Conferencing Node runs an internal capacity check. This capacity is translated into an estimated maximum number of Full HD, HD, SD or audio-only calls, and can be viewed on the status page (Status > Conferencing Nodes) for each Conferencing Node. The status also shows the current media load on each Conferencing Node as a percentage of its total capacity.

Proxying Edge Node resource requirements

When a connection is proxied via a Proxying Edge Node, the proxying node also consumes connection resources in order to forward the media streams on to a Transcoding Conferencing Node.

A proxying node uses approximately the equivalent of 3 audio-only resources to proxy a video call (of any resolution), and 1 audio-only resource to proxy an audio call.

We recommend allocating 4 vCPU and 4 GB RAM (which must both be dedicated resource) to each Proxying Edge Node, with a maximum of 8 vCPU and 8 GB RAM for large or busy deployments.

Backplane reservation

Each conference instance on each Transcoding Conferencing Node reserves a backplane connection at a resource level corresponding to the conference's Maximum call quality setting, to allow the conference to become geographically distributed if required. The exceptions to this are:

  • Deployments with a single Conferencing Node. In such cases, no backplanes will ever be required, so capacity is not reserved.
  • Conferences that are audio-only (in other words, where the conference has its Conference capabilities set to Audio-only). In such cases, capacity equivalent to one audio connection is reserved for the backplane.

For some reservation examples, see Resource allocation examples.

Gateway calls

Gateway calls (person-to-person calls, or calls to an externally-hosted conference, such as a Microsoft Teams or Skype for Business meeting, or Google Meet) require sufficient capacity for both the inbound leg and the outbound leg. In general, this means that each gateway call consumes resources equivalent to two connections.

Non-distributed gateway calls (where the inbound and outbound legs are on the same transcoding node) involving only SIP or H.323 clients do not use any additional ports/connections. However, in other scenarios, additional ports may be used as described below (assuming that Maximum call quality is HD unless otherwise specified):

  • Distributed gateway calls (where the outbound leg is on a different transcoding node to the inbound leg) consume backplane ports — thus 1 HD video + 1 backplane for participant A plus 1 HD video + 1 backplane for participant B. This typically occurs when calling registered endpoints (where the outbound call to the registered endpoint will originate from the node the endpoint is registered to), or when using Call Routing Rules with an Outgoing location set to something other than Automatic.
  • For non-distributed gateway calls involving a Skype for Business / Lync client, a backplane is reserved in case the SfB/Lync user starts presenting (as the RDP/VbSS presentation stream could connect to any Conferencing Node due to DNS). Each presentation stream counts as 1 HD port. Thus if the incoming RDP/VbSS call lands on the same node as the video call then the resource usage is equivalent to 3 HD ports (2 video + 1 presentation). If the RDP/VbSS call lands on a different node to the video call then the resource usage for the call is equivalent to 5 HD ports (2 video + 2 backplane + 1 presentation).
  • For a gateway call to a Microsoft Teams meeting, the connection to Teams uses 1.5 HD of resource if Maximum call quality is SD or HD, otherwise it uses 1.5 Full HD resources. The resources required for the VTC leg of the connection depend upon the Maximum call quality setting. No additional resources are required for the connection from Pexip Infinity to Teams for presentations to or from the Teams meeting.
  • For a gateway call to a Skype for Business meeting, the connection to SfB uses 1 HD of resource for main video and will use another 1 HD of resource if either side starts presenting. The resources required for the VTC leg of the connection depend upon the Maximum call quality setting.
  • For a gateway call to Google Meet, the connection to Google Meet always uses 1 HD resource (it uses VP8) for main video. The resources required for the VTC leg of the connection depend upon the type of endpoint and the Maximum call quality setting. If the VTC endpoint starts to present content then an extra 1 HD resource is used for the connection from Pexip Infinity to Google Meet. However, no additional resources are required if presentation content is sent from Google Meet.

Call protocols and presentation content

The various call protocols have different behavior and limitations for sending and receiving video and presentation content, in addition to what Pexip Infinity will request based on its configured maximum call quality and bandwidth settings. As the endpoints ultimately decide what to send to Pexip Infinity, the following information should be seen as a guide only.

WebRTC

WebRTC may send resolutions up to 1080p to Pexip Infinity, depending on the camera capabilities and available bandwidth. Presentation content is always sent in a separate channel to main video, at up to the same resolution as the main video channel.

When Infinity Connect WebRTC clients elect to receive presentations in full motion, they are received in a separate channel with the same bandwidth as main video. Standard presentations are received as JPEG content in a separate channel with no fixed bandwidth. The JPEG content will update every second or so, and the amount of bandwidth depends on the amount of change from the previous JPEG image.

Skype for Business / Lync

Skype for Business / Lync clients may send resolutions up to 1080p to Pexip Infinity, depending on the camera capabilities and available bandwidth. RDP and VbSS presentation content is sent in a separate channel to main video. RDP is more bandwidth heavy than VbSS and it utilizes TCP, which is a poor choice for realtime media. RDP content is sent at the same resolution as the originating screen.

Skype for Business / Lync clients ask to receive content at a resolution based on the size of the window they are appearing in. For example, at smaller window sizes, the client will request video to be sent to it from Pexip Infinity at CIF (352x288) but as the size of the window expands, this can increase up to 1080p.

H.323 and SIP

H.323 and SIP video endpoints may send resolutions up to 1080p to Pexip Infinity at whatever bandwidth and frame rate they decide. Presentation content shares the bandwidth of the main video channel, meaning that when a participant starts presenting, this might decrease resolution of the main video. Most H.323 and SIP endpoints prioritize motion (higher frame rate) for main video and sharpness (higher resolution) for content.

RTMP

RTMP clients send resolutions up to 720p to Pexip depending on camera and bandwidth. RTMP does not support 1080p. RTMP clients do not support sending content, but they do receive content as JPEG.

Licenses

The total number of concurrent calls that can be made in your deployment (regardless of whether those calls are HD, SD or audio-only) is limited by your license. For more information, see Pexip Infinity license installation and usage.

Scaling up capacity, media overflow and dynamic bursting

You can easily scale a deployment up by creating several Conferencing Nodes in the same location (i.e. the same datacenter). Capacity can even be added “on the fly” – Conferencing Nodes can be added in a couple of minutes if more capacity is needed. Alternatively, each location can be configured to overflow to another location if it reaches its capacity, including bursting to temporary resources on a cloud service.

For more information on media overflow and dynamic bursting, see: