Viewing alarms

When there are active alarms on your Pexip Infinity deployment, a flashing blue triangle appears at the top right of each page of the Administrator interface. To view details of the current alarms, click on this icon or go to the Alarms page (Status > Alarms).

  • Alarms remain in place for as long as the issue exists. After the issue has been resolved (for example, if a conference ends, therefore freeing up licenses) the associated alarm will automatically disappear from the Alarms page.
  • Multiple instances of the same type of alarm can be raised. For example if two Conferencing Nodes are not correctly synchronized to an NTP server, you will see an alarm for each node.
  • You can select individual alarms and view the associated documentation (this guide) for suggested causes and resolutions.

The History & Logs > Alarm history page shows the details of all historic alarms including the severity level, and the time the alarm was raised and lowered.

An alarm is raised in each of the following situations:

Alarm ID Logged as Alarm = Level Cause Suggested resolutions
The Management Node does not have a TLS certificate 20

tls_certificate_missing_management

Critical The Management Node has no associated TLS certificate. Upload a TLS certificate and associate it with the Management Node.
A Conferencing Node does not have a TLS certificate 9

tls_certificate_missing

Critical A Conferencing Node has no associated TLS certificate.

Upload a TLS certificate and associate it with the Conferencing Node.

Alternatively, and if appropriate for your deployment, associate an existing certificate with your Conferencing Node. When doing this, the existing certificate should already contain a SAN (Subject Alternative Name) that matches your Conferencing Node's FQDN.

See Managing a node's TLS server certificate for more information.

CPU instruction set not supported 10

cpu_not_supported

Critical

A Conferencing Node has gone into maintenance mode because it was deployed on a server with an unsupported processor instruction set (e.g. SSE4.1).

This could also be caused by setting the EVC mode on a VMware cluster to too low a level, such as Westmere (see Enhanced vMotion Compatibility (EVC) for more information).

Deploy the Conferencing Node on a server with AVX or later.
Eventsink Reached Maximum Backoff 36

eventsink_maximum_backoff

Critical

An event cannot be delivered to an event sink, and the system has reached its retry timeout limit.

See Troubleshooting event sink failures.

Eventsink Reached Maximum Concurrent POSTs 37

eventsink_maximum_posts

Critical More than the configured Maximum number of background POSTs events (default 1000) are queued for an event sink but have not been sent.

See Troubleshooting event sink failures.

Azure key vault certificate has expired 48

azure_key_vault_certificate_expired

Critical The Teams Connector certificate (in the Azure key vault) has expired. Update the Teams Connector certificate.
NTP not synchronized 11

ntp_not_synchronised

Error

A node has failed to synchronize with the configured NTP servers.

A virtual machine on VMware has been migrated.

Ensure that NTP is enabled on the Management Node, and that NTP servers are assigned to, and accessible from, each location. See Syncing with NTP servers for more information.

See https://kb.vmware.com/s/article/2108828 for VMware migration issues.

Configuration not synchronized 18

configuration_sync_failure

Error This alarm is raised if the Conferencing Node status “Last contacted” time has not been updated within the last 2 expected replication intervals (typically no contact within the last 3 minutes).

In typical deployments, configuration replication is performed approximately once per minute. However, in very large deployments (more than 60 Conferencing Nodes), configuration replication intervals are extended, and it may take longer for configuration changes to be applied to all Conferencing Nodes (the administrator log shows when each node has been updated).

If configuration synchronization fails this may indicate network connectivity or routing issues between the Management Node and the Conferencing Node, which could be due to a malfunction or misconfiguration of devices such as routers or firewalls etc.

Ensure that all of the appropriate Pexip nodes are fully routable to each other in both directions. See General network requirements.

MS Exchange Connection Failure 22

scheduling_connection_failure

Error The Management Node cannot connect to the Exchange server. Check that the details entered in the EWS URL (System > VMR scheduling for Exchange integrations) are correct and the Exchange server is online.
Automatic backup upload failed 25

autobackup_upload_failed

Error The Management Node cannot connect to the FTP server to upload a backup file. Check that the Upload URL (supported schemes are FTPS and FTP) and the Username and Password credentials of the FTP server are correct (Utilities > Automatic backups) and that the Management Node can reach the FTP server.
LDAP sync failed 28

ldap_sync_failure

Error An LDAP template synchronization process has failed. This alarm duplicates the information shown for the error listed at Status > LDAP sync.

See Troubleshooting LDAP server connections for help with resolving LDAP connection issues.

The alarm is lowered when you resync the template (although it will get re-raised if the issue has not been resolved).

Scheduled scaling: cannot allocate some or all of the requested Teams Connector instances 42

azure_teamsconnector_scheduledscaling_failure

Error

This is raised if Pexip Infinity requests more instances than Azure will allow (above the limit set by the instance count (slider) configuration in the Azure portal for the Virtual machine scale set in your Teams Connector resource group).

Note that Azure will still create as many instances as it can up to the maximum.

You should review your maximum instances setting (slider) in Azure and your scheduled scaling policies to ensure you do not request scaling up beyond your maximum limit.

See Scheduling scaling and managing Teams Connector capacity for more information.

Scheduled scaling: some or all of the requested Teams Connector instances are not operational 43

azure_teamsconnector_scheduledscaling_notenoughinstances_failure

Error

This is raised if the number of required instances (Minimum number of instances plus the policy's Number of instances to add) are not running at the policy's activation date/time.

This can occur if Azure failed to start the instances, running instances have failed, or there is some other problem with Azure's scaling/provisioning processes.

It can also occur if the Azure Event Hub connection string field is not configured correctly.

This alarm can also be raised temporarily if the Minimum number of instances is increased. It will last for a few minutes until the new instances are up and running. This is expected behavior and can occur with or without any scheduled scaling policies.

This requires investigation of the VM scale set in the Azure portal as to the cause of failure, and manual intervention to resolve the issue. Problem scenarios could include:

  • The instance may be running but not sending heartbeat events.
  • The instance failed to start.

In most cases the resolution is to restart the instance via the Azure portal.

Also check that the Azure Event Hub connection string field in Pexip Infinity is configured correctly (see Configuring your Teams Connector (addresses, Event Hub, minimum capacity)).

Teams scheduled scaling: Event hub for management events does not exist 44

azure_teamsconnector_scheduledscaling_endpoint_not_found

Error

This is raised if "Enable Azure Event Hub" is enabled but Pexip Infinity cannot connect to the Azure Event Hub queue for scheduled scaling.

The most likely reason for this is that you have not created the Teams Connector API app.

Disabling the "Enable Azure Event Hub" setting will lower the alarm.

However, to use scheduled scaling you must follow the instructions to create the Teams Connector API app and redeploy your Teams Connector.

System integrity is compromised 46 integrity_error Error This means that one or more of the Pexip Infinity system files have been modified by an external party/event and thus the integrity of the system has been compromised. (These are internal files, not Management Node configuration data.)

You need to perform a full redeploy of your platform.

You should also notify your security administrator if the source of the event is unknown.

License limit reached 2

licenses_exhausted

Warning A Conferencing Node is unable to accept a call because there are not enough concurrent licenses available on the system at this time. For more information, see Pexip Infinity license installation and usage.
  • Wait until one or more of the existing conferences have finished and the licenses have been returned to the pool.
  • Contact your Pexip authorized support representative to purchase more licenses.

Note that when a license subsequently becomes available (e.g. because a participant leaves a conference, or because the administrator adds more licenses), the alarm is not cleared immediately; the alarm is cleared after the next participant successfully joins a conference.

Licenses expiring 3

licenses_expiring

Warning One or more of your licenses is due to expire within the next 60 days. Contact your Pexip authorized support representative to renew your licenses.
Call capacity limit reached 1

capacity_exhausted

Warning

A call has not been accepted because all Conferencing Nodes that are able to take the media for this call are at capacity. It could be either Proxying Edge Nodes or Transcoding Conferencing Nodes that are out of capacity.

Note: to understand how often this issue is occurring in your deployment, search the Administrator log for "out of proxying resource" or "out of transcoding resource".

This alarm clears either when an existing call is disconnected or the next time a new call is successfully placed.

  • Deploy more Conferencing Nodes in either the proxying or transcoding location as appropriate.
  • Move existing Conferencing Nodes onto more powerful servers.
  • Allocate more virtual CPUs for Conferencing Nodes on existing servers (if there are sufficient CPU cores). Note that the Conferencing Node will have to be rebooted for this to take effect.
  • Configure each location with a primary and secondary overflow location.
  • If a call is received in a location that contains Proxying Edge Nodes, that location must be configured with a Transcoding location that contains your Transcoding Conferencing Nodes.

Note that some types of call consume more resources than other calls. Thus, for example, if you are at full capacity and an audio-only call disconnects, there may still not be sufficient free resource to connect a new HD video call. For further information on capacity and how calls consume resources, see Hardware resource allocation rules.

Management Node limit reached 5

management_node_exhausted

Warning The Management Node does not have sufficient resources for the current deployment size (number of Conferencing Nodes).

Increase the amount of RAM and the number of virtual CPUs assigned to the Management Node.

See the recommended hardware requirements in Server design recommendations.

Trusted CA certificates expiring 6

trustedca_expiring

Warning One or more of your trusted CA certificates is due to expire within the next 30 days, or has already expired. Obtain and upload an updated certificate for the certificate authority.
TLS certificates expiring 7

tls_certificate_expiring

Warning One or more of your TLS certificates is due to expire within the next 30 days, or has already expired. Obtain and upload an updated TLS certificate. You may also need to delete the old certificate.
Incomplete TLS certificate chains 8

tls_certificate_chains

Warning A TLS certificate has an incomplete chain of trust to the root CA certificate. Obtain and upload the appropriate chain of intermediate CA certificates to the Management Node (the certificate provider normally provides the relevant bundle of intermediate CA certificates).
Syslog server inaccessible 4

syslog_inaccessible

Warning A syslog server has been configured to use TCP or TLS but either is not responding to contact requests, or the connection has dropped.
  • Check your network connectivity.
  • Check that the syslog server is running.
Connectivity lost between nodes 19

connectivity_lost

Warning

Communication to a Pexip Infinity node has been lost.

Check network connectivity and routing as for "Configuration not synchronized" above, or in the case of a software upgrade, wait for the upgrade process to complete.
Hardware instability detected 21

irregular_pulse

Warning Pexip Infinity has detected that the underlying VM infrastructure has paused the Pexip virtual machine. This is usually indicative of over-committed hardware, which we do not support. Pexip Infinity is a real time system and requires dedicated access to the underlying CPU and RAM resources of the hardware host.

Ensure that the Management Node and all Conferencing Nodes have dedicated access to their own RAM and CPU cores.

See the recommended hardware requirements in Server design recommendations.

CPU instruction set is deprecated 23

cpu_deprecated

Warning

The node is deployed on a server that is not using the AVX or later CPU instruction set (e.g. if it uses SSE4.2).

This alarm is raised when a Conferencing Node restarts and is automatically cleared after 48 hours.

Deploy the Conferencing Node on a server with AVX or later.
Hardware IO (input/output) instability detected 24

io_high_latency

Warning Pexip Infinity has recently detected consistent read latency greater than 100ms or write latency greater than 400ms.
  • Avoid having multiple VMs using the same physical hard drive.
  • Check the hard drive for failures.
VOIP scanner resistance has detected excessive incorrect aliases being dialed in a short period 26

possible_voip_scanner_ips_blocked

Warning Pexip Infinity's VOIP scanner resistance has detected excessive incorrect aliases being dialed in a short period, and has temporarily blocked access attempts from the suspected VOIP scanner IP addresses. See the administrator log for details of the calls.
PIN brute force resistance has detected excessive incorrect PIN entry attempts in a short period 27

service_access_quarantined

Warning Pexip Infinity's PIN brute force resistance has detected excessive incorrect PIN entry attempts in a short period, and has temporarily blocked access attempts to one or more conferencing services. See the administrator log for details of the calls.
Scheduled maintenance event (freeze) 39

scheduled_maintenance_event_freeze

Warning A scheduled maintenance event in Microsoft Azure has been detected.

No action is required.

Any Conferencing Node running on the affected VM is automatically placed into maintenance mode until the event completes.

Scheduled maintenance event (redeploy) 40

scheduled_maintenance_event_redeploy

Warning
Scheduled maintenance event (preemption) 41

scheduled_maintenance_event_preempt

Warning
Azure key vault certificate expiring 47

azure_key_vault_certificate_expiring

Warning The Teams Connector certificate (in the Azure key vault) is due to expire within the next 30 days. Update the Teams Connector certificate.
Google Meet Gateway Token expiring 49

gms_gateway_token_expiring

Warning The Google Meet gateway token is due to expire within the next 30 days. Update the gateway token.

Cloud bursting alarms

The following alarms may be raised in relation to issues with dynamic cloud bursting. See Dynamic bursting to a cloud service for more information about resolving these alarms.

Alarm ID Logged as Alarm = Level Cause Suggested resolutions
Not authorized to perform this operation

15

&

16

bursting_unauthorized_instance_failure

bursting_unauthorized_region_failure

Error Pexip Infinity is not authorized to view instance data or to start and stop instances in the cloud service.

For AWS, ensure that an appropriate policy document is configured in AWS and is attached to the user that is being used by the Pexip platform.

For Azure, check your Active Directory (AD) application and its associated role/permissions.

For GCP, check your service account and its associated role/permissions.

Authentication failure while trying to communicate with the cloud provider 17

bursting_authentication_failure

Error Pexip Infinity cannot sign in to the cloud service.

Check your cloud bursting settings in Platform > Global settings > Cloud bursting:

  • For AWS, check that the Access Key ID and Secret Access Key match the User Security Credentials for the user you added within Identity and Access Management in the AWS dashboard.
  • For Azure, check that your subscription, client and tenant IDs and secret key are correct for your Active Directory application.
  • For GCP, check that your configured GCP project ID, service account ID and private key are correct for your GCP service account.
Cloud bursting process encountered an unexpected error 12

bursting_error

Error

Pexip Infinity encountered an unexpected error while managing the cloud overflow nodes.

Check the status of your cloud bursting nodes within Pexip Infinity (Status > Cloud bursting) and of your instances within your cloud provider.

Also check administrator and support log messages that are tagged with a log module name of administrator.alarm to see additional error message information.

Cloud-bursting node found, but no corresponding Conferencing Node has been configured 13

bursting_missing_pexip_node

Warning

This occurs when Pexip Infinity detects a bursting instance with a tag matching your system's hostname but there is no corresponding Conferencing Node configured within Pexip Infinity.

This message can occur temporarily in a normal scenario when deploying a new Conferencing Node and you have set up the VM instance in your cloud provider but you have not yet deployed the Conferencing Node in Pexip Infinity. In this case, the issue will disappear as soon as the Conferencing Node is deployed.

A location contains cloud bursting nodes, but no other locations are using it for overflow 14

bursting_no_location_overflow

Warning A location contains some cloud overflow nodes, but no other locations are using it as an overflow location. Set the location containing the cloud overflow nodes as the Primary overflow location of the locations containing your "always on" Conferencing Nodes.

One-Touch Join alarms

The following alarms may be raised in relation to issues with One-Touch Join:

Alarm ID Logged as Alarm = Level Instance Cause Suggested resolutions
OTJ Google Gatherer Error 29

mjx_google_gatherer_failure

Error Google Connection Test Failure The connection test to Google Workspace has failed. This could be because your service account credentials are incorrect. Check your service account details, specifically the service account email and private key.
Google Room Connection Failure OTJ cannot connect to one of the rooms you have specified. This could be because the room is misconfigured within Google Workspace. Check the steps to set up a new room. Is the room resource email correct? Has it been shared with the service account?
OTJ Exchange Gatherer Error 30

mjx_exchange_gatherer_failure

Error Exchange Connection Test Error The connection test to Exchange has failed. This could be because your service account credentials are incorrect. Check your Exchange service account username and password.
Exchange OAuth Error OTJ cannot use OAuth to sign into Exchange. Check your OAuth credentials.
Exchange Room Connection Error OTJ cannot connect to the room specified in the alarm description. This could be because the room is misconfigured. Check the room has been correctly set up.
OTJ Endpoint Configurator Error 31

mjx_endpoint_configurator_failure

Error Endpoint Misconfigured Error The OTJ endpoint does not have a username and password configured, and there is no default username and password. Provide a username and password for the OTJ endpoint or for the associated OTJ profile.
Endpoint Request Error OTJ cannot connect to the endpoint. Check that the endpoint is configured correctly.
Endpoint Non-200 Status Code The endpoint returns a non-200 status code. Check the status code that is given in the logs. This is likely a configuration error with the endpoint.
OTJ Meeting Processor Failure 32

mjx_meeting_processor_failure

Error Meeting Processor Rendering Error, Template Error or Runtime Error The meeting processing rule could not extract a meeting alias. Check and edit the rule using the test tool.
OTJ Poly Endpoint Error 35

mjx_poly_failure

Error Poly Endpoint Not Polled A Poly endpoint that has Raise alarms enabled has not made contact with the OTJ calendaring service within the last 10 minutes.

Ensure thatEnable support for Pexip Infinity Connect clients and Client API is enabled.

Ensure that the configuration for endpoint on Pexip Infinity and on the endpoint itself is correct, in particular that the username and password configured on both match.

Ensure that the endpoint is showing as registered to the calendaring service.

Restart the endpoint.

OTJ Webex Failure 38

mjx_webex_failure

Error Webex Endpoint failed with <Webex error message>

One-Touch Join has received an error from Webex when attempting to send the request. Examples of these messages include:

  • Cloud Calendar is configured
  • Device has not registered as an XAPI provider
  • Webex Request Error
  • No response received from request
  • The server, while acting as a gateway or proxy, received an invalid response from the upstream server it accessed in attempting to fulfill the request

The resolution will depend on the issue, for example:

  • Disable the cloud calendar.
  • Check that the Device ID is correct.
  • Confirm that the correct ports are open, and that Pexip Infinity can reach Webex.
  • Confirm that the endpoint is switched on and connected to the internet, and that Webex can reach it.
Webex OAuth Error The system cannot get an access token for your Webex integration. Sign in with the Webex service account again.
Webex Configuration Error A Webex endpoint is configured but there is no Webex integration configured on the OTJ profile. Make sure you configure a Webex integration on the OTJ profile.
OTJ Graph Gatherer Error 45

mjx_graph_gatherer_failure

Error Graph Connection Test Error

A successful connection to the Graph API could not be made for any room. Possible causes:

  • A network connection issue between the Conferencing Node and Graph API (https://graph.microsoft.com).
  • The room email addresses (for the OTJ Endpoints used by this O365 Graph Integration) are incorrect.
  • The application permission (the Application set up in Azure for OAuth) has not been granted for the rooms.
  • Check network connectivity.
  • Check the email addresses used by the rooms.
  • Check the Azure App registration configuration.
Graph OAuth Error

There was a problem getting an OAuth access token that could be used to connect to the Graph API.

Possible causes:

  • There is a network connection issue between the Conferencing Node and the token endpoint URI (which is https://login.microsoftonline.com by default).
  • The OAuth Client ID, Client secret, or OAuth 2.0 token endpoint URL are not entered correctly in the O365 Graph Integration configuration on the Pexip Infinity Management Node.
  • The Azure application is not set up correctly (for example, the correct API permissions were not added, or admin consent was not provided).
Graph Room Connection Error There was a problem connecting to one or more rooms (but the OTJ service has successfully connected to other rooms).
  • Check that the room email address is correct.
  • Check the application permission is configured correctly to allow OTJ to read this room's calendar.