Citrix Media Offloading Architecture

Updated 

Citrix media offloading works by logically splitting the application into two layers. The first is the application layer, which includes all UI-related logic as well as signaling, such as call controls, agent actions, and session management. The second is the WebRTC layer, which is responsible for real-time media communication.

The real-time media processing can be divided into three functional parts.

  • The first is capturing the agent’s audio stream.

  • The second is establishing and maintaining the connection to the remote peer.

  • The third is receiving and playing back the customer’s audio.

When media offloading is not enabled, all of these responsibilities are handled by the browser’s built-in media and WebRTC engine (for example, Chrome or Edge). However, when media offloading is enabled, these responsibilities are offloaded to a Citrix-managed process spawned by the Citrix Workspace app, known as HdxRtcEngine.exe, which handles all three functions.

When an agent lands on Sprinklr with media offloading enabled, the Sprinklr UI first connects to a local WebSocket service - CtxHdxWebSocketService, running within the Citrix VDI itself. Access to this service is controlled through the
Computer\HKEY_LOCAL_MACHINE\SOFTWARE\WOW6432Node\Citrix\WebSocketService
registry key, as outlined in the configuration steps. This service is the point of entry for Sprinklr into the VDA. In other words, Sprinklr must open a secure websocket connection to this service, which listens at 127.0.0.1:9002 in the VDA.

Once this connection is established, Sprinklr signals the start of the HdxRtcEngine.exe process on the agent’s local machine via the Citrix Workspace app. When the process starts successfully, Sprinklr indicates readiness by displaying a successful Citrix media offloading status in the Agent Readiness modal. At this point, the environment is ready to use media offloading.

From that point onward, all media and WebRTC-related operations are forwarded as remote procedure calls to the agent’s local machine, where HdxRtcEngine.exe assumes responsibility for real-time media processing.

Related Articles: