Smart Comprehend in AI+ Studio
Updated
Smart Comprehend is a powerful AI capability that identifies relevant information from the knowledge base and suggests it to agents based on the context of a customer’s conversation. These recommendations are surfaced in the Smart Assist tab of both the Agent and Care Console, streamlining access to helpful content and enabling quicker responses. This document outlines the end-to-end process of deploying Smart Comprehend, from preparing the indexing pipeline to activating real-time inference using AI pipelines in Sprinklr’s AI+ Studio.
Note: This capability is currently a Limited Availability (LA) feature.
Use Case
Smart Comprehend supports RAG-based ML pipeline components exposed in AI+ Studio. This enhancement makes the pipelines more transparent and configurable within the platform.
Previously, Smart Comprehend was an entirely backend-driven feature. All ML pipeline operations, including configuration, fine-tuning, and enhancements, were handled exclusively by ML engineers. Any changes to the pipeline (e.g., adjusting thresholds, modifying article ranking logic, or experimenting with feature weights) required backend intervention.
Implementation teams and Admin users can directly access and update pipeline configurations in a self-service manner, streamlining customization and reducing turnaround time for enhancements wherever this is enabled.
Steps to Navigate to Smart Comprehend Pipeline
Click the New Tab icon. Under Platform Modules, select AI+ Studio within Learn.
On the AI+ Studio homepage, click Deploy your Use-Cases.
On the Choose your use-case window, navigate to the Sprinklr Service tab and select Smart Comprehend within Agent Assist.
Currently, there are two flows:
Indexing : The Indexing Pipeline orchestrates the complete workflow for preparing and training data. It manages content ingestion, chunking, vectorization, and storage, ensuring the data is well-structured and optimized for downstream processes like retrieval and inference.
Inference : The Inference Pipeline defines how trained data is applied to generate real-time predictions and responses. It manages the flow of incoming queries through sequential stages, including query creation, content retrieval, answer generation, and post-processing, to ensure outputs are accurate, contextually relevant, and timely.
Steps to Create an Indexing Pipeline
From the left panel, go to Indexing and the Record Manager, for the same will open up.
Click + Deployment in the top right corner.
On the Basic Details window, fill in the required fields.
The following table lists the various fields and their description.
Fields
Description
Name
Enter a unique name to your indexing pipeline.
Description
Specify a description for your pipeline.
Training Frequency
Specify how often the model should check for updates in Knowledge Base (KB) articles and retrain to incorporate the latest content.
Schedule Type: You can define the schedule type from the dropdown. The available options are: Minutes, Hourly, Daily, Daily (Exclude Weekend), Weekly, Monthly, and Custom.
Repeat: You can define the repeat frequency based on the option you choose in the schedule type field.
Training Datasets
Specify the set of Knowledge Base (KB) articles that you want to be used for training the model.
Share Deployments with
Define the level of permission for different users or user groups. Please note that the Global Admins & Workspace Admins have Editor access by default.
Once done, click Save in the bottom right corner.
The indexing canvas window for your deployment will open.
Once you have created the indexing pipeline, click Save and Deploy in the bottom right corner.
How Indexing Pipeline Work?
Article Selection
The pipeline begins with the set of Knowledge Base (KB) articles configured in the Basic Details screen.
These articles serve as the source from which information will be processed and later retrieved.
Preprocessing (Update Properties Node)
The raw articles often contain unnecessary formatting such as HTML tags, redundant markup, or styling.
During preprocessing, these issues are removed, and only clean, meaningful text is extracted for further use.
Chunking the Data
Instead of treating an article as a single large block of text, the content is split into smaller, context-rich segments (chunks).
This makes retrieval more efficient and accurate, since searching within smaller chunks increases contextual relevance.
Loop Node Processing
Each chunk created in the previous step is sent through a loop node for further processing.
Within this loop, the following happens:
A prompt node generates sample queries that could potentially be answered using the content of that chunk.
Embeddings are then created for these queries.
Embeddings are vector representations of words or text, capturing semantic meaning in a form that a machine can process.
Index Creation (Vector Index Field)
Once embeddings are generated, they are saved in a vector index.
The vector index acts as a structured store of embeddings, enabling fast similarity searches.
From this index, the pipeline can retrieve the most relevant article chunks when a real query is made.
Steps to Create an Inference Pipeline
From the left panel, go to Inference, and the record manager for the same will open.
Click + Deployment in the top right corner.
Fill in the required fields on the Basic Details window.
The following table lists the various fields in the Basic Details window and their description.
Fields
Description
Name
Enter a unique name to your inference pipeline.
Description
Specify a relevant description for the pipeline.
Priority
Specify the priority level for the Inference Pipeline. If multiple deployments conflict, this value determines which one takes precedence.
Deploy on all records
Enable this toggle to deploy the Inference Pipeline on all records. Disable this toggle to specify the conditions for deployment.
Filters
Select which cases you want the inference pipeline to run on.
Share Deployments with
Select the Users and/or User Groups who can access the Inference Pipeline. Please note that the Global Admins and Workspace Admins have editor access by default.
Once done, click Next in the bottom right corner to move to the Additional Details window. Fill in all the required details.
Note: The Additional Details section is only applicable for Voice channels.
The following table lists the various fields in the Additional Settings window and their description.
Field
Description
Warm Up Period
Define the number of inbound messages during which predictions will be made using the warm-up frequency.
Warm Up Frequency
Specify how often the model should be triggered for predictions during the warm-up period.
Frequency
After the warm-up period ends, set the frequency (in number of inbound messages) at which the model should continue to generate predictions.
Closing Period
Set the total number of inbound messages after which the model will stop generating predictions.
Context Carry Forward
Specify the number of inbound and outbound messages to be included as context for each prediction.
Note: The fields in the Additional Settings window is preconfigured with default values optimized for performance. While you can modify them, it is recommended to retain the defaults unless you have a clear understanding of the impact of the changes.
Once done, click Save in the bottom right corner.
The inference canvas window for your deployment will open.
How does the Inference Pipeline Work?
Once the training (index) pipeline is created and deployed, the inference pipeline is responsible for using that indexed data to generate predictions for live cases.
Input Preparation (Update Properties Node)
The incoming conversation or case data is cleaned and standardized.
This preprocessing step ensures that only meaningful text is passed forward for prediction.
Query Generation (Prompt Node)
A prompt node is added, which takes the conversation as input.
The prompt is designed to convert the conversation into a structured query that can be used for retrieval.
Embedding Generation (Generate Embeddings Node)
Embeddings are generated for the query, creating a vector representation of the text.
Unlike the training pipeline, embeddings are not saved here. Instead, they are used directly for retrieval.
Retrieval (Retrieve Embeddings Node)
The generated embeddings are compared against stored vectors in the vector index.
In the Vector Index field, all deployed indexes appear in a dropdown, allowing selection of the preferred one.
Two key parameters define retrieval:
Confidence Threshold: Specifies the minimum similarity score required for a result to be predicted.
Max Predictions: Defines how many predictions to return at once.
Output Formatting (Update Properties Node)
The retrieved article chunks are trimmed, refined, and formatted.
This ensures that the final output is concise, relevant, and user-friendly for preview and case resolution.
Once done, click Save and Deploy in the bottom right corner.