Understanding Generative AI Responses
Updated
Note: Before starting, ensure you set up content sources, deploy the model in the dialogue tree, and test the bot.
In this article, we will deconstruct the Generative AI response, to understand the final output, sources fetched and referred and the metadata that we receive from the LLM.
What is the process for generating response by the LLM?
User query → Sprinklr Rewords the query before peforming search → Sprinklr performs search on knowledge content to retrieve “context“ → Sprinklr passes user’s query, context and prompt containing instructions to LLM → LLM generates response, and tells which sources were used.
How to access the response?
Navigate to the “Conversations” tab, and select the case which contains the message for which you want to view the API response.
In the debug log, navigate to the Smart FAQ node.
The “Raw Response“ field contains the complete API response.
How to read this response?
Response:
{Root}.results[0].details.suggestions[0]
Field contains the “text“ or the actual response generated by the LLM to the user’s question.
Conversation:
{Root}.results[0].details.suggestions[0].addtional.conv
Field contains the past conversation history, which has been passed as context to the LLM to help in answering the question.
Logs:
{Root}.results[0].details.suggestions[0].addtional.logs
Field contains the processing metadata.
Sources:
{Root}.results[0].details.suggestions[0].addtional.sources
Field contains the sources fetched, which were sent to the LLM, based on which the LLM has framed the response. Please note that the exact response is formed from a subset of these sources.
rawResponse:
{Root}.results[0].details.suggestions[0].addtional.rawResponse
Field contains the response generated by the LLM, along with the citations.
These citations refer to the exact sources (from the ones in above field) from which the response was derived.
Processing Breakdown:
{Root}.results[0].details.suggestions[0].addtional.processing_breakdown
Field contains a breakdown of latency induced by various components in the pipeline, along with the Time to First Token (TTFT) and Time to Last Token (TTLT)
Context:
{Root}.results[0].details.placeholders[1]
Field contains the context, or exact chunks from the knowledge content which were fetched. It was this data that is actually passed to the LLM to aid in generating a response.
Reworded Question:
{Root}.results[0].details.placeholders[4]
Field contains the reworded question, based on which the search was performed on the provided knowledge content.