Marcos Duarte•10mo ago

Langchain Output Parsers

hello! I'm experimenting a bit the integration with langchain and I saw that it does not generate traces for output parsers. Do you have plans to implement that?

19 Replies

Marc•10mo ago

Hi Marcos, output parsers should work as expected See example trace: https://cloud.langfuse.com/project/cloramnkj0002jz088vzn1ja4/traces/d4f5cceb-70a9-4998-9fe3-8c710e7dc1a9?observation=cc983278-cb95-4510-bde4-3221878d4c1d Example

from langfuse.callback import CallbackHandler

handler = CallbackHandler()

from langchain.llms import OpenAI
from langchain.output_parsers import PydanticOutputParser
from langchain.prompts import PromptTemplate
from langchain_core.pydantic_v1 import BaseModel, Field, validator

model = OpenAI(model_name="text-davinci-003", temperature=0.0)


# Define your desired data structure.
class Joke(BaseModel):
    setup: str = Field(description="question to set up a joke")
    punchline: str = Field(description="answer to resolve the joke")

    # You can add custom validation logic easily with Pydantic.
    @validator("setup")
    def question_ends_with_question_mark(cls, field):
        if field[-1] != "?":
            raise ValueError("Badly formed question!")
        return field


# Set up a parser + inject instructions into the prompt template.
parser = PydanticOutputParser(pydantic_object=Joke)

prompt = PromptTemplate(
    template="Answer the user query.\n{format_instructions}\n{query}\n",
    input_variables=["query"],
    partial_variables={"format_instructions": parser.get_format_instructions()},
)

chain = prompt | model | parser
chain.invoke({"query": "Tell me a joke."}, {"callbacks":[handler]})

from langfuse.callback import CallbackHandler

handler = CallbackHandler()

from langchain.llms import OpenAI
from langchain.output_parsers import PydanticOutputParser
from langchain.prompts import PromptTemplate
from langchain_core.pydantic_v1 import BaseModel, Field, validator

model = OpenAI(model_name="text-davinci-003", temperature=0.0)


# Define your desired data structure.
class Joke(BaseModel):
    setup: str = Field(description="question to set up a joke")
    punchline: str = Field(description="answer to resolve the joke")

    # You can add custom validation logic easily with Pydantic.
    @validator("setup")
    def question_ends_with_question_mark(cls, field):
        if field[-1] != "?":
            raise ValueError("Badly formed question!")
        return field


# Set up a parser + inject instructions into the prompt template.
parser = PydanticOutputParser(pydantic_object=Joke)

prompt = PromptTemplate(
    template="Answer the user query.\n{format_instructions}\n{query}\n",
    input_variables=["query"],
    partial_variables={"format_instructions": parser.get_format_instructions()},
)

chain = prompt | model | parser
chain.invoke({"query": "Tell me a joke."}, {"callbacks":[handler]})

Marcos Duarte•10mo ago

thx Marc! the callback handler works well with the LLMChain class? that's how I'm creating the chain:

class WForceConversationSummarizationTask:
    def get_chain(self):
        template = """Read a support chat between a customer and a support agent. \
Based on the informations given in the chat, answer in a few words (max. 80) what is the main subject of the conversation,\
stay conscious and be precise, explain what was defined and the result of the service without inferring information. At the end, give a list of 5 keywords which address the main topics, separated by comma. \
You must answer in Portuguese, with only a json containing two attributes: text and keywords.\
The suport chat data is: {question} \
"""

        prompt = PromptTemplate(template=template, input_variables=["question"])

        return LLMChain(
            llm=AzureChatOpenAI(openai_api_version=os.environ["AZURE_OPENAI_API_VERSION"], azure_deployment=os.environ["AZURE_GPT35_DEPLOYMENT_NAME"], temperature=0.3),
            prompt=prompt,
            output_parser=WForceConversationSummarizationOutputParser()
        )

class WForceConversationSummarizationTask:
    def get_chain(self):
        template = """Read a support chat between a customer and a support agent. \
Based on the informations given in the chat, answer in a few words (max. 80) what is the main subject of the conversation,\
stay conscious and be precise, explain what was defined and the result of the service without inferring information. At the end, give a list of 5 keywords which address the main topics, separated by comma. \
You must answer in Portuguese, with only a json containing two attributes: text and keywords.\
The suport chat data is: {question} \
"""

        prompt = PromptTemplate(template=template, input_variables=["question"])

        return LLMChain(
            llm=AzureChatOpenAI(openai_api_version=os.environ["AZURE_OPENAI_API_VERSION"], azure_deployment=os.environ["AZURE_GPT35_DEPLOYMENT_NAME"], temperature=0.3),
            prompt=prompt,
            output_parser=WForceConversationSummarizationOutputParser()
        )

And that's how I'm invoking it:

task = WForceConversationSummarizationTask()
response = task.get_chain().invoke(input={"question": conversation}, config={"callbacks": [CallbackHandler()]})

task = WForceConversationSummarizationTask()
response = task.get_chain().invoke(input={"question": conversation}, config={"callbacks": [CallbackHandler()]})

The tracing looks like this I'm guessing i have to use LCEL?

Marcos Duarte•10mo ago

yep, that was it... solved! 🙂 thx @Marc ! btw, Is it possible to customize the high level trace name and metadata using the callback handler?

Marc•10mo ago

Nice, great that it works when using LCEL Not sure about the core issue with the previous implementation Callbacks are not always well-implemented across Langchain depending on how you combine the different abstractions We're currently simplifying the Langchain callback handler, will include this as well For the time being, you can createa trace using the Langfuse Python SDK to change name/metadata and then get a langchain handler for this trace

Marc•10mo ago

See details here: https://langfuse.com/docs/langchain/python#adding-trace-as-context-to-a-langchain-handler

Langchain integration (Python) - Langfuse

Langchain users can integrated with Langfuse in seconds using the integration

Marcos Duarte•10mo ago

great! thx again @Marc

Marc•10mo ago

sure, np

Marcos Duarte•10mo ago

this worked like a charm! 👌

Marc•10mo ago

Perfect, thanks for confirming

tivalii•10mo ago

Hi, I have a quick question regarding this flow of creating CallbackHandler (from trace). I faced with an issue, that no "input" and "output" are produced. Could, you, please, clarify whether this intended behaviour? Because using direct CallbackHandler creation flow works good. But what I need is to use user_id and custom trace is the only way.

Marc•10mo ago

Do you only want to add the user_id or do you customize the trace also in any other way? 👀 we could just add user_id as an additional constructor argument to CallbackHandler, wdyt @Max Forgot to answer your questions, This is the intended behavior as this implementation is often used to track multiple chains + custom observations on the same trace. If each chain would update the traces IO, it would not make sense Alternatively, you can manually set the trace input/output with the users input and the chain output Not as elegant but also not overly verbose

tivalii•10mo ago

Hi @Marc , thank you for your answer. Having user_id as an extra argument for CallbackHandler would be a great. In my case it will greatly simplify code without a need of manually update input/output on multiple chains calls within a single session_id .

Marc•10mo ago

Makes sense. Do you run a single chain on each user interaction?

tivalii•10mo ago

No, I'm running two chains one after another during a single user interaction, some kind of ConversationalRetrievalChain, but with custom logic, which requires splitting ConversationalRetrievalChain into two separate chains and independent invokations as well.