Langfuse

L

Langfuse

Langfuse is the open source LLM Engineering Platform. This is our Community Discord.

Join

support

self-host-support

get-support

feedback

feature-suggestion

self-host-discussion

announcements-releases

Pivot table of experiments results

Is there some way to build a pivot table with models performance cross different datasets?
Solution:
We want to add pivot tables to the dashboard to break down data across Langfuse traces and datasets. Can you add this ideas here? Langfuse.com/idea For now, you could test different models via dataset runs which will give you a breakdown of evaluation scores by run (ie model)...

Does Langfuse require being run in a Node environment?

I'm trying to run it inside a Convex runtime (as described here) and running into some issues. This runtime is similar to Vercel's edge runtime. Vercel's AI SDK runs in the edge runtime, so Langfuse deps like open-telemetry are the only reason I am currently using Node

Filter LangChain tool calls in dashboard

Hi everyone! New to langfuse and was wondering if there is any way to filter specific LangChain tool calls in the dashboard. I'm using a custom AgentExecutor chain connected to multiple tools and while Langfuse is able to properly log all events, I am only able to filter out the top-most event in Dashboard. I noticed that agentic tool calls are being logged as span event so I was wondering if there is any way to use that as a filter for the Dashboard? My use-case is hopefully to display events i...
No description

Get trace by metadata key

Hi, is there an endpoint to get traces using the metadata key? In Python SDK I can see a method like fetch_traces, which uses multiple parameters to filter traces out, however, there is no metadata field. Is it supported?

Langfuse batch outputs missing

I have an issue with "outputs" not populating for langchain batch calls. They either don't appear for any of the items in the batch, or only for the last one. I have created an issue here: https://github.com/langfuse/langfuse/issues/2940 Appreciate any help on that!...

Login problem

Looks like there is some login problem, my account is empty now and no api access
Solution:
looks like I was in EU region and then landed in US automatically

Summary evaluators. f1 score

How to calculate and display f1 score on langfuse?
Solution:
Hi @maximlupey, summary evaluators bound to dataset runs are on our roadmap, tracking this here: https://github.com/orgs/langfuse/discussions/2511 Feel free to add your thoughts or ideas...

Traces take too long to show up

I'm trying to use langfuse to debug traces during development and I find myself waiting minutes to see traces appear. Is this expected? when using langsmith I can see stuff showing up almost instantly which is the experience I'm looking for....

Is it possible to score a full dataset run?

I'm wondering if there is a best practice for evaluations that require multiple traces. Primary use case would be running a prompt against a full dataset and wanting to evaluate the total precision/recall/f1/etc. Right now I can score each dataset item but I haven't figured out a great way to surface metrics to the UI that would encompass the full run. The alternative I've tested is encompassing the full run in a trace and scoring that but it seems a bit hacky....
Solution:
Can you add your +1 to this idea post that tracks this feature? https://github.com/orgs/langfuse/discussions/2511 Currently only averages are nicely supported but we plan to look into run level scores...

Is it possible to have a traces between different AWS Lambdas?

Hi everyone! Quick question: I am using a trace to document every OpenAI call in a single lambda and everything works as expected. However, this lambda is growing rapidly so I would like to separate the tasks between different Lambdas but still be able to see all the generations in a single trace. Is it possible to share the trace via 'props' between my lambdas? If not, do you know a workaround to address this? Thanks!...
Solution:
Yes, all Langfuse integrations allow setting custom trace ids that you can then pass between the different lambdas. Which integration do you use?

Structured Outputs OpenAI

Langfuse doesn't seem to support the new "Structured Outputs" feature from openai. just throws up lots of errors. are there plans to do so? https://openai.com/index/introducing-structured-outputs-in-the-api/...
Solution:
Do you run the latest langfuse python sdk? As the apis use chat completion under the hood, you'll get some default traces without any changes as long as you update the langfuse sdk More on our support of the beta apis here: https://langfuse.com/docs/integrations/openai/python/get-started#openai-beta-apis...

Warning issue when disabled

this issue has been closed, but not resolved. is there no plan to fix it? it's really annoying! https://github.com/langfuse/langfuse/issues/2475...
Solution:
thanks for the ping, i reopened the issue

Issue with prompt search

I am trying to save from the playground a new version of an existing prompt. To do so I am using the "Search chat prompts...", but this search seems broken and do not work at all (even for the simplest search).
Solution:
Hi @sasha97 - thanks for the report. This is fixed in https://github.com/langfuse/langfuse/pull/2911 and will be soon released
No description

Extracting scores from traces ?

What is the best way of extracting traces with scores ? When I do traces = langfuse.client.trace.list().data I get all the traces but the scores are uuids. What is the most best solution for this. Potentially I would love to have them in csv. The scores are from RAGAS metrics...
Solution:
you can fetch the scores via the scores endpoint or get a list of scores via the list scores endpoint

Is it possible to end a trace?

So i am scraping urls of a domain and doing some analysis on top of it. For each url of that domain, I have a generation. Once all urls are done. I would want to close the trace so that the next domain gets a new trace. 1 trace for 1 domain with 5 generations (1 for each url of that domain). I don't think it is doable since there is no trace.end() and I think I have not understood traces....
Solution:
I'd move the trace creation to the generateThing and create a new trace for every thing

User feedback

Hello, I have little experience, so I don't know if this is possible. I have been trying to pass the user feedback (i.e., thumbs up) from OpenWeb UI to Langfuse. I managed to modify the .py for the pipeline and pass the valves, but the tiny detail is that the user feedback can come a while after the pipeline sends the "info" to Langfuse. Is there a way to force updates to Langfuse if the user feedback changes? (crossposted in Open WebUI channel)...
Solution:
You can add scores to Langfuse before/after a trace is created. They’ll be then associated at read via the UI or GET APIs. The UI will refresh every time you switch back to a Langfuse tab, otherwise you’d need to reload the page

custom generation langchain runnable lambda

Hello, I'm using Lanchain (and Fireworks). I'm using the _generate method from ChatFireworks (to be able to set the nparameter) as follows:
RunnableLambda(lambda _messages: model._generate(_messages, **kwargs))
RunnableLambda(lambda _messages: model._generate(_messages, **kwargs))
...

LangChain+Ollama+Langfuse Not Recording Token Usage

Hello, I am using ChatOllama from LangChain to send LLM requests to local Ollama server. I am also integrating LangFuse with LangChain to trace the requests. The generation requests are being successfully traced, including the input and output of the model. However, the token usage is always 0. I attached a screenshot of one trace showing zero token usage. I checked the output of LangChain's invoke method and the usage data is in the response. It's accessible via response.usage_metadata["input_tokens"] and response.usage_metadata["output_tokens"]. I also tried langfuse_context.update_current_observation(usage={"input": response.usage_metadata["input_tokens"], "unit": "TOKENS"}) but it still shows zero....
Solution:
thanks for reporting this, can you open an issue on github for this? https://langfuse.com/issues
No description

null value traces

Hello, I have a problem with null traces (input, output) I’ve tried trace/trace.end but it says there is no end method in trace. I’ve changed it to trace.update but now I do not have traces names on dashboard...
Solution:
Traces do not have an end method: https://langfuse.com/docs/sdk/python/low-level-sdk

Self-hosted models via API

Hello, could you please tell how do I evaluate models which are self-hosted via API?
Solution:
You can use any model with Langfuse. You can either log the usage via the low level sdks or eg via the Python decorator. Find an example here: https://langfuse.com/docs/sdk/python/decorators#log-any-llm-call