James Coates
James Coates7mo ago

Hey. Firstly thank you for all the work

Hey. Firstly thank you for all the work on Langfuse. Is there a way when using LangFuse with LangChain that I would be able to exclude some or all of the inputs & outputs from a trace? I am dealing with sensitive data so don't want to be logging this in production. Looking through the docs, I'm not sure whether this feature exists?
8 Replies
Agnieszka Mikołajczyk
Just asked the same question in the get-support channel. Seems like a very needed feature.
dave
dave7mo ago
we use LiteLLM to make API keys that don't log to Langfuse.
ladislas14
ladislas147mo ago
Hi, I actually had the same issue yesterday and end up forking the Langchain CallbackHandler and removing the input and output in the created observations. This is not perfect and I think there is a better way to do it. I would be pleased to know if there is something planned in the roadmap about data anonymisation ! For the moment, if you need, I can share what I've done with the CallbackHandler
Marc
Marc7mo ago
Quick vote, do you want to omit input and output to be send to Langfuse (no tokencounts/cost tracking then happening) or just disable that it is stored? On which level would you like to toggle this (project, single generation, Langfuse sdk instance)?
ladislas14
ladislas147mo ago
That's a good question. Wouldn't be possible to have a more granular role control. I think it would be nice to be able to hide the inputs/outputs in Langfuse for a specific role (quite like the current Viewer role) You mean the tokencounts/cost tracking will not be infered but it can still be ingested no ?
James Coates
James Coates7mo ago
I think disabling that it is stored is the best solution. Token counts & cost tracking is still very valuable & if someone still has data protection/privacy concerns then that sounds like they should be self-hosting (I am self hosting for my use case). In terms of what level to toggle, I think for my use case it would make the most sense to toggle per generation as not all generations will necessarily contain protected data. I imagine toggling per project would also be very valuable for a lot of people as well.
ladislas14
ladislas147mo ago
I think the need might be different between self-hosted and cloud. In my case where we are self-hosting Langfuse, our need is mostly to be able to hide the data to certain users, but let them see the trace, the usage, etc.
Agnieszka Mikołajczyk
In my case, it would be best to be able to select whether we want to store inputs/outputs and keep only the token tracing and cost tracking part.