Hey. Firstly thank you for all the work
Hey. Firstly thank you for all the work on Langfuse.
Is there a way when using LangFuse with LangChain that I would be able to exclude some or all of the inputs & outputs from a trace? I am dealing with sensitive data so don't want to be logging this in production. Looking through the docs, I'm not sure whether this feature exists?
8 Replies
Just asked the same question in the get-support channel. Seems like a very needed feature.
we use LiteLLM to make API keys that don't log to Langfuse.
Hi, I actually had the same issue yesterday and end up forking the Langchain
CallbackHandler
and removing the input and output in the created observations.
This is not perfect and I think there is a better way to do it.
I would be pleased to know if there is something planned in the roadmap about data anonymisation !
For the moment, if you need, I can share what I've done with the CallbackHandler
Quick vote, do you want to omit input and output to be send to Langfuse (no tokencounts/cost tracking then happening) or just disable that it is stored?
On which level would you like to toggle this (project, single generation, Langfuse sdk instance)?
That's a good question.
Wouldn't be possible to have a more granular role control. I think it would be nice to be able to hide the inputs/outputs in Langfuse for a specific role (quite like the current Viewer role)
You mean the tokencounts/cost tracking will not be infered but it can still be ingested no ?
I think disabling that it is stored is the best solution. Token counts & cost tracking is still very valuable & if someone still has data protection/privacy concerns then that sounds like they should be self-hosting (I am self hosting for my use case).
In terms of what level to toggle, I think for my use case it would make the most sense to toggle per generation as not all generations will necessarily contain protected data. I imagine toggling per project would also be very valuable for a lot of people as well.
I think the need might be different between self-hosted and cloud.
In my case where we are self-hosting Langfuse, our need is mostly to be able to hide the data to certain users, but let them see the trace, the usage, etc.
In my case, it would be best to be able to select whether we want to store inputs/outputs and keep only the token tracing and cost tracking part.