Extract data from Langfuse
I was just trying to pull data from my project for post-processing. I'm also looking for a way to backup my project. Any suggestions? Might there be a way to download a whole project to a fiile and upload/restore in case the existing project gets compromised/corrupted?
30 Replies
What are your plans on post-processing? Do you already have sth. in place?
Well, for example, suppose i wanted to create a QA dataset from traces that had good feedback
It depends a bit on your setup: Are you self-hosting? If yes, you could just export eh pg database.
If you use the hosted version, we have measures like point in time recovery in place.
yes self-hosting
In that case it depends on which features your PG provider has in place. In case you want to do things manually, pgdump (https://www.postgresql.org/docs/current/app-pgdump.html) might be the way to go
PostgreSQL Documentation
pg_dump
pg_dump pg_dump — extract a PostgreSQL database into a script file or other archive file Synopsis pg_dump [connection-option...] [option...] [dbname] …
TablePlus | Modern, Native Tool for Database Management.
Modern, native client with intuitive GUI tools to create, access, query & edit multiple relational databases: MySQL, PostgreSQL, SQLite, Microsoft SQL Server, Amazon Redshift, MariaDB, CockroachDB, Vertica, Cassandra, and Redis.
yeah i'm just using a postgres docker image
on your local machine?
yes
Ahh ok in this case, i guess tableplus is an easy setup
ah, cool. i'll look into that. thanks
Thanks! Are you using our scores to find out which one was good? And also what are you doing with the dataset?
I'm setting up a model with RAG for a demo with ~ 20 users in the next few weeks. I have feedback set up on a JS webapp that sends scores to Langfuse. After the demo I'm hoping to examine the data and get a sense for what needs to be improved and potentially making a QA dataset from it.
Thanks for all the hard work by the way. Having a tool like this is indispensable from my perspective. And being able to self-host is so critical, as data privacy is a major concern in my domain (healthcare).
Oh yes, healthcare is tough on that. You`re welcome, we enjoy the ride 🙂
I see. So basically having a dataset based on scores would help you to test your application and see whether it improves, correct?
Yeah i think being able to filter by user and score will be important