-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] logging schema #1
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Some high level comments:
- What are your thoughts on how to create runs into the Chain/LLM/Tool class? One way to do this could be to add an optional
RunLogger
callback for each class that creates the relevant Run and logs it to data storage. - Prompts and the generated output could potentially be pretty large, and is most probably going to be the main memory consumer for the DB. Wdyt about exploring NoSQL options for data storage? If retaining the relational structure is desirable, then another alternative (maybe for the future) is storing the inputs/outputs in object storage, and storing only the object storage URIs in the DB.
langchain/logging/logger.py
Outdated
success = Column(Boolean, default=True) | ||
extra = Column(JSON, default={}) | ||
execution_order = Column(Integer, default=1) | ||
serialized = Column(JSON) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Some thoughts & questions about the RunMixin
schema:
- Would be great to add in
duration
anderror_msg
fields to the schema. - What's the scope of
execution_order
? Are runs expected to have globally uniqueexecution_order
s, or is it going to be unique for runs of one chain/tool/llm object? - How do you plan to propagate
success
to parent runs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Yup! Was thinking of editing schema to add
start_time
andend_time
, still thinking through the error logging stuff, but yes we'll need anerror
field - The latter. Execution order only comes into play when there are nested runs (i.e. LLMChain -> LLM) and specifies the order within the nested run.
- The parent run will be successful iff all of the children run succeed. By having each child run having its own
success
field, it's easy to see exactly which parts of a long nested chain succeed and which ones fail. Hoping to build out the logger abstraction a bit more today to make this more clear.
we are probably going to have a concept of a prompt template and a prompt store to hold those templates/versions - how do these logs interact with that? also, i think the notion of "projects" or something may be useful - eg as we have multiple people doing multiple runs, i imagine one common viewing pattern would be "give me all logs for this experiment i ran", so the ability to easily separate logs by projects/tags seems useful additionally, there is probably additional metadata that people may want to log - eg the user id for the user input, the session id (if there is a concept of sessions that have multiple chain runs could be useful to group those) - probably should figure out (1) which, if any, we want to make columns vs just putting in some standard metadata field, (2) how to pass and store that metadata field i see you put some logging in the search api, im assuming that is representative of the tool - the issue with that is that tools can currently just be a user defined function. i know two options we had talked about was (1) having a Tool interface, (2) just putting the logging in the agent. do you have any updated thoughts on this |
this is true but its missing some of the nuance in that its invoking an LLMChain (potentially multiple times) in order to get a specifically formatted otuput that it can parse into a "Action" and "Action Input", and it then recieves an observation. There is sneaky a lot of logic in the agent abstraction (besides just a single LLM call) like multiple LLM calls, parsing, etc. and in many ways the fact that the agent is just an LLMChain to decide what to do is an implementation detail - in future could be multiple LLMchains, etc. i think we could probably benefit from trying to surface this structure a bit more in our logs. Again, maybe this is just naming and calling it an AgentActionLog instead of tool log or something |
some additional context (which i think i told @agola11 but maybe not @ShreyaR ): currently there is a logging abstraction in LangChain https://github.com/hwchase17/langchain/blob/master/langchain/logger.py with a StdOut logger implementation (eg just printing) the reason this abstraction exists, and with the given interface, is i was working with sam whitmore and she basically needed some logging functionality to use langchain so i added it in. note that the logging abstraction for the llm (inputs, prompt after formatting, outputs) is EXACTLY the schema she was using to log things (the agent format was made up, but was one that played nicely with how i wanted to print things. and then they way she used this was:
TestLogger she filled with logic to log to her DB, but the core pieces are:
set the logger globally, so you dont have to pass it to every chain
this was the method we used to pass context (id, user ids, session id, etc) around - put them in a global variable for the scope of that run. this has the nice benefit of not having to remember to pipe this context through all the places in the chain (which can be very easy to forget to do and is a pain on developers and complicates the interfaces a good bit). we can obviously change this up, and should think about doing so, but i think some learnings were: (1) she really liked this logging format for llm calls, eg having the llm inputs/formatter prompt/output all in one place was super helpful (that being said, she was viewing the raw logs, so if we added some ui that combined all this into one maybe that's resolved). (2) passing loggers/context through all the chains, and remembering to do so, was a medium-large pain and so setting them globally was helpful for that |
|
See my response to Shreya's comment: the prompt template is part of the LLM which would be part of the serialized objects I assume. This and the prompt can be stored in blob storage, and the handle can be included in the DB. All of the things you mentioned, i.e. projects, sessions, users etc can all be stored as part of the data model and have relationships to the Run entities in the DB. I'm leaning towards just putting the logging in |
Gotcha, yeah, we can do some renaming to better highlight this abstraction. Can also include a field in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there any value in passing the id of the llm/chain/agent to the log_*_end functions and validating the id matches the one in the stack?
@@ -45,7 +46,10 @@ def generate( | |||
) -> LLMResult: | |||
"""Run the LLM on the given prompt and input.""" | |||
if langchain.llm_cache is None: | |||
return self._generate(prompts, stop=stop) | |||
get_logger().log_llm_run_start({"name": self.__class__.__name__}, prompts) | |||
output = self._generate(prompts, stop=stop) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we also call generate down below if cache exists but some things arent in cache, will want to put there as well
langchain/logging/sqlite.py
Outdated
def _end_log_run(self) -> None: | ||
"""Call at the end of a run.""" | ||
|
||
self._execution_order += 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so execution order is based on the ending? wouldnt it be more intuitive to have it based on the start?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think either one works
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, execution order in this convention is helpful for pretty printing in the correct order, so I'd like to keep it this way
The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay some comments on usage pattern:
1 (nit): seems like some stuff is slightly off on the return? outputs has an outputs key, but inputs doesnt?
-
get_chain_runs returns all chain runs regardless of whether they are at the same level or not. eg i run it once and that function returns 2 chain runs (because one is a sub run of another). i think my query patterns will be: (a) show me all chain runs at the same level, or (b) show me all chain runs (and just all invocations in general in this particular chain run)
-
more of a structure nit, but get_chains runs probably doesnt belong on the logger (not the job of the logger to fetch results)
-
would be good to figure out what the base logger abstraction should be and get that into langchain, so we can just iterate on the logging class but use the up to date langchain codebase (already out of date a bit)
comment #2 probalby the only big blocker in the short term |
|
||
@abstractmethod | ||
def log_llm_run_start( | ||
self, serialized: Dict[str, Any], prompts: List[str], **extra: str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is Dict[str, Any] the best form?
what goes in extra?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what the best type is for serialized objects. I was assuming they get serialized as json
, in which case Dict[str, Any]
could work.
Extra is for any extra fields that you might want to store. Left it in there for added flexibility, but it's not used rn
"""Log the start of an LLM run.""" | ||
|
||
@abstractmethod | ||
def log_llm_run_end(self, response: Dict[str, Any], error=None) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
practically, is error ever used currently
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure response is the best type, should just be LLMResult?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
error is not used rn, but in the future, it would be good to log the error of a particular run if it didn't succeed.
Agree on LLMResult
self, | ||
serialized: Dict[str, Any], | ||
action: str, | ||
inputs: Dict[str, Any], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right now, tool inputs/outputs are assumed to be a single string rather than multiple things. if we want to make multiple here, we should change in code as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, I'll change to single string
See temporary script
tests/unit_tests/logging/agent_log.py
as a demonstration for how persisted trace logging could work:langchain.logging.get_logger()
returns shared logger instanceOutput of tracing as json:
https://gist.github.com/agola11/ab7d5276973ccc6d6e0bc68a30bb119a
Next steps:
Wrap this logic in an easy to use logger interfaceCreate example script to demonstrate usageAdd log retrieval methods