[WIP] logging schema #1

agola11 · 2022-12-20T07:05:36Z

See temporary script tests/unit_tests/logging/agent_log.py as a demonstration for how persisted trace logging could work:

langchain.logging.get_logger() returns shared logger instance

Output of tracing as json:

https://gist.github.com/agola11/ab7d5276973ccc6d6e0bc68a30bb119a

Next steps:

~~Wrap this logic in an easy to use logger interface~~
~~Create example script to demonstrate usage~~
~~Add log retrieval methods~~
Clean up PR
Add tests
Think about how to support parallel logging runs
Work on any other suggestions
Discuss strategy for logging

ShreyaR

Looks great! Some high level comments:

What are your thoughts on how to create runs into the Chain/LLM/Tool class? One way to do this could be to add an optional RunLogger callback for each class that creates the relevant Run and logs it to data storage.
Prompts and the generated output could potentially be pretty large, and is most probably going to be the main memory consumer for the DB. Wdyt about exploring NoSQL options for data storage? If retaining the relational structure is desirable, then another alternative (maybe for the future) is storing the inputs/outputs in object storage, and storing only the object storage URIs in the DB.

ShreyaR · 2022-12-20T09:31:29Z

langchain/logging/logger.py

+    success = Column(Boolean, default=True)
+    extra = Column(JSON, default={})
+    execution_order = Column(Integer, default=1)
+    serialized = Column(JSON)


Nice! Some thoughts & questions about the RunMixin schema:

Would be great to add in duration and error_msg fields to the schema.

What's the scope of execution_order? Are runs expected to have globally unique execution_orders, or is it going to be unique for runs of one chain/tool/llm object?

How do you plan to propagate success to parent runs?

Yup! Was thinking of editing schema to add start_time and end_time, still thinking through the error logging stuff, but yes we'll need an error field

The latter. Execution order only comes into play when there are nested runs (i.e. LLMChain -> LLM) and specifies the order within the nested run.

The parent run will be successful iff all of the children run succeed. By having each child run having its own success field, it's easy to see exactly which parts of a long nested chain succeed and which ones fail. Hoping to build out the logger abstraction a bit more today to make this more clear.

hwchase17 · 2022-12-20T13:49:56Z

we are probably going to have a concept of a prompt template and a prompt store to hold those templates/versions - how do these logs interact with that?

also, i think the notion of "projects" or something may be useful - eg as we have multiple people doing multiple runs, i imagine one common viewing pattern would be "give me all logs for this experiment i ran", so the ability to easily separate logs by projects/tags seems useful

additionally, there is probably additional metadata that people may want to log - eg the user id for the user input, the session id (if there is a concept of sessions that have multiple chain runs could be useful to group those) - probably should figure out (1) which, if any, we want to make columns vs just putting in some standard metadata field, (2) how to pass and store that metadata field

i see you put some logging in the search api, im assuming that is representative of the tool - the issue with that is that tools can currently just be a user defined function. i know two options we had talked about was (1) having a Tool interface, (2) just putting the logging in the agent. do you have any updated thoughts on this

hwchase17 · 2022-12-20T13:57:24Z

Note that an agent is just a chain that repeatedly invokes an LLMChain and Tools

this is true but its missing some of the nuance in that its invoking an LLMChain (potentially multiple times) in order to get a specifically formatted otuput that it can parse into a "Action" and "Action Input", and it then recieves an observation. There is sneaky a lot of logic in the agent abstraction (besides just a single LLM call) like multiple LLM calls, parsing, etc. and in many ways the fact that the agent is just an LLMChain to decide what to do is an implementation detail - in future could be multiple LLMchains, etc.

i think we could probably benefit from trying to surface this structure a bit more in our logs. Again, maybe this is just naming and calling it an AgentActionLog instead of tool log or something

hwchase17 · 2022-12-20T14:28:53Z

some additional context (which i think i told @agola11 but maybe not @ShreyaR ):

currently there is a logging abstraction in LangChain https://github.com/hwchase17/langchain/blob/master/langchain/logger.py with a StdOut logger implementation (eg just printing)

the reason this abstraction exists, and with the given interface, is i was working with sam whitmore and she basically needed some logging functionality to use langchain so i added it in. note that the logging abstraction for the llm (inputs, prompt after formatting, outputs) is EXACTLY the schema she was using to log things (the agent format was made up, but was one that played nicely with how i wanted to print things.

and then they way she used this was:

import contextlib
CONTEXT = {}

@contextlib.contextmanager
def set_context(**kwargs):
    global CONTEXT
    CONTEXT = kwargs
    yield
    CONTEXT = {}

from langchain.logger import BaseLogger

class TestLogger(BaseLogger):
    def log_llm_inputs(self, inputs: dict, prompt: str, **kwargs):
        global CONTEXT
        print(CONTEXT["id"])

import langchain
langchain.logger = TestLogger()

with set_context(id="foo"):
    react.run(question)

TestLogger she filled with logic to log to her DB, but the core pieces are:

import langchain
langchain.logger = TestLogger()

set the logger globally, so you dont have to pass it to every chain

with set_context(id="foo"):
    react.run(question)

this was the method we used to pass context (id, user ids, session id, etc) around - put them in a global variable for the scope of that run. this has the nice benefit of not having to remember to pipe this context through all the places in the chain (which can be very easy to forget to do and is a pain on developers and complicates the interfaces a good bit).

we can obviously change this up, and should think about doing so, but i think some learnings were: (1) she really liked this logging format for llm calls, eg having the llm inputs/formatter prompt/output all in one place was super helpful (that being said, she was viewing the raw logs, so if we added some ui that combined all this into one maybe that's resolved). (2) passing loggers/context through all the chains, and remembering to do so, was a medium-large pain and so setting them globally was helpful for that

agola11 · 2022-12-20T17:56:37Z

Looks great! Some high level comments:

What are your thoughts on how to create runs into the Chain/LLM/Tool class? One way to do this could be to add an optional RunLogger callback for each class that creates the relevant Run and logs it to data storage.

Prompts and the generated output could potentially be pretty large, and is most probably going to be the main memory consumer for the DB. Wdyt about exploring NoSQL options for data storage? If retaining the relational structure is desirable, then another alternative (maybe for the future) is storing the inputs/outputs in object storage, and storing only the object storage URIs in the DB.

@ShreyaR:

Going to toy around with this today. I think the easiest approach would be to have to calls to logger: one at the beginning and one at the end of each __call__ function for LLM and Chain, and then within Agent._call for tool.
Yup, definitely. The plan is to store the raw prompts in objects or blob storage somewhere, then include the handles in the DB. NoSQL could work too, but I think relational is a bit better because it 1) makes it easy to to fetch individual runs without fetching the entire nested composition and processing it (i.e. more granular data access), 2) enforces a standard schema/field 3) just better support for transactions/rollbacks and safeguards in general

agola11 · 2022-12-20T18:00:35Z

we are probably going to have a concept of a prompt template and a prompt store to hold those templates/versions - how do these logs interact with that?

also, i think the notion of "projects" or something may be useful - eg as we have multiple people doing multiple runs, i imagine one common viewing pattern would be "give me all logs for this experiment i ran", so the ability to easily separate logs by projects/tags seems useful

additionally, there is probably additional metadata that people may want to log - eg the user id for the user input, the session id (if there is a concept of sessions that have multiple chain runs could be useful to group those) - probably should figure out (1) which, if any, we want to make columns vs just putting in some standard metadata field, (2) how to pass and store that metadata field

i see you put some logging in the search api, im assuming that is representative of the tool - the issue with that is that tools can currently just be a user defined function. i know two options we had talked about was (1) having a Tool interface, (2) just putting the logging in the agent. do you have any updated thoughts on this

See my response to Shreya's comment: the prompt template is part of the LLM which would be part of the serialized objects I assume. This and the prompt can be stored in blob storage, and the handle can be included in the DB.

All of the things you mentioned, i.e. projects, sessions, users etc can all be stored as part of the data model and have relationships to the Run entities in the DB.

I'm leaning towards just putting the logging in Agent._call. Will play around with it today.

agola11 · 2022-12-20T18:04:12Z

this is true but its missing some of the nuance in that its invoking an LLMChain (potentially multiple times) in order to get a specifically formatted otuput that it can parse into a "Action" and "Action Input", and it then recieves an observation. There is sneaky a lot of logic in the agent abstraction (besides just a single LLM call) like multiple LLM calls, parsing, etc. and in many ways the fact that the agent is just an LLMChain to decide what to do is an implementation detail - in future could be multiple LLMchains, etc.

i think we could probably benefit from trying to surface this structure a bit more in our logs. Again, maybe this is just naming and calling it an AgentActionLog instead of tool log or something

Gotcha, yeah, we can do some renaming to better highlight this abstraction. Can also include a field in ChainRun to indicate whether the chain is running as an agent or not. Let's talk about this more when we sync.

hwchase17

is there any value in passing the id of the llm/chain/agent to the log_*_end functions and validating the id matches the one in the stack?

hwchase17 · 2022-12-20T23:59:17Z

langchain/llms/base.py

@@ -45,7 +46,10 @@ def generate(
    ) -> LLMResult:
        """Run the LLM on the given prompt and input."""
        if langchain.llm_cache is None:
-            return self._generate(prompts, stop=stop)
+            get_logger().log_llm_run_start({"name": self.__class__.__name__}, prompts)
+            output = self._generate(prompts, stop=stop)


we also call generate down below if cache exists but some things arent in cache, will want to put there as well

hwchase17 · 2022-12-21T00:06:44Z

langchain/logging/sqlite.py

+    def _end_log_run(self) -> None:
+        """Call at the end of a run."""
+
+        self._execution_order += 1


so execution order is based on the ending? wouldnt it be more intuitive to have it based on the start?

I think either one works

Actually, execution order in this convention is helpful for pretty printing in the correct order, so I'd like to keep it this way

agola11 · 2022-12-21T01:12:32Z

is there any value in passing the id of the llm/chain/agent to the log_*_end functions and validating the id matches the one in the stack?

The id actually isn't generated until the session gets committed, plus id would be another parameter that the user has to keep track of and pass in to end

hwchase17

okay some comments on usage pattern:

1 (nit): seems like some stuff is slightly off on the return? outputs has an outputs key, but inputs doesnt?

get_chain_runs returns all chain runs regardless of whether they are at the same level or not. eg i run it once and that function returns 2 chain runs (because one is a sub run of another). i think my query patterns will be: (a) show me all chain runs at the same level, or (b) show me all chain runs (and just all invocations in general in this particular chain run)
more of a structure nit, but get_chains runs probably doesnt belong on the logger (not the job of the logger to fetch results)
would be good to figure out what the base logger abstraction should be and get that into langchain, so we can just iterate on the logging class but use the up to date langchain codebase (already out of date a bit)

hwchase17 · 2022-12-21T22:26:29Z

comment #2 probalby the only big blocker in the short term

hwchase17 · 2022-12-22T20:34:45Z

langchain/logging/base.py

+
+    @abstractmethod
+    def log_llm_run_start(
+        self, serialized: Dict[str, Any], prompts: List[str], **extra: str


is Dict[str, Any] the best form?

what goes in extra?

Not sure what the best type is for serialized objects. I was assuming they get serialized as json, in which case Dict[str, Any] could work.

Extra is for any extra fields that you might want to store. Left it in there for added flexibility, but it's not used rn

hwchase17 · 2022-12-22T20:34:56Z

langchain/logging/base.py

+        """Log the start of an LLM run."""
+
+    @abstractmethod
+    def log_llm_run_end(self, response: Dict[str, Any], error=None) -> None:


practically, is error ever used currently

not sure response is the best type, should just be LLMResult?

error is not used rn, but in the future, it would be good to log the error of a particular run if it didn't succeed.

Agree on LLMResult

hwchase17 · 2022-12-22T20:37:26Z

langchain/logging/base.py

+        self,
+        serialized: Dict[str, Any],
+        action: str,
+        inputs: Dict[str, Any],


right now, tool inputs/outputs are assumed to be a single string rather than multiple things. if we want to make multiple here, we should change in code as well

Makes sense, I'll change to single string

agola11 added 5 commits December 15, 2022 14:57

add test file to debug llmmathchain issue with agent

1059321

Merge branch 'master' into ankush/test-agent-math

f9a0970

Merge branch 'master' into ankush/test-agent-math

1580d60

print statement logging

7f8b5e5

add db schema script

87305d6

ShreyaR reviewed Dec 20, 2022

View reviewed changes

agola11 added 5 commits December 20, 2022 14:56

add logger interface

52001bb

small refactor

b4a3b63

small refactor

ec56782

small cleanup

56b3348

rm print

9103c3e

hwchase17 reviewed Dec 21, 2022

View reviewed changes

agola11 added 2 commits December 21, 2022 10:28

add retrieval functions

127ba2a

add json serialization for trace runs

9cec988

hwchase17 reviewed Dec 21, 2022

View reviewed changes

agola11 added 7 commits December 21, 2022 14:40

fix start time issue

a33eba5

add top level querying

6aa7b69

add by id

af8d0fb

format

6010d58

consolidate child runs

212cbc0

abstract away session, use thread-local session

fcb538b

rework sqlite logger

1218582

agola11 added 4 commits December 22, 2022 12:10

format

8a36a63

formatting

5bed699

use update method

bb291e2

typing

f0ed18b

hwchase17 reviewed Dec 22, 2022

View reviewed changes

use thread local storage

cc470eb

agola11 closed this Dec 23, 2022

agola11 reopened this Dec 23, 2022

agola11 closed this Dec 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] logging schema #1

[WIP] logging schema #1

agola11 commented Dec 20, 2022 •

edited

Loading

ShreyaR left a comment

ShreyaR Dec 20, 2022

agola11 Dec 20, 2022

hwchase17 commented Dec 20, 2022

hwchase17 commented Dec 20, 2022

hwchase17 commented Dec 20, 2022

agola11 commented Dec 20, 2022

agola11 commented Dec 20, 2022

agola11 commented Dec 20, 2022

hwchase17 left a comment

hwchase17 Dec 20, 2022

hwchase17 Dec 21, 2022

agola11 Dec 21, 2022

agola11 Dec 21, 2022

agola11 commented Dec 21, 2022

hwchase17 left a comment

hwchase17 commented Dec 21, 2022

hwchase17 Dec 22, 2022

agola11 Dec 23, 2022 •

edited

Loading

hwchase17 Dec 22, 2022

hwchase17 Dec 22, 2022

agola11 Dec 23, 2022

hwchase17 Dec 22, 2022

agola11 Dec 23, 2022

[WIP] logging schema #1

[WIP] logging schema #1

Conversation

agola11 commented Dec 20, 2022 • edited Loading

ShreyaR left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hwchase17 commented Dec 20, 2022

hwchase17 commented Dec 20, 2022

hwchase17 commented Dec 20, 2022

agola11 commented Dec 20, 2022

agola11 commented Dec 20, 2022

agola11 commented Dec 20, 2022

hwchase17 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

agola11 commented Dec 21, 2022

hwchase17 left a comment

Choose a reason for hiding this comment

hwchase17 commented Dec 21, 2022

Choose a reason for hiding this comment

agola11 Dec 23, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

agola11 commented Dec 20, 2022 •

edited

Loading

agola11 Dec 23, 2022 •

edited

Loading