Consolidate separate caching docs (microsoft#2054)

jackgerrits · sonichi · web-flow · commit d3ec6290573e · 2024-03-20T03:40:59.000Z
* Consolidate separate caching docs

* add pointer for moved

---------

Co-authored-by: Chi Wang &lt;wang.chi@microsoft.com&gt;
diff --git a/website/docs/Use-Cases/enhanced_inference.md b/website/docs/Use-Cases/enhanced_inference.md
@@ -167,74 +167,7 @@ Note: if using a custom model client (see [here](/blog/2024/01/26/Custom-Models)
 
 ## Caching
 
-API call results are cached locally and reused when the same request is issued.
-This is useful when repeating or continuing experiments for reproducibility and cost saving.
-
-Starting version 0.2.8, a configurable context manager allows you to easily configure
-the cache, using either DiskCache or Redis.
-All `OpenAIWrapper` created inside the context manager can use the same cache
-through the constructor.
-
-```python
-from autogen import Cache
-
-with Cache.redis(redis_url="redis://localhost:6379/0") as cache:
-    client = OpenAIWrapper(..., cache=cache)
-    client.create(...)
-
-with Cache.disk() as cache:
-    client = OpenAIWrapper(..., cache=cache)
-    client.create(...)
-```
-
-You can also set a cache directly in the `create()` method.
-
-```python
-client = OpenAIWrapper(...)
-with Cache.disk() as cache:
-    client.create(..., cache=cache)
-```
-
-You can vary the `cache_seed` parameter to get different LLM output while
-still using cache.
-
-```python
-# Setting the cache_seed to 1 will use a different cache from the default one
-# and you will see different output.
-with Cache.disk(cache_seed=1) as cache:
-    client.create(..., cache=cache)
-```
-
-By default DiskCache uses `.cache` for storage. To change the cache directory,
-set `cache_path_root`:
-
-```python
-with Cache.disk(cache_path_root="/tmp/autogen_cache") as cache:
-    client.create(..., cache=cache)
-```
-
-### Turnning off cache
-
-For backward compatibility, DiskCache is always enabled by default
-with `cache_seed` set to 41. To fully disable it, set `cache_seed` to None.
-
-```python
-# Turn off cache in constructor,
-client = OpenAIWrapper(..., cache_seed=None)
-# or directly in create().
-client.create(..., cache_seed=None)
-```
-
-### Difference between `cache_seed` and openai's `seed` parameter
-
-openai v1.1 introduces a new param `seed`.
-The differences between autogen's `cache_seed` and openai's `seed`:
-    - autogen uses local disk cache to guarantee the exactly same output is produced
-    for the same input and when cache is hit, no openai api call will be made.
-    - openai's `seed` is a best-effort deterministic sampling with no guarantee
-    of determinism. When using openai's `seed` with `cache_seed` set to None,
-    even for the same input, an openai api call will be made and there is
-    no guarantee for getting exactly the same output.
+Moved to [here](/docs/topics/llm-caching).
 
 ## Error handling
 
diff --git a/website/docs/topics/llm-caching.md b/website/docs/topics/llm-caching.md
@@ -1,5 +1,7 @@
 # LLM Caching
 
+AutoGen supports caching API requests so that they can be reused when the same request is issued. This is useful when repeating or continuing experiments for reproducibility and cost saving.
+
 Since version [`0.2.8`](https://github.com/microsoft/autogen/releases/tag/v0.2.8), a configurable context manager allows you to easily
 configure LLM cache, using either [`DiskCache`](/docs/reference/cache/disk_cache#diskcache) or [`RedisCache`](/docs/reference/cache/redis_cache#rediscache). All agents inside the
 context manager will use the same cache.
@@ -16,6 +18,16 @@ with Cache.disk() as cache:
     user.initiate_chat(assistant, message=coding_task, cache=cache)
 ```
 
+The cache can also be passed directly to the model client's create call.
+
+```python
+client = OpenAIWrapper(...)
+with Cache.disk() as cache:
+    client.create(..., cache=cache)
+```
+
+## Controlling the seed
+
 You can vary the `cache_seed` parameter to get different LLM output while
 still using cache.
 
@@ -26,6 +38,8 @@ with Cache.disk(cache_seed=1) as cache:
     user.initiate_chat(assistant, message=coding_task, cache=cache)
 ```
 
+## Cache path
+
 By default [`DiskCache`](/docs/reference/cache/disk_cache#diskcache) uses `.cache` for storage. To change the cache directory,
 set `cache_path_root`:
 
@@ -34,6 +48,8 @@ with Cache.disk(cache_path_root="/tmp/autogen_cache") as cache:
     user.initiate_chat(assistant, message=coding_task, cache=cache)
 ```
 
+## Disabling cache
+
 For backward compatibility, [`DiskCache`](/docs/reference/cache/disk_cache#diskcache) is on by default with `cache_seed` set to 41.
 To disable caching completely, set `cache_seed` to `None` in the `llm_config` of the agent.
 
@@ -47,3 +63,7 @@ assistant = AssistantAgent(
     },
 )
 ```
+
+## Difference between `cache_seed` and OpenAI's `seed` parameter
+
+OpenAI v1.1 introduced a new parameter `seed`. The difference between AutoGen's `cache_seed` and OpenAI's `seed` is AutoGen uses an explicit request cache to guarantee the exactly same output is produced for the same input and when cache is hit, no OpenAI API call will be made. OpenAI's `seed` is a best-effort deterministic sampling with no guarantee of determinism. When using OpenAI's `seed` with `cache_seed` set to `None`, even for the same input, an OpenAI API call will be made and there is no guarantee for getting exactly the same output.