-
Notifications
You must be signed in to change notification settings - Fork 4
Add LightRAG integration #82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 6 commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
82bcd2e
Add the integration with LightRAG
gitbuda c5ad615
Add a why item
gitbuda 25f0b68
Add dummy parsing
gitbuda fc2e203
Add WIP (making lightrag work with gpt-oss-20b)
gitbuda 3e5cadf
Add the right API to wrap the LightRAG
gitbuda f073d25
Add the library project structure
gitbuda e171eda
Remove the version from the __init__.py file under src
gitbuda File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,2 @@ | ||
| lightrag_storage.out/ | ||
| lightrag.log |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,5 @@ | ||
| # Why? | ||
|
|
||
| * High-quality, slow and high-cost entity extraction. | ||
| * Connect extracted entities with existing domain-specific entities. | ||
| * Cost for your scale (e.g., $ per your document page). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,103 @@ | ||
| import os | ||
| import time | ||
| import traceback | ||
|
|
||
| import asyncio | ||
| from lightrag.llm.openai import gpt_4o_mini_complete, openai_embed | ||
| import shutil | ||
|
|
||
| from lightrag_memgraph import MemgraphLightRAGWrapper | ||
| from memgraph_toolbox.api.memgraph import Memgraph | ||
|
|
||
|
|
||
| SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__)) | ||
| WORKING_DIR = "./lightrag_storage.out" | ||
| if os.path.exists(WORKING_DIR): | ||
| shutil.rmtree(WORKING_DIR) | ||
| if not os.path.exists(WORKING_DIR): | ||
| os.mkdir(WORKING_DIR) | ||
| memgraph = Memgraph() | ||
| memgraph.query("MATCH (n) DETACH DELETE n;") | ||
|
|
||
| DUMMY_TEXTS = [ | ||
| """In the heart of the bustling city, a small bookstore stood as a | ||
| sanctuary for dreamers and thinkers alike. Its shelves, lined with | ||
| stories from every corner of the world, beckoned to those seeking | ||
| adventure, solace, or simply a quiet moment away from the noise. | ||
| The owner, an elderly gentleman with a gentle smile, greeted each | ||
| visitor as if they were an old friend. On rainy afternoons, the | ||
| soft patter of drops against the windows created a symphony that | ||
| mingled with the rustle of pages. Children gathered in the reading | ||
| nook, their imaginations ignited by tales of dragons and distant | ||
| lands. College students found refuge among the stacks, their minds | ||
| wandering as they prepared for exams. The bookstore was more than a | ||
| place to buy books; it was a haven where stories came alive, | ||
| friendships blossomed, and the magic of words wove its spell on all | ||
| who entered.""", | ||
| """Beneath the golden canopy of autumn leaves, a | ||
| quiet park unfolded its charm to those who wandered its winding | ||
| paths. Joggers traced familiar routes, their breath visible in the | ||
| crisp morning air, while elderly couples strolled hand in hand, | ||
| reminiscing about days gone by. Children’s laughter echoed from the | ||
| playground, where swings soared and slides became mountains to | ||
| conquer. A painter sat on a weathered bench, capturing the fiery | ||
| hues of the season on her canvas, her brush dancing with | ||
| inspiration. Nearby, a group of friends gathered for a picnic, | ||
| sharing stories and homemade treats as squirrels darted hopefully | ||
| around their feet. The gentle breeze carried the scent of earth and | ||
| fallen leaves, inviting all to pause and savor the moment. In this | ||
| tranquil oasis, time seemed to slow, offering a gentle reminder of | ||
| nature’s beauty and the simple joys that color everyday life.""", | ||
| """On the edge of a sleepy coastal village, a lighthouse stood | ||
| sentinel against the relentless waves. Its beacon, steadfast and | ||
| bright, guided fishermen safely home through fog and storm. The | ||
| keeper, a solitary figure with weathered hands, tended the light | ||
| with unwavering dedication, his days marked by the rhythm of tides | ||
| and the cries of gulls. Each evening, as the sun dipped below the | ||
| horizon, the village gathered on the shore to watch the sky ignite | ||
| in shades of orange and violet. Children chased the surf, their | ||
| laughter mingling with the roar of the sea. Local artisans | ||
| displayed their crafts at the market, their wares shaped by the | ||
| stories and traditions of generations. The lighthouse, a symbol of | ||
| hope and resilience, reminded all who saw it that even in the | ||
| darkest nights, a guiding light could be found, illuminating the | ||
| path home.""", | ||
| ] | ||
|
|
||
|
|
||
| async def main(): | ||
| lightrag_wrapper = MemgraphLightRAGWrapper(disable_embeddings=True) | ||
| try: | ||
| await lightrag_wrapper.initialize( | ||
| working_dir=WORKING_DIR, | ||
| llm_model_func=gpt_4o_mini_complete, | ||
| embedding_func=openai_embed, | ||
| max_parallel_insert=8, | ||
| ) | ||
|
|
||
| total_time = 0.0 | ||
| start_time = time.perf_counter() | ||
| await lightrag_wrapper.ainsert( | ||
| input=[text for text in DUMMY_TEXTS], | ||
| file_paths=[str(idx) for idx in range(len(DUMMY_TEXTS))], | ||
| ) | ||
| end_time = time.perf_counter() | ||
| total_time += end_time - start_time | ||
| if len(DUMMY_TEXTS) > 0: | ||
| print(f"Average time per text: {total_time/len(DUMMY_TEXTS):.4f} seconds.") | ||
|
|
||
| rag = lightrag_wrapper.get_lightrag() | ||
| print(await rag.get_graph_labels()) | ||
| kg_data = await rag.get_knowledge_graph(node_label="City", max_depth=3) | ||
| print("KNOWLEDGE GRAPH DATA:") | ||
| print(kg_data) | ||
|
|
||
| except Exception as e: | ||
| print(f"An error occurred: {e}") | ||
| print(traceback.format_exc()) | ||
| finally: | ||
| await lightrag_wrapper.afinalize() | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| asyncio.run(main()) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,56 @@ | ||
| [build-system] | ||
| requires = ["hatchling"] | ||
| build-backend = "hatchling.build" | ||
|
|
||
| [project] | ||
| name = "lightrag-memgraph" | ||
| version = "0.1.0" | ||
| description = "LightRAG integration with Memgraph" | ||
| readme = "README.md" | ||
| requires-python = ">=3.10" | ||
| authors = [ | ||
| {name = "Memgraph Tech Team", email = "[email protected]"}, | ||
| ] | ||
| classifiers = [ | ||
| "Development Status :: 4 - Beta", | ||
| "Intended Audience :: Developers", | ||
| "License :: OSI Approved :: Apache Software License", | ||
| "Programming Language :: Python :: 3", | ||
| "Programming Language :: Python :: 3.10", | ||
| "Programming Language :: Python :: 3.11", | ||
| "Programming Language :: Python :: 3.12", | ||
| ] | ||
| dependencies = [ | ||
| "lightrag-hku[api]==1.4.8.2", | ||
| "memgraph-toolbox", | ||
| "numpy>=1.21.0", | ||
| ] | ||
|
|
||
| [project.optional-dependencies] | ||
| dev = [ | ||
| "pytest>=7.0.0", | ||
| "pytest-asyncio>=0.21.0", | ||
| "black>=22.0.0", | ||
| "isort>=5.0.0", | ||
| "mypy>=1.0.0", | ||
| ] | ||
|
|
||
| [tool.uv.sources] | ||
| memgraph-toolbox = { workspace = true } | ||
|
|
||
| [tool.hatch.build.targets.wheel] | ||
| packages = ["src/lightrag_memgraph"] | ||
|
|
||
| [tool.black] | ||
| line-length = 88 | ||
| target-version = ['py310'] | ||
|
|
||
| [tool.isort] | ||
| profile = "black" | ||
| line_length = 88 | ||
|
|
||
| [tool.mypy] | ||
| python_version = "3.10" | ||
| warn_return_any = true | ||
| warn_unused_configs = true | ||
| disallow_untyped_defs = true |
10 changes: 10 additions & 0 deletions
10
integrations/lightrag-memgraph/src/lightrag_memgraph/__init__.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,10 @@ | ||
| """ | ||
| LightRAG integration with Memgraph. | ||
|
|
||
| This package provides a wrapper around LightRAG that uses Memgraph as the graph storage backend. | ||
| """ | ||
|
|
||
| from .core import MemgraphLightRAGWrapper | ||
|
|
||
| __version__ = "0.1.0" | ||
| __all__ = ["MemgraphLightRAGWrapper"] | ||
68 changes: 68 additions & 0 deletions
68
integrations/lightrag-memgraph/src/lightrag_memgraph/core.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,68 @@ | ||
| import os | ||
| from typing import Optional | ||
|
|
||
| from lightrag import LightRAG | ||
| from lightrag.kg.shared_storage import initialize_pipeline_status | ||
| from lightrag.utils import setup_logger | ||
| import numpy as np | ||
|
|
||
|
|
||
| SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__)) | ||
| MEMGRAPH_URI = os.getenv("MEMGRAPH_URI", "bolt://localhost:7687") | ||
| os.environ["MEMGRAPH_URI"] = MEMGRAPH_URI | ||
|
|
||
|
|
||
| class DummyEmbed: | ||
| def __init__(self, dim: int = 1): | ||
| self.embedding_dim = dim | ||
|
|
||
| async def __call__(self, texts: list[str]) -> np.ndarray: | ||
| return np.ones((len(texts), self.embedding_dim), dtype=float) | ||
|
|
||
|
|
||
| class MemgraphLightRAGWrapper: | ||
| def __init__( | ||
| self, | ||
| log_level: str = "INFO", | ||
| disable_embeddings: bool = False, | ||
| ): | ||
| self.log_level = log_level | ||
| self.disable_embeddings = disable_embeddings | ||
| self.rag: Optional[LightRAG] = None | ||
|
|
||
| # https://github.com/HKUDS/LightRAG/blob/main/lightrag/lightrag.py | ||
| # https://github.com/HKUDS/LightRAG/blob/main/lightrag/llm | ||
| async def initialize(self, **lightrag_kwargs) -> None: | ||
| setup_logger("lightrag", level=self.log_level) | ||
| if self.disable_embeddings: | ||
| lightrag_kwargs["embedding_func"] = DummyEmbed(dim=1) | ||
| lightrag_kwargs["vector_storage"] = "NanoVectorDBStorage" | ||
| if "working_dir" in lightrag_kwargs: | ||
| working_dir = lightrag_kwargs["working_dir"] | ||
| if not os.path.exists(working_dir): | ||
| os.mkdir(working_dir) | ||
| self.rag = LightRAG(graph_storage="MemgraphStorage", **lightrag_kwargs) | ||
| await self.rag.initialize_storages() | ||
| await initialize_pipeline_status() | ||
|
|
||
| def get_lightrag(self) -> LightRAG: | ||
| if self.rag is None: | ||
| raise RuntimeError("LightRAG not initialized. Call initialize() first.") | ||
| return self.rag | ||
|
|
||
| # https://github.com/HKUDS/LightRAG/blob/main/lightrag/lightrag.py | ||
| async def ainsert(self, **kwargs) -> None: | ||
| """ | ||
| Example call: await lightrag_wrapper.ainsert(input=text, file_paths=[id]) | ||
|
|
||
| If you want to inject info under each entity about the source input, | ||
| pass file_paths as a list of strings (ids don't work, not written under each entity). | ||
| """ | ||
| if self.rag is None: | ||
| raise RuntimeError("LightRAG not initialized. Call initialize() first.") | ||
| await self.rag.ainsert(**kwargs) | ||
|
|
||
| async def afinalize(self) -> None: | ||
| if self.rag is None: | ||
| raise RuntimeError("LightRAG not initialized. Call initialize() first.") | ||
| await self.rag.finalize_storages() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.