Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

community: add include_labels option to ConfluenceLoader #28259

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

nakamasato
Copy link
Contributor

@nakamasato nakamasato commented Nov 21, 2024

Description:

Enable ConfluenceLoader to include labels with include_labels option (false by default for backward compatibility). and the labels are set to metadata in the Document. e.g. {"labels": ["l1", "l2"]}

Notes

Confluence API supports to get labels by providing metadata.labels to expand query parameter

All of the following functions support expand in the same way:

  • confluence.get_page_by_id
  • confluence.get_all_pages_by_label
  • confluence.get_all_pages_from_space
  • cql (internally using /api/content/search)

Issue:

No issue related to this PR.

Dependencies:

No changes.

Twitter handle:

@gymnstcs

  • Add tests and docs: If you're adding a new integration, please include

    1. a test for the integration, preferably unit tests that do not rely on network access,
    2. an example notebook showing its use. It lives in docs/docs/integrations directory.
  • Lint and test: Run make format, make lint and make test from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/

Copy link

vercel bot commented Nov 21, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
langchain ⬜️ Ignored (Inspect) Visit Preview Nov 21, 2024 1:17pm

@nakamasato nakamasato changed the title feat: add include_labels option to ConfluenceLoader community: add include_labels option to ConfluenceLoader Nov 21, 2024
@nakamasato nakamasato force-pushed the add-include-labels-option-to-confluence-loaders branch from 08a12d8 to f3dffce Compare November 21, 2024 13:17
metadata = {
"title": page["title"],
"id": page["id"],
"source": self.base_url.strip("/") + page["_links"]["webui"],
**({"labels": labels} if include_labels else {}),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set labels key only when include_labels is set to true

Comment on lines +596 to +601
labels = [
label["name"]
for label in page.get("metadata", {})
.get("labels", {})
.get("results", [])
]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

label would be sth like this: {'prefix': 'global', 'name': 'database', 'id': '111111111'}

Screenshot 2024-11-21 at 22 23 03

ref

@nakamasato nakamasato marked this pull request as ready for review November 21, 2024 13:25
@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. community Related to langchain-community Ɑ: doc loader Related to document loader module (not documentation) labels Nov 21, 2024
Comment on lines +339 to +345
expand = ",".join(
[
content_format.value,
"version",
*(["metadata.labels"] if include_labels else []),
]
)
Copy link
Contributor Author

@nakamasato nakamasato Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expand is a comma-separated query parameter. originally hardcoded as f"{content_format.value},version".

I made a variable so we can add more option if necessary in the future as expand parameter supports a lot more values.

Screenshot 2024-11-21 at 22 27 32
(ref)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community Related to langchain-community Ɑ: doc loader Related to document loader module (not documentation) size:M This PR changes 30-99 lines, ignoring generated files.
Projects
Status: Triage
Development

Successfully merging this pull request may close these issues.

1 participant