Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add new Atlassian Confluence Component for document loading and vector database integration #2718

Merged

Conversation

danielgines
Copy link
Contributor

Adds new Atlassian Confluence Component

  • Implements ConfluenceComponent to load documents from the Atlassian Confluence platform.
  • Adds necessary inputs, including URL, username, API key, space key, and more.
  • Supports configuration of max_pages for pagination control.
  • Allows documents to be loaded into a vector database for queries.

This new module facilitates integration with the Atlassian Confluence platform.

gsteinLTU and others added 6 commits July 8, 2024 12:21
- Implements ConfluenceComponent to load documents from the Confluence platform.
- Adds necessary inputs, including URL, username, API key, space_key, and more.
- Supports configuration of max_pages for pagination control.
- Implements lazy loading in the load_documents method for incremental document processing.
- Allows immediate processing of documents as they are loaded.

This new module facilitates integration with the Confluence platform and enables efficient handling of large volumes of data.
- Implements ConfluenceComponent to load documents from the Confluence platform.
- Adds necessary inputs, including URL, username, API key, space key, and more.
- Supports configuration of max_pages for pagination control.

This new module facilitates integration with the Confluence platform.
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. enhancement New feature or request labels Jul 16, 2024
Copy link
Contributor

Pull Request Validation Report

This comment is automatically generated by Conventional PR

Whitelist Report

Whitelist Active Result
Pull request is a draft and should be ignored
Pull request is made by a whitelisted user and should be ignored
Pull request is submitted by a bot and should be ignored
Pull request is submitted by administrators and should be ignored

Result

Pull request does not satisfy any enabled whitelist criteria. Pull request will be validated.

Validation Report

Validation Active Result
All commits in this pull request has valid messages
Pull request does not introduce too many changes
Pull request has mentioned issues
Pull request has valid branch name
Pull request should have a non-empty body
Pull request has a valid title

Result

Pull request is invalid.

Reason

  • Pull request title does not follow the desired pattern

Last Modified at 16 Jul 24 00:40 UTC

Copy link

This pull request is automatically being deployed by Amplify Hosting (learn more).

Access this pull request here: https://pr-2718.dmtpw4p5recq1.amplifyapp.com

@danielgines danielgines changed the title Adds new Atlassian Confluence Component feat: Add new Atlassian Confluence Component for document loading and vector database integration Jul 16, 2024
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Jul 16, 2024
Copy link
Contributor

@ogabrielluiz ogabrielluiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @danielgines

This looks awesome! Thank you.

LGTM.

By the way, the component has a to_data method that you could use instead of the docs_to_data function.

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Jul 16, 2024
@danielgines
Copy link
Contributor Author

Hey @danielgines

This looks awesome! Thank you.

LGTM.

By the way, the component has a to_data method that you could use instead of the docs_to_data function.

@ogabrielluiz Thank you for the feedback and the suggestion!

Is the Data module's from_document method the one you were referring to? It would be something like this:

def load_documents(self) -> List[Data]:
    confluence = self.build_confluence()
    documents = confluence.load()
    data = [Data.from_document(doc) for doc in documents]  # Using the from_document method of Data
    self.status = data
    return data

danielgines and others added 6 commits July 16, 2024 15:18
- Changed load_documents method to convert documents using Data..from_document instead of docs_to_data for better integration with Data module.
- Updated trace_type to "tool" because the LangSmith API only supports one of the following types: ["tool", "chain", "llm", "retriever", "embedding", "prompt", "parser"].
@ogabrielluiz ogabrielluiz enabled auto-merge (squash) July 17, 2024 17:29
@ogabrielluiz ogabrielluiz merged commit 114cdb9 into langflow-ai:main Jul 17, 2024
46 checks passed
nicoloboschi pushed a commit to datastax/ragstack-ai-langflow that referenced this pull request Jul 30, 2024
… vector database integration (langflow-ai#2718)

* feat: Add Gemma 2 to Groq model list (langflow-ai#2586)

Add gemma2 to groq_constants.py

* Adds new ConfluenceComponent module with lazy loading support

- Implements ConfluenceComponent to load documents from the Confluence platform.
- Adds necessary inputs, including URL, username, API key, space_key, and more.
- Supports configuration of max_pages for pagination control.
- Implements lazy loading in the load_documents method for incremental document processing.
- Allows immediate processing of documents as they are loaded.

This new module facilitates integration with the Confluence platform and enables efficient handling of large volumes of data.

* Adds new ConfluenceComponent module

- Implements ConfluenceComponent to load documents from the Confluence platform.
- Adds necessary inputs, including URL, username, API key, space key, and more.
- Supports configuration of max_pages for pagination control.

This new module facilitates integration with the Confluence platform.

* Updated load_documents method to use Data.from_document

- Changed load_documents method to convert documents using Data..from_document instead of docs_to_data for better integration with Data module.
- Updated trace_type to "tool" because the LangSmith API only supports one of the following types: ["tool", "chain", "llm", "retriever", "embedding", "prompt", "parser"].

* [autofix.ci] apply automated fixes

---------

Co-authored-by: Gordon Stein <[email protected]>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
(cherry picked from commit 114cdb9)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request lgtm This PR has been approved by a maintainer size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants