-
Notifications
You must be signed in to change notification settings - Fork 850
[MEDI] start producing NuGet packages #7016
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
+162
−15
Merged
Changes from all commits
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
39 changes: 39 additions & 0 deletions
39
src/Libraries/Microsoft.Extensions.DataIngestion.Abstractions/README.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,39 @@ | ||
| # Microsoft.Extensions.DataIngestion.Abstractions | ||
|
|
||
| .NET developers need to efficiently process, chunk, and retrieve information from diverse document formats while preserving semantic meaning and structural context. The `Microsoft.Extensions.DataIngestion` libraries provide a unified approach for representing document ingestion components. | ||
|
|
||
| ## The packages | ||
|
|
||
| The [Microsoft.Extensions.DataIngestion.Abstractions](https://www.nuget.org/packages/Microsoft.Extensions.DataIngestion.Abstractions) package provides the core exchange types, including [`IngestionDocument`](https://learn.microsoft.com/dotnet/api/microsoft.extensions.dataingestion.ingestiondocument), [`IngestionChunker<T>`](https://learn.microsoft.com/dotnet/api/microsoft.extensions.dataingestion.ingestionchunker-1), [`IngestionChunkProcessor<T>`](https://learn.microsoft.com/dotnet/api/microsoft.extensions.dataingestion.ingestionchunkprocessor-1), and [`IngestionChunkWriter<T>`](https://learn.microsoft.com/dotnet/api/microsoft.extensions.dataingestion.ingestionchunkwriter-1). Any .NET library that provides document processing capabilities can implement these abstractions to enable seamless integration with consuming code. | ||
|
|
||
| The [Microsoft.Extensions.DataIngestion](https://www.nuget.org/packages/Microsoft.Extensions.DataIngestion) package has an implicit dependency on the `Microsoft.Extensions.DataIngestion.Abstractions` package. This package enables you to easily integrate components such as enrichment processors, vector storage writers, and telemetry into your applications using familiar dependency injection and pipeline patterns. For example, it provides processors for sentiment analysis, keyword extraction, and summarization that can be chained together in ingestion pipelines. | ||
|
|
||
| ## Which package to reference | ||
|
|
||
| Libraries that provide implementations of the abstractions typically reference only `Microsoft.Extensions.DataIngestion.Abstractions`. | ||
|
|
||
| To also have access to higher-level utilities for working with document ingestion components, reference the `Microsoft.Extensions.DataIngestion` package instead (which itself references `Microsoft.Extensions.DataIngestion.Abstractions`). Most consuming applications and services should reference the `Microsoft.Extensions.DataIngestion` package along with one or more libraries that provide concrete implementations of the abstractions, such as `Microsoft.Extensions.DataIngestion.MarkItDown` or `Microsoft.Extensions.DataIngestion.Markdig`. | ||
|
|
||
| ## Install the package | ||
|
|
||
| From the command-line: | ||
|
|
||
| ```console | ||
| dotnet add package Microsoft.Extensions.DataIngestion.Abstractions --prerelease | ||
| ``` | ||
|
|
||
| Or directly in the C# project file: | ||
|
|
||
| ```xml | ||
| <ItemGroup> | ||
| <PackageReference Include="Microsoft.Extensions.DataIngestion.Abstractions" Version="[CURRENTVERSION]" /> | ||
| </ItemGroup> | ||
| ``` | ||
|
|
||
| ## Documentation | ||
|
|
||
| Refer to the [Microsoft.Extensions.DataIngestion libraries documentation](https://learn.microsoft.com/dotnet/dataingestion/microsoft-extensions-dataingestion) for more information and API usage examples. | ||
|
|
||
| ## Feedback & Contributing | ||
|
|
||
| We welcome feedback and contributions in [our GitHub repo](https://github.com/dotnet/extensions). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
36 changes: 36 additions & 0 deletions
36
src/Libraries/Microsoft.Extensions.DataIngestion.MarkItDown/README.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,36 @@ | ||
| # Microsoft.Extensions.DataIngestion.MarkItDown | ||
|
|
||
| Provides an implementation of the `IngestionDocumentReader` class for the [MarkItDown](https://github.com/microsoft/markitdown/) utility. | ||
|
|
||
| ## Install the package | ||
|
|
||
| From the command-line: | ||
|
|
||
| ```console | ||
| dotnet add package Microsoft.Extensions.DataIngestion.MarkItDown --prerelease | ||
| ``` | ||
|
|
||
| Or directly in the C# project file: | ||
|
|
||
| ```xml | ||
| <ItemGroup> | ||
| <PackageReference Include="Microsoft.Extensions.DataIngestion.MarkItDown" Version="[CURRENTVERSION]" /> | ||
| </ItemGroup> | ||
| ``` | ||
|
|
||
| ## Usage Examples | ||
|
|
||
| ### Creating a MarkItDownReader for Data Ingestion | ||
|
|
||
| ```csharp | ||
| using Microsoft.Extensions.DataIngestion; | ||
|
|
||
| IngestionDocumentReader reader = | ||
| new MarkItDownReader(new FileInfo(@"pathToMarkItDown.exe"), extractImages: true); | ||
|
|
||
| using IngestionPipeline<string> pipeline = new(reader, CreateChunker(), CreateWriter()); | ||
| ``` | ||
|
|
||
| ## Feedback & Contributing | ||
|
|
||
| We welcome feedback and contributions in [our GitHub repo](https://github.com/dotnet/extensions). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
35 changes: 35 additions & 0 deletions
35
src/Libraries/Microsoft.Extensions.DataIngestion.Markdig/README.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,35 @@ | ||
| # Microsoft.Extensions.DataIngestion.Markdig | ||
|
|
||
| Provides an implementation of the `IngestionDocumentReader` class for the Markdown files using [MarkDig](https://github.com/xoofx/markdig) library. | ||
|
|
||
| ## Install the package | ||
|
|
||
| From the command-line: | ||
|
|
||
| ```console | ||
| dotnet add package Microsoft.Extensions.DataIngestion.Markdig --prerelease | ||
| ``` | ||
|
|
||
| Or directly in the C# project file: | ||
|
|
||
| ```xml | ||
| <ItemGroup> | ||
| <PackageReference Include="Microsoft.Extensions.DataIngestion.Markdig" Version="[CURRENTVERSION]" /> | ||
| </ItemGroup> | ||
| ``` | ||
|
|
||
| ## Usage Examples | ||
|
|
||
| ### Creating a MarkdownReader for Data Ingestion | ||
|
|
||
| ```csharp | ||
| using Microsoft.Extensions.DataIngestion; | ||
|
|
||
| IngestionDocumentReader reader = new MarkdownReader(); | ||
|
|
||
| using IngestionPipeline<string> pipeline = new(reader, CreateChunker(), CreateWriter()); | ||
| ``` | ||
|
|
||
| ## Feedback & Contributing | ||
|
|
||
| We welcome feedback and contributions in [our GitHub repo](https://github.com/dotnet/extensions). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
34 changes: 34 additions & 0 deletions
34
src/Libraries/Microsoft.Extensions.DataIngestion/README.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,34 @@ | ||
| # Microsoft.Extensions.DataIngestion | ||
|
|
||
| .NET developers need to efficiently process, chunk, and retrieve information from diverse document formats while preserving semantic meaning and structural context. The `Microsoft.Extensions.DataIngestion` libraries provide a unified approach for representing document ingestion components. | ||
|
|
||
| ## The packages | ||
|
|
||
| The [Microsoft.Extensions.DataIngestion.Abstractions](https://www.nuget.org/packages/Microsoft.Extensions.DataIngestion.Abstractions) package provides the core exchange types, including [`IngestionDocument`](https://learn.microsoft.com/dotnet/api/microsoft.extensions.dataingestion.ingestiondocument), [`IngestionChunker<T>`](https://learn.microsoft.com/dotnet/api/microsoft.extensions.dataingestion.ingestionchunker-1), [`IngestionChunkProcessor<T>`](https://learn.microsoft.com/dotnet/api/microsoft.extensions.dataingestion.ingestionchunkprocessor-1), and [`IngestionChunkWriter<T>`](https://learn.microsoft.com/dotnet/api/microsoft.extensions.dataingestion.ingestionchunkwriter-1). Any .NET library that provides document processing capabilities can implement these abstractions to enable seamless integration with consuming code. | ||
|
|
||
| The [Microsoft.Extensions.DataIngestion](https://www.nuget.org/packages/Microsoft.Extensions.DataIngestion) package has an implicit dependency on the `Microsoft.Extensions.DataIngestion.Abstractions` package. This package enables you to easily integrate components such as enrichment processors, vector storage writers, and telemetry into your applications using familiar dependency injection and pipeline patterns. For example, it provides the [`SentimentEnricher`](https://learn.microsoft.com/dotnet/api/microsoft.extensions.dataingestion.sentimentenricher), [`KeywordEnricher`](https://learn.microsoft.com/dotnet/api/microsoft.extensions.dataingestion.keywordenricher), and [`SummaryEnricher`](https://learn.microsoft.com/dotnet/api/microsoft.extensions.dataingestion.summaryenricher) processors that can be chained together in ingestion pipelines. | ||
adamsitnik marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| ## Which package to reference | ||
|
|
||
| Libraries that provide implementations of the abstractions typically reference only `Microsoft.Extensions.DataIngestion.Abstractions`. | ||
|
|
||
| To also have access to higher-level utilities for working with document ingestion components, reference the `Microsoft.Extensions.DataIngestion` package instead (which itself references `Microsoft.Extensions.DataIngestion.Abstractions`). Most consuming applications and services should reference the `Microsoft.Extensions.DataIngestion` package along with one or more libraries that provide concrete implementations of the abstractions, such as `Microsoft.Extensions.DataIngestion.MarkItDown` or `Microsoft.Extensions.DataIngestion.Markdig`. | ||
|
|
||
| ## Install the package | ||
|
|
||
| From the command-line: | ||
|
|
||
| ```console | ||
| dotnet add package Microsoft.Extensions.DataIngestion --prerelease | ||
| ``` | ||
| Or directly in the C# project file: | ||
|
|
||
| ```xml | ||
| <ItemGroup> | ||
| <PackageReference Include="Microsoft.Extensions.DataIngestion" Version="[CURRENTVERSION]" /> | ||
| </ItemGroup> | ||
| ``` | ||
|
|
||
| ## Feedback & Contributing | ||
|
|
||
| We welcome feedback and contributions in [our GitHub repo](https://github.com/dotnet/extensions). | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.