SemanticKernel Memory pipeline for Agent Frameworks

This repository contains an advanced Memory Pipeline designed to enhance the context management and information retrieval capabilities of Semantic Kernel agents. The pipeline integrates various memory storage and retrieval strategies to enable agents to maintain, update, memory effectively across interactions.

Features

Advanced Text Chunking: Supports both simple size-based chunking and intelligent semantic chunking based on document structure
Semantic Chunking: Creates meaningful chunks by detecting document headings and structure (Markdown, underlined, numbered headings)
Dependency Injection: Full integration with Microsoft.Extensions.DependencyInjection for easy configuration and testing
Fluent Configuration API: Intuitive configuration syntax
Modular memory components supporting vector stores, databases, and custom memory handlers
Efficient embedding and semantic search integration for context-aware retrieval
Easy integration with Semantic Kernel SDK and extensible architecture for custom memory logic

Installation

Install Packages

Then install the packages you need:

# Core memory pipeline functionality
dotnet add package SemanticKernel.Agents.Memory.Core

Alternatively, you can add the packages directly to your .csproj file:

<PackageReference Include="SemanticKernel.Agents.Memory.Core" />
<PackageReference Include="SemanticKernel.Agents.Memory.Abstractions" />

Getting Started

Demo setup

The demo code in samples/SemanticKernel.Agents.Memory.Samples/PipelineDemo.cs registers configuration and composes the pipeline using the fluent API. The example below shows how to configure the pipeline:

services.AddAzureOpenAITextEmbeddingGeneration(
    deploymentName: "NAME_OF_YOUR_DEPLOYMENT", // Name of deployment, e.g. "text-embedding-ada-002".
    endpoint: "YOUR_AZURE_ENDPOINT",           // Name of Azure OpenAI service endpoint, e.g. https://myaiservice.openai.azure.com.
    apiKey: "YOUR_API_KEY",
    modelId: "MODEL_ID",          // Optional name of the underlying model if the deployment name doesn't match the model name, e.g. text-embedding-ada-002.
    serviceId: "YOUR_SERVICE_ID", // Optional; for targeting specific services within Semantic Kernel.
    dimensions: 1536              // Optional number of dimensions to generate embeddings with.
);

services.AddAzureOpenAIChatCompletion(
    deploymentName: "NAME_OF_YOUR_DEPLOYMENT",
    apiKey: "YOUR_API_KEY",
    endpoint: "YOUR_AZURE_ENDPOINT",
    modelId: "gpt-4", // Optional name of the underlying model if the deployment name doesn't match the model name
    serviceId: "YOUR_SERVICE_ID" // Optional; for targeting specific services within Semantic Kernel
);

var memoryStore = new InMemoryVectorStore(); // or your vector store implementation

// Configure the memory ingestion pipeline using the fluent API
services.ConfigureMemoryIngestion(options =>
{
    options
        // Use MarkitDown extraction service running locally
        .WithMarkitDownTextExtraction("http://localhost:5000")
        // Semantic (structure-aware) chunking with lambda configuration
        .WithSemanticChunking(() => new SemanticChunkingOptions
        {
            MaxChunkSize = 500,         // Max characters per chunk
            MinChunkSize = 100,         // Minimum characters per chunk for structure-aware splitting
            TitleLevelThreshold = 3,    // Consider headings up to this level as titles
            IncludeTitleContext = true, // Include heading/title text in chunk context
            TextOverlap = 50            // Overlapping characters between adjacent chunks
        })
        // Handler that generates embeddings (uses configured Azure OpenAI or mock generator)
        .WithDefaultEmbeddingsGeneration()
        // Save records using an in-memory vector store instance (samples use this for demos)
        .WithSaveRecords(memoryStore);
});

services.AddMemorySearchClient(memoryStore, new SearchClientOptions
{
    MaxMatchesCount = 10,        // Max search results to retrieve
    AnswerTokens = 300,          // Max tokens for AI-generated answers
    Temperature = 0.7,           // LLM creativity (0.0 = deterministic, 1.0 = creative)
    MinRelevance = 0.6           // Minimum relevance score for results
});
...

// Get the orchestrator from the service provider
var orchestrator = serviceProvider.GetRequiredService<ImportOrchestrator>();

// Get the search client from the service provider
ISearchClient searchClient = serviceProvider.GetRequiredService<ISearchClient>();

// Create a file upload request using the fluent builder API
_ = await orchestrator.ProcessUploadAsync(index: "default",
    orchestrator.NewDocumentUpload()
        .WithFile("path/to/document.pdf")
        .WithTag("document-type", "technical")
        .WithTag("priority", "high")
        .WithContext("source", "user-upload")
        .Build());

var searchResult = await searchClient.SearchAsync(
    index: "default",
    query: query,
    minRelevance: 0.7,
    limit: 5
);

Console.WriteLine($"Found {searchResult.Results.Count} results for: {query}");

foreach (var result in searchResult.Results)
{
    Console.WriteLine($"• {result.Source} (Score: {result.RelevanceScore:F3})");
    Console.WriteLine($"  Content: {result.Content.Substring(0, Math.Min(150, result.ContentLength))}...");
}

Running the Sample

To see the memory pipeline in action, run the sample application:

cd samples/SemanticKernel.Agents.Memory.Samples
dotnet run

The sample application demonstrates several demos (see PipelineDemo.cs):

Basic pipeline demo using simple, size-based chunking (RunAsync)
Semantic chunking demo that uses document structure (RunSemanticChunkingAsync)
Custom handler / services demo showing how to register additional services (RunCustomHandlerAsync)
Semantic chunking configuration demo with fine-grained options (RunSemanticChunkingConfigDemo)

Running the MarkitDown extraction service

The samples call a small helper service (MarkitDown) to extract and preprocess documents. You can run it either directly with Python or via Docker. The service listens on port 5000 by default and the samples use the URL http://localhost:5000.

Run with Python (recommended for development):

python3 -m venv .venv
source .venv/bin/activate
pip install -r services/markitdown-service/requirements.txt
python services/markitdown-service/app.py

Run with Docker:

docker build -t markitdown-service services/markitdown-service
docker run --rm -p 5000:5000 markitdown-service

License

This project is licensed under the MIT License - see the LICENSE.txt file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 86 Commits
.github		.github
samples/SemanticKernel.Agents.Memory.Samples		samples/SemanticKernel.Agents.Memory.Samples
scripts		scripts
services/markitdown-service		services/markitdown-service
src		src
tests/SemanticKernel.Agents.Memory.Core.Tests		tests/SemanticKernel.Agents.Memory.Core.Tests
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
Directory.Build.props		Directory.Build.props
Directory.Packages.props		Directory.Packages.props
LICENSE.txt		LICENSE.txt
README.md		README.md
SemanticKernel.Agents.Memory.sln		SemanticKernel.Agents.Memory.sln
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SemanticKernel Memory pipeline for Agent Frameworks

Features

Installation

Install Packages

Getting Started

Demo setup

Running the Sample

Running the MarkitDown extraction service

License

About

Uh oh!

Releases 12

Packages

Uh oh!

Contributors 4

Uh oh!

Languages

License

kbeaugrand/SemanticKernel.Agents.Memory

Folders and files

Latest commit

History

Repository files navigation

SemanticKernel Memory pipeline for Agent Frameworks

Features

Installation

Install Packages

Getting Started

Demo setup

Running the Sample

Running the MarkitDown extraction service

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 12

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

Packages