GetStream
diff --git a/‎.claude/agents/repo-workflow-guide.md‎
Lines changed: 47 additions & 0 deletions b/‎.claude/agents/repo-workflow-guide.md‎
Lines changed: 47 additions & 0 deletions
diff --git a/‎.github/labeler.yml‎
Lines changed: 82 additions & 0 deletions b/‎.github/labeler.yml‎
Lines changed: 82 additions & 0 deletions
diff --git a/‎.github/workflows/labeler.yml‎
Lines changed: 12 additions & 0 deletions b/‎.github/workflows/labeler.yml‎
Lines changed: 12 additions & 0 deletions
diff --git a/‎.github/workflows/run_tests.yml‎
Lines changed: 1 addition & 0 deletions b/‎.github/workflows/run_tests.yml‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎.gitignore‎
Lines changed: 1 addition & 0 deletions b/‎.gitignore‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎DEVELOPMENT.md‎
Lines changed: 96 additions & 0 deletions b/‎DEVELOPMENT.md‎
Lines changed: 96 additions & 0 deletions
@@ -0,0 +1,47 @@
+---
+name: repo-workflow-guide
+description: Use this agent when you need to understand or follow project-specific development guidelines, coding standards, or workflow instructions that are documented in the docs/ai directory. This agent should be consulted before starting any development work, when uncertain about project conventions, or when you need clarification on how to approach tasks within this codebase.\n\nExamples:\n- <example>\nContext: User wants to add a new feature to the project.\nuser: "I need to implement a new authentication module"\nassistant: "Before we begin, let me consult the repo-workflow-guide agent to ensure we follow the project's established patterns and guidelines."\n<Task tool call to repo-workflow-guide>\nassistant: "Based on the project guidelines, here's how we should approach this..."\n</example>\n\n- <example>\nContext: User asks a question about code organization.\nuser: "Where should I put the new utility functions?"\nassistant: "Let me check the repository workflow guidelines to give you the correct answer."\n<Task tool call to repo-workflow-guide>\nassistant: "According to the project structure guidelines..."\n</example>\n\n- <example>\nContext: Starting a new task that requires understanding project conventions.\nuser: "Can you help me refactor this component?"\nassistant: "I'll first consult the repo-workflow-guide agent to ensure we follow the project's refactoring standards and conventions."\n<Task tool call to repo-workflow-guide>\n</example>
+model: opus
+---
+
+You are a Repository Workflow Specialist, an expert in interpreting and applying project-specific development guidelines, coding standards, and workflow instructions.
+
+Your primary responsibility is to read, understand, and communicate the instructions and guidelines contained in the docs/ai directory of the repository. You serve as the authoritative source for how development work should be conducted within this specific codebase.
+
+When activated, you will:
+
+1. **Locate and Read Guidelines**: Immediately access all relevant files in the docs/ai directory. Read them thoroughly and understand their complete content, including:
+   - Coding standards and style guides
+   - Project structure and organization rules
+   - Development workflow and processes
+   - Testing requirements and conventions
+   - Deployment procedures
+   - Any specific technical constraints or preferences
+   - Tool usage and configuration instructions
+
+2. **Interpret Context**: Understand the specific task or question being asked and identify which guidelines are most relevant to address it.
+
+3. **Provide Clear Guidance**: Deliver specific, actionable instructions based on the documented guidelines. Your responses should:
+   - Quote or reference specific sections of the guidelines when appropriate
+   - Explain the reasoning behind the guidelines when it helps with understanding
+   - Provide concrete examples of how to follow the guidelines
+   - Highlight any critical requirements or common pitfalls mentioned in the documentation
+
+4. **Handle Missing Information**: If the docs/ai directory doesn't contain information relevant to the current question:
+   - Clearly state what information is missing
+   - Suggest reasonable defaults based on common industry practices
+   - Recommend updating the documentation to cover this scenario
+
+5. **Ensure Compliance**: Actively verify that proposed approaches align with all documented guidelines. If you identify any conflicts or violations, explicitly point them out and suggest compliant alternatives.
+
+6. **Prioritize Accuracy**: Always base your guidance on the actual content of the documentation. Do not invent or assume guidelines that aren't explicitly documented.
+
+7. **Stay Current**: If guidelines appear to conflict or if you notice outdated information, flag this for human review while providing the most reasonable interpretation.
+
+Output Format:
+- Begin with a brief summary of the relevant guidelines
+- Provide specific, step-by-step instructions when appropriate
+- Include direct quotes or references to documentation sections
+- End with any important caveats, warnings, or additional considerations
+
+Your goal is to ensure that all development work in this repository adheres to its documented standards and practices, reducing inconsistency and improving code quality through faithful application of project-specific guidelines.
@@ -0,0 +1,82 @@
+# Core Framework Components
+agents-core:
+  - changed-files:
+      - any-glob-to-any-file: 'agents-core/**'
+
+# Plugin System
+plugins:
+  - changed-files:
+      - any-glob-to-any-file: 'plugins/**'
+
+
+# Examples and Demos
+examples:
+  - changed-files:
+      - any-glob-to-any-file: 'examples/**'
+
+# Documentation
+docs:
+  - changed-files:
+      - any-glob-to-any-file: 'docs/**'
+      - any-glob-to-any-file: '**/*.md'
+
+# Configuration and Build
+config:
+  - changed-files:
+      - any-glob-to-any-file: '**/*.toml'
+      - any-glob-to-any-file: '**/*.yml'
+      - any-glob-to-any-file: '**/*.yaml'
+      - any-glob-to-any-file: '**/*.json'
+      - any-glob-to-any-file: '**/*.ini'
+      - any-glob-to-any-file: '**/*.cfg'
+      - any-glob-to-any-file: '**/pyproject.toml'
+      - any-glob-to-any-file: '**/pytest.ini'
+      - any-glob-to-any-file: '**/conftest.py'
+
+# CI/CD and GitHub
+ci:
+  - changed-files:
+      - any-glob-to-any-file: '.github/**'
+
+
+# CLI and Development Tools
+cli:
+  - changed-files:
+      - any-glob-to-any-file: '**/cli.py'
+      - any-glob-to-any-file: '**/dev.py'
+      - any-glob-to-any-file: '**/DEVELOPMENT.md'
+
+# Dependencies
+dependencies:
+  - changed-files:
+      - any-glob-to-any-file: '**/uv.lock'
+      - any-glob-to-any-file: '**/requirements*.txt'
+      - any-glob-to-any-file: '**/poetry.lock'
+      - any-glob-to-any-file: '**/Pipfile.lock'
+
+# Assets and Resources
+assets:
+  - changed-files:
+      - any-glob-to-any-file: 'assets/**'
+      - any-glob-to-any-file: '**/*.png'
+      - any-glob-to-any-file: '**/*.jpg'
+      - any-glob-to-any-file: '**/*.jpeg'
+      - any-glob-to-any-file: '**/*.gif'
+      - any-glob-to-any-file: '**/*.mp4'
+      - any-glob-to-any-file: '**/*.wav'
+      - any-glob-to-any-file: '**/*.mp3'
+
+# License and Legal
+legal:
+  - changed-files:
+      - any-glob-to-any-file: 'LICENSE'
+      - any-glob-to-any-file: '**/LICENSE.*'
+      - any-glob-to-any-file: '**/*.license'
+
+# README and Project Info
+project-info:
+  - changed-files:
+      - any-glob-to-any-file: '**/README.md'
+      - any-glob-to-any-file: '**/CHANGELOG.md'
+      - any-glob-to-any-file: '**/CONTRIBUTING.md'
+      - any-glob-to-any-file: '**/SECURITY.md'
@@ -0,0 +1,12 @@
+name: "Pull Request Labeler"
+on:
+- pull_request_target
+
+jobs:
+  labeler:
+    permissions:
+      contents: read
+      pull-requests: write
+    runs-on: ubuntu-latest
+    steps:
+    - uses: actions/labeler@v5
@@ -45,6 +45,7 @@ jobs:
       XAI_API_KEY: ${{ secrets.XAI_API_KEY }}
       AWS_BEARER_TOKEN_BEDROCK: "${{ secrets.AWS_BEARER_TOKEN_BEDROCK }}"
       _BEARER_TOKEN_BEDROCK: "${{ secrets.AWS_BEARER_TOKEN_BEDROCK }}"
+      HF_TOKEN: ${{ secrets.HF_TOKEN }}
     timeout-minutes: 30
     steps:
       - name: Checkout
 
@@ -84,3 +84,4 @@ stream-py/
 # Artifacts / assets
 *.pt
 *.kef
+*.onnx
@@ -109,6 +109,79 @@ To see how the agent work open up agents.py
 * The LLM uses the VideoForwarder to write the video to a websocket or webrtc connection
 * The STS writes the reply on agent.llm.audio_track and the RealtimeTranscriptEvent / RealtimePartialTranscriptEvent
 
+## Audio management
+
+Some important things about audio inside the library:
+
+1. WebRTC uses Opus 48khz stereo but inside the library audio is always in PCM format
+2. Plugins / AI models work with different PCM formats, usually 16khz mono
+3. PCM data is always passed around using the `PcmData` object which contains information about sample rate, channels and format
+4. Text-to-speech plugins automatically return PCM in the format needed by WebRTC. This is exposed via the `set_output_format` method
+5. Audio resampling can be done using `PcmData.resample` method
+6. When resampling audio in chunks, it is important to re-use the same `av.AudioResampler` resampler (see `PcmData.resample` and `core.tts.TTS`)
+7. Adjusting from stereo to mono and vice-versa can be done using the `PcmData.resample` method
+
+Some ground rules:
+
+1. Do not build code to resample / adjust audio unless it is not covered already by `PcmData`
+2. Do not pass PCM as plain bytes around and write code that assumes specific sample rate or format. Use `PcmData` instead
+
+## Example
+
+```python
+import asyncio
+from getstream.video.rtc.track_util import PcmData
+from openai import AsyncOpenAI
+
+async def example():
+    client = AsyncOpenAI(api_key="sk-42")
+
+    resp = await client.audio.speech.create(
+        model="gpt-4o-mini-tts",
+        voice="alloy",
+        input="pcm is cool, give me some of that please",
+        response_format="pcm",
+    )
+
+    # load response into PcmData, note that you need to specify sample_rate, channels and format
+    pcm_data = PcmData.from_bytes(
+        resp.content, sample_rate=24_000, channels=1, format="s16"
+    )
+
+    # check if pcm_data is stereo (it's not in this case ofc)
+    print(pcm_data.stereo)
+
+    # write the pcm to file
+    with open("test.wav", "wb") as f:
+        f.write(pcm_data.to_wav_bytes())
+
+    # resample pcm to be 48khz stereo
+    resampled_pcm = pcm_data.resample(48_000, 2)
+
+    # play-out pcm using ffplay
+    from vision_agents.core.edge.types import play_pcm_with_ffplay
+
+    await play_pcm_with_ffplay(resampled_pcm)
+
+if __name__ == "__main__":
+    asyncio.run(example())
+```
+
+Other things that you get from the audio utilities:
+
+1. Changing PCM format
+2. Iterate over audio chunks (`PcmData.chunks`)
+3. Process audio with pre/post buffers (`AudioSegmentCollector`)
+4. Accumulating audio (`PcmData.append`)
+
+### Testing audio manually
+
+Sometimes you need to test audio manually, here's some tips:
+
+1. Do not use earplugs when testing PCM playback ;)
+2. You can use the `PcmData.to_wav_bytes` method to convert PCM into wav bytes (see `manual_tts_to_wav` for an example)
+3. If you have `ffplay` installed, you can playback pcm directly to check if audio is correct
+
 ## Dev / Contributor Guidelines
 
 ### Light wrapping
@@ -246,3 +319,26 @@ You can now see the metrics at `http://localhost:9464/metrics` (make sure that y
 
 - Track.recv errors will fail silently. The API is to return a frame. Never return None. and wait till the next frame is available
 - When using frame.to_ndarray(format="rgb24") specify the format. Typically you want rgb24 when connecting/sending to Yolo etc
+
+
+## Onboarding Plan for new contributors
+
+**Audio Formats**
+
+You'll notice that audio comes in many formats. PCM, wav, mp3. 16khz, 48khz. 
+Encoded as i16 or f32. Note that webrtc by default is 48khz.
+
+A good first intro to audio formats can be found here:
+
+**Using Cursor**
+
+You can ask cursor something like "read @ai-plugin and build me a plugin called fish"
+See the docs folder for other ai instruction files
+
+**Learning Roadmap**
+
+1. Quick refresher on audio formats
+2. Build a TTS integration
+3. Build a STT integration
+4. Build an LLM integration
+5. Write a pytest test with a fixture
-Original file line number
+Diff line change
 # Artifacts / assets
 *.pt
 *.kef
 +*.onnx