This starter app template for LiveKit Agents provides a simple voice interface using the LiveKit Swift SDK. It supports voice, transcriptions, live video input, and virtual avatars.
This template is compatible with iOS, iPadOS, macOS, and visionOS and is free for you to use or modify as you see fit.
First, you'll need a LiveKit agent to speak with. Try our starter agent for Python, Node.js, or create your own from scratch.
Second, you need a token server. The easiest way to set this up is with the Sandbox for LiveKit Cloud and the LiveKit CLI.
First, create a new Sandbox Token Server for your LiveKit Cloud project. Then, run the following command to automatically clone this template and connect it to LiveKit Cloud. This will create a new Xcode project in the current directory.
lk app create --template agent-starter-swift --sandbox <token_server_sandbox_id>Then, build and run the app from Xcode by opening VoiceAgent.xcodeproj. You may need to adjust your app signing settings to run the app on your device.
Note
To set up without the LiveKit CLI, clone the repository and then either create a VoiceAgent/.env.xcconfig with a LIVEKIT_SANDBOX_ID (if using a Sandbox Token Server), or modify VoiceAgent/VoiceAgentApp.swift to replace the SandboxTokenSource with a custom token source implementation.
This starter app supports several features of the agents framework and is easily configurable to enable or disable them in code based on your needs as you adapt this template to your own use case.
This app supports text, video, and/or voice input according to the needs of your agent. To update the features enabled in the app, edit VoiceAgent/VoiceAgentApp.swift and modify the .environment() modifiers to enable or disable features.
By default, all features (voice, video, and text input) are enabled. To disable a feature, change the value from true to false:
.environment(\.voiceEnabled, true) // Enable voice input
.environment(\.videoEnabled, false) // Disable video input
.environment(\.textEnabled, true) // Enable text inputAvailable input types:
.voice: Allows the user to speak to the agent using their microphone. Requires microphone permissions..text: Allows the user to type to the agent. See the docs for more details..video: Allows the user to share their camera or screen to the agent. This requires a supported model like the Gemini Live API. See the docs for more details.
If you have trouble with screen sharing, refer to the docs for more setup instructions.
The app is built on top of two main observable components from the LiveKit Swift SDK:
Sessionobject to connect to the LiveKit infrastructure, interact with theAgentand its local state, and send/receive text messages.LocalMediaobject to manage the local media tracks (audio, video, screen sharing) and their lifecycle.
This app enables preConnectAudio by default to capture and buffer audio before the room connection completes. This allows the connection to appear "instant" from the user's perspective and makes your app more responsive. To disable this feature, set preConnectAudio to false in SessionOptions when creating the Session.
If your agent publishes a virtual avatar, this app will automatically render the avatar's camera feed in AgentView when available.
In production, you'll need to develop a solution to generate tokens for your users that integrates with your authentication system. You should replace your SandboxTokenSource with an EndpointTokenSource or your own TokenSourceFixed or TokenSourceConfigurable implementation. Additionally, you can use the .cached() extension to cache valid tokens and avoid unnecessary token requests.
To use this template with video (or screen sharing) input, you need to run the app on a physical device. Testing on the Simulator will still support voice and text modes, as well as virtual avatars.
LiveKitWebRTC.xcframework, which is part of the LiveKit Swift SDK, does not contain dSYMs. Submitting the app to the App Store will result in the following warning:
The archive did not include a dSYM for the LiveKitWebRTC.framework with the UUIDs [...]. Ensure that the archive's dSYM folder includes a DWARF file for LiveKitWebRTC.framework with the expected UUIDs.
It will not prevent the app from being submitted to the App Store or passing the review process.
This template is open source and we welcome contributions! Please open a PR or issue through GitHub, and don't forget to join us in the LiveKit Community Slack!
