-
Couldn't load subscription status.
- Fork 244
Comments on Agent SDK prototype #237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 32 commits
356dfb7
effb228
ec87981
ed4a803
a926e16
809e322
7760aec
5e35853
3978348
b2355fc
2bf6419
abe2026
4c1c024
8fcd2ed
9425833
4443c04
994b209
1658dc9
76a3962
fb01009
b4ba41f
b192a46
6b9dfc5
58a5ad3
d38ddf6
91f4a08
728258b
0d312e3
22282c2
4d22785
b2602b8
6f0eeb5
d24be87
8f6301b
00674f4
34c3645
0d6288e
6fae521
82a5fe2
e5b1fc0
f4eeace
ad4c236
730dbed
80038a0
c3993be
fa6cac4
e47c9af
6f12617
1d716d7
a7ff04f
9d7c5b3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,189 @@ | ||
| import type TypedEventEmitter from 'typed-emitter'; | ||
| import { EventEmitter } from "events"; | ||
| import { ConnectionState, ParticipantEvent, ParticipantKind, RemoteParticipant, Room, RoomEvent, Track } from 'livekit-client'; | ||
| import { getParticipantTrackRefs, participantTrackEvents, TrackReference } from '@/agent-sdk/external-deps/components-js'; | ||
| import { ParticipantEventCallbacks } from '@/agent-sdk/external-deps/client-sdk-js'; | ||
|
|
||
| const stateAttribute = 'lk.agent.state'; | ||
|
|
||
| export type AgentState = | ||
| | 'disconnected' | ||
| | 'connecting' | ||
| | 'initializing' | ||
| | 'listening' | ||
| | 'thinking' | ||
| | 'speaking'; | ||
|
|
||
| export enum AgentEvent { | ||
| VideoTrackChanged = 'videoTrackChanged', | ||
| AudioTrackChanged = 'videoTrackChanged', | ||
| AgentAttributesChanged = 'agentAttributesChanged', | ||
| AgentStateChanged = 'agentStateChanged', | ||
| } | ||
|
|
||
| export type AgentCallbacks = { | ||
| [AgentEvent.VideoTrackChanged]: (newTrack: TrackReference | null) => void; | ||
| [AgentEvent.AudioTrackChanged]: (newTrack: TrackReference | null) => void; | ||
| [AgentEvent.AgentAttributesChanged]: (newAttributes: Record<string, string>) => void; | ||
| [AgentEvent.AgentStateChanged]: (newState: AgentState) => void; | ||
| }; | ||
|
|
||
| /** | ||
| * Agent encapculates all agent state, normalizing some quirks around how LiveKit Agents work. | ||
| */ | ||
| export default class Agent extends (EventEmitter as new () => TypedEventEmitter<AgentCallbacks>) { | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Any strong (or JS-specific) arguments why it couldn't be flattened into one "observable" object? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
FWIW that was the intention, I want to avoid It sounds like you are proposing this become more of a general "state container" that would have a larger responsibility than it currently does? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, 2 main points here:
Personally I don't think |
||
| private room: Room; | ||
| state: AgentState = 'disconnected'; | ||
|
|
||
| private agentParticipant: RemoteParticipant | null = null; | ||
| private workerParticipant: RemoteParticipant | null = null; // ref: https://docs.livekit.io/agents/integrations/avatar/#avatar-workers | ||
| audioTrack: TrackReference | null = null; | ||
| videoTrack: TrackReference | null = null; | ||
|
|
||
| attributes: Record<string, string> = {}; | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. the defined attributes are primarily intended for internal usage. If anything, I think we should strip all internal There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good point, probably orthogonal to this PR 👍 |
||
|
|
||
| constructor(room: Room) { | ||
| super(); | ||
| this.room = room; | ||
|
|
||
| this.room.on(RoomEvent.ParticipantConnected, this.handleParticipantConnected); | ||
| this.room.on(RoomEvent.ParticipantDisconnected, this.handleParticipantDisconnected); | ||
| this.room.on(RoomEvent.ConnectionStateChanged, this.handleConnectionStateChanged); | ||
| this.updateAgentState(); | ||
| } | ||
|
|
||
| teardown() { | ||
| this.room.off(RoomEvent.ParticipantConnected, this.handleParticipantConnected); | ||
| this.room.off(RoomEvent.ParticipantDisconnected, this.handleParticipantDisconnected); | ||
| this.room.off(RoomEvent.ConnectionStateChanged, this.handleConnectionStateChanged); | ||
| } | ||
|
|
||
| private handleParticipantConnected = () => { | ||
| this.updateParticipants(); | ||
| } | ||
| private handleParticipantDisconnected = () => { | ||
| this.updateParticipants(); | ||
| } | ||
|
|
||
| private handleConnectionStateChanged = () => { | ||
| this.updateAgentState(); | ||
| } | ||
|
|
||
| private updateParticipants() { | ||
| const newAgentParticipant = this.roomRemoteParticipants.find( | ||
| (p) => p.kind === ParticipantKind.AGENT && !('lk.publish_on_behalf' in p.attributes), | ||
| ) ?? null; | ||
| const newWorkerParticipant = newAgentParticipant ? ( | ||
| this.roomRemoteParticipants.find( | ||
| (p) => | ||
| p.kind === ParticipantKind.AGENT && p.attributes['lk.publish_on_behalf'] === newAgentParticipant.identity, | ||
| ) ?? null | ||
| ) : null; | ||
|
|
||
| const oldAgentParticipant = this.agentParticipant; | ||
| const oldWorkerParticipant = this.workerParticipant; | ||
| this.agentParticipant = newAgentParticipant; | ||
| this.workerParticipant = newWorkerParticipant; | ||
|
|
||
| // 1. Listen for attribute changes | ||
| if (oldAgentParticipant !== this.agentParticipant) { | ||
| oldAgentParticipant?.off(ParticipantEvent.AttributesChanged, this.handleAttributesChanged); | ||
|
|
||
| if (this.agentParticipant) { | ||
| this.agentParticipant.on(ParticipantEvent.AttributesChanged, this.handleAttributesChanged); | ||
| this.handleAttributesChanged(this.agentParticipant.attributes); | ||
| } | ||
| } | ||
|
|
||
| // 2. Listen for track updates | ||
| for (const event of participantTrackEvents) { | ||
| if (oldAgentParticipant !== this.agentParticipant) { | ||
| oldAgentParticipant?.off(event as keyof ParticipantEventCallbacks, this.handleUpdateTracks); | ||
| if (this.agentParticipant) { | ||
| this.agentParticipant.on(event as keyof ParticipantEventCallbacks, this.handleUpdateTracks); | ||
| this.handleUpdateTracks(); | ||
| } | ||
| } | ||
| if (oldWorkerParticipant !== this.workerParticipant) { | ||
| oldWorkerParticipant?.off(event as keyof ParticipantEventCallbacks, this.handleUpdateTracks); | ||
| if (this.workerParticipant) { | ||
| this.workerParticipant.on(event as keyof ParticipantEventCallbacks, this.handleUpdateTracks); | ||
| this.handleUpdateTracks(); | ||
| } | ||
| } | ||
| } | ||
| } | ||
|
|
||
| private handleUpdateTracks = () => { | ||
| const newVideoTrack = ( | ||
| this.agentTracks.find((t) => t.source === Track.Source.Camera) ?? | ||
| this.workerTracks.find((t) => t.source === Track.Source.Camera) ?? null | ||
| ); | ||
| if (this.videoTrack !== newVideoTrack) { | ||
| this.videoTrack = newVideoTrack; | ||
| this.emit(AgentEvent.VideoTrackChanged, newVideoTrack); | ||
| } | ||
|
|
||
| const newAudioTrack = ( | ||
| this.agentTracks.find((t) => t.source === Track.Source.Microphone) ?? | ||
| this.workerTracks.find((t) => t.source === Track.Source.Microphone) ?? null | ||
| ); | ||
| if (this.audioTrack !== newAudioTrack) { | ||
1egoman marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| this.audioTrack = newAudioTrack; | ||
| this.emit(AgentEvent.AudioTrackChanged, newAudioTrack); | ||
| } | ||
| }; | ||
|
|
||
| private handleAttributesChanged = (attributes: Record<string, string>) => { | ||
| this.attributes = attributes; | ||
| this.emit(AgentEvent.AgentAttributesChanged, attributes); | ||
| this.updateAgentState(); | ||
| }; | ||
|
|
||
| private updateAgentState() { | ||
| let newAgentState: AgentState | null = null; | ||
| const connectionState = this.room.state; | ||
|
|
||
| if (connectionState === ConnectionState.Disconnected) { | ||
| newAgentState = 'disconnected'; | ||
| } else if ( | ||
| connectionState === ConnectionState.Connecting || | ||
| !this.agentParticipant || | ||
| !this.attributes[stateAttribute] | ||
| ) { | ||
| newAgentState = 'connecting'; | ||
| } else { | ||
| newAgentState = this.attributes[stateAttribute] as AgentState; | ||
| } | ||
| console.log('!! STATE:', newAgentState, this.agentParticipant?.attributes); | ||
|
|
||
| if (this.state !== newAgentState) { | ||
| this.state = newAgentState; | ||
| this.emit(AgentEvent.AgentStateChanged, newAgentState); | ||
| } | ||
| } | ||
|
|
||
| private get roomRemoteParticipants() { | ||
| return Array.from(this.room.remoteParticipants.values()); | ||
| } | ||
|
|
||
| private get agentTracks() { | ||
| if (!this.agentParticipant) { | ||
| return []; | ||
| } | ||
| return getParticipantTrackRefs( | ||
| this.agentParticipant, | ||
| { sources: [Track.Source.Microphone, Track.Source.Camera] } | ||
| ); | ||
| } | ||
|
|
||
| private get workerTracks() { | ||
| if (!this.workerParticipant) { | ||
| return []; | ||
| } | ||
| return getParticipantTrackRefs( | ||
| this.workerParticipant, | ||
| { sources: [Track.Source.Microphone, Track.Source.Camera] } | ||
| ); | ||
| } | ||
| } | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still thinking about two different states.
One being the agent's conversational state and the other being the sessions (connection) state.
It's not ideal either as users potentially have to consider two different states, but it would make the whole preconnectbuffer thing easier to deal with as it would allow for an agent to be
listeningwhile the session is stillconnecting.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe adding these two states, and leaning more into properties like what I did with
isAvailablecould be an option - these properties could check specific combinations of states to determine whether common situations are happening or not?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented here - this ended up being pretty elegant with the getters like
isAvailable/isConnected/ etc, I think encouraging users to interact with these state values on that level is the way to go.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the rule of thumb should be: does it make sense in the (custom)
componentsor not - if you need to compose the state there, it should be moved inside.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anecdotally, I will say in the work I have done to try to integrate these abstractions into
agent-starter-react, I've definitely needed the more abstract / composed values (ie,isAvailablecontrolling whether ui elements related to the agent should be visible, etc). So I think it probably makes sense.