Kafka Source Connector using Spotify as the data source.
-
Clone and build the project
> git clone https://github.com/msschroe3/kafka-connect-spotify > cd kafka-connect-spotify > ./gradlew clean shadowJar
-
Start docker environment by running
> docker-compose up -d
-
Duplicate
spotify-source.template.json
and rename the new filespotify-source.json
- git will ignore
spotify-source.json
so that you don't have to worry about committing Spotify credentials
- git will ignore
-
Update
spotify-source.json
with Spotify Credentials- See Generating Spotify Credentials section for more details
-
POST configuration to connect worker to start connector
> curl -X POST -H "Content-Type: application/json" --data @spotify-source.json localhost:8083/connectors
-
Debugging is enabled by default. See the IntelliJ Remote Debugger Setup section for instructions on attaching to the remote process and the Debugging section for general debugging details.
- If
DEBUG_SUSPEND_FLAG
is set toy
in thedocker-compose
file, the connect service's startup will pause until a remote debugger is connected.
- If
- Confluent Control Center (running at
localhost:9021
) - Kafka Topics UI (running at
localhost:8000
)
If you don't care about these, feel free to comment them out of the docker-compose
file to save memory usage.
See spotify-source.json
for full configuration set.
Spotify Config Values:
spotify.oauth.clientId
: The client ID provided when setting up a Spotify Integration.spotify.oauth.clientSecret
: The client secret provided when setting up a Spotify Integration.spotify.oauth.accessToken
: Access token for calling the Spotify API.spotify.kafka.topic
: Name of the topic that your messages will be sent to.
There are two types of credentials that can be provided to the connector so that it can communicate with the Spotify API.
This is a simple token that will expire after 30 minutes. This approach is best for a simple test.
If using this approach, remove the client ID and secret configuration values
These are credentials that can be used to create tokens on the fly. This approach is best for those
who plan to start the Spotify connector and let it run indefinitely.
1. Go to Developer Dashboard
2. Login w/Spotify Account
3. Create an App
4. Copy out Client ID & Secret for spotify-source.json
config file
If using this approach, remove the access token configuration value
To see if messages are flying around, exec into the broker container and use the built in kafka-console-consumer
CLI.
> docker exec -it ${broker-container-id} bash
> kafka-console-consumer --bootstrap-server localhost:9092 --topic spotify_play_history --from-beginning
# get number of messages written to topic
> kafka-consumer-offset-checker --topic spotify_play_history --zookeeper zookeeper:2181
You can also view the Kafka Topics UI on port 8000
The REST endpoints available via the connect service can help manage the setup, updating and teardown of connectors
and their tasks. Replace spotify-play-history
with the 'name' value from your config JSON file.
curl -X POST -H "Content-Type: application/json" --data @spotify-source.json localhost:8083/connectors
curl localhost:8083/connectors
curl localhost:8083/connectors/spotify-play-history/tasks | jq
curl localhost:8083/connectors/spotify-play-history/status | jq
curl localhost:8083/connectors/spotify-play-history | jq
curl -X PUT localhost:8083/connectors/spotify-play-history/pause
curl -X PUT localhost:8083/connectors/spotify-play-history/resume
curl -X DELETE localhost:8083/connectors/spotify-play-history
The connect service's docker-compose setup has a few properties than enable remote debugging of the connector.
# this snippet only contains properties related to debugging
connect:
ports:
# expose the default remote debug port
- "5005:5005"
environment:
# enable remote debugging
KAFKA_DEBUG: y
# suspend startup until debugger is attached (optional)
DEBUG_SUSPEND_FLAG: y
The DEBUG_SUSPEND_FLAG
can be helpful if there are errors being thrown during startup that you want to step through.
Without it the connector and its task(s) might get up and running before you have a chance to step into it.
- Navigate to Run > Edit Configurations
- Select 'Add New Configuration (⌘N)'
- Name the Configuration, Set
Host: localhost
andPort: 5005
- Save Configuration and Start Debugger
To start executing KSQL on the CLI server, execute
docker-compose exec ksql-cli ksql http://ksql-server:8088
To view all records in a topic, SET the auto offset reset to earliest and it will start from the beginning.
ksql> SET 'auto.offset.reset' = 'earliest';
print 'spotify_play_history';
If a song comes through more than x times, stuff it in a "favorites playlist"