Metarank: real time personalization as a service

Docs | Website | Community Slack | Blog | Demo

What is Metarank?

Metarank is an open-source ranking service. It can help you to build a personalized semantic/neural search and recommendations.

If you just want to get started, try:

the quickstart tutorial of implementing Learning-to-Rank on top of your search engine.
a semantic search guide of building an LLM-based neural search.
a collaborative filtering recommendations guide to create a "you may also like" widget as seen on many e-commerce stores.

Why Metarank?

With Metarank, you can make your existing search and recommendations smarter:

Integrate customer signals like clicks and purchases into the ranking - and optimize for maximal CTR!
Track visitor profile and make search results adapt to user actions with real-time personalization.
Use LLMs in bi- and cross-encoder mode to make your search understand the true meaning of search queries.

Metarank is fast:

optimized for reranking latency, it can handle even large result sets within 10-20ms. See benchmarks.
as a stateless cloud-native service (with state managed by Redis), it can scale horizontally and process thousands of RPS. See Kubernetes deployment guide for details.

Save your development time:

Metarank can compute dozens of typical ranking signals out of the box: CTR, referer, User-Agent, time, etc - you don't need to write custom ad-hoc code for most common ranking factors. See the full list of supported ranking signals in our docs.
There are integrations with many possible streaming processing systems to ingest visitor signals: See data sources for details.

What can you build with Metarank?

Metarank helps you build advanced ranking systems for search and recommendations:

Semantic search: use state-of-the-art LLMs to make your Elasticsearch/OpenSearch understand the meaning of your queries
Recommendations: traditional collaborative-filtering and new-age semantic content recommendations.
Learning-to-Rank: optimize your existing search

Content

Blog posts:

Meetups and conference talks:

Building an open-source online Learn-to-rank engine, Haystack EU 23, slides
Overcoming position and presentation biases in search and recommender systems, Data Natives Meetup Berlin, slides
Learning-to-rank: Deep, fast, precise - choose any two, DataTalks meetup, slides

Main features

Semantic neural search: [TODO]
Recommendations: trending and similar-items (MF ALS).
Personalization: secondary reranking (LambdaMART)
AutoML: automatic feature generation and model re-training
A/B testing: multiple model serving

Demo

You can play with Metarank demo on demo.metarank.ai:

The demo itself and the data used are open-source and you can grab a copy of training events and config file in the github repo.

Metarank in One Minute

Let us show how you can start personalizing content with LambdaMART-based reranking in just under a minute:

Prepare the data: we will get the dataset and config file from the demo.metarank.ai
Start Metarank in a standalone mode: it will import the data, train the ML model and start the API.
Send a couple of requests to the API.

Step 1: Prepare data

We will use the ranklens dataset, which is used in our Demo, so just download the data file

curl -O -L https://github.com/metarank/metarank/raw/master/src/test/resources/ranklens/events/events.jsonl.gz

Step 2: Prepare configuration file

We will again use the configuration file from our Demo. It utilizes in-memory store, so no other dependencies are needed.

curl -O -L https://raw.githubusercontent.com/metarank/metarank/master/src/test/resources/ranklens/config.yml

Step 3: Start Metarank!

With the final step we will use Metarank’s standalone mode that combines training and running the API into one command:

docker run -i -t -p 8080:8080 -v $(pwd):/opt/metarank metarank/metarank:latest standalone --config /opt/metarank/config.yml --data /opt/metarank/events.jsonl.gz

You will see some useful output while Metarank is starting and grinding through the data. Once this is done, you can send requests to localhost:8080 to get personalized results.

Here we will interact with several movies by clicking on one of them and observing the results.

First, let's see the initial output provided by Metarank without before we interact with it

# get initial ranking for some items
curl http://localhost:8080/rank/xgboost \
    -d '{
    "event": "ranking",
    "id": "id1",
    "items": [
        {"id":"72998"}, {"id":"67197"}, {"id":"77561"},
        {"id":"68358"}, {"id":"79132"}, {"id":"103228"}, 
        {"id":"72378"}, {"id":"85131"}, {"id":"94864"}, 
        {"id":"68791"}, {"id":"93363"}, {"id":"112623"}
    ],
    "user": "alice",
    "session": "alice1",
    "timestamp": 1661431886711
}'

# {"item":"72998","score":0.9602446652021992},{"item":"79132","score":0.7819134441404151},{"item":"68358","score":0.33377910321385645},{"item":"112623","score":0.32591281190727805},{"item":"103228","score":0.31640256043322723},{"item":"77561","score":0.3040782705414116},{"item":"94864","score":0.17659007036183608},{"item":"72378","score":0.06164568676567339},{"item":"93363","score":0.058120639770243385},{"item":"68791","score":0.026919880032451306},{"item":"85131","score":-0.35794106000271037},{"item":"67197","score":-0.48735167237049154}

# tell Metarank which items were presented to the user and in which order from the previous request
# optionally, we can include the score calculated by Metarank or your internal retrieval system
curl http://localhost:8080/feedback \
 -d '{
  "event": "ranking",
  "fields": [],
  "id": "test-ranking",
  "items": [
    {"id":"72998","score":0.9602446652021992},{"id":"79132","score":0.7819134441404151},{"id":"68358","score":0.33377910321385645},
    {"id":"112623","score":0.32591281190727805},{"id":"103228","score":0.31640256043322723},{"id":"77561","score":0.3040782705414116},
    {"id":"94864","score":0.17659007036183608},{"id":"72378","score":0.06164568676567339},{"id":"93363","score":0.058120639770243385},
    {"id":"68791","score":0.026919880032451306},{"id":"85131","score":-0.35794106000271037},{"id":"67197","score":-0.48735167237049154}
  ],
  "user": "test2",
  "session": "test2",
  "timestamp": 1661431888711
}'

Now, let's intereact with the items 93363

# click on the item with id 93363
curl http://localhost:8080/feedback \
 -d '{
  "event": "interaction",
  "type": "click",
  "fields": [],
  "id": "test-interaction",
  "ranking": "test-ranking",
  "item": "93363",
  "user": "test",
  "session": "test",
  "timestamp": 1661431890711
}'

Now, Metarank will personalize the items, the order of the items in the response will be different

# personalize the same list of items
# they will be returned in a different order by Metarank
curl http://localhost:8080/rank/xgboost \
 -d '{
  "event": "ranking",
  "fields": [],
  "id": "test-personalized",
  "items": [
    {"id":"72998"}, {"id":"67197"}, {"id":"77561"},
    {"id":"68358"}, {"id":"79132"}, {"id":"103228"}, 
    {"id":"72378"}, {"id":"85131"}, {"id":"94864"}, 
    {"id":"68791"}, {"id":"93363"}, {"id":"112623"}
  ],
  "user": "test",
  "session": "test",
  "timestamp": 1661431892711
}'

# {"items":[{"item":"93363","score":2.2013986484185124},{"item":"72998","score":1.1542776301073876},{"item":"68358","score":0.9828904282341605},{"item":"112623","score":0.9521647429731446},{"item":"79132","score":0.9258841742518286},{"item":"77561","score":0.8990921381835769},{"item":"103228","score":0.8990921381835769},{"item":"94864","score":0.7131600718467729},{"item":"68791","score":0.624462038351694},{"item":"72378","score":0.5269765094008626},{"item":"85131","score":0.29198666089255343},{"item":"67197","score":0.16412780810560743}]}

Useful Links

What's next?

Check out a more in-depth Quickstart full Reference.

If you have any questions, don't hesitate to join our Slack!

License

This project is released under the Apache 2.0 license, as specified in the License file.

Name		Name	Last commit message	Last commit date
Latest commit History 847 Commits
.github		.github
deploy		deploy
doc		doc
project		project
src		src
.git-blame-ignore-revs		.git-blame-ignore-revs
.gitattributes		.gitattributes
.gitbook.yaml		.gitbook.yaml
.gitignore		.gitignore
.scala-steward.conf		.scala-steward.conf
.scalafmt.conf		.scalafmt.conf
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
build.sbt		build.sbt
build_docker.sh		build_docker.sh
run_demo.sh		run_demo.sh
run_quickstart.sh		run_quickstart.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Metarank: real time personalization as a service

Docs | Website | Community Slack | Blog | Demo

What is Metarank?

Why Metarank?

What can you build with Metarank?

Content

Main features

Demo

Metarank in One Minute

Step 1: Prepare data

Step 2: Prepare configuration file

Step 3: Start Metarank!

Useful Links

What's next?

License

About

Releases

Packages

Languages

License

tomsquest/metarank

Folders and files

Latest commit

History

Repository files navigation

Metarank: real time personalization as a service

Docs | Website | Community Slack | Blog | Demo

What is Metarank?

Why Metarank?

What can you build with Metarank?

Content

Main features

Demo

Metarank in One Minute

Step 1: Prepare data

Step 2: Prepare configuration file

Step 3: Start Metarank!

Useful Links

What's next?

License

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages