GitHub - duncantmiller/langchainrb: Build LLM-backed Ruby applications

💎🔗 Langchain.rb

⚡ Building applications with LLMs through composability ⚡

For deep Rails integration see: langchainrb_rails gem.

Available for paid consulting engagements! Email me.

Langchain.rb is a library that's an abstraction layer on top many emergent AI, ML and other DS tools. The goal is to abstract complexity and difficult concepts to make building AI/ML-supercharged applications approachable for traditional software engineers.

Installation

Install the gem and add to the application's Gemfile by executing:

bundle add langchainrb

If bundler is not being used to manage dependencies, install the gem by executing:

gem install langchainrb

Usage

require "langchain"

Supported vector search databases and features:

Database	Querying	Storage	Schema Management	Backups	Rails Integration
Chroma	✅	✅	✅	WIP	✅
Hnswlib	✅	✅	✅	WIP	WIP
Milvus	✅	✅	✅	WIP	✅
Pinecone	✅	✅	✅	WIP	✅
Pgvector	✅	✅	✅	WIP	✅
Qdrant	✅	✅	✅	WIP	✅
Weaviate	✅	✅	✅	WIP	✅

Using Vector Search Databases 🔍

Choose the LLM provider you'll be using (OpenAI or Cohere) and retrieve the API key.

Add gem "weaviate-ruby", "~> 0.8.3" to your Gemfile.

Pick the vector search database you'll be using and instantiate the client:

client = Langchain::Vectorsearch::Weaviate.new(
    url: ENV["WEAVIATE_URL"],
    api_key: ENV["WEAVIATE_API_KEY"],
    index_name: "",
    llm: Langchain::LLM::OpenAI.new(api_key: ENV["OPENAI_API_KEY"])
)

# You can instantiate any other supported vector search database:
client = Langchain::Vectorsearch::Chroma.new(...) # `gem "chroma-db", "~> 0.6.0"`
client = Langchain::Vectorsearch::Hnswlib.new(...) # `gem "hnswlib", "~> 0.8.1"`
client = Langchain::Vectorsearch::Milvus.new(...) # `gem "milvus", "~> 0.9.2"`
client = Langchain::Vectorsearch::Pinecone.new(...) # `gem "pinecone", "~> 0.1.6"`
client = Langchain::Vectorsearch::Pgvector.new(...) # `gem "pgvector", "~> 0.2"`
client = Langchain::Vectorsearch::Qdrant.new(...) # `gem"qdrant-ruby", "~> 0.9.3"`

# Creating the default schema
client.create_default_schema

# Store plain texts in your vector search database
client.add_texts(
    texts: [
        "Begin by preheating your oven to 375°F (190°C). Prepare four boneless, skinless chicken breasts by cutting a pocket into the side of each breast, being careful not to cut all the way through. Season the chicken with salt and pepper to taste. In a large skillet, melt 2 tablespoons of unsalted butter over medium heat. Add 1 small diced onion and 2 minced garlic cloves, and cook until softened, about 3-4 minutes. Add 8 ounces of fresh spinach and cook until wilted, about 3 minutes. Remove the skillet from heat and let the mixture cool slightly.",
        "In a bowl, combine the spinach mixture with 4 ounces of softened cream cheese, 1/4 cup of grated Parmesan cheese, 1/4 cup of shredded mozzarella cheese, and 1/4 teaspoon of red pepper flakes. Mix until well combined. Stuff each chicken breast pocket with an equal amount of the spinach mixture. Seal the pocket with a toothpick if necessary. In the same skillet, heat 1 tablespoon of olive oil over medium-high heat. Add the stuffed chicken breasts and sear on each side for 3-4 minutes, or until golden brown."
    ]
)

# Store the contents of your files in your vector search database
my_pdf = Langchain.root.join("path/to/my.pdf")
my_text = Langchain.root.join("path/to/my.txt")
my_docx = Langchain.root.join("path/to/my.docx")

client.add_data(paths: [my_pdf, my_text, my_docx])

# Retrieve similar documents based on the query string passed in
client.similarity_search(
    query:,
    k:       # number of results to be retrieved
)

# Retrieve similar documents based on the query string passed in via the [HyDE technique](https://arxiv.org/abs/2212.10496)
client.similarity_search_with_hyde()

# Retrieve similar documents based on the embedding passed in
client.similarity_search_by_vector(
    embedding:,
    k:       # number of results to be retrieved
)

# Q&A-style querying based on the question passed in
client.ask(
    question:
)

Integrating Vector Search into ActiveRecord models

class Product < ActiveRecord::Base
  vectorsearch provider: Langchain::Vectorsearch::Qdrant.new(
                 api_key: ENV["QDRANT_API_KEY"],
                 url: ENV["QDRANT_URL"],
                 index_name: "Products",
                 llm: Langchain::LLM::GooglePalm.new(api_key: ENV["GOOGLE_PALM_API_KEY"])
               )

  after_save :upsert_to_vectorsearch
end

Exposed ActiveRecord methods

# Retrieve similar products based on the query string passed in
Product.similarity_search(
    query:,
    k:       # number of results to be retrieved
)

# Q&A-style querying based on the question passed in
Product.ask(
    question:
)

Additional info here.

Using Standalone LLMs 🗣️

Add gem "ruby-openai", "~> 4.0.0" to your Gemfile.

OpenAI

openai = Langchain::LLM::OpenAI.new(api_key: ENV["OPENAI_API_KEY"])

You can pass additional parameters to the constructor, it will be passed to the OpenAI client:

openai = Langchain::LLM::OpenAI.new(api_key: ENV["OPENAI_API_KEY"], llm_options: {uri_base: "http://localhost:1234"}) )

openai.embed(text: "foo bar")

openai.complete(prompt: "What is the meaning of life?")

Open AI Function calls support

Conversation support

chat = Langchain::Conversation.new(llm: openai)

chat.set_context("You are the climate bot")
chat.set_functions(functions)

qdrant:

client.llm.functions = functions

Azure

Add gem "ruby-openai", "~> 5.2.0" to your Gemfile.

azure = Langchain::LLM::Azure.new(
  api_key: ENV["AZURE_API_KEY"],
  llm_options: {
    api_type: :azure,
    api_version: "2023-03-15-preview"
  },
  embedding_deployment_url: ENV.fetch("AZURE_EMBEDDING_URI"),
  chat_deployment_url: ENV.fetch("AZURE_CHAT_URI")
)

where AZURE_EMBEDDING_URI is e.g. https://custom-domain.openai.azure.com/openai/deployments/gpt-35-turbo and AZURE_CHAT_URI is e.g. https://custom-domain.openai.azure.com/openai/deployments/ada-2

You can pass additional parameters to the constructor, it will be passed to the Azure client:

azure = Langchain::LLM::Azure.new(
  api_key: ENV["AZURE_API_KEY"],
  llm_options: {
    api_type: :azure,
    api_version: "2023-03-15-preview",
    request_timeout: 240 # Optional
  },
  embedding_deployment_url: ENV.fetch("AZURE_EMBEDDING_URI"),
  chat_deployment_url: ENV.fetch("AZURE_CHAT_URI")
)

azure.embed(text: "foo bar")

azure.complete(prompt: "What is the meaning of life?")

Cohere

Add gem "cohere-ruby", "~> 0.9.6" to your Gemfile.

cohere = Langchain::LLM::Cohere.new(api_key: ENV["COHERE_API_KEY"])

cohere.embed(text: "foo bar")

cohere.complete(prompt: "What is the meaning of life?")

HuggingFace

Add gem "hugging-face", "~> 0.3.2" to your Gemfile.

hugging_face = Langchain::LLM::HuggingFace.new(api_key: ENV["HUGGING_FACE_API_KEY"])

Replicate

Add gem "replicate-ruby", "~> 0.2.2" to your Gemfile.

replicate = Langchain::LLM::Replicate.new(api_key: ENV["REPLICATE_API_KEY"])

Google PaLM (Pathways Language Model)

Add "google_palm_api", "~> 0.1.3" to your Gemfile.

google_palm = Langchain::LLM::GooglePalm.new(api_key: ENV["GOOGLE_PALM_API_KEY"])

AI21

Add gem "ai21", "~> 0.2.1" to your Gemfile.

ai21 = Langchain::LLM::AI21.new(api_key: ENV["AI21_API_KEY"])

Anthropic

Add gem "anthropic", "~> 0.1.0" to your Gemfile.

anthropic = Langchain::LLM::Anthropic.new(api_key: ENV["ANTHROPIC_API_KEY"])

anthropic.complete(prompt: "What is the meaning of life?")

Ollama

ollama = Langchain::LLM::Ollama.new(url: ENV["OLLAMA_URL"])

ollama.complete(prompt: "What is the meaning of life?")

ollama.embed(text: "Hello world!")

Using Prompts 📋

Prompt Templates

Create a prompt with one input variable:

prompt = Langchain::Prompt::PromptTemplate.new(template: "Tell me a {adjective} joke.", input_variables: ["adjective"])
prompt.format(adjective: "funny") # "Tell me a funny joke."

Create a prompt with multiple input variables:

prompt = Langchain::Prompt::PromptTemplate.new(template: "Tell me a {adjective} joke about {content}.", input_variables: ["adjective", "content"])
prompt.format(adjective: "funny", content: "chickens") # "Tell me a funny joke about chickens."

Creating a PromptTemplate using just a prompt and no input_variables:

prompt = Langchain::Prompt::PromptTemplate.from_template("Tell me a funny joke about chickens.")
prompt.input_variables # []
prompt.format # "Tell me a funny joke about chickens."

Save prompt template to JSON file:

prompt.save(file_path: "spec/fixtures/prompt/prompt_template.json")

Loading a new prompt template using a JSON file:

prompt = Langchain::Prompt.load_from_path(file_path: "spec/fixtures/prompt/prompt_template.json")
prompt.input_variables # ["adjective", "content"]

Few Shot Prompt Templates

Create a prompt with a few shot examples:

prompt = Langchain::Prompt::FewShotPromptTemplate.new(
  prefix: "Write antonyms for the following words.",
  suffix: "Input: {adjective}\nOutput:",
  example_prompt: Langchain::Prompt::PromptTemplate.new(
    input_variables: ["input", "output"],
    template: "Input: {input}\nOutput: {output}"
  ),
  examples: [
    { "input": "happy", "output": "sad" },
    { "input": "tall", "output": "short" }
  ],
   input_variables: ["adjective"]
)

prompt.format(adjective: "good")

# Write antonyms for the following words.
#
# Input: happy
# Output: sad
#
# Input: tall
# Output: short
#
# Input: good
# Output:

Save prompt template to JSON file:

prompt.save(file_path: "spec/fixtures/prompt/few_shot_prompt_template.json")

Loading a new prompt template using a JSON file:

prompt = Langchain::Prompt.load_from_path(file_path: "spec/fixtures/prompt/few_shot_prompt_template.json")
prompt.prefix # "Write antonyms for the following words."

Loading a new prompt template using a YAML file:

prompt = Langchain::Prompt.load_from_path(file_path: "spec/fixtures/prompt/prompt_template.yaml")
prompt.input_variables #=> ["adjective", "content"]

Using Output Parsers

Parse LLM text responses into structured output, such as JSON.

Structured Output Parser

You can use the StructuredOutputParser to generate a prompt that instructs the LLM to provide a JSON response adhering to a specific JSON schema:

json_schema = {
  type: "object",
  properties: {
    name: {
      type: "string",
      description: "Persons name"
    },
    age: {
      type: "number",
      description: "Persons age"
    },
    interests: {
      type: "array",
      items: {
        type: "object",
        properties: {
          interest: {
            type: "string",
            description: "A topic of interest"
          },
          levelOfInterest: {
            type: "number",
            description: "A value between 0 and 100 of how interested the person is in this interest"
          }
        },
        required: ["interest", "levelOfInterest"],
        additionalProperties: false
      },
      minItems: 1,
      maxItems: 3,
      description: "A list of the person's interests"
    }
  },
  required: ["name", "age", "interests"],
  additionalProperties: false
}
parser = Langchain::OutputParsers::StructuredOutputParser.from_json_schema(json_schema)
prompt = Langchain::Prompt::PromptTemplate.new(template: "Generate details of a fictional character.\n{format_instructions}\nCharacter description: {description}", input_variables: ["description", "format_instructions"])
prompt_text = prompt.format(description: "Korean chemistry student", format_instructions: parser.get_format_instructions)
# Generate details of a fictional character.
# You must format your output as a JSON value that adheres to a given "JSON Schema" instance.
# ...

Then parse the llm response:

llm = Langchain::LLM::OpenAI.new(api_key: ENV["OPENAI_API_KEY"])
llm_response = llm.chat(prompt: prompt_text)
parser.parse(llm_response)
# {
#   "name" => "Kim Ji-hyun",
#   "age" => 22,
#   "interests" => [
#     {
#       "interest" => "Organic Chemistry",
#       "levelOfInterest" => 85
#     },
#     ...
#   ]
# }

If the parser fails to parse the LLM response, you can use the OutputFixingParser. It sends an error message, prior output, and the original prompt text to the LLM, asking for a "fixed" response:

begin
  parser.parse(llm_response)
rescue Langchain::OutputParsers::OutputParserException => e
  fix_parser = Langchain::OutputParsers::OutputFixingParser.from_llm(
    llm: llm,
    parser: parser
  )
  fix_parser.parse(llm_response)
end

Alternatively, if you don't need to handle the OutputParserException, you can simplify the code:

# we already have the `OutputFixingParser`:
# parser = Langchain::OutputParsers::StructuredOutputParser.from_json_schema(json_schema)
fix_parser = Langchain::OutputParsers::OutputFixingParser.from_llm(
  llm: llm,
  parser: parser
)
fix_parser.parse(llm_response)

See here for a concrete example

Using Agents 🤖

Agents are semi-autonomous bots that can respond to user questions and use available to them Tools to provide informed replies. They break down problems into series of steps and define Actions (and Action Inputs) along the way that are executed and fed back to them as additional information. Once an Agent decides that it has the Final Answer it responds with it.

ReAct Agent

Add gem "ruby-openai", gem "eqn", and gem "google_search_results" to your Gemfile

search_tool = Langchain::Tool::GoogleSearch.new(api_key: ENV["SERPAPI_API_KEY"])
calculator = Langchain::Tool::Calculator.new

openai = Langchain::LLM::OpenAI.new(api_key: ENV["OPENAI_API_KEY"])

agent = Langchain::Agent::ReActAgent.new(
  llm: openai,
  tools: [search_tool, calculator]
)

agent.run(question: "How many full soccer fields would be needed to cover the distance between NYC and DC in a straight line?")
#=> "Approximately 2,945 soccer fields would be needed to cover the distance between NYC and DC in a straight line."

SQL-Query Agent

Add gem "sequel" to your Gemfile

database = Langchain::Tool::Database.new(connection_string: "postgres://user:password@localhost:5432/db_name")

agent = Langchain::Agent::SQLQueryAgent.new(llm: Langchain::LLM::OpenAI.new(api_key: ENV["OPENAI_API_KEY"]), db: database)

agent.run(question: "How many users have a name with length greater than 5 in the users table?")
#=> "14 users have a name with length greater than 5 in the users table."

Demo

Available Tools 🛠️

Name	Description	ENV Requirements	Gem Requirements
"calculator"	Useful for getting the result of a math expression		`gem "eqn", "~> 1.6.5"`
"database"	Useful for querying a SQL database		`gem "sequel", "~> 5.68.0"`
"ruby_code_interpreter"	Interprets Ruby expressions		`gem "safe_ruby", "~> 1.0.4"`
"google_search"	A wrapper around Google Search	`ENV["SERPAPI_API_KEY"]` (https://serpapi.com/manage-api-key)	`gem "google_search_results", "~> 2.0.0"`
"weather"	Calls Open Weather API to retrieve the current weather	`ENV["OPEN_WEATHER_API_KEY"]` (https://home.openweathermap.org/api_keys)	`gem "open-weather-ruby-client", "~> 0.3.0"`
"wikipedia"	Calls Wikipedia API to retrieve the summary		`gem "wikipedia-client", "~> 1.17.0"`

Loaders 🚚

Need to read data from various sources? Load it up.

Usage

Just call Langchan::Loader.load with the path to the file or a URL you want to load.

Langchain::Loader.load('/path/to/file.pdf')

or

Langchain::Loader.load('https://www.example.com/file.pdf')

Supported Formats

Format	Pocessor	Gem Requirements
docx	Langchain::Processors::Docx	`gem "docx", "~> 0.8.0"`
html	Langchain::Processors::HTML	`gem "nokogiri", "~> 1.13"`
pdf	Langchain::Processors::PDF	`gem "pdf-reader", "~> 1.4"`
text	Langchain::Processors::Text
JSON	Langchain::Processors::JSON
JSONL	Langchain::Processors::JSONL
csv	Langchain::Processors::CSV
xlsx	Langchain::Processors::Xlsx	`gem "roo", "~> 2.10.0"`

Examples

Additional examples available: /examples

Evaluations (Evals)

The Evaluations module is a collection of tools that can be used to evaluate and track the performance of the output products by LLM and your RAG (Retrieval Augmented Generation) pipelines.

RAGAS

Ragas helps you evaluate your Retrieval Augmented Generation (RAG) pipelines. The implementation is based on this paper and the original Python repo. Ragas tracks the following 3 metrics and assigns the 0.0 - 1.0 scores:

Faithfulness - the answer is grounded in the given context.
Context Relevance - the retrieved context is focused, containing little to no irrelevant information.
Answer Relevance - the generated answer addresses the actual question that was provided.

# We recommend using Langchain::LLM::OpenAI as your llm for Ragas
ragas = Langchain::Evals::Ragas::Main.new(llm: llm)

# The answer that the LLM generated
# The question (or the original prompt) that was asked
# The context that was retrieved (usually from a vectorsearch database)
ragas.score(answer: "", question: "", context: "")
# =>
# {
#   ragas_score: 0.6601257446503674,
#   answer_relevance_score: 0.9573145866787608,
#   context_relevance_score: 0.6666666666666666,
#   faithfulness_score: 0.5
# }

Logging

LangChain.rb uses standard logging mechanisms and defaults to :warn level. Most messages are at info level, but we will add debug or warn statements as needed. To show all log messages:

Langchain.logger.level = :info

Development

git clone https://github.com/andreibondarev/langchainrb.git
cp .env.example .env, then fill out the environment variables in .env
bundle exec rake to ensure that the tests pass and to run standardrb
bin/console to load the gem in a REPL session. Feel free to add your own instances of LLMs, Tools, Agents, etc. and experiment with them.
Optionally, install lefthook git hooks for pre-commit to auto lint: gem install lefthook && lefthook install -f

Discord

Join us in the Langchain.rb Discord server.

Core Contributors

Contributors

Star History

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/andreibondarev/langchainrb.

License

The gem is available as open source under the terms of the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 469 Commits
.github		.github
bin		bin
examples		examples
lib		lib
sig		sig
spec		spec
.env.example		.env.example
.gitignore		.gitignore
.rspec		.rspec
.rubocop.yml		.rubocop.yml
.tool-versions		.tool-versions
CHANGELOG.md		CHANGELOG.md
Gemfile		Gemfile
Gemfile.lock		Gemfile.lock
LICENSE.txt		LICENSE.txt
README.md		README.md
Rakefile		Rakefile
langchain.gemspec		langchain.gemspec
lefthook.yml		lefthook.yml

License

duncantmiller/langchainrb

Folders and files

Latest commit

History

Repository files navigation

💎🔗 Langchain.rb

Explore Langchain.rb

Installation

Usage

Supported vector search databases and features:

Using Vector Search Databases 🔍

Integrating Vector Search into ActiveRecord models

Exposed ActiveRecord methods

Using Standalone LLMs 🗣️

OpenAI

Open AI Function calls support

Azure

Cohere

HuggingFace

Replicate

Google PaLM (Pathways Language Model)

AI21

Anthropic

Ollama

Using Prompts 📋

Prompt Templates

Few Shot Prompt Templates

Using Output Parsers

Structured Output Parser

Using Agents 🤖

ReAct Agent

SQL-Query Agent

Demo

Available Tools 🛠️

Loaders 🚚

Usage

Supported Formats

Examples

Evaluations (Evals)

RAGAS

Logging

Development

Discord

Core Contributors

Contributors

Star History

Contributing

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages