Skip to content
This repository was archived by the owner on Dec 2, 2024. It is now read-only.

Create README_marconi.md #513

Merged
merged 4 commits into from
Jun 23, 2022
Merged

Create README_marconi.md #513

merged 4 commits into from
Jun 23, 2022

Conversation

joseph-fajen
Copy link
Contributor

This is an initial draft for the Marconi README file. Currently this is a placeholder and a place to begin organizing the documentation.

Marconi

The Cardano blockchain indexer for dApp developers

A lightweight solution for indexing and querying the Cardano blockchain

Built by IOG in Haskell

  • dApp developers can index whatever is important to them.
  • Marconi is an indexing solution for developers who need to store on-chain data in a local database.

Introduction

What is the vision for what Marconi will be? What role is it intended to serve in the broader community? What is the full narrative that goes with it? It's true purpose?

This software component will be a Haskell library used by dApp providers. The component will be designed to allow dApp providers to index in the desired structure the information read from the Cardano blockchain which will be used for the dApp.

Documentation

When it exists, we can link to our readthedocs user documentation.

  • User Guide
  • Reference Guide
  • Tutorial
  • Example Code

Intended Use Cases

Description of its primary intended use cases.

  • Sync with the Cardano blockchain (private/public testnet or mainnet) by reading all blocks from the genesis block to the current tip
  • Index the syncing information based on the user’s predefined indexing
  • Query the indexed information based on the user’s predefined queries

NOTES

  • Multiple indexers
  • UTXO by address
  • We have an indexer that queries local DB
  • Return my UTXO by address (my Cardano address)

Use Case 1

Use Case 2

Use Case 3

NOTES

A user can specify which ones they want to use for their specific applications.
Cardano db sync is IOG's current indexing solution, but it uses lots of memory and days/week to sync. Marconi will be a scalable solution. Want to index whatever is important for the dApp developer. Like Scrolls, we can be selective for what is to be indexed. Scrolls can store into multiple databases. We might do that. It is currently focused on local DBs like SQL lite.

Example Queries

Can we provide example queries?

Differentiators

Compare to existing tools.

  • How does this tool compare to:
    • Cardano DB Sync
    • Scrolls
    • Oura
    • Cardano Chain Index?
    • Plutus Chain Index?
  • What makes this tool different from others?
  • Mention of any known trade-offs that went into the design.
  • Scrolls uses Rust (developed by 3rd party).
  • Positives, negatives. The known trade-offs.

Developed by IOG

Built in Haskell

Functional Description

How does it work?

Architecture

  • Provide a compelling diagram

Architecture Decision Records

  • Records of decisions that were made by the team.
  • Discuss alternatives.
  • Why did we choose one design over another?
    • Konstantinos has some for Plutus Apps.
    • Radu has some for Marconi.

System Requirements

How to Install and Configure

Installation Procedures

Configuration Procedures

How to Contribute

How do you contribute to it? What tools, methods, processes are required?

  • Contributing documentation.
  • Most of the info in Plutus-Apps and Plutus should also be included in Marconi.

Making Builds

How do you build it?

Build Procedures

Storage Considerations and Accessing Data

Troubleshooting

FAQ

Further Reading

Pre-submit checklist:

  • Branch
    • Tests are provided (if possible)
    • Commit sequence broadly makes sense
    • Key commits have useful messages
    • Formatting, PNG optimization, etc. are updated
  • PR
    • Self-reviewed the diff
    • Useful pull request description
    • Reviewer requested

This is an initial draft for the Marconi README file that will get moved to a new repo soon.
@joseph-fajen
Copy link
Contributor Author

This is an initial placeholder draft of the README to kick things off. Please share any thoughts and ideas and we'll build on this.

@michaelpj
Copy link
Contributor

Did you mean to put the content in the actual README file instead of the PR description?

@michaelpj
Copy link
Contributor

That would make it easier to review!

@michaelpj
Copy link
Contributor

  • PR checklist should go in a PR template like in plutus-apps
  • Contributing documentation should go in CONTRIBUTING and be linked from the README
  • Are we going to have a RTD site for documentation? Probably no point talking about it until there is one!

@joseph-fajen
Copy link
Contributor Author

We are planning to have a RTD site for Marconi documentation. I think Lorenzo will be creating the repo very soon with that in mind. I'm excited for the opportunity to start a new RTD from the start of the project.

@andreabedini
Copy link
Contributor

Can I add my two cents and say that I don't like the check box "self-reviewed the diff". It doesn't (IMHO) correspond to anything real and therefore it doesn't add any value.

Also, might I suggest to change Tests are provided (if possible) into Tests are provided or justification is presented? Of course a justification doesn't need to be anything formal but at least 1) it captures a decision and 2) it starts a conversation

@michaelpj
Copy link
Contributor

Can I add my two cents and say that I don't like the check box "self-reviewed the diff". It doesn't (IMHO) correspond to anything real and therefore it doesn't add any value.

The idea was for this to b a nudge to say: did you actually look at the diff before hitting submit? I frequently catch things by doing this and notice things in other people's PRs that would have been caught by doing this (classic example: accidentally committing extra files).

That said... I'm not sure how useful the checklist is. It's only useful if you actually decide to follow it and tick the boxes, which I usually do but I'm not sure that many other people do.

@andreabedini
Copy link
Contributor

The idea was for this to b a nudge to say: did you actually look at the diff before hitting submit?

IMHO the answer is often yes but this doesn't prevent mistakes :-)

This draft incorporates new input from Andrew. Also inviting @andreabedini and @raduom to please review this draft.
raduom
raduom previously approved these changes Jun 16, 2022
Copy link
Contributor

@raduom raduom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a nice start (I especially love the questions).

It misses the notifications, streaming, indexer distribution part of marconi, but that is something that we did not yet discuss. I suspect it can be added later.

Good job.

@raduom raduom dismissed their stale review June 16, 2022 04:37

I reviewed the wrong thing. :)

@raduom raduom self-requested a review June 16, 2022 04:37
Copy link
Contributor

@raduom raduom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I raised some points, but I don't think they are blockers for merging.


If your use cases require multiple synchronized instances for load balancing or for supporting different indexers, Marconi is designed with alternative transport layers on top of the core API that support network streaming and RPC calls.

## What Differentiates Marconi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not really an issue about this README document, but as far as indexers go, they are more of a specification + reference implementation in Haskell. I don't think it is wrong to think (at least wrt indexers) that they are primarily a specification.
It's not that difficult to write an indexer in Rust, for example, and use the Haskell FFI to connect it to our property tests to verify they work according to specification (or have a generic REST-based testing interface).
I am not sure this is worth mentioning.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I fully understand your comment, but it sounds like a relatively minor or subtle point that you feel isn't necessarily vital. I'll leave the text unchanged for the time being. I'd like to chat with you about it if it's important to nail this point down.

This draft incorporates comments from @raduom and @andreabedini
Comment on lines +88 to +90
* The query type. Because Marconi uses an Abstract Data Type (ADT), you need to define the types of queries that the indexer responds to.
* Query for certain slot numbers by determining the point in the in-memory history where you want to run the query.
* Weigh considerations for in-memory and on-disk data. The query function produces a result by merging the in-memory and on-disk data. If you are concerned only with on-disk data, then you can query the database directly.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* The query type. Because Marconi uses an Abstract Data Type (ADT), you need to define the types of queries that the indexer responds to.
* Query for certain slot numbers by determining the point in the in-memory history where you want to run the query.
* Weigh considerations for in-memory and on-disk data. The query function produces a result by merging the in-memory and on-disk data. If you are concerned only with on-disk data, then you can query the database directly.
Because we need to support querying both on disk and in-memory data in a unified way, you can customise both the query filter (by specifying an ADT for the `q` type parameter) and the result (by specifying an ADT for the `r` parameter).
For example, if you want to query the UTXOs at address A, you will define a query type that includes a field for the address A.

ADTs stand for Algebraic Data Types (or ADTs), and they are the 'usual' Haskell data types. In Java or other languages ADT means Abstract Data Types. So saying that Marconi uses ADTs is an odd formulation, since everything written in Haskell uses them. I think adding how Marconi uses them makes everything more clear.


The streaming component is customized with a filter function that translates blocks into user-defined types that are wrapped in the streaming event type.
2. Customize how a function performs queries by determining the following aspects of the function:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
2. Customize how a function performs queries by determining the following aspects of the function:
2. Define how an indexer performs queries by providing a query function.

The indexer performs the queries and is parameterised by a function that defines: the query type, the result type and the implementation.


Assumption: Marconi has a typeclass for the necessary functions.
1. *This statement is a placeholder:* A function that is given events and current state outputs notifications.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. *This statement is a placeholder:* A function that is given events and current state outputs notifications.
1. *This statement is a placeholder:* A function that is given events and current state and outputs notifications.


The indexing component is customized with a corresponding function that takes the user-defined type and translates them into database types.
3. Customize a function that stores buffered data. While there is no connection to any storage mechanism, there is a nice API is available, and you are encouraged to change it to suit your needs. Changing it is very simple.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
3. Customize a function that stores buffered data. While there is no connection to any storage mechanism, there is a nice API is available, and you are encouraged to change it to suit your needs. Changing it is very simple.
3. Define a function that stores buffered data. We currently use the SQLite database backend for storage, but you can write you own storage function for your own storage type and events.

So these functions are as raw as can be. You can customise the indexer using them, but I think it's more accurate to say that you can write your indexer using them. They are very similar to abstract methods in Java, which you need to define for your class/object to make sense.

@joseph-fajen joseph-fajen merged commit 5c46471 into main Jun 23, 2022
@joseph-fajen joseph-fajen deleted the joseph-fajen-patch-1 branch June 23, 2022 15:15
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants