Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Bubble Sort example #158

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

phklive
Copy link

@phklive phklive commented Dec 8, 2023

Proposing an implementation of the Bubble Sort algorithm in Miden Assembly. Sorting an array that is provided through private inputs using the advice_stack.

As examples will be mostly read by newcomers, I have tried to create thorough documentation of the code to aid them in understanding the implementation and principles of Miden Assembly, enabling easier onboarding.

Proposed Improvements:

  • Robust Input Handling: making the program more resilient to variations in inputs.
  • Efficiency Optimisation: The current implementation has a relatively high clock (clk) count.

I welcome any feedback or suggestions

Copy link

sonarqubecloud bot commented Dec 8, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
No Duplication information No Duplication information

Copy link
Contributor

@hackaugusto hackaugusto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

I left lots of inline comments. Please go over them, I'm sharing optimization techniques, coding guidelines, code organization tips, that we expect to have in production code.

For the first masm example, I think this is great :)

examples/bubble_sort.masm Outdated Show resolved Hide resolved
Comment on lines 87 to 91
# Load and increment `g` counter to keep track of added elements
mem_load.3
add.1
dup
mem_store.3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: It is good practice to always include the cycle count. The best way to find the exact cycle count of an instruction is to look at the assembler:

Each instruction is translated to a series of operations. Each operation takes exactly one cycle. Using the above references we have:

  • mem_load.3: 2 cycles Push(3) MLoad
  • add.1: 1 cycle Incr
  • dup: 1 cycle Dup0
  • mem_store.3: 2 cycles MStore Drop

Once you know the cycle count, it can be documented inline:

Suggested change
# Load and increment `g` counter to keep track of added elements
mem_load.3
add.1
dup
mem_store.3
# Load and increment `g` counter to keep track of added elements (6 cycles)
mem_load.3
add.1
dup
mem_store.3

And that is useful to compute the cycle count for the procedure.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be impossible to give the cycle count if there is a loop?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is not impossible, but it is probably harder, use asymptotic approximations, if it is too hard than not having it should be fine.

examples/bubble_sort.masm Outdated Show resolved Hide resolved
examples/bubble_sort.masm Outdated Show resolved Hide resolved
Comment on lines 93 to 105
# Load `adv` array length and check if `g` == `adv`;
# signifying that all elements from array have been added to `operand_stack`
mem_load.4
eq
if.true
# Stop looping and reset `g` to 0
push.0
push.0
mem_store.3
else
# Continue looping
push.1
end
Copy link
Contributor

@hackaugusto hackaugusto Dec 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a good idea to optimize loops, since these are executed often.

Here we can do a few things:

  • Instead of starting at 0 and counting up to n. Start at -n and count up to zero.
    • The comparison is against the constant 0, instead of two variables. So this removes the load for length.
    • Going from -n to 0 instead of 0 to n also performs better because it uses +1 ref instead of -1 ref. This performs better because +1 has a special operation Incr that takes a single cycle, whereas -1 is two operations Push(-imm) Add which takes 2 cycles (and the immediate value takes another 64bits to encode)
  • Move cleanup out of the loop. In this case the push.0 mem_store.3. This eliminates the if-else from the loop. Here this can be done by using eq not

push.1
while.true
# Push 1 element from the array into the `operand_stack`
adv_push.1
Copy link
Contributor

@hackaugusto hackaugusto Dec 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: For production code almost always the data should always be validated, here we only know an array of length n was sorted, but we know nothing about the contents. To validate the contents the input should be a hash of the array instead of its length (this would not leak information because we use cryptographic secure hashes).

To implement this, the stack input would be a commitment instead of a counter. The length would go in the advice_stack (or the advice_map which has native support for length). The loop would use adv_pipe instead of adv_push. The adv_pipe instruction does a few things

  1. reads the data from the advice stack
  2. hashes the data read
  3. saves it to memory

Here is an example of that: https://github.com/0xPolygonMiden/examples/blob/main/examples/matrix_multiplication.masm#L108-L146

examples/bubble_sort.masm Outdated Show resolved Hide resolved
examples/bubble_sort.masm Outdated Show resolved Hide resolved
Comment on lines +165 to +193
# incrementCounters() -> void
#
# Increments the `i` and `j` counter variables by 1
proc.incrementCounters
# load `i` counter, increment it by 1 and store it in mem[1]
mem_load.1
add.1
mem_store.1

# load `j` counter, increment it by 1 and store it in mem[2]
mem_load.2
add.1
mem_store.2
end

# decrementCounters() -> void
#
# Decrements the `i` and `j` counter variables by 1
proc.decrementCounters
# load `i` counter, decrement it by 1 and store it in mem[1]
mem_load.1
sub.1
mem_store.1

# load `j` counter, decrement by 1 and store it in mem[2]
mem_load.2
sub.1
mem_store.2
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: For production code, the j would have to be removed. Since it is the same as i + 1. All the instructions manipulating it are extra cycle cost that can be removed.

Comment on lines +351 to +364
# Initialise the VM
exec.initialise

# Loop over the array and sort repeatedly
while.true
# If looping has been done over the whole array reset counters to start over from the beginning
exec.resetCounters

# Sort elements two-by-two repeatedly
exec.sort

# Check if the array is sorted if so then stop looping
exec.sorted
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: for production code we also want to have a nice interface. This would be better wrapper around a single sort procedure that people can use. Things like initialise, resetCounters, sorted should be treated as implementation details.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the code would be both more efficient and more readable if it used the stack more often. Here is one suggestion:

#! Sort data
#!
#! Input:
#!   operand_stack: [addr, C, ...]
#!   advice_stack: [n, d0, d1, .., dn, ..]
#! Output: [...]
#!
#! Where:
#!   addr: address were the data will be loaded and sorted in-place
#!   C: a commitment to the data to be sorted
#!   n: the number of elements to be sorted
#!   d0..dn: the data to be sorted
#!
#! Cycles: X
proc.sort
  # save a copy of addr
  dup movdn.6
  # => [C, B, A, addr, C, addr ..]

  # initialize the hasher state
  # TODO: handle padding / input array with odd number of elements
  padw padw padw
  # => [C, B, A, addr, C, addr, ..]

  # load length
  adv_push.1
  # => [len, C, B, A, addr, C, addr, ..]

  # load data 
  # optimization: negate the counter, we are going to loop from `[-n,0)`
  neg
  dup neq.0
  while.true
    movdn.13 # save len (TODO: movup?)
    adv_pipe
    movup.13 add.2 # update len
    dup neq.0
  end

  # clear counter and array end 
  drop movup.13 drop

  # TODO: verify commitment

  # sort
  # 1. copy the addr
  # 2. have a boolean to check if any swaps happened
  # 3. stop when the boolean is false
end

@Dominik1999
Copy link
Contributor

Should we merge this? @phklive

@phklive
Copy link
Author

phklive commented Jan 2, 2024

Should we merge this? @phklive

Depends on what we want in the examples repo.

If we want to have different styles and programs then we could merge now. If we want to show idiomatic and optimised examples that user could take as boilerplate then I could work to find time this week to apply the modifications requested and then merge.

What do you think?

@Dominik1999
Copy link
Contributor

If we want to have different styles and programs then we could merge now. If we want to show idiomatic and optimised examples that user could take as boilerplate then I could work to find time this week to apply the modifications requested and then merge.

Yes, this solution.

@phklive
Copy link
Author

phklive commented Jan 4, 2024

If we want to have different styles and programs then we could merge now. If we want to show idiomatic and optimised examples that user could take as boilerplate then I could work to find time this week to apply the modifications requested and then merge.

Yes, this solution.

I didn't get it sorry. The two solutions were:

  1. If we want to have different styles and programs then we could merge now.
  2. If we want to show idiomatic and optimised examples that user could take as boilerplate then I could work to find time this week to apply the modifications requested and then merge.

What is the one you would like to follow?

@Dominik1999
Copy link
Contributor

If we want to have different styles and programs then we could merge now. If we want to show idiomatic and optimised examples that user could take as boilerplate then I could work to find time this week to apply the modifications requested and then merge.

Yes, this solution.

I didn't get it sorry. The two solutions were:

  1. If we want to have different styles and programs then we could merge now.
  2. If we want to show idiomatic and optimised examples that user could take as boilerplate then I could work to find time this week to apply the modifications requested and then merge.

What is the one you would like to follow?

Ah sorry for being unclear.

Can you work in the suggestions from Augusto? And then we can merge. That way, we have optimized code and maybe we can even use Bubble Sort in the miden-lib eventually.

@phklive
Copy link
Author

phklive commented Jan 4, 2024

If we want to have different styles and programs then we could merge now. If we want to show idiomatic and optimised examples that user could take as boilerplate then I could work to find time this week to apply the modifications requested and then merge.

Yes, this solution.

I didn't get it sorry. The two solutions were:

  1. If we want to have different styles and programs then we could merge now.
  2. If we want to show idiomatic and optimised examples that user could take as boilerplate then I could work to find time this week to apply the modifications requested and then merge.

What is the one you would like to follow?

Ah sorry for being unclear.

Can you work in the suggestions from Augusto? And then we can merge. That way, we have optimized code and maybe we can even use Bubble Sort in the miden-lib eventually.

Sure, starting now.

@Dominik1999
Copy link
Contributor

@phklive should we try to close this soon?

@phklive
Copy link
Author

phklive commented Jan 15, 2024

@phklive should we try to close this soon?

On it now.

Copy link

Quality Gate Passed Quality Gate passed

Kudos, no new issues were introduced!

0 New issues
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

@Dominik1999
Copy link
Contributor

Can we merge this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants