Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What kind of performance is to be expected? #22

Open
zachdaniel opened this issue Mar 6, 2017 · 7 comments
Open

What kind of performance is to be expected? #22

zachdaniel opened this issue Mar 6, 2017 · 7 comments

Comments

@zachdaniel
Copy link

Right now we have a 2.5mb spreadsheet being generated, and the call to write_to_memory/2 takes about 25 seconds. Is this more than normal, or should we be investigating a cause?

Regardless it would be a very high priority for us to make this faster.

@zachdaniel
Copy link
Author

I checked, and its not the call to :zip.create/3 causing the slowness its the call to Elixlsx.Writer.create_files/2

@xou
Copy link
Owner

xou commented Mar 13, 2017

Thanks for doing the benchmarking! I have not written elixlsx with performance in mind, but I agree that 25secs for ~2.5MB seems a bit slow.

Could you share any specifics on what makes the spreadsheet so large? Is it a lot of sheets, large sheets (and in which dimension)?

Could you share a code snippet that reproduces this behaviour?

Thanks!

@ryanhart2
Copy link
Contributor

Elixlsx currently relies on a lot of string interpolation and string concatenation. I believe that replacing this with iolists will increase performance significantly. I updated the make_sheet/2 to use iolists and it ran 16% faster (as measured with Benchee), even though all the other functions that feed into make_sheet/2 were unchanged. The speed should increase with each function that's changed. The update was simply to output a list of the various strings, instead of concatenating them.

This is a good blog post I found on the topic:
https://www.bignerdranch.com/blog/elixir-and-io-lists-part-1-building-output-efficiently/

@andreapavoni
Copy link

@ryanhart2 any chances to get a PR with this improvement?

@roehst
Copy link
Contributor

roehst commented Mar 12, 2024

I am having an issue with a spreadsheet that is only 8MB total, but consumes a whooping 2GB RAM while writing it.

If the IO List is a solution, I would be more than happy to finance a freelancer to do so.

@zachdaniel @ryanhart2

@roehst
Copy link
Contributor

roehst commented Mar 12, 2024

I am investigating the cost (in # of reductions) for very simple spreadsheets with N rows.

The number of reductions/row seems to go up with the number of rows - which looks like quadratic to me. This makes large spreadsheets take much longer.

Is this relatable to the use of IO lists?

My benchmark is naive, so I welcome feedback.

defmodule Example do
  alias Elixlsx.Workbook
  alias Elixlsx.Sheet

  def generate_data(n) do
    for _ <- 1..n do
      %{
        user_id: 200,
        shares: 200,
        strike_price: 200
      }
    end
  end

  def run(n) do
    task =
      Task.async(fn ->
        data = generate_data(n)

        reductions_1 = :erlang.process_info(self)[:reductions]

        workbook =
          %Workbook{}
          |> append_data_sheet(data)

        _ = Elixlsx.write_to_memory(workbook, "foo.xlsx")

        reductions = :erlang.process_info(self)[:reductions] -reductions_1

        reductions
      end)

    reductions = Task.await(task)

    # reds / n
    IO.puts("Reductions per row: #{reductions / n}")
  end

  def append_data_sheet(workbook, data) do
    sheet = %Sheet{name: "Data"}

    sheet =
      Enum.reduce(Enum.with_index(data), sheet, fn {row, index}, sheet ->
        sheet =
          sheet
          |> Sheet.set_at(index + 1, 1, row.user_id)
          |> Sheet.set_at(index + 1, 2, row.shares)
          |> Sheet.set_at(index + 1, 3, row.strike_price)

        Enum.reduce(1..10, sheet, fn j, sheet ->
          Sheet.set_at(sheet, index + 1, 3 + j, 5)
        end)
      end)

    Workbook.append_sheet(workbook, sheet)
  end
end

@user20230119
Copy link

Did anyone compared Elixlsx with Exceed? https://github.com/synchronal/exceed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants