Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tables.jl interface and iterators #174

Closed
trulsf opened this issue Aug 3, 2022 · 9 comments
Closed

Tables.jl interface and iterators #174

trulsf opened this issue Aug 3, 2022 · 9 comments

Comments

@trulsf
Copy link
Contributor

trulsf commented Aug 3, 2022

I have implemented the Tables.jl interface for some solution structures from JuMP. This seems to work well for most applications (e.g. CSV, databases etc.), but we run into problems with PrettyTables.jl.

The problem seems to reside with the fact that PrettyTables seems to assume that the iterator for the rows access of the table uses the row number as state. As far as I understand the Tables.jl interface there is no requirement that the iterator must be implemented with the row number as the state. In my case the iterator state is e.g. a CartesianIndex into an array.

The code in question is in tables.jl (lines 79 and 100):

it, ~ = iterate(rtable.table, i)

Is this a correct observation? Any possibilities of having PrettyTables working with more general iterators?

@ronisbr
Copy link
Owner

ronisbr commented Aug 3, 2022

Hi @trulsf !

I maybe wrong, but check the documentation of Tables.jl if you have row-access:

Tables.rows(table)    Return an Tables.AbstractRow-compatible iterator from your table

Hence, in tables.jl, ratable.table must be an iterator compatible with Tables.AbstractRow. Thus, I think it, ~ = iterate(rtable.table, i) must return the ith row. After that, the following code should be able to obtain the jth element in this row:

Tables.getcolumn(row, j)

If you replace the line:

element = it[column_name]

by

element = Tables.getcolumn(row, j)

Does it work?

@trulsf
Copy link
Contributor Author

trulsf commented Aug 4, 2022

As far as I understand it there is no requirement on an iterator compatible with Tables.AbstractRow to have iterate(iter, state) using the row number as state. In the general documentation on iteration interfaces it says that

The state object will be passed back to the iterate function on the next iteration and is generally considered an
implementation detail private to the iterable object.
https://docs.julialang.org/en/v1/manual/interfaces/

@ronisbr
Copy link
Owner

ronisbr commented Aug 4, 2022

So, how can we take the element in the ith row and jth column to print the table?

@trulsf
Copy link
Contributor Author

trulsf commented Aug 4, 2022

I am no expert on these matters, so it may be worthwhile to check out with the Tables.jl people what is the proper way of getting a specific row. In principle, I do not think that Tables.jl allows for indexing. I guess a possibility is to use collect on the iterator and store the rows in a vector. May be troublesome for large tables where you are not printing all rows, but using take on the iterator and its reverse could be an option.

@ronisbr ronisbr closed this as completed in 16c194d Aug 6, 2022
@ronisbr
Copy link
Owner

ronisbr commented Aug 6, 2022

Hi @trulsf !

I think I discovered the problem. Your data assumes that the initial state is 0 instead of 1. Hence, PrettyTables.jl is trying to obtain an element that does not exist. I agree that, in this case, I cannot assume the initial state is 1. I modified this part to obtain the i-th row by iterating i times.

Can you please test against master?

@hellemo
Copy link

hellemo commented Aug 9, 2022

Works for me on master, thanks @ronisbr

@ronisbr
Copy link
Owner

ronisbr commented Aug 9, 2022

Perfect! I should release a new version after I finish the PR to use HTML backend in DataFrames.jl.

@bkamins
Copy link

bkamins commented Sep 7, 2023

As far as I understand the Tables.jl interface there is no requirement that the iterator must be implemented with the row number as the state.

State can be anything, but Tables.jl table should support Tables.subset function where, as one of the cases, you can pass row number (i.e. 1-based index) to get a given row.

help?> Tables.subset
  Tables.subset(x, inds; viewhint=nothing)

  Return one or more rows from table x according to the position(s) specified by inds:

    •  If inds is a single non-boolean integer return a row object.

    •  If inds is a vector of non-boolean integers, a vector of booleans, or a :, return a subset of the original table according to the indices. In this case, the returned type is not necessarily the
       same as the original table type.

  If other types of inds are passed than specified above the behavior is undefined.

  The viewhint argument tries to influence whether the returned object is a view of the original table or an independent copy:

    •  If viewhint=nothing (the default) then the implementation for a specific table type is free to decide whether to return a copy or a view.

    •  If viewhint=true then a view is returned and if viewhint=false a copy is returned. This applies both to returning a row or a table.

  Any specialized implementation of subset must support the viewhint=nothing argument. Support for viewhint=true or viewhint=false is optional (i.e. implementations may ignore the keyword argument and return
  a view or a copy regardless of viewhint value).

@ronisbr
Copy link
Owner

ronisbr commented Sep 7, 2023

Ah! I see, so the object should implement Tables.subset and, if it does not, we fall back to the slow algorithm. Makes sense!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants