Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sheet: Override header span and skip rows when using properties file #266

Merged
merged 3 commits into from
Nov 9, 2024

Conversation

richiejp
Copy link
Contributor

@richiejp richiejp commented Nov 7, 2024

The properties file was overriding opts.rows_to_ignore and
opts.header_span values that need to be set to zero when using an
index. This recreates the parser with zsv_new after the opts have
been rewritten when an index is active.

The properties file was overriding opts.rows_to_ignore and
opts.header_span values that need to be set to zero when using an
index. This recreates the parser with zsv_new after the opts have
been rewritten when an index is active.
@liquidaty
Copy link
Owner

liquidaty commented Nov 8, 2024

There may in the future be properties beyond rows_to_ignore and header_span, which these changes would ignore. Could these changes instead use zsv_new_with_properties(), but before calling, set opts->rows_to_ignore and opts->header_span to zero, and set the corresponding chars (R and d) in opts_used, so that those are not loaded from the props.json file? That would retain all other properties but disregard rows_to_ignore and header_span.

Alternatively, the index could be invalidated whenever props.json was modified more recently than when the index was created. The index should be invalidated whenever the data file was modified more recently than the index anyway, so this additional invalidation criteria should be a trivial code change that should then make zsv_new_with_properties() work without any change to zsv_opts or opts_used. Since changes to the data file and/or props.json file are expected to be infrequent, the performance impact is not a concern.

@richiejp
Copy link
Contributor Author

richiejp commented Nov 8, 2024

Hi, sorry I'm not sure I understand. The first paragraph could be interpreted as meaning there may be properties that are not represented in the zsv_opts struct. The second suggests that the properties could change between building the index and reading the file?

If we are talking about the properties or data file changing, then presently the user can restart the sheet command if this happens. Once the index is saved to a file then this issue becomes a lot more pressing. To my knowledge creating an index file is the next item on the TODO list for me, so I'd prefer to handle invalidation during that task.

Previously the indexer was using the file offsets from the original
file when the filter was active. Instead it should use offsets in the
temporary filtered file.

The writer library doesn't record or expose the total number of bytes
written and flushing to the stream and retrieving the offset would be
expensive and invasive. So this splits filtering and indexing into two
operations.
add CI tests for macos-14, which should pass
@liquidaty liquidaty merged commit a6eeeee into liquidaty:main Nov 9, 2024
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants