Skip to content

Conversation

@prazanna
Copy link
Contributor

No description provided.

@prazanna prazanna self-assigned this Mar 21, 2017
@prazanna prazanna requested a review from vinothchandar March 21, 2017 19:16
@prazanna
Copy link
Contributor Author

cc @siddharthagunda - FYI

@prazanna
Copy link
Contributor Author

This fixes #108


/**
* Clean up any stale/old files/data lying around (either on file storage or index storage) that is past
* the typical query timeout. Default is 12 hours.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given cleaner itself has these knobs, why are we reintroducing a default clean time window here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment on index tagLocation and updateLocation...

Would come really handy, as we design the next generation of indexing,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given cleaner itself has these knobs
True. I just added more description on the API as we are making this public. I will add that based on the cleaning policy.

Same comment on index tagLocation and updateLocation...
I suppose adding metrics? That can be a seperate diff.

* Clean up any stale/old files/data lying around (either on file storage or index storage) that is past
* the typical query timeout. Default is 12 hours.
*/
public void clean() throws HoodieIOException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a metric around how long cleaning takes? It might be really useful going forward to keep an eye on

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have metrics today for cleaning. We should be able to get this already. We are also saving this in the .clean file in the timeline already.

@vinothchandar
Copy link
Member

Ship it

@prazanna prazanna merged commit f1b7afa into apache:master Mar 22, 2017
vinishjail97 pushed a commit to vinishjail97/hudi that referenced this pull request Dec 15, 2023
…ers (apache#109)

* fail job if duplicate data files detected during reconcileAgainstMarkers

* adding missing apache License
vinishjail97 added a commit to vinishjail97/hudi that referenced this pull request Dec 15, 2023
…ers (apache#109) (apache#169)

* fail job if duplicate data files detected during reconcileAgainstMarkers

* adding missing apache License

Co-authored-by: harshal <[email protected]>
vinishjail97 pushed a commit to vinishjail97/hudi that referenced this pull request Feb 16, 2024
…ers (apache#109)

* fail job if duplicate data files detected during reconcileAgainstMarkers

* adding missing apache License
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants