-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: dag import --stats (#8237) #8237
Merged
Merged
Changes from 6 commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
6f0b5c2
feat: report block count on `dag import`
rvagg daab857
fix: clean-up dag import message format
rvagg 875c3f0
Only print stats when --stats flag is passed
gammazero 52c970f
fix sharness test
gammazero c14d1bc
Add PayloadBytesCount to stats
gammazero 56b87df
Attempt to stabilize flaky tests
gammazero 2636d56
Rename PayloadBytesCount to BlockBytesCount
gammazero 160b9a6
Correctly calculate size or imported dag
gammazero ed9b606
Use RawSize of original block for import bytes calc
rvagg 02fe735
test: dag import without --stats
lidel File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Payload
can be confusing here: is this raw blocks (CAR-metadata) or actual data in each block (CAR-metadata-dagmetadata)?Perhaps renaming it to
BlockBytesCount
and setting this to sum ofnd.Stat().BlockSize()
is a way to remove confusion while keeping this useful no matter what codecs are used inside of blocks?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed
PayloadBytesCount
toBlockBytesCount
It happens that
nd.Size()
andnd.Stat().BlockSize
return the same value, so I think it is better to usend.Size()
, given the comment here.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm.. if they are the same, it makes sense, but something feels off when I compare Size imported from CAR with value reported by
ipfs dag stat
:125832269 bytes is ~125 MB which is way more than 26MB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lidel What I did previously (using
nd.Size()
) worked for the car files insharness/t0054-dag-car-import-export-data/
but does not work for your example. Usingnd.Stat().DataSize + nd.Stat().LinksSize
works for your example, but not for the test cars/dags. The test cars/dags have stats with all zeros, for almost all blocks.It appears that
nd.Size()
returnsnd.Stat().CumulativeSize
if a block has stat values. Othersize,nd.Size()
is set tolen(nd.RawData())
. This makes bothnd.Size()
andnd.Stats()
completely unreliable across different dags.Apparently, the way to get a reliable size is to always use
len(nd.RawData())
. So, that is what the latest change does.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DataSize
andLinksSize
sounds like a dag-pb thing, which probably won't be represented in the fixture files.What stat are we actually after here, @ribasushi what's your expectation of what this sizing is going to report? I would think that it's the size of the output, which includes CAR header, CID lengths and even varint section size prefixes. But I could find that by measuring the size of the output myself, so the utility doesn't seem great. But what is the utility of reporting just the block sizes? What is the useful for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect the
ipfs dag stat
number. It is useful in terms of "this is the amount of IPLD-data these blocks hold"There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @ribasushi,
len(nd.RawData())
is probably the one then, but in this case maybe just uselen(block.RawData())
then you get to use the block as it comes out of the CAR rather than whatever happens to it through aDecode
cycle (probably the same, but seems safer to use the one closer to the original)