Skip to content
This repository has been archived by the owner on Jun 10, 2020. It is now read-only.

harvested wac files keep wav file as original, closes #35 #49

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

cofiem
Copy link
Contributor

@cofiem cofiem commented Oct 29, 2016

@atruskie please review

Copy link
Member

@atruskie atruskie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution, we just need to consider a few edge cases before this is merged.

target = File.join(File.dirname(source), File.basename(source, '.wac') + '.wav')

# only try to create the wav file if it doesn't already exist
@audio.modify(source, target) unless File.size?(target)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cofiem - why this unless condition? Couldn't this lead to non-idempotent behaviour on repeated runs?

I guess you're assuming an identical .wav file may have already been uploaded and you don't want to run wac2wav if that is the case... in which case, that check should be explicit.

Adding to that, the output name of the .wav file should be unique so duplicate runs don't mistake converted wav files as original wav files.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: This concept of converting a file for storage won't be a once off - see #50 and #36.

file = wav_file
rescue StandardError => e
@logger.error(@class_name) {
"Could not convert .wac file #{file} using utc offset '#{utc_offset}': #{format_error(e)}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming the utc_offset part of the exception has some relevance?

"Source .wac file #{wac_file} was not renamed."
}
end
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this block itself is fine, but the implication is disturbing: if we harvest files in WAac then we'll use triplicate storage until harvester_to_do is cleaned up. Given #50 & #36 this pattern will become more frequent as well.

I think we should keep source file in harvest_to_do, and delete the converted file out of harvester_to_do after it is copied to original_audio. Thoughts?

@@ -183,10 +183,10 @@
password = 'password'
auth_token = 'auth token this is'

file_hash = 'SHA256::c6d561c91664a92a1598bda3b79734d5ee29266ad0411ffaea1188f29d5b6439'
file_hash = 'SHA256::b06458c96ac2de75d77b83e1b3acb3e64b84a4bfe671c794153855bf7b5e96fd'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to the changes below as well: what information is stored in the database?

It appears the hash, original file name, and relative path all refer to the wav file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely do not change the meaning of existing fields. Do this properly. Otherwise it's a quick way to shoot yourself in the foot.

First of all, why do you care what format the file used to be?

If you must store info about pre-conversion, what info do you want to store?
Add new fields to audio_recording to store this information. Or, maybe, store it in the existing machine readable notes field.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well it's less changing what the fields mean and more clarifying what they mean in a new scenario. Does original file name refer to the source file or the converted file? That needs to be defined.

Why do we care? The only reason we have both of those fields (name and path (in notes)) is so we can map back to the original files in other storage mediums (E.g. Availae or a portable hdd). Storing the name of the converted file breaks that.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants