Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce a simpler cache dedicated to just decode JPEGs #1550

Merged
merged 13 commits into from
Mar 13, 2023

Conversation

jleibs
Copy link
Member

@jleibs jleibs commented Mar 9, 2023

This cache stores a Tensor entity built from the decoded data.

This is now used in the few places where we have queried Tensors, but the TensorImageCache no longer needs to worry about JPEG-encoded data.

Ideally in the future this would just become something like a derived component from a new CompressedTensor. Then anything querying for Tensors would find this automatically with decoding and caching happening at the store/query layer instead. Some views (like the selection view) could then optionally handle CompressedTensor for a few UI elements like "download original image."

Related to:

Checklist

@jleibs jleibs changed the title Jleibs/image decode cache Introduce a simpler cache dedicated to just decode JPEGs Mar 9, 2023
@jleibs jleibs added the 🚀 performance Optimization, memory use, etc label Mar 9, 2023
@jleibs jleibs marked this pull request as ready for review March 9, 2023 23:41
@@ -535,7 +551,7 @@ impl<'a> TryFrom<&'a Tensor> for ::ndarray::ArrayViewD<'a, half::f16> {

#[cfg(feature = "image")]
#[derive(thiserror::Error, Debug)]
pub enum ImageError {
pub enum TensorImageError {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Differentiating this from the image::ImageError had me scratching my head for a bit.

@Wumpf Wumpf requested review from Wumpf and teh-cmc and removed request for teh-cmc March 10, 2023 08:05
Copy link
Member

@Wumpf Wumpf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking very good! But I'm very concerned about the tensor clone operation at the end of try_decode_tensor_if_necessary - I looked at Tensor and TensorData and they don't contain any Arc/Cow etc.. We need to come up with something more sophisticated that gets a handle to the tensor

Comment on lines 410 to 422
match &self.data {
TensorData::U8(buf) | TensorData::JPEG(buf) => buf.len(),
TensorData::U16(buf) => buf.len(),
TensorData::U32(buf) => buf.len(),
TensorData::U64(buf) => buf.len(),
TensorData::I8(buf) => buf.len(),
TensorData::I16(buf) => buf.len(),
TensorData::I32(buf) => buf.len(),
TensorData::I64(buf) => buf.len(),
TensorData::F32(buf) => buf.len(),
TensorData::F64(buf) => buf.len(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks so sad :D

pub enum TensorDecodeError {
// TODO(jleibs): It would be nice to just transparently wrap
// `image::ImageError` and `tensor::TensorImageError` but neither implements
// `Clone`, which we need if we ant to cache the Result.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// `Clone`, which we need if we ant to cache the Result.
// `Clone`, which we need if we want to cache the Result.

isn't the later our own and we could implement Clone?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes... except it wraps ImageError :-D

};

let memory_used = match &tensor {
Ok(tensor) => tensor.size_in_bytes() as u64,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd expect usize all the way for cpu-sided memory sizes. But admittedly it's a mess. Have this casting issue again and again in re_renderer....

crates/re_viewer/src/misc/caches/tensor_decode_cache.rs Outdated Show resolved Hide resolved
crates/re_viewer/src/misc/caches/tensor_decode_cache.rs Outdated Show resolved Hide resolved
crates/re_viewer/src/ui/data_ui/image.rs Show resolved Hide resolved
@jleibs jleibs force-pushed the jleibs/image_decode_cache branch from 8e59ca2 to 60ba461 Compare March 10, 2023 20:54
@jleibs jleibs requested a review from Wumpf March 10, 2023 20:57
@jleibs
Copy link
Member Author

jleibs commented Mar 10, 2023

@Wumpf ok I pulled in the shim for to make these arrow Buffer objects so they can be cheaply cloned:
screen

/// Can be removed when: [arrow2-convert#103](https://github.com/DataEngineeringLabs/arrow2-convert/pull/103) lands
#[derive(Clone, Debug, PartialEq, ArrowField, ArrowSerialize)]
#[arrow_field(transparent)]
pub struct BinaryBuffer(pub Buffer<u8>);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't ByteBuffer a more fititng name name? 😬

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arrow calls it a BinaryArray so this is the Buffer that a binary array deserializes into.

TensorData::I32(buf) => buf.len(),
TensorData::I64(buf) => buf.len(),
TensorData::F32(buf) => buf.len(),
TensorData::F64(buf) => buf.len(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wow, it is quite surprising to me that Buffer<f64>::len() is the number of bytes, and not the number of elements, but it seems right: https://docs.rs/arrow2/latest/arrow2/buffer/struct.Buffer.html#method.len

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I filed an issue for this: jorgecarleitao/arrow2#1430

let max_image_cache_use = 1_000_000_000;
self.image.new_frame(max_image_cache_use);
let max_decode_cache_use = 1_000_000_000;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the decode cache is RAM-only, we could make it quite a bit bigger. Maybe 8 GB?

Suggested change
let max_decode_cache_use = 1_000_000_000;
let max_decode_cache_use = 8_000_000_000;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but Web!! If we go there, this needs to have a different limit on Web

Copy link
Member

@emilk emilk Mar 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch - yeah, 8GB is quite a high limit for a 4iGB system :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8 still seems high in general. Splitting the difference and going with 4 normally and keeping it as 1 for wasm.

crates/re_viewer/src/misc/caches/tensor_decode_cache.rs Outdated Show resolved Hide resolved

lookup.tensor.clone()
}
_ => Ok(maybe_encoded_tensor),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should have TensorData be split into just two categories: Compressed and Raw so we can do an exhaustive match here. Or at least TensorData::Compressed(CompressedTensor) + enum CompressedTensor { Jpeg(…) } so it is easier to add png etc.

…but we can do that in another PR though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I had the exact same thought as I was working through this. I think I'd like to split the tensor into 3 components. One for the meta info, one for native buffers, and one for compressed buffers. Then use the compressed data -> native as a very minimal prototype for a cached "derived component".

TensorData::I32(buf) => buf.len(),
TensorData::I64(buf) => buf.len(),
TensorData::F32(buf) => buf.len(),
TensorData::F64(buf) => buf.len(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I filed an issue for this: jorgecarleitao/arrow2#1430

@Wumpf
Copy link
Member

Wumpf commented Mar 11, 2023

@Wumpf ok I pulled in the shim for to make these arrow Buffer objects so they can be cheaply cloned: screen

Nice! But - completely unrelated to anything here - oh my god RenderContext::before_submit is going crazy. I'll need to look into this at some point

@jleibs jleibs merged commit 79af754 into main Mar 13, 2023
@jleibs jleibs deleted the jleibs/image_decode_cache branch March 13, 2023 14:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🚀 performance Optimization, memory use, etc
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants