Serialization/deserialization of optimized tract models #1313

cospectrum · 2024-01-24T13:46:35Z

Hi, I intend to use tract for inference with AWS Lambda. I've observed that the initialization and optimization of ONNX models (from &[u8]) can be 2-3 times slower than the actual model execution. Perhaps it's a good idea to introduce a method for storing your graph IR as &[u8]?

The text was updated successfully, but these errors were encountered:

kali · 2024-01-24T14:26:43Z

Hey, thanks for your interest.

You should give a try to the NNEF serialization. It's significantly faster to load and optimise than an ONNX model.

cospectrum · 2024-01-24T14:36:22Z

Is there a way to load a model, optimize it with tract, and then save it back?

kali · 2024-01-24T14:43:53Z

The NNEF serialization is a step towards this, as you'll save the "decluttered" model. And decluttering account for the most expensive part of the the loading/declutter/optimize workflow (more than the actual optimisation).

There is no way to dump and reload a tract fully optimized model at this stage.

cospectrum · 2024-01-24T14:59:02Z

If there is no such thing yet, it would be a good idea to start by at least providing public access to all the necessary internals of your IR so that I can create my own utility without a fork. Is IR public?

cospectrum · 2024-01-24T15:31:09Z

Well, I see that TypedModel (created with into_optimed) is an alias to Graph whose fields are completely public. Perhaps I have everything I need!

kali · 2024-01-24T16:25:03Z

Yeah, the "IR" is just tract-core with TypedModel with some optimized operators. Most operators will retain their "decluttered" form, because there is not much to gain in optimizing them, but the most important ones (MatMul & co) are heavily modified.

There is no commitment on stability of operators (decluttered and optimised). Additionally optimized operators are not portable from one architecture to another.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Serialization/deserialization of optimized tract models #1313

Serialization/deserialization of optimized tract models #1313

cospectrum commented Jan 24, 2024

kali commented Jan 24, 2024 •

edited

Loading

cospectrum commented Jan 24, 2024

kali commented Jan 24, 2024

cospectrum commented Jan 24, 2024

cospectrum commented Jan 24, 2024

kali commented Jan 24, 2024

Serialization/deserialization of optimized tract models #1313

Serialization/deserialization of optimized tract models #1313

Comments

cospectrum commented Jan 24, 2024

kali commented Jan 24, 2024 • edited Loading

cospectrum commented Jan 24, 2024

kali commented Jan 24, 2024

cospectrum commented Jan 24, 2024

cospectrum commented Jan 24, 2024

kali commented Jan 24, 2024

kali commented Jan 24, 2024 •

edited

Loading