Skip to content

fix(perf): Remove FontSources.masks as they were consuming large amounts of memory even when no font sources were set#2519

Merged
CommanderStorm merged 17 commits intomaplibre:mainfrom
Auspicus:fix/2518/fonts-memory-usage
Jan 24, 2026
Merged

fix(perf): Remove FontSources.masks as they were consuming large amounts of memory even when no font sources were set#2519
CommanderStorm merged 17 commits intomaplibre:mainfrom
Auspicus:fix/2518/fonts-memory-usage

Conversation

@Auspicus
Copy link
Copy Markdown
Contributor

@Auspicus Auspicus commented Jan 22, 2026

fixes #2518

This drops the RSS value after the bitsets are computed. Unfortunately the bitset computation still spikes the memory usage quite high.

UPDATE: Swapping out BitSet for BTreeSet results in much lower peak memory usage after startup. I don't think BitSet is doing what we want here.

UPDATE: Replaced BTreeSet back with BitSet for the font codepoints for lower peak memory usage for a font that provides every available codepoint.

@Auspicus Auspicus changed the title Add Arc<T> around FontSources.masks to allow for copy-free cloning across workers fix(perf): Add Arc<T> around FontSources.masks to allow for copy-free cloning across workers Jan 22, 2026
@Auspicus Auspicus changed the title fix(perf): Add Arc<T> around FontSources.masks to allow for copy-free cloning across workers fix(perf): Add Arc<T> around FontSources.masks to allow for duplicate-free cloning across workers Jan 22, 2026
@Auspicus Auspicus changed the title fix(perf): Add Arc<T> around FontSources.masks to allow for duplicate-free cloning across workers fix(perf): Add Arc<T> around FontSources.masks to allow for duplicate-free cloning across workers, replace BitSet with BTreeSet Jan 22, 2026
@Auspicus Auspicus marked this pull request as draft January 22, 2026 13:45
}

let mut needed = self.masks[(start as usize) / CP_RANGE_SIZE].clone();
let needed: &BTreeSet<usize> = &self.masks[(start as usize) / CP_RANGE_SIZE];
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nyurik Why did you use masks in the first place here instead of something like

Suggested change
let needed: &BTreeSet<usize> = &self.masks[(start as usize) / CP_RANGE_SIZE];
// Create a BitSet for the requested range (start..=end)
let mut needed = BitSet::with_capacity(end as usize);
for cp in start..=end {
needed.insert(cp as usize);
}

In the testing I have done, this works fairly well, but there must be a reason why this is done this way that I am not getting..

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tbh, it has been a long time since i did it, so no clue :)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can imagine BitSet would work well for a small range like 0-255 but if you support the whole unicode range you need a lot of memory per bitset.

face_index: isize,
/// Unicode codepoints this font supports.
codepoints: BitSet,
codepoints: BTreeSet<usize>,
Copy link
Copy Markdown
Member

@CommanderStorm CommanderStorm Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsure if there is a better way.
Lets brainstorm a bit.

so we need to know which if the 256 bits -> BitSet256 is set.

While we curently store this as one very big BitSet, that is likely not nessesary, since this range is likely fairly compressible.
No font designer would leave weird gaps in their font intentionally I think.

  • Vec<BitSet256> works and weighs in at ~1.1 megabytes (MAX_UNICODE_CP Bits)
  • BTreeSet<u32> works and is 32B * number of font-items -> Likely lower than above.
  • HashMap<u16, BitSet256> ((start / 256) -> next 256 code points) should works and is a compromise between the two?

Copy link
Copy Markdown
Member

@CommanderStorm CommanderStorm Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noticed that only u32 is needed for the index, that is another halfing of that memory usage.

This also stacks with the improvement above.

Copy link
Copy Markdown
Contributor Author

@Auspicus Auspicus Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah there was a mix of usize and u32 around so definitely could standardise on u32 and avoid a bunch of casting

Copy link
Copy Markdown
Contributor Author

@Auspicus Auspicus Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with using a BitSet256 is you need to range it from 0-255 but the codepoints are 0-0x10FFFF so you either need to translate the requested codepoints back to 0-255 before masking (losing the performance of a bitset) or have one GIANT BitSet for each range. If you check the size of each BitSet before you'll see some are as big as 0x10FFFF (depending on the largest inserted value).

let mut bs = BitSet::with_capacity(CP_RANGE_SIZE);
let mut set = BTreeSet::new();
for v in 0..=MAX_UNICODE_CP {
bs.insert(v);
Copy link
Copy Markdown
Contributor Author

@Auspicus Auspicus Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the core issue is that although we allocate a BitSet::with_capacity(256) the value for v here might be 0x10FFFF causing a MASSIVE reallocation to fit values 0-0x10FFFF in the BitSet. From what I understand, the values must be contiguous without holes so each BitSet could be as large as (0x10FFFF / 32) (bits per u32) * 4 (bytes per u32) and then each of these is inserted for each range up to 4352.

Copy link
Copy Markdown
Member

@nyurik nyurik Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my understanding is that each file would use a 136KB of RAM (136*1024*8 = 1_114_112 = 0x11_0000 codepoint presence bits) - not that much memory. If we follow the optimization I proposed below, most fonts won't even get that high, but it will keep the complexity to the minimal.

Copy link
Copy Markdown
Contributor Author

@Auspicus Auspicus Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is less with each font source since it's only a single BitSet of 34816 u32 (139KB). The issue is with the pre-computed masks. Each range (of which there are 4352) carries a masking BitSet of up to 0x11_0000 (34816 u32 [4 bytes] in the backing BitVec). Assuming every range is this size (inaccurately) we get 600MB (34816*4*4352 bytes). More accurately we get masking BitSets with sizes growing from 8-34816 u32 and more likely around 300MB total. Prior to adjusting the max codepoint size we had smaller BitSets (8-2048 u32) and less of them 256 (0xFFFF/256). Resulting in at most ~2MB (2048*4*256 bytes) for masks but likely less due to not all BitSets being 2048 size.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See images attached with debugger running on main branch in the default for FontSources:

Screenshot_20260123_083203 Screenshot_20260123_083257

@nyurik
Copy link
Copy Markdown
Member

nyurik commented Jan 22, 2026

one thing we could try without any other changes:

 fn get_available_codepoints(face: &mut Face) -> Option<GetGlyphInfo> {
-    let mut codepoints = BitSet::with_capacity(MAX_UNICODE_CP);
+    let mut codepoints = BitSet::new();  // dynamic memory growth
 ...
     for cp in 0..=MAX_UNICODE_CP {
     ...
+      if count >= face.num_glyphs() { break; } // we know we won't find any more glyphs in this font

@nyurik
Copy link
Copy Markdown
Member

nyurik commented Jan 22, 2026

P.S. Oh, silly me -- we no longer need to even do that -- I added character iteration to the freetype lib 2 years ago -- PistonDevelopers/freetype-rs#257

so should just be able to call for ch in face.chars() { ... }

@Auspicus
Copy link
Copy Markdown
Contributor Author

So maybe no masks? I was wondering how expensive it would be to iterate vs difference the set in the first place

@nyurik
Copy link
Copy Markdown
Member

nyurik commented Jan 22, 2026

freetype-rs doesn't support iteration from arbitrary location (PRs welcome i guess), so i think it might still be useful to have a bitmask for perf reasons, esp since it is much simpler now. Since you ran all the memory usage tests, could you try this implementation?

fn get_available_codepoints(face: &mut Face) -> Option<GetGlyphInfo> {
    let mut codepoints = BitSet::new();
    let mut spans = Vec::new();
    let mut first: Option<usize> = None;
    let mut last = 0;

    for (cp, _) in face.chars() {
        codepoints.insert(cp);
        if let Some(start) = first {
            if cp != last + 1 {
                spans.push((start, last));
                first = Some(cp);
            }
        } else {
            first = Some(cp);
        }
        last = cp;
    }

    if let Some(first) = first {
        spans.push((first, last));
        let start = spans[0].0;
        let end = spans[spans.len() - 1].1;
        Some((codepoints, usize::try_from(face.num_glyphs()).unwrap(), spans, start, end))
    } else {
        None
    }
}

@nyurik
Copy link
Copy Markdown
Member

nyurik commented Jan 22, 2026

(didn't test, but had to update the above a few times)

@nyurik
Copy link
Copy Markdown
Member

nyurik commented Jan 22, 2026

interestingly enough, my original implementation seemed to have had a bug - if a font file had a codepoint at the MAX_UNICODE_CP-1 position (last allowed), the last span would not be included in the result.

@Auspicus Auspicus marked this pull request as ready for review January 22, 2026 21:52
@nyurik
Copy link
Copy Markdown
Member

nyurik commented Jan 22, 2026

(note to self - github editor is NOT that great for ... editing code)

@Auspicus have you had a chance to memory profile it?

/// Maximum Unicode codepoint range ID.
///
/// Each range is 256 codepoints long, so the highest range ID is 0xFFFF / 256 = 255.
/// Each range is 256 codepoints long, so the highest range ID is 0x10FFFF / 256 = 255.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should fix this number which is now 0x10FFFF/256 = 4352 (rounded up)

let mut bs = BitSet::with_capacity(CP_RANGE_SIZE);
let mut set = BTreeSet::new();
for v in 0..=MAX_UNICODE_CP {
bs.insert(v);
Copy link
Copy Markdown
Contributor Author

@Auspicus Auspicus Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is less with each font source since it's only a single BitSet of 34816 u32 (139KB). The issue is with the pre-computed masks. Each range (of which there are 4352) carries a masking BitSet of up to 0x11_0000 (34816 u32 [4 bytes] in the backing BitVec). Assuming every range is this size (inaccurately) we get 600MB (34816*4*4352 bytes). More accurately we get masking BitSets with sizes growing from 8-34816 u32 and more likely around 300MB total. Prior to adjusting the max codepoint size we had smaller BitSets (8-2048 u32) and less of them 256 (0xFFFF/256). Resulting in at most ~2MB (2048*4*256 bytes) for masks but likely less due to not all BitSets being 2048 size.

let mut bs = BitSet::with_capacity(CP_RANGE_SIZE);
let mut set = BTreeSet::new();
for v in 0..=MAX_UNICODE_CP {
bs.insert(v);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See images attached with debugger running on main branch in the default for FontSources:

Screenshot_20260123_083203 Screenshot_20260123_083257

@Auspicus
Copy link
Copy Markdown
Contributor Author

I am just starting my workday now but I'll review again after work today. I will also post the dhat file in the viewer as it really highlights the BitSet allocations.

autofix-ci bot and others added 2 commits January 23, 2026 07:49
…an BTreeSet at full capacity), lower lambda timeout and memory usage back to explicit default
/// Returns `None` if the font contains no usable glyphs.
fn get_available_codepoints(face: &mut Face) -> Option<GetGlyphInfo> {
let mut codepoints = BitSet::with_capacity(MAX_UNICODE_CP);
let mut codepoints = BitSet::new();
Copy link
Copy Markdown
Contributor Author

@Auspicus Auspicus Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without explicit capacity there will be some required resizing but it allows for this BitSet to be as small as 8 bytes (256 bits) when the font only provides codepoints within the range 0-255. If the font provides even a single codepoint at the end of the range this will need to be resized to 139KB. So we're trading off a small amount of start up time here for a lower memory footprint per font.

@Auspicus Auspicus changed the title fix(perf): Add Arc<T> around FontSources.masks to allow for duplicate-free cloning across workers, replace BitSet with BTreeSet fix(perf): Remove FontSources.masks Jan 23, 2026
@Auspicus Auspicus changed the title fix(perf): Remove FontSources.masks fix(perf): Remove FontSources.masks as they were consuming large amounts of memory even when no font sources were set Jan 23, 2026
@Auspicus
Copy link
Copy Markdown
Contributor Author

Auspicus commented Jan 23, 2026

Before:

git checkout main && \
cargo b --release --no-default-features --features fonts && \
valgrind --tool=dhat ./target/release/martin -C 0 -W 16 --font ./tests/fixtures/fonts

Total:     5,770,215,231 bytes in 90,450 blocks
At t-gmax: 4,857,098,381 bytes in 71,914 blocks
At t-end:  4,857,098,381 bytes in 71,914 blocks
Reads:     5,241,483,644 bytes
Writes:    5,475,459,168 bytes

After:

git checkout fix/2518/fonts-memory-usage && \
cargo b --release --no-default-features --features fonts && \
valgrind --tool=dhat ./target/release/martin -C 0 -W 16 --font ./tests/fixtures/fonts

Total:     8,947,164 bytes in 16,304 blocks
At t-gmax: 1,674,288 bytes in 4,653 blocks
At t-end:  96,933 bytes in 191 blocks
Reads:     4,826,227 bytes
Writes:    8,527,250 bytes

@Auspicus
Copy link
Copy Markdown
Contributor Author

I'm happy with where this is at now, let me know if you think we need to change anything @CommanderStorm / @nyurik

Copy link
Copy Markdown
Member

@CommanderStorm CommanderStorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you 👍

@CommanderStorm CommanderStorm merged commit b6b61aa into maplibre:main Jan 24, 2026
40 checks passed
@nyurik
Copy link
Copy Markdown
Member

nyurik commented Jan 24, 2026

oops, we should have added to the subject / changelog that this also fixes incorrect font behavior - now the font glyph count will be reported correctly

@CommanderStorm CommanderStorm mentioned this pull request Jan 21, 2026
CommanderStorm added a commit that referenced this pull request Jan 28, 2026
## 🤖 New release

* `martin-tile-utils`: 0.6.8 -> 0.6.9 (✓ API compatible changes)
* `mbtiles`: 0.15.0 -> 0.15.1 (✓ API compatible changes)
* `martin-core`: 0.2.5 -> 0.2.6 (✓ API compatible changes)
* `martin`: 1.2.0 -> 1.3.0

<details><summary><i><b>Changelog</b></i></summary><p>


## `mbtiles`

<blockquote>

##
[0.15.1](mbtiles-v0.15.0...mbtiles-v0.15.1)
- 2026-01-27

### Added

- add MLT decoding support
([#2512](#2512))
- migrate our log library to tracing
([#2494](#2494))

### Other

- unignore `diff_and_patch_bsdiff` test with unique SQLite database
names ([#2480](#2480))
- *(mbtiles)* remove the prefix-ism around how files are named for
binary diff copy and simpify their naming
([#2478](#2478))
- *(mbtiles)* add assertion messages what we are checking to the copy
tests ([#2477](#2477))
</blockquote>

## `martin-core`

<blockquote>

##
[0.2.6](martin-core-v0.2.5...martin-core-v0.2.6)
- 2026-01-27

### Added

- migrate our log library to tracing
([#2494](#2494))
- *(martin-core)* Allow glyph ranges more than 0xFFFF
([#2438](#2438))

### Fixed

- *(perf)* Remove FontSources.masks as they were consuming large amounts
of memory even when no font sources were set
([#2519](#2519))
- improve error message if no SVG sprite files are present
([#2516](#2516))

### Other

- move our imports to tracing
([#2500](#2500))
- *(deps)* shear our dependencys
([#2497](#2497))
</blockquote>

## `martin`

<blockquote>

##
[1.3.0](martin-v1.2.0...martin-v1.3.0)
- 2026-01-27

### Added

- *(srv)* Add `route_prefix` configuration for native subpath support
without the need of a reverse proxy override
([#2523](#2523))
- add MLT decoding support
([#2512](#2512))
- migrate our log library to tracing
([#2494](#2494))
- improve martin-cp progress output time estimate
([#2491](#2491))
- *(pg)* include ID column info for tables
([#2485](#2485))
- *(pg)* support PostgreSQL materialized views
([#2279](#2279))
- *(martin-core)* Allow glyph ranges more than 0xFFFF
([#2438](#2438))

### Fixed

- *(ui)* clipboard copy for http://0.0.0.0:3000 and unify
implementations ([#2487](#2487))
- the `Copy` icon displaying nicely, next to the text and with enough
padding ot all items
([#2483](#2483))
- update copy text to include icon for better visibility
([#2482](#2482))
- *(perf)* Remove FontSources.masks as they were consuming large amounts
of memory even when no font sources were set
([#2519](#2519))
- improve error message if no SVG sprite files are present
([#2516](#2516))

### Other

- move our request logging to tracing
([#2508](#2508))
- move our imports to tracing
([#2500](#2500))
- *(deps)* shear our dependencys
([#2497](#2497))
- *(ui)* adjust margin for copy icon in URL component
([#2489](#2489))
- unignore `diff_and_patch_bsdiff` test with unique SQLite database
names ([#2480](#2480))
- *(mbtiles)* remove the prefix-ism around how files are named for
binary diff copy and simpify their naming
([#2478](#2478))
- *(mbtiles)* add assertion messages what we are checking to the copy
tests ([#2477](#2477))
</blockquote>


</p></details>

---
This PR was generated with
[release-plz](https://github.com/release-plz/release-plz/).

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@Auspicus Auspicus deleted the fix/2518/fonts-memory-usage branch March 18, 2026 20:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Lower memory usage of fonts feature

3 participants