Skip to content

ci: Add support for release version bumps other than nightly#23618

Merged
kjarosh merged 1 commit into
ruffle-rs:masterfrom
kjarosh:ci-bump
May 4, 2026
Merged

ci: Add support for release version bumps other than nightly#23618
kjarosh merged 1 commit into
ruffle-rs:masterfrom
kjarosh:ci-bump

Conversation

@kjarosh
Copy link
Copy Markdown
Member

@kjarosh kjarosh commented May 3, 2026

Description

This patch adds a param to the bump command of the release script, so that you can select how the version should be bumped. Currently it supports "major", "minor", "patch", and "nightly".

Additionally it prevents bumping when the workspace is not clean.

It shouldn't change anything in the current release process, just adding features required for stable releases.

Testing

Tested locally by running the release script. CI changes are untested.

Checklist

  • I, a human, have self-reviewed this PR and fully understand the changes within.
  • I have made or updated tests where possible.
  • All of my commits are properly scoped, compile successfully, and pass all tests.
  • This PR does not make sense to split up into smaller PRs.
  • An LLM was involved in the authoring of this code.

This patch adds a param to the `bump` command of the release script, so
that you can select how the version should be bumped. Currently it
supports "major", "minor", "patch", and "nightly".

Additionally it prevents bumping when the workspace is not clean.

It shouldn't change anything in the current release process, just adding
features required for stable releases.
@kjarosh kjarosh added A-build Area: Build scripts & CI T-feature Type: New Feature (that Flash doesn't have) labels May 3, 2026
@kjarosh kjarosh merged commit 655b217 into ruffle-rs:master May 4, 2026
26 checks passed
@kjarosh kjarosh deleted the ci-bump branch May 4, 2026 08:22
Hancock33 added a commit to Hancock33/batocera.piboy that referenced this pull request May 9, 2026
------------------------------------------------------------------------------------------
dolphin-emu.mk b0eb643c614ddeda6400dc4033d58934a20ba5eb # Version: Commits on May 05, 2026
------------------------------------------------------------------------------------------
Merge pull request #14642 from SuperSamus/cpp-move-fixup-nocubeb

Fixup #14565 (compilation with `-DENABLE_CUBEB=OFF`),

-----------------------------------------------------------------------------------
eden.mk 4f4c298a39fee558f2a593157192afe7f821014c # Version: Commits on May 05, 2026
-----------------------------------------------------------------------------------
[hle, service] fix errors related to race conditions triggering under SMG1 and SMG2 (#3927)

-----------------------------------------------------------------------------------------------
lindbergh-loader.mk 0af606d845b70339c335785c0eba68b47b78df3c # Version: Commits on May 05, 2026
-----------------------------------------------------------------------------------------------
Update Patreon link in README.md,

--------------------------------------------------------------------------------------
openmsx.mk 22ec19b72a717446a18364fecda8e8132e0e0880 # Version: Commits on May 05, 2026
--------------------------------------------------------------------------------------
Update Node.js 20 actions to Node.js 24 versions.,

-----------------------------------------------------------------------------------
play.mk c9eccec03d1ee6840a3b818153df7fea7a6c142c # Version: Commits on Apr 16, 2026
-----------------------------------------------------------------------------------
FrameDebugger: Set initial file picker directory.,

-------------------------------------------------------------------------------------
ppsspp.mk 462b57bc1a21417b097acd06711935bdc9334c43 # Version: Commits on May 05, 2026
-------------------------------------------------------------------------------------
Merge pull request #21642 from hrydgard/dinput-code-cleanup

UWP keyboard fix, DInput code cleanup,

------------------------------------------------------------------------------------
rpcs3.mk d93d9b2c5aa859d1cf2f1381cefd204fb022163a # Version: Commits on May 05, 2026
------------------------------------------------------------------------------------
game_list: Fix ISO cache bypass in is_from_yml branch for multi-game ISOs (#18683)

Fixes regression from #18546 and #18679.

## Problem

The is_from_yml ISO branch constructed iso_archive unconditionally,

bypassing the cache check inside add_game, making the cache write-only

for yml-sourced ISOs.

## Fix

Added a lightweight index cache entry (iso_path + \//index\) storing the

subdir list + mtime. On hit, skips archive construction entirely. On

miss, walks as before and writes the index,

-----------------------------------------------------
ryujinx.mk 1.3.287 # Version: Commits on May 05, 2026
-----------------------------------------------------
1.3.287

--------------------------------------------------------------------------------------
shadps4.mk 4d3827c34949d034cc47e86c943b7fd9318c48ae # Version: Commits on May 05, 2026
--------------------------------------------------------------------------------------
Avoid out-of-bounds array access when checking custom color for TV Remote (#4356),

---------------------------------------------------------------------------------------
touchhle.mk f886c577758f596b2a77ed599a9e1a3597540cb7 # Version: Commits on May 04, 2026
---------------------------------------------------------------------------------------
Remove edits to SDLActivity.java

It seems that debug builds work fine without it? I'm not sure why it was

breaking before...

Change-Id: Ibaf1cdaf55a91bdb12c02d5d5ac423ba1d112194,

-------------------------------------------------
vice.mk r46091 # Version: Commits on May 04, 2026
-------------------------------------------------
null

-------------------------------------------------------------------------------------------
xenia-canary.mk 80f2b535e9736a9772de528952877e912c328aea # Version: Commits on Feb 15, 2026
-------------------------------------------------------------------------------------------
[Kernel] Added KeSaveFloatingPointState and KeRestoreFloatingPointState from nukernel,

-----------------------------------------------------------------------------------------
xenia-edge.mk ba5fd0f4149a99e8665e989d53bbd2c6b9b7bc91 # Version: Commits on May 05, 2026
-----------------------------------------------------------------------------------------
[GPU/macOS] Tighten vblank and present pacing with mach_wait_until,

-----------------------------------------------------------------------------------
ymir.mk 374c8be5c37eb3853a9f0fc2b1eb5c263c725fe2 # Version: Commits on May 05, 2026
-----------------------------------------------------------------------------------
chore: Update Patreon supporters list,

---------------------------------------------------------------
ruffle.mk nightly-2026-05-05 # Version: Commits on May 05, 2026
---------------------------------------------------------------
## What's Changed

* ci: Add support for release version bumps other than nightly by @kjarosh in ruffle-rs/ruffle#23618

* chore: Bump esbuild version in package-lock.json by @torokati44 in ruffle-rs/ruffle#23616

* chore: Bump rollup package version in package-lock.json by @torokati44 in ruffle-rs/ruffle#23615

* chore: Bump webpack-cli to 7 in web/ by @torokati44 in ruffle-rs/ruffle#23613

* render: Improve hairline strokes and scaling strokes on WebGL and WGPU by @darktohka in ruffle-rs/ruffle#23011

## New Contributors

* @darktohka made their first contribution in ruffle-rs/ruffle#23011

**Full Changelog**: ruffle-rs/ruffle@nightly-2026-05-04...nightly-2026-05-05,

-----------------------------------------------------------------------------------------
catacombgl.mk a18035bf899d6f3093b487725b3c6e3867365231 # Version: Commits on May 05, 2026
-----------------------------------------------------------------------------------------
Adapt Catacomb 3-D menu instructions for game controller,

------------------------------------------------------------------------------------
cdogs.mk 3483ad394587f205f467a0d819b435395145b879 # Version: Commits on May 05, 2026
------------------------------------------------------------------------------------
Fix vehicle head drawing,

------------------------------------------------------------------------------------------
devilutionx.mk 3eb2b44e5a572c7ae1aaf8eaaa3856d188110d88 # Version: Commits on May 01, 2026
------------------------------------------------------------------------------------------
Ensure that buffered player info gets processed,

------------------------------------------------------------------------------------------
fallout2-ce.mk e42d8021c1fddc51ede3216f89cc9cdc75e07dc5 # Version: Commits on May 05, 2026
------------------------------------------------------------------------------------------
WIP Mapper implementation (#438)

* Add mapper CMakeTarget, tool for mapping function names to originals, load/save toolbar & update_art implemented

* edit_mapper function + stubs

* Rename exe to mapper-ce

* load_lbm_to_buf

* Add comments for read/write functions in db.h

* load_dialog, save_dialog, save_as, info_dialog and some other functions

* Fix LBM loading

* Fix mouse input not working on initial empty map, changed error in partyMemberRecoverLoadInstance to print to log, matching vanilla

* mapper.cc: basic hi-res support, NULL->nullptr

* load_lbm_to_buf rewrite, print_toolbar_name background fix

* Stubs for enter/exit playmode, art slot indexes fix, map_scr_toggle_hexes

* Fix memory corruption on screen_width > 640, fix various UI offset bugs

* mapper.cc: UI code style, toggle button fixes, rotation keys, edit button placeholders, PAGEUP key fix

* Elevation display fix, object type switching

* Spatial script placement and display, basic object selection

* Fixed dragging objects, block object showing, add all missing cases in edit_mapper with stubs, move all keys codes to constants

* chore: auto-format with clang-format

* Fix non-win builds

* Add stub calls from edit_mapper, fix objects being incorrectly deleted when unselected, fix tile number display

* Fix compile on Linux

* Attempt to fix iOS signing error

* Placing of objects and tiles, F12 to erase map, bug fixes

* Fixed block object toggling logic and add missing switch cases to edit_mapper

* Object editing added, 'p' to scroll palette fixed

* Add new files to CMakeLists

* Attempt to fix some colors + alignment in critter edit window

* chore: auto-format with clang-format

* Linux build fix attempt

* Critter inventory editing

* Vanilla grid-based inventory item picker

* Review fixes

* More review fixes and const correctness

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>,

----------------------------------------------------------------------------------------
retroarch.mk 14a5cc00a050b3d253d42ae0afa284c4a6fb129f # Version: Commits on May 05, 2026
----------------------------------------------------------------------------------------
Fix Dolphin autostate load hang by sleeping a bit first,

----------------------------------------------------------
bgfx.mk v1.143.9248-539 # Version: Commits on May 05, 2026
----------------------------------------------------------
Fix cmake syntax error when compiling shaders in Debug mode,

------------------------------------------------------------
libdof.mk c02135e90ce1acd13a5ab21a4878b1d1820bbe49 NULL-NULL
------------------------------------------------------------
Moved Permanently,

---------------------------------------------------------------------------------------
vpinball.mk 034f9408539c8bc39866305fdb9cd57721961816 # Version: Commits on May 04, 2026
---------------------------------------------------------------------------------------
BGFX: use camera relative rendering to support low precision platform (Meta Quest),

----------------------------------------------------
glslang.mk 16.3.0 # Version: Commits on May 01, 2026
----------------------------------------------------
Deprecation Notice:

* Deprecate the HLSL front-end. See issue #4210 for details.

Changes in this release:

* Support GL_NV_explicit_typecast

* Raise the maximum limit for specialization constant IDs

* Add explicit 8-bit and 16-bit type support for bitfieldReverse

* Implement system include directives for the standalone wrapper

* Check for invalid usage of gl_WorkGroupSize components

* HLSL: Provide string error context only if token is a string

* Fix layoutDescriptorStride bitfield truncation for large stride values

* GL_EXT_long_vector with 2-4 components no longer require LongVector capability

* Fix alignment of guard blocks

* Fix ShaderDebugInfo having invalid line numbers when generating SPIRV 1.0

* Replace ostringstream with string concat during #include preprocessing

* Check for bad parameters on long vector type

* HLSL: Check for bad integer argument on Load*, Store*, Interlocked*

* HLSL: handle type error for ternary operator

* HLSL: Ensure scope is popped even when method body fails to parse

* Avoid unneccessary copies in SpirvIntrinsics.cpp

* Unconditionally emit debug source for include files when using non-semantic debug info

* Support bfloat16 and float8 tensors

* Add small type capabilities for GLSL.STD.450

* Add initial support for NonSemantic.Shader.DebugInfo 101

* Fix access chains for GL_ARM_tensors with raw descriptor heap accesses

* Support GL_KHR_compute_shader_derivatives

* Require a quad or linear layout qualifier to be specified for GL_KHR_compute_shader_derivatives

* Supportx SPV_KHR_constant_data and SPV_KHR_abort

----------------------------------------------------------------------------------------
doomretro.mk 827c09d875a53f4a6ad6464d30448c51496ab6b9 # Version: Commits on May 05, 2026
----------------------------------------------------------------------------------------
Update releasenotes.md,

--------------------------------------------------------------------------------------
yquake2.mk f8939a0561ac992837ab006c144fd972d9cd1628 # Version: Commits on May 04, 2026
--------------------------------------------------------------------------------------
game: scale ammo on fire

Scale exposion effect is unsupported by protocol.,

------------------------------------------------------------------------------------------
xash3d-fwgs.mk e6a44b70e08c379fc6dc059ae7cfeca799fb7c58 # Version: Commits on May 04, 2026
------------------------------------------------------------------------------------------
engine: client: always load client.dll last to crash on nullptr in mods that fetch cvar pointers early, add comment for anyone who would modify this file,

--------------------------------------------------------------------------------------------------
libretro-beetle-psx.mk 882e55b8cb3a1b4c3b91d71a2c156a9b33f279b8 # Version: Commits on May 05, 2026
--------------------------------------------------------------------------------------------------
mednafen: drop clamp.h; fold + optimize audio saturation; fix Vulkan static-after-extern shadow

Two changes that travel together because they touch the same audit

pass.

(1) clamp.h dropped, callers folded inline

==========================================

clamp.h was a 29-line file with one 4-line static inline

function (`clamp(int32_t *val, ssize_t min, ssize_t max)`) that

saturated a value in place. 12 call sites across spu.c (7),

cdc.cpp (4), and gte.c (1). All but one saturated to the

signed 16-bit audio range [-32768, 32767]; the gte.c outlier

saturates to [-32768 + lm * 32768, 32767] where lm is a bit

from the GTE opcode. Folded inline at every call site, where

each fold also gets a comment explaining what kind of

saturation is happening (audio output sample, ADPCM IIR-filter

intermediate, GTE projected coordinate, etc.).

While auditing the call sites for the fold, three real

optimisation opportunities surfaced:

 (a) cdc.cpp ApplyVolume short-circuit on Muted:

     Historical body computed L/R volume-matrix mix

     unconditionally, ran two clamps, then conditionally zeroed

     both channels if Muted was set. Muted is the resting state

     any time CD audio isn't actively playing - probably the

     majority of frames in many games. Reordered to test Muted

     first and bail with samples[]=0 in that case; mix and

     clamp only run when the result is going to be used. Saves

     4 multiplies + 4 shifts + 2 adds + 4 saturating compares

     per sample on the muted path. Same final samples[] in both

     paths so behaviour is identical.

 (b) cdc.cpp GetCDAudio resampler eliminates out_tmp[2] stack

     scratch:

     The fractional-rate path used an int32 out_tmp[2] stack

     accumulator, accumulated each channel's 25-tap windowed-

     sinc convolution into it, clamped, then copied to

     samples[i]. Folded into a per-channel local int32 acc that

     accumulates and writes straight to samples[i] - same ops,

     one fewer stack temp.

 (c) spu.c per-sample mix loop eliminates output[2] stack

     scratch and tightens the IntermediateBuffer overflow

     guard:

     The mix loop computed per-LR `output[lr]` from accum[lr]

     and the global volume sweep, clamped, and on the next line

     wrote `(output[lr] * 3 + 2) >> 2` to IntermediateBuffer.

     output[] only existed to carry one int32 per channel

     between those two lines. Fused: the post-volume-sweep

     value is computed inside the IntermediateBuffer write

     expression directly, saving 8 bytes of stack and one

     round-trip per sample. As a side effect the

     IntermediateBufferPos overflow guard now covers the

     volume-sweep step too - previously only the buffer write

     was guarded and the sweep + clamp ran every sample even

     when the buffer was full (debugger edge case).

     SPU_Sweep_ReadVolume is pure (returns sweep->Current), so

     skipping it on the buffer-full path is behaviour-

     preserving.

The two reverb resampler helpers (Reverb4422 / Reverb2244)

collapse from `clamp(&out, ...); return out;` to a pair of

inline ifs followed by `return out;`. Each is a simple

collapse, no semantic change.

The voice-decode clamp inside the SPU's ADPCM nibble loop is a

straight inline-the-clamp; no opportunity for a structural

optimisation there because the saturated value feeds into both

tb[i] and the M1/M2 history (PS1 silicon clamps at int16 for

its IIR filter state), so the temporary is genuinely needed.

Per-TU text-section sizes at -O2 (size /tmp/X.o):

                  before   after    delta

  spu.o            34846    34910    +64

  gte.o            20055    20055      0

  cdc.o            29443    29379    -64

                                    ----

                                       0   net

Same total binary size; the optimisations balance the slight

structural growth from the IntermediateBuffer-guard rework.

(2) rsx_lib_vulkan.cpp: rename file-static crop_overscan to

    avoid extern-vs-static shadow

======================================================

fc4d742 (\core: prune dead globals; consolidate cross-TU

extern decls\) replaced rsx_lib_vulkan.cpp's local-extern

redecls of cross-TU globals with a `#include

\beetle_psx_globals.h\`. The header includes

`extern int crop_overscan;`. Unfortunately the file had a

`static int crop_overscan;` declaration at file scope from

long before fc4d742 - a long-standing shadow of the global

that nothing else in the TU referenced.

g++ (correctly) refuses the resulting static-after-extern:

  rsx/rsx_lib_vulkan.cpp:55:12: error: 'crop_overscan' was

    declared 'extern' and later 'static' [-fpermissive]

   55 | static int crop_overscan;

      |            ^~~~~~~~~~~~~

Renamed the file-static to `vulkan_crop_overscan` plus its 8

internal use sites; the BEETLE_OPT(crop_overscan) macro key on

line 360 stays as-is (it's the env-var name, not the variable

name). Behaviour preserved bit-perfect: the file still reads

the BEETLE_OPT(crop_overscan) env var into its own private

copy and uses that locally, exactly as before. The cross-TU

global crop_overscan from beetle_psx_globals.h is left for

other TUs (libretro.cpp, gpu.cpp, input.cpp, rsx_intf.cpp,

rsx_lib_gl.cpp) which have always read it directly. The two

parallel-but-separate values track identically because

libretro.cpp's check_variables() reads the same env var into

the global at the same time rsx_lib_vulkan reads it into the

static.

Verification

============

  - All 9 sampled CXX TUs (gpu.cpp, frontio.cpp, cdc.cpp,

    cpu.cpp, guncon.cpp, justifier.cpp, gamepad.cpp,

    general.cpp, mempatcher.cpp) compile clean at -O2.

  - All 10 sampled C TUs (dma.c, gte.c, timer.c, spu.c, sio.c,

    irq.c, mdec.c, error.c, mednafen-endian.c, Deinterlacer.c)

    compile clean at -O2.

  - rsx_lib_vulkan.cpp structural check passes - no

    static-vs-extern conflicts, no undeclared-symbol errors

    (the file still needs Vulkan SDK headers not on this

    sandbox to compile fully, but those errors are unrelated

    and identical before/after this change).

  - Direct grep confirms zero remaining `clamp(` calls outside

    GLSL shader code (`clamp(uint(coords.x), 0, 0xff)` in

    rsx/shaders_gl/command_fragment.glsl.h is GLSL's built-in,

    not C).,

---------------------------------------------------------------------------------------------
libretro-fbneo.mk f7574b86e0eeece0e8c633b77dd9833840155dd9 # Version: Commits on May 05, 2026
---------------------------------------------------------------------------------------------
(libretro) update files,

--------------------------------------------------------------------------------------------------
libretro-gearcoleco.mk c4ae7b25b35ab1060fa84cc5464dd899b43651d2 # Version: Commits on May 04, 2026
--------------------------------------------------------------------------------------------------
Update publish to mcp registry workflow,

-------------------------------------------------------------------------------------------------
libretro-geargrafx.mk c4b8b8eab4427ebfe4a5f08af8b349ff3b4a21bc # Version: Commits on May 04, 2026
-------------------------------------------------------------------------------------------------
Update publish to mcp registry workflow,

--------------------------------------------------------------------------------------------------
libretro-gearsystem.mk 4dedd026c1c861158e1f17b8616bdf11d7cd9ad2 # Version: Commits on May 04, 2026
--------------------------------------------------------------------------------------------------
Update publish to mcp registry workflow,

---------------------------------------------------------------------------------------------
libretro-noods.mk 626628ca270e41528c20ebbedb69408eca326834 # Version: Commits on May 05, 2026
---------------------------------------------------------------------------------------------
Libretro: fix saves on non unix platforms,

----------------------------------------------------------------------------------------------
libretro-ppsspp.mk 462b57bc1a21417b097acd06711935bdc9334c43 # Version: Commits on May 05, 2026
----------------------------------------------------------------------------------------------
Merge pull request #21642 from hrydgard/dinput-code-cleanup

UWP keyboard fix, DInput code cleanup,

-------------------------------------------------------------------------------------------
libretro-ps2.mk 0f2c9a7c615357e6d82a4520e502f94ff27ca77b # Version: Commits on May 05, 2026
-------------------------------------------------------------------------------------------
Buildfixes: restore __forceinline on non-mingw toolchains

The d2d1ebc / fdb0eec / c9d5ee4 series stubbed __fi / __ri /

__releaseinline (and removed __forceinline from a few SPU2 hot-path

functions) to make the libretro Makefile build link under mingw.  That

was correct for the failing target, but it was applied universally and

silently disabled cross-TU inlining on every working toolchain too -

MSVC, Linux gcc, macOS clang.  The hot paths that lost their always-

inline (SPU2 Mix / TimeUpdate / spu2M_Write / UpdateSpdifMode and

everything reached through __fi / __ri elsewhere in the codebase) are

all on the audio mix and EE/IOP-recompiler-adjacent paths where the

inlining is the point of the decoration.

The actual breakage is mingw-only.  mingw-w64's _mingw.h defines

__forceinline as `extern __inline__ __attribute__((__always_inline__,

__gnu_inline__))`, which under GNU inline rules means \inline at every

callsite AND DO NOT emit an out-of-line copy\.  In a non-LTO build

that turns every cross-TU caller of a __forceinline-decorated free

function (dmaSIF1, vtlb_GetPhyPtr, x86Emitter::xPUSH, the four SPU2

ones above, ...) into an undefined reference.  cmake builds avoid this

because PCSX2_LTO=ON merges all TUs at link time; the libretro

Makefile builds do not LTO.

MSVC's __forceinline always emits an out-of-line copy, and Linux/macOS

gcc/clang's __attribute__((always_inline, unused)) also emits one.

On those toolchains the historical decoration is correct.

So we keep the historical __forceinline definition and the historical

__fi / __ri / __releaseinline = __forceinline mapping for everyone,

and special-case __MINGW32__ to bind __fi / __ri / __releaseinline to

empty.  __forceinline itself stays untouched on mingw - the system

headers (winbase.h, processthreadsapi.h, synchapi.h, _mingw.h)

declare strnlen_s / _InterlockedIncrement / NtCurrentTeb / etc as

__forceinline and rely on gnu_inline semantics for ODR.

Verified by preprocessing common/Pcsx2Defs.h on both compilers:

  Linux gcc -DNDEBUG: __fi -> __attribute__((always_inline, unused))

  mingw-w64 gcc      : __fi -> empty, __forceinline left alone

Verified by running nm against fresh .o files compiled with both

compilers in NDEBUG mode:

  Linux:  spu2M_Write / TimeUpdate / UpdateSpdifMode / Mix all emit

          out-of-line T symbols (cross-TU linkable).

  mingw:  same four symbols emit T (cross-TU linkable, link will

          succeed for the libretro Makefile build).

Also restored the __forceinline that was dropped from SPU2 Mixer.cpp's

Mix() and from spu2sys.cpp's three __forceinline functions, but spelt

as __fi instead of __forceinline directly so the mingw-stub path

applies cleanly.

Net effect on the Windows MSVC, Linux, macOS, and cmake builds: code

emission goes back to whatever it was before d2d1ebc (perf restored).

Net effect on the libretro Makefile mingw build: identical to ab74e3d

(still links, still runs as far as it currently does).,

---------------------------------------------------------------------------------------------------
libretro-snes9x-next.mk d9cba8a41b3407ebb929816a7033e0407fd7b2d0 # Version: Commits on May 05, 2026
---------------------------------------------------------------------------------------------------
tile.c: hoist invariant RealScreenColors assignment out of backdrop renderers

The 28 DrawBackdrop16* renderers each began with

    GFX.RealScreenColors = IPPU.ScreenColors;

    GFX.ScreenColors = GFX.ClipColors ? BlackColourMap : GFX.RealScreenColors;

The first line is invariant across the whole backdrop pass: backdrop

has no per-tile palette slice (unlike SELECT_PALETTE for regular tiles)

and no Direct Colour Mode override (unlike Mode 7 entry points), so it

always sets RealScreenColors to IPPU.ScreenColors. Lift that line out

of every renderer body into the DrawBackdrop() and DRAW_BACKDROP_NO_MATH()

macros in ppu.c, set once before the per-clip-region loop.

The second line stays inside each renderer (BlackColourMap is private

to tile.c) and is genuinely per-clip-region (ClipColors changes each

iteration of the macro's loop).

Saves N-1 redundant assignments per backdrop pass where N is the

number of clip regions; perf-negligible. Net -19 lines.

src/ppu.c  +9

src/tile.c -28,

----------------------------------------------------------------------------------------------
libretro-stella.mk 93a070e927573584bb3059028a5514ec22f2b0ce # Version: Commits on May 05, 2026
----------------------------------------------------------------------------------------------
More ostringstream cleanups.,

---------------------------------------------------------------------------------------------
libretro-vba-m.mk 26fe5b40ca10931bf5e4bfde671a85625247e1a4 # Version: Commits on May 05, 2026
---------------------------------------------------------------------------------------------
ci: disable SDL3 PPA on Ubuntu runners for now

Disable getting the SDL3 backport from a PPA on the Ubuntu CI runners

for now due to issues with launchpad.

Signed-off-by: Rafael Kitover <rkitover@gmail.com>,

-------------------------------------------------------------------------------------------
glsl-shaders.mk 42fa8a98ab19bdaffb53280746a30819eb21f807 # Version: Commits on May 05, 2026
-------------------------------------------------------------------------------------------
crt-geom-mini; optimize to be closer to crt-geom, tiny-ntsc add saturation parameter (#562)

* Update crt-geom-mini.glsl

* Update tiny_ntsc.glsl

* Update crt-geom-mini.glslp,

--------------------------------------------------------------------------------------------
slang-shaders.mk 2ba50bfaeae630741216a9b60b5147485657316f # Version: Commits on May 05, 2026
--------------------------------------------------------------------------------------------
vectorscale: pack-positions pre-pass + geometric crossing intersection (#909)

* vectorscale: pack-positions pre-pass + inline crossing intersection

Adds a per-CP pre-pass (pack-positions) that denormalizes render

geometry into a single PackedPositions texture and folds the crossing

curve-curve intersection into the same pass. The rasterizer reads its

full per-CP geometry from PackedPositions and skips ghost extension,

neighbor-index decoding, and t_branch solving in its hot loop.

New shader: pack-positions.slang

For each CP slot, packs into 3 horizontally-adjacent texels:

  col 0 = (pp.x, pp.y, prev_ci_or_-1, _)

  col 1 = (cp.x, cp.y, t_branch, validity 0=skip 1=normal 2=line)

  col 2 = (np.x, np.y, next_ci_or_-1, _)

(pp, cp, np) is the ghost-extended (pp = 2·prev - cp etc.) Bezier

control triple. t_branch is computed per CP type:

- IS_CROSSING: 2D Newton iteration on F(t,s) = B_a(t) - B_b(s) = 0,

  starting from (0.5, 0.5). The optimizer keeps crossings near the

  grid corner so the initial guess is within ~0.1 of the answer;

  4 iterations drive the residual below f32 epsilon. Reads neighbor

  positions from both this slot's chain (N-S or E-W) and the partner

  slot's chain.

  This replaces the legacy ghost-aware inverse-correction that moved

  each crossing CP so the rendered curve passed through the grid

  corner at t=0.5. The CP now stays at its optimizer-final position

  and the rasterizer's wedge AA anchors at the geometric intersection

  B_a(t) = B_b(s).

- 2-CP chain (degenerate stem with both ends as endpoint markers):

  t_branch = 0.5; render geometry pre-built as a straight line so the

  rasterizer dispatches to its closed-form line solver via is_line.

- One-sided clamped Bezier (prev or next is endpoint): closed-form

  cubic project of the interior B-spline midpoint onto the clamped

  span — finds the t at which the rendered clamped curve reaches the

  same physical \before/after sc\ boundary an interior B-spline would

  at t=0.5.

- Else: t_branch = 0.5.

Modified: update-tjunction.slang

Drop the IS_CROSSING ghost-aware inverse-correction branch; crossings

pass through unchanged. Drops the now-unused Opt2 sampler binding,

read_orig_pos helper, and Opt2Size UBO field.

Modified: cell-rasterizer.slang

Replace read_pos + read_neighbors + ghost extension + 2-CP-chain

construction + t_branch cubic-solver in test_one_cp with a single

read_packed_cp(ci) call returning a PackedCp struct. Per-active-probe

fetch count: ~6 → 4 (1 flag + 3 packed reads). resolve_hit's

neighbor-direction lookups for color resolution are unchanged.

Modified: vectorscale.slangp

11 passes (was 10). pack-positions inserted between the final

update-tjunction iteration (FinalPositions) and cell-rasterizer.

PackedPositions framebuffer is 3.0 × source-relative wide.

* vectorscale: cubic solver — FMA on discriminants, Newton polish, faster trig

Three numerical improvements to closest_on_span:

1. FMA on discriminants. b²−4ac is the textbook catastrophic-cancellation

   case when b² ≈ 4ac (near-double-root); fma(b, b, -4·c·a) computes the

   sum with a single rounding instead of two, recovering ~1 extra bit and

   preventing disc from rounding to the wrong sign at the branch boundary.

   Same trick for the cubic disc q²/4 + p³/27 at the disc≈0 (near-triple-

   root) boundary between Cardano and trig branches.

2. Newton polish on every analytical root. Cardano + acos/cos/pow(_, 1/3)

   come back at ~5 ULP; one Newton step on D'(t) drives the root to

   ~1 ULP. polish_root_c skips when D''(t) is small or |step| ≥ 0.5 to

   avoid divergence at near-double/triple-root cases.

3. Faster trig branch. Replaces pow(sqrt(-p³/27), 1/3) (3 multiplies +

   sqrt + pow(_, 1/6)) with the equal 2·sqrt(-p/3). Reduces work and

   avoids precision loss of pow(_, 1/6).

* vectorscale: split cell rasterizer into single-AA + multi-AA passes

Replaces the monolithic cell-rasterizer.slang with two passes that share

the same algorithm but separate the AA work for occupancy on register-

constrained GPUs.

1. cell-rasterizer-single-aa.slang — tracks one best hit + the second-

   best hit's distance² (no full 2nd hit data). Resolves color, applies

   single-curve AA on the resolved hit. Writes RGB = AA color and

   A = sentinel (1.0 if 2nd hit is within aa_threshold so multi-curve

   AA could fire, 0.0 otherwise). Hit struct is slim (5 scalars: d2, t,

   cp_idx, prev_ci, next_ci) — geometry refetched via read_packed_cp at

   consumer sites (texture cache hits ~100% since test_one_cp just read

   the same texels).

2. cell-rasterizer-multi-aa.slang — reads SingleAA. If A < 0.5, passes

   RGB through unchanged (most pixels). Otherwise redoes find_hits

   (top-3) and runs wedge AA + dual-curve AA gates as in the original

   monolithic rasterizer, falling back to SingleAA's RGB if neither

   fires (single-curve AA already applied). pos/neg colors are scoped

   to each AA branch via out-params on resolve_hit instead of struct

   fields, keeping them out of the cross-branch live set.

Two presets:

- vectorscale.slangp: chains single-aa → multi-aa. Output is equivalent

  to the original monolithic rasterizer; most pixels take the cheap

  early-exit path on pass 2.

- vectorscale-single-aa.slangp: single-aa pass alone. Faster on register-

  constrained GPUs but jaggy at junctions and dual-curve crossings.

The sentinel is purely an inter-pass signal — the standalone single-aa

preset writes it to viewport alpha where display ignores it.

Measured on Apple Silicon: monolithic was 254 VGPRs (1/8 occupancy with

240 bytes spill); single-aa pass alone is ~120 VGPRs (clears the 128

threshold for ~30% occupancy, ~3x faster end-to-end). The chained

two-pass setup matches monolithic output with the early-exit speedup.,
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-build Area: Build scripts & CI T-feature Type: New Feature (that Flash doesn't have)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants