Skip to content

common : fix state save in common_prompt_batch_decode#23468

Open
danbev wants to merge 3 commits into
ggml-org:masterfrom
danbev:save-session-llama-completion
Open

common : fix state save in common_prompt_batch_decode#23468
danbev wants to merge 3 commits into
ggml-org:masterfrom
danbev:save-session-llama-completion

Conversation

@danbev
Copy link
Copy Markdown
Member

@danbev danbev commented May 21, 2026

Overview

This commit addresses a bug in common_prompt_batch_decode that affects the session state store/restore in completion.cpp and save-load-state.cpp.

Additional information

The motivation for this is that currently the code is saving n-1 tokens in both the session_tokens and in the KV cache. Then when loading the session tokens, and if the prompt matches, it would replay the last saved token (n-1) into the next position, effectively replaying the same token in the wrong position.

The fix is to store all n tokens in session_tokens, while the memory state only reflects n-1 processed tokens as the saving happens before the last token is decoded in common_prompt_batch_decode.

I ran both completion.cpp and save-load-state.cpp with a transformer, a recurrent, and a hybrid model.

Requirements

Resolves: #23400

@fairydreaming
Copy link
Copy Markdown
Collaborator

I tested it on my toy Llama-3.1 example from the bug report and a longer prompt with DeepSeek V3.2 and it looks good, model output during last prompt token replay now matches the model output when memory state was written.

I see there's one more problem though where state is not saved at all if I have a long prompt and there's no --prompt-cache-all option. Going to investigate it now, I think it may be also related to the same #18862 PR.

@fairydreaming
Copy link
Copy Markdown
Collaborator

fairydreaming commented May 21, 2026

@danbev I printed some debug messages during processing of a long prompt:

(base) phm@epyc:~/projects/llama.cpp-deepseek-v32-minimal/build-cuda$ ./bin/llama-completion -m ../models/DeepSeek-V3.2-Q8_0.gguf -f ~/projects/prompts/prompt-deepseek-3.2-long.txt --no-warmup --temp 0.01 --prompt-cache prompt-deepseek-3.2-long.cache -fa 1 -ngl 99 -cmoe -fit off --no-op-offload
0.00.199.964 I llama_completion: llama backend init
0.00.199.972 I llama_completion: load the model and apply lora adapter, if any
0.00.470.820 W llama_model_loader: tensor overrides to CPU are used with mmap enabled - consider using --no-mmap for better performance
0.00.478.461 W model has unused tensor blk.61.attn_norm.weight (size = 28672 bytes) -- ignoring
0.00.478.466 W model has unused tensor blk.61.attn_q_a_norm.weight (size = 6144 bytes) -- ignoring
0.00.478.468 W model has unused tensor blk.61.attn_kv_a_norm.weight (size = 2048 bytes) -- ignoring
0.00.478.470 W model has unused tensor blk.61.attn_q_a.weight (size = 11698176 bytes) -- ignoring
0.00.478.472 W model has unused tensor blk.61.attn_q_b.weight (size = 40108032 bytes) -- ignoring
0.00.478.474 W model has unused tensor blk.61.attn_kv_a_mqa.weight (size = 4386816 bytes) -- ignoring
0.00.478.476 W model has unused tensor blk.61.attn_k_b.weight (size = 8912896 bytes) -- ignoring
0.00.478.478 W model has unused tensor blk.61.attn_v_b.weight (size = 8912896 bytes) -- ignoring
0.00.478.480 W model has unused tensor blk.61.attn_output.weight (size = 124780544 bytes) -- ignoring
0.00.478.483 W model has unused tensor blk.61.ffn_norm.weight (size = 28672 bytes) -- ignoring
0.00.478.485 W model has unused tensor blk.61.indexer.k_norm.weight (size = 512 bytes) -- ignoring
0.00.478.487 W model has unused tensor blk.61.indexer.k_norm.bias (size = 512 bytes) -- ignoring
0.00.478.489 W model has unused tensor blk.61.indexer.proj.weight (size = 1835008 bytes) -- ignoring
0.00.478.491 W model has unused tensor blk.61.indexer.attn_k.weight (size = 974848 bytes) -- ignoring
0.00.478.493 W model has unused tensor blk.61.indexer.attn_q_b.weight (size = 13369344 bytes) -- ignoring
0.00.478.496 W model has unused tensor blk.61.ffn_gate_inp.weight (size = 7340032 bytes) -- ignoring
0.00.478.505 W model has unused tensor blk.61.ffn_gate_exps.weight (size = 3992977408 bytes) -- ignoring
0.00.478.507 W model has unused tensor blk.61.ffn_down_exps.weight (size = 3992977408 bytes) -- ignoring
0.00.478.510 W model has unused tensor blk.61.ffn_up_exps.weight (size = 3992977408 bytes) -- ignoring
0.00.478.512 W model has unused tensor blk.61.ffn_gate_shexp.weight (size = 15597568 bytes) -- ignoring
0.00.478.514 W model has unused tensor blk.61.ffn_down_shexp.weight (size = 15597568 bytes) -- ignoring
0.00.478.516 W model has unused tensor blk.61.ffn_up_shexp.weight (size = 15597568 bytes) -- ignoring
0.00.478.518 W model has unused tensor blk.61.nextn.eh_proj.weight (size = 109182976 bytes) -- ignoring
0.00.478.521 W model has unused tensor blk.61.nextn.enorm.weight (size = 28672 bytes) -- ignoring
0.00.478.524 W model has unused tensor blk.61.nextn.hnorm.weight (size = 28672 bytes) -- ignoring
0.00.478.527 W model has unused tensor blk.61.nextn.embed_tokens.weight (size = 984596480 bytes) -- ignoring
0.00.478.529 W model has unused tensor blk.61.nextn.shared_head_head.weight (size = 984596480 bytes) -- ignoring
0.00.478.532 W model has unused tensor blk.61.nextn.shared_head_norm.weight (size = 28672 bytes) -- ignoring
0.42.549.372 W llama_context: setting new yarn_attn_factor = 1.0000 (mscale == 1.0, mscale_all_dim = 1.0)
0.42.880.180 I llama_completion: llama threadpool init, n_threads = 32
0.42.880.202 I 
0.42.880.305 I system_info: n_threads = 32 (n_threads_batch = 32) / 64 | CUDA : ARCHS = 1200 | USE_GRAPHS = 1 | PEER_MAX_BATCH_SIZE = 128 | BLACKWELL_NATIVE_FP4 = 1 | CPU : SSE3 = 1 | SSSE3 = 1 | AVX = 1 | AVX2 = 1 | F16C = 1 | FMA = 1 | BMI2 = 1 | AVX512 = 1 | AVX512_VBMI = 1 | AVX512_VNNI = 1 | AVX512_BF16 = 1 | LLAMAFILE = 1 | OPENMP = 1 | REPACK = 1 | 
0.42.880.305 I 
0.42.880.306 I llama_completion: attempting to load saved session from 'prompt-deepseek-3.2-long.cache'
0.42.880.409 I llama_completion: session file does not exist, will create.
session_do_save 1 = !path_session.empty() 1 && n_match < embd_inp.size() 1 && !params.prompt_cache_ro 1
0.42.892.098 I sampler seed: 4054808884
0.42.892.107 I sampler params: 
	repeat_last_n = 64, repeat_penalty = 1.000, frequency_penalty = 0.000, presence_penalty = 0.000
	dry_multiplier = 0.000, dry_base = 1.750, dry_allowed_length = 2, dry_penalty_last_n = -1
	top_k = 40, top_p = 0.950, min_p = 0.050, xtc_probability = 0.000, xtc_threshold = 0.100, typical_p = 1.000, top_n_sigma = -1.000, temp = 0.010
	mirostat = 0, mirostat_lr = 0.100, mirostat_ent = 5.000, adaptive_target = -1.000, adaptive_decay = 0.900
0.42.892.113 I sampler chain: logits -> ?penalties -> ?dry -> ?top-n-sigma -> top-k -> ?typical -> top-p -> min-p -> ?xtc -> temp-ext -> dist 
0.42.892.114 I generate: n_ctx = 163840, n_batch = 2048, n_predict = -1, n_keep = 1
0.42.892.114 I 
Summarize the text below. Be concise.
---
What the white whale was to Ahab, has been hinted; what, at times, he was to me, as yet remains unsaid.

Aside from those more obvious considerations touching Moby Dick, which could not but occasionally awaken in any man’s soul some alarm, there was another thought, or rather vague, namis_last_batch 0 = (n_consumed 2048 >= embd_inp.size() 4713
save_now 0 = session_do_save 1 && is_last_batch 0
eless horror concerning him, which at times by its intensity completely overpowered all the rest; and yet so mystical and well nigh ineffable was it, that I almost despair of putting it in a comprehensible form. It was the whiteness of the whale that above all things appalled me. But how can I hope to explain myself here; and yet, in some dim, random way, explain myself I must, else all these chapters might be naught.

Though in many natural objects, whiteness refiningly enhances beauty, as if imparting some special virtue of its own, as in marbles, japonicas, and pearls; and though various nations have in some way recognised a certain royal preeminence in this hue; even the barbaric, grand old kings of Pegu placing the title “Lord of the White Elephants” above all their other magniloquent ascriptions of dominion; and the modern kings of Siam unfurling the same snow-white quadruped in the royal standard; and the Hanoverian flag bearing the one figure of a snow-white charger; and the great Austrian Empire, Cæsarian, heir to overlording Rome, having for the imperial colour the same imperial hue; and though this pre-eminence in it applies to the human race itself, giving the white man ideal mastership over every dusky tribe; and though, besides, all this, whiteness has been even made significant of gladness, for among the Romans a white stone marked a joyful day; and though in other mortal sympathies and symbolizings, this same hue is made the emblem of many touching, noble things—the innocence of brides, the benignity of age; though among the Red Men of America the giving of the white belt of wampum was the deepest pledge of honor; though in many climes, whiteness typifies the majesty of Justice in the ermine of the Judge, and contributes to the daily state of kings and queens drawn by milk-white steeds; though even in the higher mysteries of the most august religions it has been made the symbol of the divine spotlessness and power; by the Persian fire worshippers, the white forked flame being held the holiest on the altar; and in the Greek mythologies, Great Jove himself being made incarnate in a snow-white bull; and though to the noble Iroquois, the midwinter sacrifice of the sacred White Dog was by far the holiest festival of their theology, that spotless, faithful creature being held the purest envoy they could send to the Great Spirit with the annual tidings of their own fidelity; and though directly from the Latin word for white, all Christian priests derive the name of one part of their sacred vesture, the alb or tunic, worn beneath the cassock; and though among the holy pomps of the Romish faith, white is specially employed in the celebration of the Passion of our Lord; though in the Vision of St. John, white robes are given to the redeemed, and the four-and-twenty elders stand clothed in white before the great white throne, and the Holy One that sitteth there white like wool; yet for all these accumulated associations, with whatever is sweet, and honorable, and sublime, there yet lurks an elusive something in the innermost idea of this hue, which strikes more of panic to the soul than that redness which affrights in blood.

This elusive quality it is, which causes the thought of whiteness, when divorced from more kindly associations, and coupled with any object terrible in itself, to heighten that terror to the furthest bounds. Witness the white bear of the poles, and the white shark of the tropics; what but their smooth, flaky whiteness makes them the transcendent horrors they are? That ghastly whiteness it is which imparts such an abhorrent mildness, even more loathsome than terrific, to the dumb gloating of their aspect. So that not the fierce-fanged tiger in his heraldic coat can so stagger courage as the white-shrouded bear or shark.*

*With reference to the Polar bear, it may possibly be urged by him who would fain go still deeper into this matter, that it is not the whiteness, separately regarded, which heightens the intolerable hideousness of that brute; for, analysed, that heightened hideousness, it might be said, only rises from the circumstance, that the irresponsible ferociousness of the creature stands invested in the fleece of celestial innocence and love; and hence, by bringing together two such opposite emotions in our minds, the Polar bear frightens us with so unnatural a contrast. But even assuming all this to be true; yet, were it not for the whiteness, you would not have that intensified terror.

As for the white shark, the white gliding ghostliness of repose in that creature, when beheld in his ordinary moods, strangely tallies with the same quality in the Polar quadruped. This peculiarity is most vividly hit by the French in the name they bestow upon that fish. The Romish mass for the dead begins with “Requiem eternam” (eternal rest), whence Requiem denominating the mass itself, and any other funeral music. Now, in allusion to the white, silent stillness of death in this shark, and the mild deadliness of his habits, the French call him Requin.

Bethink thee of the albatross, whence come those clouds of spiritual wonderment and pale dread, in which that white phantom sails in all imaginations? Not Coleridge first threw that spell; but God’s great, unflattering laureate, Nature.*

*I remember the first albatross I ever saw. It was during a prolonged gale, in waters hard upon the Antarctic seas. From my forenoon watch below, I ascended to the overclouded deck; and there, dashed upon the main hatches, I saw a regal, feathery thing of unspotted whiteness, and with a hooked, Roman bill sublime. At intervals, it arched forth its vast archangel wings, as if to embrace some holy ark. Wondrous flutterings and throbbings shook it. Though bodily unharmed, it uttered cries, as some king’s ghost in supernatural distress. Through its inexpressible, strange eyes, methought I peeped to secrets which took hold of God. As Abraham before the angels, I bowed myself; the white thing was so white, its wings so wide, and in those for ever exiled waters, I had lost the miserable warping memories of traditions and of towns. Long I gazed at that prodigy of plumage. I cannot tell, can only hint, the things that darted through me then. But at last I awoke; and turning, asked a sailor what bird was this. A goney, he replied. Goney! never had heard that name before; is it conceivable that this glorious thing is utterly unknown to men ashore! never! But some time after, I learned that goney was some seaman’s name for albatross. So that by no possibility could Coleridge’s wild Rhyme have had aught to do with those mystical impressions which were mine, when I saw that bird upon our deck. For neither had I then read the Rhyme, nor knew the bird to be an albatross. Yet, in saying this, I do but indirectly burnish a little brighter the noble merit of the poem and the poet.

I assert, then, that in the wondrous bodily whiteness of the bird chiefly lurks the secret of the spell; a truth the more evinced in this, that by a solecism of terms there are birds called grey albatrosses; and these I have frequently seen, but never with such emotions as when I beheld the Antarctic fowl.

But how had the mystic thing been caught? Whisper it not, and I will tell; with a treacherous hook and line, as the fowl floated on the sea. At last the Captain made a postman of it; tying a lettered, leathern tally round its neck, with the ship’s time and place; and then letting it escape. But I doubt not, that leathern tally, meant for man, was taken off in Heaven, when the white fowl flew to join the wing-folding, the invoking, and adoring cherubim!

Most famous in our Western annals and Indian traditions is that of the White Steed of the Prairies; a magnificent milk-white charger, large-eyed, small-headed, bluff-chested, and with the dignity of a thousand monarchs in his lofty, overscorning carriage. He was the elected Xerxes of vast herds of wild horses, whose pastures in those days were only fenced by the Rocky Mountains and the Alleghanies. At their flaming head he westward trooped it like that chosen star which every evening leads on the hosts of light. The flashing cascade of his mane, the curving comet of his tail, invested him with housings more resplendent than gold and silver-beaters could have furnished him. A most imperial and archangelical apparition of that unfallen, western world, which to the eyes of the old trappers and hunters revived the glories of those primeval times when Adam walked majestic as a god, bluff-browed and fearless as this mighty steed. Whether marching amid his aides and marshals in the van of countless cohorts that endlessly streamed it over the plains, like an Ohio; or whether with his circumambient subjects browsing all around at the horizon, the White Steed gallopingly reviewed them with warm nostrils reddening through his cool milkiness; in whatever aspect he presented himself, always to the bravest Indians he was the object of trembling reverence and awe. Nor can it be questioned from what stands on legendary record of this noble horse, that it was his spiritual whiteness chiefly, which so clothed him with divineness; and that this divineness had that in it which, though commanding worship, at the same time enforced a certain nameless terror.

But there are other instances where this whiteness loses all that accessory and strange glory which invests it in the White Steed and Albatross.

What is it that in the Albino man so peculiarly repels and often shocks the eye, as that sometimes he is loathed by his own kith and kin! It is that whiteness which invests him, a thing expressed by the name he bears. The Albino is as well made as otheris_last_batch 0 = (n_consumed 4096 >= embd_inp.size() 4713
save_now 0 = session_do_save 0 && is_last_batch 0
 men—has no substantive deformity—and yet this mere aspect of all-pervading whiteness makes him more strangely hideous than the ugliest abortion. Why should this be so?

Nor, in quite other aspects, does Nature in her least palpable but not the less malicious agencies, fail to enlist among her forces this crowning attribute of the terrible. From its snowy aspect, the gauntleted ghost of the Southern Seas has been denominated the White Squall. Nor, in some historic instances, has the art of human malice omitted so potent an auxiliary. How wildly it heightens the effect of that passage in Froissart, when, masked in the snowy symbol of their faction, the desperate White Hoods of Ghent murder their bailiff in the market-place!

Nor, in some things, does the common, hereditary experience of all mankind fail to bear witness to the supernaturalism of this hue. It cannot well be doubted, that the one visible quality in the aspect of the dead which most appals the gazer, is the marble pallor lingering there; as if indeed that pallor were as much like the badge of consternation in the other world, as of mortal trepidation here. And from that pallor of the dead, we borrow the expressive hue of the shroud in which we wrap them. Nor even in our superstitions do we fail to throw the same snowy mantle round our phantoms; all ghosts rising in a milk-white fog—Yea, while these terrors seize us, let us add, that even the king of terrors, when personified by the evangelist, rides on his pallid horse.

Therefore, in his other moods, symbolize whatever grand or gracious thing he will by whiteness, no man can deny that in its profoundest idealized significance it calls up a peculiar apparition to the soul.

But though without dissent this point be fixed, how is mortal man to account for it? To analyse it, would seem impossible. Can we, then, by the citation of some of those instances wherein this thing of whiteness—though for the time either wholly or in great part stripped of all direct associations calculated to impart to it aught fearful, but nevertheless, is found to exert over us the same sorcery, however modified;—can we thus hope to light upon some chance clue to conduct us to the hidden cause we seek?

Let us try. But in a matter like this, subtlety appeals to subtlety, and without imagination no man can follow another into these halls. And though, doubtless, some at least of the imaginative impressions about to be presented may have been shared by most men, yet few perhaps were entirely conscious of them at the time, and therefore may not be able to recall them now.

Why to the man of untutored ideality, who happens to be but loosely acquainted with the peculiar character of the day, does the bare mention of Whitsuntide marshal in the fancy such long, dreary, speechless processions of slow-pacing pilgrims, down-cast and hooded with new-fallen snow? Or, to the unread, unsophisticated Protestant of the Middle American States, why does the passing mention of a White Friar or a White Nun, evoke such an eyeless statue in the soul?

Or what is there apart from the traditions of dungeoned warriors and kings (which will not wholly account for it) that makes the White Tower of London tell so much more strongly on the imagination of an untravelled American, than those other storied structures, its neighbors—the Byward Tower, or even the Bloody? And those sublimer towers, the White Mountains of New Hampshire, whence, in peculiar moods, comes that gigantic ghostliness over the soul at the bare mention of that name, while the thought of Virginia’s Blue Ridge is full of a soft, dewy, distant dreaminess? Or why, irrespective of all latitudes and longitudes, does the name of the White Sea exert such a spectralness over the fancy, while that of the Yellow Sea lulls us with mortal thoughts of long lacquered mild afternoons on the waves, followed by the gaudiest and yet sleepiest of sunsets? Or, to choose a wholly unsubstantial instance, purely addressed to the fancy, why, in reading the old fairy tales of Central Europe, does “the tall pale man” of the Hartz forests, whose changeless pallor unrustlingly glides through the green of the groves—why is this phantom more terrible than all the whooping imps of the Blocksburg?

Nor is it, altogether, the remembrance of her cathedral-toppling earthquakes; nor the stampedoes of her frantic seas; nor the tearlessness of arid skies that never rain; nor the sight of her wide field of leaning spires, wrenched cope-stones, and crosses all adroop (like canted yards of anchored fleets); and her suburban avenues of house-walls lying over upon each other, as a tossed pack of cards;—it is not these things alone which make tearless Lima, the strangest, saddest city thou can’st see. For Lima has taken the white veil; and there is a higher horror in this whiteness of her woe. Old as Pizarro, this whiteness keeps her ruins for ever new; admits not the cheerful greenness of complete decay; spreads over her broken ramparts the rigid pallor of an apoplexy that fixes its own distortions.

I know that, to the common apprehension, this phenomenon of whiteness is not confessed to be the prime agent in exaggerating the terror of objects otherwise terrible; nor to the unimaginative mind is there aught of terror in those appearances whose awfulness to another mind almost solely consists in this one phenomenon, especially when exhibited under any form at all approaching to muteness or universality. What I mean by these two statements may perhaps be respectively elucidated by the following examples.

First: The mariner, when drawing nigh the coasts of foreign lands, if by night he hear the roar of breakers, starts to vigilance, and feels just enough of trepidation to sharpen all his faculties; but under precisely similar circumstances, let him be called from his hammock to view his ship sailing through a midnight sea of milky whiteness—as if from encircling headlands shoals of combed white bears were swimming round him, then he feels a silent, superstitious dread; the shrouded phantom of the whitened waters is horrible to him as a real ghost; in vain the lead assures him he is still off soundings; heart and helm they both go down; he never rests till blue water is under him again. Yet where is the mariner who will tell thee, “Sir, it was not so much the fear of striking hidden rocks, as the fear of that hideous whiteness that so stirred me?”

Second: To the native Indian of Peru, the continual sight of the snow-howdahed Andes conveys naught of dread, except, perhaps, in the mere fancying of the eternal frosted desolateness reigning at such vast altitudes, and the natural conceit of what a fearfulness it would be to lose oneself in such inhuman solitudes. Much the same is it with the backwoodsman of the West, who with comparative indifference views an unbounded prairie sheeted with driven snow, no shadow of tree or twig to break the fixed trance of whiteness. Not so the sailor, beholding the scenery of the Antarctic seas; where at times, by some infernal trick of legerdemain in the powers of frost and air, he, shivering and half shipwrecked, instead of rainbows speaking hope and solace to his misery, views what seems a boundless churchyard grinning upon him with its lean ice monuments and splintered crosses.

But thou sayest, methinks that white-lead chapter about whiteness is but a white flag hung out from a craven soul; thou surrenderest to a hypo, Ishmael.

Tell me, why this strong young colt, foaled in some peaceful valley of Vermont, far removed from all beasts of prey—why is it that upon the sunniest day, if you but shake a fresh buffalo robe behind him, so that he cannot even see it, but only smells its wild animal muskiness—why will he start, snort, and with bursting eyes paw the ground in phrensies of affright? There is no remembrance in him of any gorings of wild creatures in his green northern home, so that the strange muskiness he smells cannot recall to him anything associated with the experience of former perils; for what knows he, this New England colt, of the black bisons of distant Oregon?

No: but here thou beholdest even in a dumb brute, the instinct of the knowledge of the demonism in the world. Though thousands of miles from Oregon, still when he smells that savage musk, the rending, goring bison herds are as present as to the deserted wild foal of the prairies, which this instant they may be trampling into dust.

Thus, then, the muffled rollings of a milky sea; the bleak rustlings of the festooned frosts of mountains; the desolate shiftings of the windrowed snows of prairies; all these, to Ishmael, are as the shaking of that buffalo robe to the frightened colt!

Though neither knows where lie the nameless things of which the mystic sign gives forth such hints; yet with me, as with the colt, somewhere those things must exist. Though in many of its aspects this visible world seems formed in love, the invisible spheres were formed in fright.

But not yet have we solved the incantation of this whiteness, and learned why it appeals with such power to the soul; and more strange and far more portentous—why, as we have seen, it is at once the most meaning symbol of spiritual things, nay, the very veil of the Christian’s Deity; and yet should be as it is, the intensifying agent in things the most appalling to mankind.

Is it that by its indefiniteness it shadows forth the heartless voids and immensities of the universe, and thus stabs us from behind with the thought of annihilation, when beholding the white depths of the milky way? Or is it, that as in essence whiteness is not so much a colour as the visible absence of colour; and at the same time the concrete of all colours; is it for these reasons that there is such a dumb blankness, full of meaning, in a wide landscape of snows—a colourless, all-colour of atheism from which we shrink? And when we consider that other theory of the natural philosophers, that all other earthly hues—every stately or lovely emblazoning—the sweet tinges of sunset skies and woods; yea, and the gilded velvets of butterflies, and the butterfly cheeks of young girls; all these are but subtile deceits, not actually inherent in substances, but only laid on from without; so that all deified Nature absolutely paints like the harlot, whose allurements cover nothing but the charnel-house within; and when we proceed further, and consider that the mystical cosmetic which produces every one of her hues, the great principle of light, for ever remains white or colorless in itself, and if operating without medium upon matter, would touch all objects, even tulips and roses, with its own blank tinge—pondering all this, the palsied universe lies before us a leper; and like wilful travellers in Lapland, who refuse to wear coloured and colouring glasses upon their eyes, so the wretched infidel gazes himself blind at the monumental white shroud that wraps all the prospect around him. And of all these things the Albino whale was the symbol. Wonder ye then at the fiery hunt?
---
<think>is_last_batch 1 = (n_consumed 4713 >= embd_inp.size() 4713
save_now 0 = session_do_save 0 && is_last_batch 1

It looks like session_do_save gets switched to false during prompt processing and when it processes last batch save_now is false because of this and no state is saved.

danbev and others added 2 commits May 21, 2026 14:09
This commit addresses a bug in common_prompt_batch_decode that affects
the session state store/restore in completion.cpp and
save-load-state.cpp.

The motivation for this is that currently the code is saving n-1 tokens
in both the session_tokens and in the KV cache. Then when loading the
session tokens, and if the prompt matches, it would replay the last
saved token (n-1) into the next position, effectively replaying the
same token in the wrong position.

The fix is to store all n tokens in session_tokens, while the memory
state only reflects n-1 processed tokens as the saving happens before
the last token is decoded in common_prompt_batch_decode.

I ran both completion.cpp and save-load-state.cpp with a transformer, a
recurrent, and a hybrid model.

Resolves: ggml-org#23400

Co-authored-by: fairydreaming <166155368+fairydreaming@users.noreply.github.com>
@danbev danbev force-pushed the save-session-llama-completion branch from b698524 to 411c926 Compare May 21, 2026 13:45
@danbev
Copy link
Copy Markdown
Member Author

danbev commented May 21, 2026

@fairydreaming Thanks for reporting and testing this!
I've pushed an update which I need to clean up, but wondering if you could run the same test as you did above for the long prompt?

@fairydreaming
Copy link
Copy Markdown
Collaborator

@fairydreaming Thanks for reporting and testing this! I've pushed an update which I need to clean up, but wondering if you could run the same test as you did above for the long prompt?

I confirm that it works now, first in writer I see:

2.43.037.337 I saved session before last token to prompt-deepseek-3.2-long.cache, n_tokens = 4713

then in reader:

0.42.744.462 I llama_completion: attempting to load saved session from 'prompt-deepseek-3.2-long.cache'
0.42.815.857 I llama_completion: loaded a session with prompt size of 4713 tokens
0.42.827.766 I llama_completion: session file has exact match for prompt!
0.43.111.307 I llama_completion: replayed last token from session

@github-actions github-actions Bot added testing Everything test related examples labels May 21, 2026
@danbev
Copy link
Copy Markdown
Member Author

danbev commented May 21, 2026

@fairydreaming Thanks!

@danbev danbev marked this pull request as ready for review May 22, 2026 08:23
@danbev danbev requested review from a team and ggerganov as code owners May 22, 2026 08:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

examples testing Everything test related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Misc. bug: Last token in saved session state is replayed with wrong position by llama-completion

3 participants