Memory Management + Buffer Cache #4

zhjwpku · 2024-11-07T14:32:52Z

Workshop Title

Andy Pavlo, CMU Database Group, Memory Management + Buffer Cache

Date: November 2024

Resources

Video: https://www.youtube.com/watch?v=aoewwZwVmv4

zhjwpku · 2024-11-07T14:42:05Z

Quoted from Robert Haas:

some notes on terminology if you're trying to parse the Andy Pavlo talk:

Andy uses the term "page" to refer to one fixed-size chunk of a disk file. We more often call these "blocks". We do also use the word "page" to mean basically the same thing, but typically when we say "page" we're thinking about their contents and their internal structure, rather than the fact that they're stored in files. (See the typedefs for "Page" and "BlockNumber" in the source code.)
Andy uses the term "frame" to refer to a chunk of memory that can hold one page worth of data from on-disk. In PostgreSQL, we typically call these "buffers". (See the typedef for "Buffer" in the source code.)
What Andy calls a "latch" roughly corresponds to a LWLock in PostgreSQL.
What Andy calls a "page table," PostgreSQL calls the "buffer mapping table." (See buf_table.c.)
What Andy calls a "page directory" doesn't really exist in PostgreSQL. He describes a page directory as telling you where a certain page ID can be found on disk, but we set up the data directory in such a way that the operating system can essentially answer this question for us. (See GetRelationPath() in relpath.c.)

there were two things in the talk that i thought might be mistakes:

~23m Andy talks about transaction-safety, but AFAICS this has little connection to transactons, at least in PostgreSQL's architecture. i think the issue is WAL-safety: we can't choose a design that would allow the OS to flush dirty pages before flushing the WAL that covers those changes. else, it wouldn't be a write-ahead log
~44m Andy seems to say that PostgreSQL uses LRU-K and/or that we reload buffer access information from disk after a cold start. The former is not true; we use GCLOCK, not LRU-K. The part about reloading buffer access information is only true if you're using pg_prewarm's autoprewarm feature, but I don't think that was the result of the 2002 mailing list post Andy mentions. I may be misunderstanding what Andy's referring to here, but if I'm not, I think he's picked up some incorrect information.

zhjwpku · 2024-11-09T03:07:54Z

2 papers about GCLOCK(Generalized clock):

zhjwpku · 2024-11-09T03:57:45Z

Frédéric: Andy talks about finding a trade-off between evicting a dirty page that is rather cold versus evicting a clean page that is rather hot. But if I understood correctly, postgres' implementation of GCLOCK doesn't even consider wether a page is dirty or not?

Ants: We have background writer that tries to make sure that if a page is going to be evicted then it has been written out by the time someone needs it.

Robert Haas: the relevant logic here is in StrategyGetBuffer(). it doesn't care whether the buffer is dirty when deciding what to evict:

    }

    /* Nothing on the freelist, so run the "clock sweep" algorithm */
    trycounter = NBuffers;
    for (;;)
    {
        buf = GetBufferDescriptor(ClockSweepTick());

        /*
         * If the buffer is pinned or has a nonzero usage_count, we cannot use
         * it; decrement the usage_count (unless pinned) and keep scanning.
         */
        local_buf_state = LockBufHdr(buf);

        if (BUF_STATE_GET_REFCOUNT(local_buf_state) == 0)
        {
            if (BUF_STATE_GET_USAGECOUNT(local_buf_state) != 0)
            {
                local_buf_state -= BUF_USAGECOUNT_ONE;

                trycounter = NBuffers;
            }
            else
            {
                /* Found a usable buffer */

Junwang Zhao: I thought the usage count can be used for that? If a dirty page is cold, it's usage count is low, a hot clean page has high usage count, so the dirty page will be selected as the victim.

zhjwpku · 2024-11-09T04:42:13Z

x4m: Any block may be in shared buffers, or buffers of some BufferAccessStrategy. When we are locking buffer in shared buffers, how do we ensure it's not accesed somewhere by some strategy?

TomasV: It still has to be in the global hash table, mapping blocks to shared buffers.
Each block can be in only one buffer at a time, there can't be multiple copies. The access strategy is used to generate "candidates" if the block is not already present in shared buffers. It's not a "hard restriction" on which buffers may be used by the scan.

zhjwpku · 2024-11-09T09:03:42Z

Resize shared_buffers without a server restart:

Signed-off-by: Junwang Zhao <[email protected]>

zhjwpku added the ongoing This is the workshop for current month label Nov 7, 2024

zhjwpku added archived The workshop has finished, but you can still post your questions and removed ongoing This is the workshop for current month labels Dec 7, 2024

zhjwpku added a commit that referenced this issue Dec 14, 2024

archive workshop #4

13f569f

Signed-off-by: Junwang Zhao <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory Management + Buffer Cache #4

Memory Management + Buffer Cache #4

zhjwpku commented Nov 7, 2024

zhjwpku commented Nov 7, 2024

zhjwpku commented Nov 9, 2024

zhjwpku commented Nov 9, 2024

zhjwpku commented Nov 9, 2024

zhjwpku commented Nov 9, 2024

Memory Management + Buffer Cache #4

Memory Management + Buffer Cache #4

Comments

zhjwpku commented Nov 7, 2024

Workshop Title

Resources

zhjwpku commented Nov 7, 2024

zhjwpku commented Nov 9, 2024

zhjwpku commented Nov 9, 2024

zhjwpku commented Nov 9, 2024

zhjwpku commented Nov 9, 2024