Cool, good thinking, thanks heaps. I'll implement that in the morning. :)

justinclift · on July 17, 2023

Implemented in https://github.com/justinclift/docker-pgautoupgrade/commit/0....

I think I got everything right, and it passes the test harness, but please give it a look over if you have a few minutes. :)

javajosh · on July 17, 2023

This looks like this will use twice the disk.

justinclift · on July 17, 2023

Nah. The concept of verifying that the mkdir's worked is sound and also easy to do. They just need doing individually in the script, as doing both in the same spot would muck up the wildcard a bit later on.

Using variable names better ($OLD, $NEW) is a good idea too. Should cut down any potential typo risk as well. :)

justsomehnguy · on July 17, 2023

If you prefer to work with the dirs under pgdata (well it makes sense...) you can just make the list of files (maybe even write it to the file, maybe even write a batch file which would move them) and use it for moving the data from pgdata to old. This saves an unnecessary move.

Add: are sure about "${NEW}"/* in this?

  444   mv -v "${NEW}"/* "${PGDATA}"

justinclift · on July 17, 2023

Yeah, this syntax looks a bit unwieldy:

    "${NEW}"/*

But it's specifically to do wildcard expansion of the quoted string, and the shell interpreter is happy with it. I'm open to suggestions for improvements though. :)

---

This is confusing to me:

    ... you can just make the list of files (maybe even write it to the file, maybe
    even write a batch file which would move them) and use it for moving the data
    from pgdata to old. This saves an unnecessary move.

I'm not understanding what you're meaning here.

I understand the "make a list of files" bit, but I'm not grokking why doing that is an improvement, and I'm not seeing where there's an unnecessary move that could be eliminated?

The pg_upgrade process is pretty much:

    1. Initialise a fresh data directory using the new PostgreSQL version
    2. Run pg_upgrade, pointing at both the old and new data directories
    3. Start the database using the new data directory

For the "automatic upgrade container" purposes, we need to do everything under the single mount point so the "--link" option to pg_upgrade is effective and uses hard links.

Thus the "move things into an 'old' directory" first, then the "move the converted 'new' data into place" bit afterwards.

justsomehnguy · on July 18, 2023

> I'm not understanding what you're meaning here.

If this was PowerShell then I would just get the list of files in $PGDATA, create folders and then move the files in the list, ie

    $files = gci $PGDATA
    try {
        New-item "$PGDATA/old" -erroraction stop
        New-item "$PGDATA/new" -erroraction stop
        }
    catch {
        throw "Failed to create the necessary dirs"
        }
    try {
        $files | Move-Item -Destination "$PGDATA/old" -erroraction stop
        }
    catch {
        # throw "Failed to move pg_data, your databases are now borked, good luck"
        gci "$PGDATA/new" | move-item $PGDATA
        }

It's way more streamlined and if creating the dirs would fail (especially `new`) then it would fail before moving the data. And you don't need to `set +e` in this part.

I tried to replicate this in Linux and... it's a mess.

`find` includes the directory in the list, `ls -1` does the thing but bash stores it's output as a single string, redirecting it to the file get this file included in the list... I even tried xargs, but quickly abandoned the idea. Though if you can create the redirected output file in some other place than $PGDATA (`/tmp` perharps?) then `ls -1` trick would work.

justinclift · on July 18, 2023

> "Failed to move pg_data, your databases are now borked, good luck"

:D

---

Oh. Now I understand the purpose of grabbing a file list first. That would allow for creating both old + new dirs first, prior to any move attempt.

I'll think it over. I'm kind of on the fence about it at first though. Lets see what I reckon after sleeping on it.

Thanks for following up btw. :)