Postgres (9.6) hangs on start after storage upgrade

The question:

I have been running Postgres on a Linode with the data directory mapped to an external volume. Yesterday Linode prompted me to upgrade my storage to NVMe so I did. Unfortunately, following that, Postgres is unable to start.

When I attempt to start the process it just hangs with no output. It is also impossible to stop at that point, implying that it is in an “uninterruptable sleep” state.

I’ve started Postgres with debug enabled and it doesn’t output anything useful (as best I can tell):

2022-03-11 01:39:36 EST [1752-1] DEBUG:  postgres: PostmasterMain: initial environment dump:
2022-03-11 01:39:36 EST [1752-2] DEBUG:  -----------------------------------------
2022-03-11 01:39:36 EST [1752-3] DEBUG:     TERM=xterm-256color
2022-03-11 01:39:36 EST [1752-4] DEBUG:     LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.m4a=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.oga=00;36:*.opus=00;36:*.spx=00;36:*.xspf=00;36:
2022-03-11 01:39:36 EST [1752-5] DEBUG:     PATH=/home/user/bin:/home/user/.nvm/versions/node/v14.17.0/bin:/home/user/.cargo/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games
2022-03-11 01:39:36 EST [1752-6] DEBUG:     LANG=en_ZA.UTF-8
2022-03-11 01:39:36 EST [1752-7] DEBUG:     HOME=/home/user
2022-03-11 01:39:36 EST [1752-8] DEBUG:     MAIL=/var/mail/postgres
2022-03-11 01:39:36 EST [1752-9] DEBUG:     LOGNAME=postgres
2022-03-11 01:39:36 EST [1752-10] DEBUG:    USER=postgres
2022-03-11 01:39:36 EST [1752-11] DEBUG:    USERNAME=postgres
2022-03-11 01:39:36 EST [1752-12] DEBUG:    SHELL=/bin/bash
2022-03-11 01:39:36 EST [1752-13] DEBUG:    SUDO_COMMAND=/usr/lib/postgresql/9.6/bin/postgres -d 3 -D /mnt/project-backup/postgres/project/data -c config_file=/etc/postgresql/9.6/project_db/postgresql.conf
2022-03-11 01:39:36 EST [1752-14] DEBUG:    SUDO_USER=user
2022-03-11 01:39:36 EST [1752-15] DEBUG:    SUDO_UID=1000
2022-03-11 01:39:36 EST [1752-16] DEBUG:    SUDO_GID=1000
2022-03-11 01:39:36 EST [1752-17] DEBUG:    PGLOCALEDIR=/usr/share/locale
2022-03-11 01:39:36 EST [1752-18] DEBUG:    PGSYSCONFDIR=/etc/postgresql-common
2022-03-11 01:39:36 EST [1752-19] DEBUG:    LC_COLLATE=en_ZA.UTF-8
2022-03-11 01:39:36 EST [1752-20] DEBUG:    LC_CTYPE=en_ZA.UTF-8
2022-03-11 01:39:36 EST [1752-21] DEBUG:    LC_MESSAGES=en_ZA.UTF-8
2022-03-11 01:39:36 EST [1752-22] DEBUG:    LC_MONETARY=C
2022-03-11 01:39:36 EST [1752-23] DEBUG:    LC_NUMERIC=C
2022-03-11 01:39:36 EST [1752-24] DEBUG:    LC_TIME=C
2022-03-11 01:39:36 EST [1752-25] DEBUG:  -----------------------------------------

When I look in the process’s file descriptor folder I also don’t see anything obviously weird:

lrwx------ 1 postgres postgres 64 Mar 11 01:39 0 -> /dev/pts/0
lrwx------ 1 postgres postgres 64 Mar 11 01:39 1 -> /dev/pts/0
lrwx------ 1 postgres postgres 64 Mar 11 01:39 2 -> /dev/pts/0
lr-x------ 1 postgres postgres 64 Mar 11 01:39 3 -> /dev/urandom
lrwx------ 1 postgres postgres 64 Mar 11 01:39 4 -> /mnt/project-backup/postgres/project/data/postmaster.pid

The postmaster.pid file looks like this:

1752
/mnt/project-backup/postgres/project/data
1646980776
5437

Any idea what could be happening here and how I can fix it? If I can’t recover the current situation is there at least any way to recover the data through some other means?

The Solutions:

Below are the methods you can try. The first solution is probably the best. Try others if the first one doesn’t work. Senior developers aren’t just copying/pasting – they read the methods carefully & apply them wisely to each case.

Method 1

The root cause of this issue turned out to be an incompatibility with the OS and the associated volume. The entire directory / filesystem was broken in a way that caused IO to hang, and Postgres was just a symptom of the larger issue.


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Comment