TLDR: export GTIFF_DIRECT_IO=YES
Only works with uncompressed GeoTIFFs,
however you can upload the GeoTIFF to PostGIS when it is uncompressed and
re-compress the file later.
The raster2pgsql
command is extremely slow with default options when it is
used to upload NODATA
enabled GeoTIFF files to PostGIS when using out-db
rasters. The slowness is caused by the NODATA
value check that is enabled by
default. The NODATA
check avoids adding tiles to the PostGIS where all pixel
values are NODATA
, therefore it is very useful to keep on, however the
simplest fix is just to turn it off with the -k
argument. Obviously, this
results to many useless tiles in the PostGIS table that are filled with only
NODATA
values.
A better workaround is to use the GDAL GTIFF_DIRECT_IO
or
GTIFF_VIRTUAL_MEM_IO
options. The options are documented
here, however there is no much
information about them, except that they only work with uncompressed TIFF
files.
This repository includes a test GeoTIFF file that can be used to measure the
performance improvement when setting the environment variable. The results are
computed on a Linux system. The version information for the raster2pgsql
is
RELEASE: 3.3.2 GDAL_VERSION=36 (POSTGIS_REVISION)
(GDAL version 3.6.3
).
To run the following tests, first uncompress the compressed tiff file from this repository:
gdal_translate compressed-cropped-with-nodata-small.tif cropped-with-nodata-small.tif -co COMPRESS=NONE -co BIGTIFF=YES
# Baseline
perf stat -r 10 bash -c 'raster2pgsql -a -t 100x100 -R -Y cropped-with-nodata-small.tif radarimages > /dev/null 2>&1'
Performance counter stats for 'bash -c raster2pgsql -a -t 100x100 -R -Y cropped-with-nodata-small.tif radarimages > /dev/null 2>&1' (10 runs):
12,086.80 msec task-clock:u # 9.950 CPUs utilized ( +- 9.62% )
0 context-switches:u # 0.000 /sec
0 cpu-migrations:u # 0.000 /sec
1,076,170 page-faults:u # 158.469 K/sec ( +- 9.57% )
12,245,285,654 cycles:u # 1.803 GHz ( +- 9.63% )
9,220,877,548 instructions:u # 1.32 insn per cycle ( +- 9.57% )
2,125,160,960 branches:u # 312.936 M/sec ( +- 9.57% )
106,203,865 branch-misses:u # 9.09% of all branches ( +- 9.68% )
1.2147 +- 0.0255 seconds time elapsed ( +- 2.10% )
# When setting the env variable
perf stat -r 10 bash -c 'GTIFF_DIRECT_IO=YES raster2pgsql -a -t 100x100 -R -Y cropped-with-nodata-small.tif radarimages > /dev/null 2>&1'
Performance counter stats for 'bash -c GTIFF_DIRECT_IO=YES raster2pgsql -a -t 100x100 -R -Y cropped-with-nodata-small.tif radarimages > /dev/null 2>&1' (10 runs):
3,630.24 msec task-clock:u # 9.724 CPUs utilized ( +- 9.58% )
0 context-switches:u # 0.000 /sec
0 cpu-migrations:u # 0.000 /sec
35,063 page-faults:u # 17.860 K/sec ( +- 9.58% )
5,746,675,315 cycles:u # 2.927 GHz ( +- 9.65% )
5,766,619,492 instructions:u # 1.86 insn per cycle ( +- 9.57% )
1,348,166,920 branches:u # 686.724 M/sec ( +- 9.57% )
58,917,479 branch-misses:u # 7.95% of all branches ( +- 9.71% )
0.3733 +- 0.0132 seconds time elapsed ( +- 3.54% )
From the "time elapsed" we calculate 1.2147 / 0.3733
-> > 300%
speed
increase. With larger file size the speed increase is even higher. For a 265M
GeoTIFF file the benchmark takes 26.480 +- 0.278 seconds
for baseline, and
2.709 +- 0.129 seconds
when GTIFF_DIRECT_IO=YES
is set.
The export GTIFF_VIRTUAL_MEM_IO=YES
has almost as good performance increase,
however in my benchmark GTIFF_DIRECT_IO
was slightly better.
PostGIS does not care that the GeoTIFF file compression is changed after the file metadata is already uploaded to the database. Therefore, the workaround is:
- Rename the original file e.g.
mv compressed-cropped-with-nodata-small.tif compressed-cropped-with-nodata-small.tif.orig
- Uncompress the compressed file
gdal_translate compressed-cropped-with-nodata-small.tif.orig compressed-cropped-with-nodata-small.tif -co COMPRESS=NONE -co BIGTIFF=YES
- Upload to PostGIS using the direct io option
GTIFF_DIRECT_IO=YES raster2pgsql -a -t 100x100 -R -Y compressed-cropped-with-nodata-small.tif radarimages | psql -q
- Replace the uncompressed file with the compressed file
mv compressed-cropped-with-nodata-small.tif.orig compressed-cropped-with-nodata-small.tif
To confirm that the result is exactly the same.
diff -s \
<(bash -c 'raster2pgsql -a -t 100x100 -R -Y cropped-with-nodata-small.tif radarimages') \
<(bash -c 'GTIFF_DIRECT_IO=YES raster2pgsql -a -t 100x100 -R -Y cropped-with-nodata-small.tif radarimages')
Files /proc/self/fd/11 and /proc/self/fd/12 are identical
To confirm that the NODATA
check actually does something we can disable it
with -k
for the baseline and check that the results differ.
diff -s \
<(bash -c 'raster2pgsql -k -a -t 100x100 -R -Y cropped-with-nodata-small.tif radarimages') \
<(bash -c 'GTIFF_DIRECT_IO=YES raster2pgsql -a -t 100x100 -R -Y cropped-with-nodata-small.tif radarimages')
Files /proc/self/fd/11 and /proc/self/fd/12 differ
With a quick benchmark, you get similar performance increase when setting
GTIFF_DIRECT_IO
when using in-db
rasters too.
perf stat -r 10 bash -c 'raster2pgsql -a -t 100x100 -Y cropped-with-nodata-small.tif radarimages > /dev/null 2>&1' [13:47:19]
Performance counter stats for 'bash -c raster2pgsql -a -t 100x100 -Y cropped-with-nodata-small.tif radarimages > /dev/null 2>&1' (10 runs):
6,237.50 msec task-clock:u # 9.932 CPUs utilized ( +- 9.44% )
0 context-switches:u # 0.000 /sec
0 cpu-migrations:u # 0.000 /sec
113,180 page-faults:u # 32.974 K/sec ( +- 9.60% )
9,466,631,683 cycles:u # 2.758 GHz ( +- 9.50% )
10,319,739,898 instructions:u # 2.00 insn per cycle ( +- 9.57% )
1,994,234,552 branches:u # 581.006 M/sec ( +- 9.57% )
70,480,996 branch-misses:u # 6.43% of all branches ( +- 9.35% )
0.6280 +- 0.0105 seconds time elapsed ( +- 1.67% )
perf stat -r 10 bash -c 'GTIFF_DIRECT_IO=YES raster2pgsql -a -t 100x100 -Y cropped-with-nodata-small.tif radarimages > /dev/null 2>&1' [13:47:34]
Performance counter stats for 'bash -c GTIFF_DIRECT_IO=YES raster2pgsql -a -t 100x100 -Y cropped-with-nodata-small.tif radarimages > /dev/null 2>&1' (10 runs):
3,141.00 msec task-clock:u # 9.966 CPUs utilized ( +- 9.40% )
0 context-switches:u # 0.000 /sec
0 cpu-migrations:u # 0.000 /sec
58,267 page-faults:u # 33.450 K/sec ( +- 9.58% )
5,249,417,091 cycles:u # 3.014 GHz ( +- 9.28% )
7,128,043,419 instructions:u # 2.41 insn per cycle ( +- 9.57% )
1,289,803,414 branches:u # 740.452 M/sec ( +- 9.57% )
41,614,856 branch-misses:u # 5.87% of all branches ( +- 8.65% )
0.31516 +- 0.00755 seconds time elapsed ( +- 2.40% )
Before finding the workaround with the GTIFF_DIRECT_IO=YES
option, I tried to
see if it would be possible to improve the performance of raster2pgsql
. This
repository includes flamegraph.svg
file for those that are interested. The
rt_band_check_is_nodata
calls the rt_band_get_pixel
function in a loop for
every pixel in the tile tanking the performance.