Hacker News new | past | comments | ask | show | jobs | submit login

To the authors:

You should try set track_io_timing = on before running the query. That will show you the actual amount of time spent on i/o. Additionally, if you are using RAID, and haven't tuned it yet, you should try increasing the effective_io_concurrency parameter.




This is fantastic. Thanks.

RE: effective_io_concurrency --- any idea how this should be tuned for SSD or PCIe flash? The manual only gives an example of how to tune for RAID (by number of drives).


My understanding is that the effective_io_concurrency represents the number of data drives that partcipate in the i/o system excluding the parity drives.

From my experience, for each SSD drive you can set it between 5-8 (if there are in raid0).

BTW in 9.6 the effective_io_concurrency [0] can be different for each table space (especially useful if you have multiple drives of a varying quality [ssd, spinning] in one server).

[0] https://www.postgresql.org/docs/9.6/static/runtime-config-re...


It's somewhat related to the number of data drives, but as you point out it's a bit more complicated in practice - it's not just the number of drives. For example due to parallelism built into SSDs (each SSD is actually a stack of smaller drives). For rotational devices it's complicated by TCQ/NCQ, optimizations that require a queue of commands for each device to actually work.

On modern SSD, values 5-8 per device are definitely on the low side. See for example this thread from 2014, on (then released S3500 SSD from Intel):

https://www.postgresql.org/message-id/CAHyXU0yiVvfQAnR9cyH%3...

That shows the performance increases up to effective_io_concurrency=256 (of course, in reality there may be multiple queries running on the system, sharing the I/O paths).


Very interesting.

I did some testing a while ago and didn't see much gain past 30 on 4 SSD disks. Should do another round of testing. Thank you for sharing.


It's quite difficult (or impossible) to get a single "right" value, as it depends both on the workload and hardware (say, whether you have SSDs attached directly or through a RAID controller).


There's an easy way to find out - measure.

I'm curious as to what the performance looks like when it's properly tuned.


Extremely dependent on hardware (and queries, as it only affects Bitmap Index Scans).




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: