Can database confirm the sequential read is really sequential in SSD

The question:

In SSD, overwriting a data file means SSD has to first erase it and then write data. This can change the data layout in the SSD for the data file.

MySQL has some policies to optimize random IO, e.g., read-ahead (pre-fetching). The read-ahead policy will pre-read the remaining x pages in an “extent” if prior y pages are read sequentially. From the perspective of MySQL, using consecutive offsets for pread can make reads sequential. But I doubt is it true for an SSD. In other words, pread can only confirm the logically sequential but not physically sequential?

This points to another important question: does an extent (usually contains 64 pages) in MySQL must represent consecutive space for SSD.

So if I am right, why do databases still make a substantial effort in converting random IO to sequential IO?

The Solutions:

Below are the methods you can try. The first solution is probably the best. Try others if the first one doesn’t work. Senior developers aren’t just copying/pasting – they read the methods carefully & apply them wisely to each case.

Method 1

Nowadays, there are at least two levels of abstraction, often more, between the data file the DBMS works with and the physical storage device blocks. What appears as a contiguous file to the reader is not necessarily contiguous in the filesystem, and even if it it, that’s not necessarily a contiguous span of blocks on the underlying device(s). So, no, the database can’t possibly know how sequential its sequential read actually is.

As to why, firstly, there’s history. When storage architectures were simpler, DBMSes used to operate much closer to the hardware, often working with raw devices, where they had exact control over physical placement of data file pages/blocks and could guarantee the sequential nature of I/O.

Secondly, there’s no harm in attempting sequential reads. If the underlying physical blocks happen to be in fact sequential, we get some benefit out of it; if not, well, at least we tried.


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Comment