|
Home > Archive > Data Storage > February 2006 > Idea to speed up writes
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Idea to speed up writes
|
|
|
| Maybe this is already in place, but I have not seen it.
In a RDBMS environment, performance is choked by bursts of tiny, random
writes.
The access time on these could be radically reduced if the drive
electronics dedicated "cache blocks" to each cylinder. The idea is
that a write request could be met by dumping the data to the "cache
blocks" on the cylinder that the head is already sitting on. A simple
index block at the head of a string of cache blocks could keep track of
what items are cached in that string of blocks.
Putting multiple cache-block-strings, evenly spaced about each cylinder
would reduce rotational latency as well. If no blocks are available on
the current cylinder, move to the closest cylinder with free cache
space (or the actual location of the write, whichever is faster).
When the drive is idle, it can move the data from the cache blocks to
the actual location and flag the cache blocks as free. So long as the
drive keeps the index blocks in memory, it can honestly report that a
write is complete, even though the data may reside in cache blocks
instead of the correct location.
I have done some initial math, and single millisecond write times seem
attainable.
A read request for cached data would return the cached version of the
data.
>From what I can (naively) tell, this would require no hardware changes
to current drives. The only downside would be possibly the need for
more disk RAM, slightly less total capacity (owing to some of the
surface being used as cache), and startup time could be a problem,
since the disk would need to load every index block into core before it
could function correctly.
Thoughts? Am I out in left field here?
Thanks in advance,
Marty
| |
| Bill Todd 2006-02-07, 5:54 pm |
| Marty wrote:
> Maybe this is already in place, but I have not seen it.
IIRC it's a variant of one of IBM's strategies for speeding up log
writes (reserve some blocks on each cylinder and write the most recent
log information to the closest free block, then return the log to full
logical contiguity lazily later).
>
> In a RDBMS environment, performance is choked by bursts of tiny, random
> writes.
Only if it's poorly designed. Small random updates can often be written
lazily; synchronous small writes should usually be dumped into the
transaction log for later lazy updates elsewhere (and that log gets
forced sufficiently frequently that other small writes can often
piggy-back on those accesses). That, plus 'grouped commits' where a
single log write commits multiple parallel transactions, takes most of
the sting out of the small-write RDBMS problem (if you don't have some
non-volatile RAM at the disk level to do the job you can use the IBM
log-specific solution I mentioned above to remove any residual problem,
though where the volume is such that even later lazy updates saturate
the disk log-structuring the underlying storage may be considered if
read accesses won't be impacted unacceptably).
- bill
| |
|
| Thanks for the response. It seems that this strategy is already in
use.
Thanks again,
M
|
|
|
|
|