|
Home > Archive > Data Storage > September 2006 > Content-addressable storage
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
| Author |
Content-addressable storage
|
|
| rgoubet@yahoo.fr 2006-09-07, 7:16 am |
| Hi,
I'm trying to really understand Content-addressable storage. What I'm
still unclear with is the actual reason behind the system. How is such
a system better than location-addressable storage, with files stored
along with a checksum to control integrity?
I'm sure it is superior, I'm just trying to really understand how!
Thanks!
R.
| |
| Bill Todd 2006-09-07, 7:16 am |
| rgoubet@yahoo.fr wrote:
> Hi,
>
> I'm trying to really understand Content-addressable storage. What I'm
> still unclear with is the actual reason behind the system. How is such
> a system better than location-addressable storage, with files stored
> along with a checksum to control integrity?
>
> I'm sure it is superior, I'm just trying to really understand how!
Don't be so sure: perhaps it's just novel.
One *potential* strength is that when references are maintained using
the object's signature, then (with a sufficiently unforgeable hash and
assuming the integrity of the look-up mechanism)) any change to the
object by definition invalidates the reference rather than just quietly
changes the object. Of course, if the object is accessed through some
human-meaningful name, then all one need do is change the mapping of the
name to the unforgeable reference to accomplish such a substitution (and
as you point out if conventional storage mechanisms are complemented
with a suitable checksum then as long as that mapping can't be
compromised it achieves a similar level of security).
One intrinsic advantage to CAS is the elimination of duplicate data,
though: by definition, if one object is identical to another, it's
stored in the same image rather than in a duplicate. Still, one could
achieve this with a conventional approach as well by maintaining a
separate signature database and checking for collisions there.
Why do that, when you can use the signature as the lookup mechanism?
Well, when you want to be able to cluster objects physically for
group-access performance: if you store based on signature you can't
cluster based on something else, though the larger objects get, the less
important clustering them becomes.
Still, when such clustering is not important, CAS may make sense -
especially for read-only data (dealing with updatable data is more
problematic, since any change to an object requires that the entire
object be relocated appropriately, unless piece-meal CAS mechanisms are
used). The bottom line is that CAS may be somewhat superior in specific
situations, but arguably not sufficiently so to get very excited about it.
- bill
| |
| rgoubet@yahoo.fr 2006-09-08, 7:14 am |
| Great, thanks, that's exactly the kind of explanation I needed!
R.
|
|
|
|
|