| Trustin Lee 2005-06-28, 5:45 pm |
| Hi,
2005/6/28, Niclas Hedhman <niclas-fxYu5tZJV0NAfugRpC6u6w@public.gmane.org>:
>
> On Tuesday 28 June 2005 08:23, Trustin Lee wrote:
>
> We
>
> Hmmmm... What tests have you actually run?
> You can't do without the FQ classnames of the classes involved. They are
> written in 'clear text' once for each class, then referenced with an index
> (int IIRC). Whether or not you need the field names, is your call, but it
> sounds like a decent system to not depend on knowing the exact ordering.
> The codebase URLs is the third item which written out, which of course can
> be
> very large.
>
> import java.io.*;
>
> public class Test
> {
> static public void main( String[] args )
> throws Exception
> {
> FileOutputStream fos = new FileOutputStream( "abc.ser" );
> ObjectOutputStream oos = new ObjectOutputStream( fos );
> Abc abc = new Abc();
> oos.writeObject( abc );
> oos.close();
> }
>
> private static class Abc implements Serializable
> {
> String abc = "1";
> String def = "2";
> }
> }
>
> Typically case??? Well, it results in 75 bytes.
Yes, 75 bytes for only two single character strings are huge. 
> What if the name of class changes?
>
> I assume this is a rhetorical question, since I am sure you know the
> answer. I
> am interesting to know how you are going to handle that in your own
> serialization framework.
We don't specify type name, because we know what type will come in the
stream.
> And if we implement readObject and
>
> Because you don't need to worry about complex classes, and diving into the
> hierarchies of instances, which you would for both "rolling your own" as
> well
> as Externalizable.
Right. I thought Attributes, Attribute, and Name are simple enough to
forget about a complex object graph. But attribute values should be able to
contain any Java objects, so I'm thinking about allowing Java objects there
only.
> Moreover, it
>
> Serialization writes the field names to the stream, so that it can restore
> the
> fields even if they were re-ordered in the class. I think you have
> observed
> that when you use writeObject(), the field names are till written to the
> stream. I don't know the answer to that, since the deserialization can not
> possibly know what to do with it.
You're right.
> are
>
> If they are flat, i.e. basically strings or collections of strings, then I
> agree that serialization is not necessarily any added value. But are you
> not
> allowed to store any arbitrary Object in attributes?
Attribute values can be any Java objects actually. So I'm going to use
object serialization only for that case. But most often used types such as
string and byte[] will have to be handled specially to gain maximum
performance.
LDAP entries are usually stored to B+Tree implementations, so we have to
initialize ObjectInputStream and ObjectOutputStream each time we read or
write objects, and it is major performance panelty because it usually gives
us additional memory allocation and copy and it cause class descriptors are
written every time again and again (in regular stream, it is not a problem
because they are reused, but it becomes a problem in the environment like
this). Plus, the size of entry impacts the performance of backing storage if
massive operation is being performed. Making serialized data smaller gives
performance gain because it makes database contain more items per page.
If performance is not a problem, we can just go with object serialization,
but currently our performance is not really good, and it is being caused by
large extra I/O from object serialization.
Trustin
--
what we call human nature is actually human habit
--
http://gleamynode.net/
|