Re: Sesame Native Store OPS indexing (was Re: [update] piggybank performance profiling)

From: Vineet Sinha <vineet_at_csail.mit.edu>
Date: Wed, 14 Sep 2005 00:17:35 -0400

>> Adding statements seems to have added a 25% cost.
>
>
> Hey, that's less than I expected! If it doesn't add more than 25%
> overhead then I think we should include the change in future Sesame
> releases.

What is the timeframe of future? If it is soon enough we might be able
to hold our releases.

> Can you give an indication of the amount of data that you
> tested this with?

287,766 statements


>> Beyond adding the second comparator, I also renamed previous
>> btree/file/filename variables to include 'spo' before them and made a
>> copy for 'ops'. triples.dat is now triples-1.dat and triples-2.dat.
>
>
> May I suggest to call these files "triples-spo.dat" and
> "triples-ops.dat"?

Done.


>> The other issue is a Jeen mentioned, the large file size for
>> transmission. The best solution could be to not require it for
>> transmission and build the second index automatically (in fact this
>> should also increase the add performance, relying on the spo index
>> until the ops index is ready).
>
>
> I'm not sure if I understand this correctly. The file size of the
> indexes is problematic because you are sending these files over the
> internet or something?

I thought that was the original reason that it was not implemented.
Don't worry about it.


>> There were no unit tests, but tests on Relo should have been good.

btw, I just realized that I had removed support for when both subject
and object are missing. I just checked in that support.

Stefano, you might want to update pb's jar files.

Vineet
Received on Wed Sep 14 2005 - 04:13:04 EDT

This archive was generated by hypermail 2.3.0 : Thu Aug 09 2012 - 16:39:18 EDT