Re: [update] piggybank performance profiling

From: Vineet Sinha <vineet_at_csail.mit.edu>
Date: Fri, 02 Sep 2005 20:00:33 -0400

> So, I kept instrumenting and found out that we spend pretty much half of
> our time (if not more) by performing queries such as
>
> ?x predicate object
>
> each of those queries take between 1300 and 1600 on my machine and with
> the (small) number of statements I have in my triple store, compared to
> basically 0ms time of queries such as
>
> subject predicate ?x
>
> which seems to indicate that Sesame does a good job at indexing by
> subject but a terrible job at indexing by object.

I did a performance tuneup of Relo in the last week and 'fixed' the
above as well. Some more minor details (in hindsight they are obvious)
that might be helpful - the overhead when using the native store was
rougly 200x higher for both getStatements and hasStatements when the
subject is not provided.

I was able to get my performance up by:
]] modifying my schema so that the user facing actions motly result in
queries that have subject provided (mostly by adding a reverseCached
predicate).
]] limiting the number of and moving most of the reverse queries to
return results to the interface asynchronously

What did not work, was loading part of data to an in-memory store. I am
guessing this is because the in-memory store is also indexed by subject
- though I did not spend as much time on this approach since in my case
the above two cases showed very good results without more tweaking needed.


Vineet
Received on Fri Sep 02 2005 - 23:57:33 EDT

This archive was generated by hypermail 2.3.0 : Thu Aug 09 2012 - 16:39:18 EDT