[update] piggybank performance profiling from Stefano Mazzocchi on 2005-09-02 (stdin)

From: Stefano Mazzocchi <stefanom_at_mit.edu>
Date: Fri, 02 Sep 2005 19:30:21 -0400

My quest for the solidification of piggybank continues. Still feel like
I'm in a jungle, but now at least I have a compass: both the frontend
(the part inside firefox) and the backend (the java part) are
instrumented with tracing/profiling loggers and this helps a lot in
finding out what is consuming our cycles and what is going on.

First of all, I realized that velocity templates cache was turned off.
This means that everytime we loaded a template (and we use several!) no
matter how frequently used, we would have to parse it again. Since we
have templates that generate as little as a few lines and are reused
hundreds of times thruout the various pages, you understand this was a
lot of wasted CPU for no reason.

I turned the cache on, and this makes the first page turn up almost
instantaneously (some 200ms total and 100ms reaction time) after it has
been loaded once (the first time takes a while but it's understandable
since it has to load itself from the triple store... note that the first
load of that page could be made transparent in the background so that we
don't show this delay at the user).

The browsing of the piggybank items is still incredibly slow, though...
I was expecting a substantial performance improvement but instead it
seems there is something a lot bigger dragging us.

So, I kept instrumenting and found out that we spend pretty much half of
our time (if not more) by performing queries such as

?x predicate object

each of those queries take between 1300 and 1600 on my machine and with
the (small) number of statements I have in my triple store, compared to
basically 0ms time of queries such as

subject predicate ?x

which seems to indicate that Sesame does a good job at indexing by
subject but a terrible job at indexing by object.

Also, unlike the first page that seems to be caching results
effectively, the internal pages generate a little avalance of queries to
the triple store... a good number of which are 'give all subjects of
given object' queries and therefore result in the perceived slowness.

Now, the question is: is there a way to make sesame index by object too?

-- 
Stefano Mazzocchi
Research Scientist                 Digital Libraries Research Group
Massachusetts Institute of Technology            location: E25-131C
77 Massachusetts Ave                   telephone: +1 (617) 253-1096
Cambridge, MA  02139-4307              email: stefanom at mit . edu
-------------------------------------------------------------------

Received on Fri Sep 02 2005 - 23:26:14 EDT

This archive was generated by hypermail 2.3.0 : Thu Aug 09 2012 - 16:39:18 EDT