Re: some piggyback bugs...

From: Stefano Mazzocchi <>
Date: Mon, 12 Dec 2005 10:13:19 -0500

Gunnar Grimnes wrote:
> Hash: SHA1
>>> Another problem is that opening new pages is slower than before, and CPU
>>> usage goes to 100% for a bit, but I assume this is piggyback parsing
>>> pages, so i'll live with it.
>> We actually don't parse a new page that much--we just get all the <link>
>> tags in the <head>. That shouldn't take too long. This CPU consumption
>> is probably caused by something else.
> I've found a good test-case for this, browsing a google-groups and going
> back and forth between individual threads and the search result gives me
> 10-15 second of 100% cpu. The gnome system monitor tells me that it is
> the java_wm running as a child of firefox that is the problem...
> Any other idea what it could be?

[back from vacation]

yes. During last week (when I was on vacation) it hit me: maybe I was
looking in the wrong place. We are experiencing memory leaks in the
semantic bank which forces us to restart it every now and then. We have
now automated the restart every night as a quick fix, but I suspect that
the same problem might hit Piggy Bank, since they share most of the code.

Now, why the 100% cpu hog? well, piggy bank invokes the java subsystem
every time you browse a page. I noticed that if you *never* invoke
piggybank (never click on the pig and never put any new information in
and never click on the data coin), you can go for a very long time using
your firefox without experiencing that problem.

But if you did invoke piggy bank (therefore, your JVM is now using 40Mb
instead of, say, 10Mb when it starts) there is a lot less room. My
experiments show that a JVM invoked without parameters has a default
heap size limit around 64Mb. That means that there are 24Mb of heap left
to use when you start things up.

[don't ask me why the JVM uses that much memory but it does! and yes,
it's a pain]

If we do have a memory leak, it means that that heap size is reduced and
the amount of work the garbage collector has to do to look for little
fragments of memory to clean up might be longer and longer.

So, fixing the memory leak for semantic bank might also reduce (solve?)
this random CPU usage that we experience.

Even a better solution would be to rewrite some internals to completely
decouple the browsing experience from the invocation of the java
subsystem and I think I might try to do that as well since it also makes
it easier to separate Piggy Bank in two: a local server that runs
independently and on its own process (thus reducing the fragility of the
browser process) and a much smaller browser extension that does not
contain any java code.

The issue, and we already talked about it between us, at that point
would be to make the installation and deliver of such local service as
easy if not easier than Piggy Bank already is.

What do you people think?

Stefano Mazzocchi
Research Scientist                 Digital Libraries Research Group
Massachusetts Institute of Technology            location: E25-131C
77 Massachusetts Ave                   telephone: +1 (617) 253-1096
Cambridge, MA  02139-4307              email: stefanom at mit . edu
Received on Mon Dec 12 2005 - 15:07:34 EST

This archive was generated by hypermail 2.3.0 : Thu Aug 09 2012 - 16:39:18 EDT