Re: [RT] Moving Piggy Bank forward... from Stefano Mazzocchi on 2005-07-23 (stdin)

From: Stefano Mazzocchi <stefanom_at_mit.edu>
Date: Sat, 23 Jul 2005 21:47:31 -0400

Brad Clements wrote:

>> I tried PB, loved the idea of java based FireFox plugin, have considered
>> using the same technique in unrelated application.
>>
>> But .. disabled PB because
>>
>> a) got stuck a few times, made my browser slow
>>
>> b) I'm not yet a big RDF user, so couldn't really capitalize on it yet.
>>
>> But .. I still like the idea, and was going to setup my own semantic
bank
>> too. I was very interested in the haystack project.. etc
>>
>> Any, here are my thoughts as a programmer, not a user.
>>
>> On 21 Jul 2005 at 15:44, Stefano Mazzocchi wrote:
>>
>>
>
>>>>the rest is
>>>>spent even making I/O worse because lots of bytes are transferred thru
>>>>the localhost socket.
>
>>
>>
>> I this a Mac issue? I can't see how localhost loopback socket overhead
>> would be noticed at all on any platform. It must be related to polling,
>> looping, select or some other foible.

I really don't know what it is, but the JVM on mac *blows* I/O wise,
both on sockets and on files. There is no real reason for this (also
because they have access to the same exact code that Sun's JVM are based
on) but there might be something else going on.

>>>>1) decoupling PB from the browser
>
>>
>>
>> I think this would make PB quite a bit more useful as a system-wide
>> service that could be used by other applications.
>>
>> Like google search runs, so PB/Semantic Bank could be setup to run
>> locally, and stay swapped out most of the time when it's not in-use.

Exactly.

>> Another option is using a small tcp-listener application that then
launches
>> the PB backend when it gets a connection, then loops through to PB on
>> the backend... It can then tell PB to quit if it's idle for a while.
I don't see
>> loopback sockets as being a performance issue considering everything
>> else that's happening.

Oh, I definately agree here.

>> Also, this would eliminate JNI, and the browser plug-in component could
>> use regular http connections to talk to PB.

The use of JNI is very very minimal, amost all activity happens thru
regular HTTP connections to localhost already, so that part wouldn't
change much. But what's interesting is that longwell2 running standalone
"feels" faster than PB running inside my browser, even if the access
method is the same. I don't know if this is a perception, though, as I
can't find a way to measure the real numbers.

>>>> - we can think of other plugins that use "piggy-bank" as a local
>>>>service (for example thunderbird plugins that use PB to know which RSS
>>>>feed to subscribe, or that inject email in piggy-bank for facetted
>>>>browsing your email, or your pictures/calendar you name it)
>
>>
>>
>> yes!
>>
>>
>>
>
>>>>CONs:
>>>>
>>>> - installation becomes more painful (and more work for us since we have
>>>>to write different installers for different operating systems) some
>>>>people don't like to have services running (even if things like Google
>>>>Desktop work exactly like this).
>
>>
>> Could you just rely on java web-start or something?

In theory, yes, in practice it's tricky.

WebStart was invented for desktop applications, something that you
deploy and keep up-to-date easily. Works *great* for things like Welkin,
that you start and stop at need and hit the server back everytime they
start so that we can tell users a new version is ready (and we can
monitor their usage ;-)

If you want a service that starts when the OS starts up, WebStart is of
very little use, unless you write your own, OS-specific, way to hook
into the startup services.

Don't get me wrong, it's totally doable, and maybe there are installers
out there (zero-g? vise?) that are cross-platform yet powerful enough to
do what we need, but it's not a few hours hack and I would like to get
general agreement of usefulness before moving down this path.

>>>>2) find a compression scheme for the URLs.
>>>>
>>>>PROs:
>>>>
>>>> - makes the page a lot smaller (reduces I/O overhead)
>>>>
>>>> - reduces the url-encoding overhead
>
>>
>> I am unsure to what this refers to. Do you mean compressing urls used in
>> RDF?

Each URL is, basically, a URL-encoded query. But since we have no
'context' we have been sending the entire URIs along (no prefixing!).
Just assigning, for example, numbers to prefixes would reduce the amount
of bytes transferred (and URL-encoded!) significantly.

"compression" is maybe misleading as a term, I mean 'reducing the size'
of what gets sent in and out, but without reducing the ability to, for
example, bookmark.

>>>> - data is saved on disk *only* after regular shutdown. In case of
>>>>system collapse there is data loss. (Jeen, is there a workaround for
>>>>this problem? like saving the new RDF right away before returning)
>
>>
>>
>> That would be very bad. I've dealt with applications that, for example,
>> don't save their preferences until the application "exits normally".
>>
>> It's confounding when the app dies, or Windows Shutdown occurs (which
>> apparently isn't a normal exit for this particular application).
>>
>> I think you really need to write the stuff to disk when you get it.

Definately agreed.

-- 
Stefano Mazzocchi
Research Scientist                 Digital Libraries Research Group
Massachusetts Institute of Technology            location: E25-131C
77 Massachusetts Ave                   telephone: +1 (617) 253-1096
Cambridge, MA  02139-4307              email: stefanom at mit . edu
-------------------------------------------------------------------

Received on Sun Jul 24 2005 - 01:44:23 EDT

This archive was generated by hypermail 2.3.0 : Thu Aug 09 2012 - 16:39:18 EDT