[Wsf-general] Messaging project concept

Discussion:

James Clark

2007-03-20 13:17:03 UTC

I've written up the concept for a new project/product focusing on
messaging functionality:

http://www.wso2.org/wiki/display/wsf/Telegon+Project

This is based on some discussions Paul and I had in London.

Comments are, as always, welcome.

James

Sanjiva Weerawarana

2007-03-26 07:03:29 UTC

Permalink

Hi James,

First of all sorry for the delay in replying. I read the doc when u first
sent it but didn't get a chance to think thru it and reply.

Very interesting. I agree with most of it, but not the proposed
implementation plan .. as u probably expected :).

First of all, this would be a straightforward set of python scripts that
uses wsclient underneath. Obviously that won't support adding transports
with python etc. but it'll solve business problem: provide a tool to move
data from one machine to another. The intermediary would be a
straightforward configuration of the ESB. Are you actually suggesting that
we re-write SOAP/WS-Sec*/RM etc. all in Python?? I must be mis-reading it.

It seems to me its key to figure out who the audience is. If its for
"users" to move files around, then we shouldn't expose things like
transport selection, endpoint selection etc. to them (as much as
possible). This feels like a solution that we'd be doing over our stuff
(like the identity solution): WSO2 Secure Messaging Solution. (BTW this, I
believe, is what we're doing with Zend in France.)

If the target is for developers to use to exchange data between machines
reliably then the UI would be different but not by much. Again I still see
it as a simple python script that uses wsclient underneath. Server side
would another python script that uses wsf/c with a thttpd type embedded
transport.

I won't answer the questions yet because it depends a lot on the positioning.

Sanjiva.

Post by James Clark
I've written up the concept for a new project/product focusing on
http://www.wso2.org/wiki/display/wsf/Telegon+Project
This is based on some discussions Paul and I had in London.
Comments are, as always, welcome.
James
_______________________________________________
Wsf-general mailing list
http://wso2.org/cgi-bin/mailman/listinfo/wsf-general

--
Sanjiva Weerawarana, Ph.D.
Founder, Chairman & CEO; WSO2, Inc.; http://www.wso2.com/
email: ***@wso2.com; cell: +94 77 787 6880; fax: +1 509 691 2000

"Oxygenating the Web Service Platform."

Paul Fremantle

2007-03-26 08:21:11 UTC

Permalink

Our current Sandesha implementation doesn't support this kind of
solution - yet.

Firstly, we don't have a separate process that can be run simply to
deliver and manage the messaging interaction. So - for example - you run
wsclient to deliver a message. Suppose the endpoint isn't up. wsclient
doesn't run as a daemon. And if wsclient kept running, I don't think you
could start a new wsclient and expect everything to run sweetly.

However, I think if you had a local engine running as an intermediary,
the wsclient could deliver to that - get the acks, and then the engine
could keep running to deliver the messages.

Secondly, I don't think we've really thought about the long-running case
properly with Sandesha. For example, we don't have a mode where we will
back off to a once a day attempt if the server is down for more than 12
hours. Similarly we don't have any console or way of seeing what the
status of messages is. We don't have a simple logger that only logs the
delivery status of each message.

Basically, our messaging agent is designed to be run as a handler inside
a SOAP engine and not as a standalone messaging engine. However, that
isn't to say that we can't morph it to be that. I actually think there
is a lot of mileage in the multi-process design that James has proposed.
I think however, that we could still implement the multi-process design
while re-using much of the web services code we already have.

However, I think that the concept - of selling a lightweight, Unix
friendly, messaging engine has a lot of potential. In fact, it could fit
the Presto case much better than what we are selling today.

Paul

Post by Sanjiva Weerawarana
Hi James,
First of all sorry for the delay in replying. I read the doc when u
first sent it but didn't get a chance to think thru it and reply.
Very interesting. I agree with most of it, but not the proposed
implementation plan .. as u probably expected :).
First of all, this would be a straightforward set of python scripts that
uses wsclient underneath. Obviously that won't support adding transports
with python etc. but it'll solve business problem: provide a tool to
move data from one machine to another. The intermediary would be a
straightforward configuration of the ESB. Are you actually suggesting
that we re-write SOAP/WS-Sec*/RM etc. all in Python?? I must be
mis-reading it.
It seems to me its key to figure out who the audience is. If its for
"users" to move files around, then we shouldn't expose things like
transport selection, endpoint selection etc. to them (as much as
possible). This feels like a solution that we'd be doing over our stuff
(like the identity solution): WSO2 Secure Messaging Solution. (BTW this,
I believe, is what we're doing with Zend in France.)
If the target is for developers to use to exchange data between machines
reliably then the UI would be different but not by much. Again I still
see it as a simple python script that uses wsclient underneath. Server
side would another python script that uses wsf/c with a thttpd type
embedded transport.
I won't answer the questions yet because it depends a lot on the positioning.
Sanjiva.

--
Paul Fremantle
VP/Technology and Partnerships, WSO2
OASIS WS-RX TC Co-chair

http://bloglines.com/blog/paulfremantle
***@wso2.com
(646) 290 8050

"Oxygenating the Web Service Platform", www.wso2.com

Sanjiva Weerawarana

2007-03-27 05:12:03 UTC

Permalink

Post by Paul Fremantle
Our current Sandesha implementation doesn't support this kind of
solution - yet.

Have you made these suggestions on sandesha-dev? If not why not??

Post by Paul Fremantle
Firstly, we don't have a separate process that can be run simply to
deliver and manage the messaging interaction. So - for example - you run
wsclient to deliver a message. Suppose the endpoint isn't up. wsclient
doesn't run as a daemon. And if wsclient kept running, I don't think you
could start a new wsclient and expect everything to run sweetly.

Of course not Paul. But that's a *trivial* thing to do if that's what we
want to do .. we can run Sandesha as an endpoint manager by having a
transport that drops messages to a queue (database, filesystem, whatever)
and have another process pick it up and delivers them reliably.

Why have we not done it? No one asked for it yet.

Post by Paul Fremantle
However, I think if you had a local engine running as an intermediary,
the wsclient could deliver to that - get the acks, and then the engine
could keep running to deliver the messages.

Piece of cake.

Post by Paul Fremantle
Secondly, I don't think we've really thought about the long-running case
properly with Sandesha. For example, we don't have a mode where we will
back off to a once a day attempt if the server is down for more than 12

Come on; is that a serious problem for us to implement an exponential
backoff on the sending logic?? Its trivial.

Just because we haven't done a specific thing is not a reason to dismiss
the whole thing. Let's not throw the baby out with the bathwater.

Post by Paul Fremantle
hours. Similarly we don't have any console or way of seeing what the
status of messages is. We don't have a simple logger that only logs the
delivery status of each message.

Again, piece of cake. We have an API for checking status of Sandesha- all
we'd need is a command line tool to drive that API. What's hard about it??

Post by Paul Fremantle
Basically, our messaging agent is designed to be run as a handler inside
a SOAP engine and not as a standalone messaging engine. However, that
isn't to say that we can't morph it to be that. I actually think there
is a lot of mileage in the multi-process design that James has proposed.
I think however, that we could still implement the multi-process design
while re-using much of the web services code we already have.

+1. All it takes is a few custom transports to decouple the pieces into
multiple processes.

Post by Paul Fremantle
However, I think that the concept - of selling a lightweight, Unix
friendly, messaging engine has a lot of potential.

+1.

Post by Paul Fremantle
In fact, it could fit
the Presto case much better than what we are selling today.

How does that reconcile with doing that work with our current partner?
Let's take this to a non-public forum as we can't discuss customer stuff here.

Sanjiva.

--
Sanjiva Weerawarana, Ph.D.
Founder, Chairman & CEO; WSO2, Inc.; http://www.wso2.com/
email: ***@wso2.com; cell: +94 77 787 6880; fax: +1 509 691 2000

"Oxygenating the Web Service Platform."

Paul Fremantle

2007-03-27 07:27:58 UTC

Permalink

My responses inline

Post by Sanjiva Weerawarana

Post by Paul Fremantle
Our current Sandesha implementation doesn't support this kind of
solution - yet.

Have you made these suggestions on sandesha-dev? If not why not??

Because this hasn't been the target for Sandesha. I'm not saying that
this approach shouldn't be a target for Sandesha. On the other hand you
could see this as an extension that goes beyond the aims of Sandesha.

Post by Sanjiva Weerawarana

Post by Paul Fremantle
Firstly, we don't have a separate process that can be run simply to
deliver and manage the messaging interaction. So - for example - you
run wsclient to deliver a message. Suppose the endpoint isn't up.
wsclient doesn't run as a daemon. And if wsclient kept running, I
don't think you could start a new wsclient and expect everything to
run sweetly.

Of course not Paul. But that's a *trivial* thing to do if that's what we
want to do .. we can run Sandesha as an endpoint manager by having a
transport that drops messages to a queue (database, filesystem,
whatever) and have another process pick it up and delivers them reliably.

I never said it wasn't trivial.

Post by Sanjiva Weerawarana
Why have we not done it? No one asked for it yet.

Er. Yes. That was the whole point of my note. I'm not clear why you are
arguing with me.

Post by Sanjiva Weerawarana

Piece of cake.

+1. I agree with you we can do this simply and we should.

Post by Sanjiva Weerawarana

Post by Paul Fremantle
Secondly, I don't think we've really thought about the long-running
case properly with Sandesha. For example, we don't have a mode where
we will back off to a once a day attempt if the server is down for
more than 12

Come on; is that a serious problem for us to implement an exponential
backoff on the sending logic?? Its trivial.

We have exponential backoff. I think we need something a little more
configurable to support real-life scenarios. Again I didn't say it
wasn't trivial to implement.

Post by Sanjiva Weerawarana
Just because we haven't done a specific thing is not a reason to dismiss
the whole thing. Let's not throw the baby out with the bathwater.

You are reading something into my note that I didn't write. I never
suggested dismissing Sandesha or our code.

Where on earth did you get the idea that I'm suggesting throwing out
*any code* at all. My note was simply highlighting areas we need to
develop from our current codebase to meet the proposed use-case.

Post by Sanjiva Weerawarana

Post by Paul Fremantle
Our current Sandesha implementation doesn't support this kind of
solution - yet.

then the word "yet" that implies that it can be made to do that.

Post by Sanjiva Weerawarana

Post by Paul Fremantle
However, that isn't to say that we can't morph it to be that.
hours. Similarly we don't have any console or way of seeing what the
status of messages is. We don't have a simple logger that only logs
the delivery status of each message.

Again, piece of cake. We have an API for checking status of Sandesha-
all we'd need is a command line tool to drive that API. What's hard
about it??

Again I didn't say it was hard, I merely said we hadn't done this so far.

Post by Sanjiva Weerawarana

Post by Paul Fremantle
Basically, our messaging agent is designed to be run as a handler
inside a SOAP engine and not as a standalone messaging engine.
However, that isn't to say that we can't morph it to be that. I
actually think there is a lot of mileage in the multi-process design
that James has proposed. I think however, that we could still
implement the multi-process design while re-using much of the web
services code we already have.

+1. All it takes is a few custom transports to decouple the pieces into
multiple processes.

Post by Paul Fremantle
However, I think that the concept - of selling a lightweight, Unix
friendly, messaging engine has a lot of potential.

+1.

Post by Paul Fremantle
In fact, it could fit
the Presto case much better than what we are selling today.

How does that reconcile with doing that work with our current partner?

I think we should implement the first prototype of this in PHP thereby
making it available with a web UI to them.

Paul
--
Paul Fremantle
VP/Technology and Partnerships, WSO2
OASIS WS-RX TC Co-chair

http://bloglines.com/blog/paulfremantle
***@wso2.com
(646) 290 8050

"Oxygenating the Web Service Platform", www.wso2.com

Sanjiva Weerawarana

2007-03-27 07:52:13 UTC

Permalink

Sorry Paul I was over-reacting in my earlier response :).

Post by Paul Fremantle

Post by Sanjiva Weerawarana

Post by Paul Fremantle
Our current Sandesha implementation doesn't support this kind of
solution - yet.

then the word "yet" that implies that it can be made to do that.

Post by Sanjiva Weerawarana

Post by Paul Fremantle
However, that isn't to say that we can't morph it to be that.

Yes I agree .. sorry!

OK I think we're mostly in agreement.

Post by Paul Fremantle

Post by Sanjiva Weerawarana

Post by Paul Fremantle
In fact, it could fit
the Presto case much better than what we are selling today.

How does that reconcile with doing that work with our current partner?

I think we should implement the first prototype of this in PHP thereby
making it available with a web UI to them.

OK sounds good but can it be done in the timing constraints we have? Maybe
we should offer the current solution (as that should do what they want)
and work on this in parallel.

Sanjiva.

--
Sanjiva Weerawarana, Ph.D.
Founder, Chairman & CEO; WSO2, Inc.; http://www.wso2.com/
email: ***@wso2.com; cell: +94 77 787 6880; fax: +1 509 691 2000

"Oxygenating the Web Service Platform."

Chamikara Jayalath

2007-03-28 02:33:00 UTC

Permalink

Hi All,

(For a possible Telegon/Java implementation :-) )
Yes, there features are trivial to implement. If needed I'll start
working on them right after the 1.2 release.

Chamikara

Post by Paul Fremantle
My responses inline

Post by Sanjiva Weerawarana

Post by Paul Fremantle
Our current Sandesha implementation doesn't support this kind of
solution - yet.

Have you made these suggestions on sandesha-dev? If not why not??

Because this hasn't been the target for Sandesha. I'm not saying that
this approach shouldn't be a target for Sandesha. On the other hand
you could see this as an extension that goes beyond the aims of Sandesha.

Post by Sanjiva Weerawarana

Post by Paul Fremantle
Firstly, we don't have a separate process that can be run simply to
deliver and manage the messaging interaction. So - for example - you
run wsclient to deliver a message. Suppose the endpoint isn't up.
wsclient doesn't run as a daemon. And if wsclient kept running, I
don't think you could start a new wsclient and expect everything to
run sweetly.

Of course not Paul. But that's a *trivial* thing to do if that's what
we want to do .. we can run Sandesha as an endpoint manager by having
a transport that drops messages to a queue (database, filesystem,
whatever) and have another process pick it up and delivers them reliably.

I never said it wasn't trivial.

Post by Sanjiva Weerawarana
Why have we not done it? No one asked for it yet.

Er. Yes. That was the whole point of my note. I'm not clear why you
are arguing with me.

Post by Sanjiva Weerawarana

Post by Paul Fremantle
However, I think if you had a local engine running as an
intermediary, the wsclient could deliver to that - get the acks, and
then the engine could keep running to deliver the messages.

Piece of cake.

+1. I agree with you we can do this simply and we should.

Post by Sanjiva Weerawarana

Post by Paul Fremantle
Secondly, I don't think we've really thought about the long-running
case properly with Sandesha. For example, we don't have a mode where
we will back off to a once a day attempt if the server is down for
more than 12

Come on; is that a serious problem for us to implement an exponential
backoff on the sending logic?? Its trivial.

We have exponential backoff. I think we need something a little more
configurable to support real-life scenarios. Again I didn't say it
wasn't trivial to implement.

Post by Sanjiva Weerawarana
Just because we haven't done a specific thing is not a reason to
dismiss the whole thing. Let's not throw the baby out with the
bathwater.

You are reading something into my note that I didn't write. I never
suggested dismissing Sandesha or our code.
Where on earth did you get the idea that I'm suggesting throwing out
*any code* at all. My note was simply highlighting areas we need to
develop from our current codebase to meet the proposed use-case.

Post by Sanjiva Weerawarana

Post by Paul Fremantle
Our current Sandesha implementation doesn't support this kind of
solution - yet.

then the word "yet" that implies that it can be made to do that.

Post by Sanjiva Weerawarana

Again, piece of cake. We have an API for checking status of Sandesha-
all we'd need is a command line tool to drive that API. What's hard
about it??

Again I didn't say it was hard, I merely said we hadn't done this so far.

Post by Sanjiva Weerawarana

Post by Paul Fremantle
Basically, our messaging agent is designed to be run as a handler
inside a SOAP engine and not as a standalone messaging engine.
However, that isn't to say that we can't morph it to be that. I
actually think there is a lot of mileage in the multi-process design
that James has proposed. I think however, that we could still
implement the multi-process design while re-using much of the web
services code we already have.

+1. All it takes is a few custom transports to decouple the pieces
into multiple processes.

Post by Paul Fremantle
However, I think that the concept - of selling a lightweight, Unix
friendly, messaging engine has a lot of potential.

+1.

Post by Paul Fremantle
In fact, it could fit
the Presto case much better than what we are selling today.

How does that reconcile with doing that work with our current partner?

I think we should implement the first prototype of this in PHP thereby
making it available with a web UI to them.
Paul

James Clark

2007-03-26 09:57:58 UTC

Permalink

Post by Sanjiva Weerawarana
It seems to me its key to figure out who the audience is. If its for
"users" to move files around, then we shouldn't expose things like
transport selection, endpoint selection etc. to them (as much as
possible). This feels like a solution that we'd be doing over our stuff
(like the identity solution): WSO2 Secure Messaging Solution. (BTW this, I
believe, is what we're doing with Zend in France.)
If the target is for developers to use to exchange data between machines
reliably then the UI would be different but not by much. Again I still see
it as a simple python script that uses wsclient underneath. Server side
would another python script that uses wsf/c with a thttpd type embedded
transport.

The main product would be a command-line program. Like most
properly-designed Unix command-line tools, it would be usable both by
end-users from the command-line and by developers from scripts or C
programs. Think of something like the "cvs" program or the client part
of the "sendmail" program.

You could do a variety of GUIs (GNOME, Web), but in the Linux world
these are frosting. As an alternative interface for programs, you could
also do a very thin C library that would talk some simple protocol to a
Unix socket. But again, this is frosting. A solid, command-line
interface is the main thing, and plenty to start with.

I see the key point about the positioning is that it's something
designed to appeal to the mainstream Linux, open source hacker world,
and in that world, I don't see the user/developer dichotomy that you do.

Post by Sanjiva Weerawarana
First of all, this would be a straightforward set of python scripts that
uses wsclient underneath. Obviously that won't support adding transports
with python etc. but it'll solve business problem: provide a tool to move
data from one machine to another. The intermediary would be a
straightforward configuration of the ESB. Are you actually suggesting that
we re-write SOAP/WS-Sec*/RM etc. all in Python?? I must be mis-reading it.

Sanjiva Weerawarana

2007-03-27 04:41:07 UTC

Permalink

Post by James Clark
The main product would be a command-line program. Like most
properly-designed Unix command-line tools, it would be usable both by
end-users from the command-line and by developers from scripts or C
programs. Think of something like the "cvs" program or the client part
of the "sendmail" program.

OK understood.

Post by James Clark
You could do a variety of GUIs (GNOME, Web), but in the Linux world
these are frosting. As an alternative interface for programs, you could
also do a very thin C library that would talk some simple protocol to a
Unix socket. But again, this is frosting. A solid, command-line
interface is the main thing, and plenty to start with.

+1.

Post by James Clark
I see the key point about the positioning is that it's something
designed to appeal to the mainstream Linux, open source hacker world,
and in that world, I don't see the user/developer dichotomy that you do.

I'm fine with that positioning. However, key point is that that its a
*utility tool* that we're offering. A command that does something useful.
Do you agree?

Yes I'm setting up for my argument below ;-).

James Clark

2007-03-27 07:24:15 UTC

Permalink

Post by Sanjiva Weerawarana
However, key point is that that its a
*utility tool* that we're offering. A command that does something useful.
Do you agree?

Yes.

Post by Sanjiva Weerawarana

Your suggested architecture is not anything like what I think is a
reasonable architecture for this application. The right architecture in
my view would be similar to an MTA (qmail or postfix). Here's a good

I'm familiar with MTAs.

But are you familiar with postfix and qmail? Different MTAs have very
different architectures. The architectural difference I'm talking about
is the difference between sendmail on the one hand and postfix or qmail
on the other.

Post by Sanjiva Weerawarana
I'm unclear why you think the current library
cannot be used to implement such an architecture though.

The Axis2 core is based on passing a message context between handlers
within a *single process*. Please explain to me how you would implement
a qmail/postfix-like architecture using this; I can't see how to do
that.

Post by Sanjiva Weerawarana

The main goal on the receiver side needs to be security. Unless we can
demonstrate a good security architecture, it won't become popular. (For
a start it won't get included in major distributions, which do a
security audit before they include new applications.) The key to a
high-security architecture for something like Telegon is
compartmentalizing it into independent processes, which are small and
simple and each have limited security permissions. This is not
something you can retrofit to Axis2/C. Indeed I don't think it's
possible to design a general purpose WS-* architecture that would give
the same level of security as an architecture optimized just to handle a
specific Telegon-like service.

OK so this is a general argument against anything written in C then?

No. That's not the point I'm trying to make. I'm saying that

a) an modern MTA-like architecture (like postfix or qmail) is the right
kind of architecture for Telegon (because of its superior security
qualities), but

b) it wouldn't be a suitable architecture for a general purpose WS-*
stack (message throughput too small, not a good match for HTTP)

However, it is of course true that using a high-level language rather
than C is good from a security point of view in that it eliminates a
whole class of potential security problems.

Post by Sanjiva Weerawarana

On the sender side, security is less of an issue. However, I don't
think a python script on top of wsclient will do it. The problem is our
memory-based single process architecture, particularly as regards RM.
The command line client should add the message in an queue, so that it
ends up in a file somewhere, and then exit. All the work should be done
by a daemon that picks up messages from the queue and sends them on.
This should happen whether RM is being used or not. The RM
implementation would I believe be radically different, because it would
share the message queue infrastructure with the rest of the server. Even
here there are security issues because there should be a single daemon
to handle messages from all users.

This is a trivial thing you describe- a transport that drops the message
to a queue. No RM or anything involved .. "sending" amounts to saving the
message in the send queue and returning.

My point was that the RM implementation needs to share the message queue
infrastructure with the rest of the server. I think Paul explained some
of the issues here.

The queue as far as a modern MTA is concerned does not play the role of
the transport: it's the heart of the MTA that connects all the parts of
the MTA together.

Post by Sanjiva Weerawarana
People trust email. How many people do you think ever tried to understand
the mess that sendmail was/is? I'm not a Linux developer any more but when
I was I certainly didn't introspect the source of all the cool stuff I
used to use. Maybe that was not the norm!

The sendmail example is a good one. If you came out with a mailer with
sendmail-like architecture today, you would not find many takers.
Sendmail achieved popularity because for a long time it was the only
game in town. But it's had a long history of security problems. Newer
MTAs that have managed achieved popularity (notably qmail and postfix)
and take some market share from sendmail have all done so for a large
part on the strength of a proper security architecture. This isn't just
theory: postfix and qmail have had very few security problems compared
to sendmail.

Post by Sanjiva Weerawarana

Another key part of winning people over is overcoming the perception
that WS-* is a indivisible blob, that they have to take or leave as a
whole. In particular, it's important that people realize that they can
take advantage of the relatively small and simple parts related to
particular message instances (SOAP core, MTOM, WS-Security), without
buying into all the complexity and controversy surrounding the parts
that relate to types/contracts and message payloads (XSD, WSDL,
WS-Policy). It's hard to do that if your implementation is based on a
full WS-* stack.

Not at all! That's *EXACTLY* the design of Axis2 .. bring these in as
needed.
Many of the WS-* opponents are really arguing about the fact that there
are 100+ WS-<something> specs out there and its not clear what really
matters vs. what's fringe. The core you identified (SOAP core (not
adjuncts) and MTOM) are what Axis2 do and Rampart does WS-Sec. I don't see
how Axis2 (C or Java) can be lumped into the "WS-* blob" category.

The problem is the Axis2 core. This very much has a WSDL world view, in
particular the two parallel Message/Operation/Service hierarchies, which
is at the heart of the architecture. This is going way beyond the part
of WS-* stack that I think we need for Telegon, which corresponds
roughly to the functionality of Axiom plus some of Rampart (without the
dependencies of Rampart on Axis2 core). The other big problem with the
Axis2 core is that it has a fundamentally single-process Java-esque
architecture.

Post by Sanjiva Weerawarana
However, whether people will use the Telegon product or not is a
function of its usefulness and its stability.

I think you're missing a crucial factor here: security, or, more
precisely, perceived security. This is important for any network
application, but it's particularly important for Telegon:

- I envisage Telegon being used between organizations, which implies
that messages will go across the firewall.

- The reason for using Telegon rather than email is presumably that the
message is high-value.

- The sophisticated end-to-end security offered by WS-Security is one of
the main potential selling points of Telegon relative to competing
technologies (AMQP, REST, email). AMQP also has to balance the needs of
security against a requirement for high volume/throughput.

What drives how a product's security is perceived? People look at the
code, they look at the architecture and they look at the documentation.
That's not to say every user does it, but a conscientious system admin
will try to determine the net consensus about the quality of the
security of a new security-sensitive product, and that will ultimately
be driven by people who have had a detailed look at the implementation.
Important distributions (at least RedHat) also do a security audit on
any program that is a candidate to be included in the distribution.

Post by Sanjiva Weerawarana
If we can achieve that with
Axis2/C *and* if that helps get this done faster/cheaper why wouldn't we?

I think the fastest/cheapest way to get something that is
stable/reliable/secure up and running is to do a pure Python
implementation, without spending any time creating Python bindings to C
libraries. (There are already Python bindings to XML parsers, databases,
openssl: I don't think we could manage without those.) Spending time
creating Python bindings to Axis2/C at this stage is not in my view a
good idea: creating good bindings is a lot of work (just think how much
effort has been spent on the PHP stuff); these bindings would then have
to be kept up to date with the development of Axis2/C, which would an
ongoing additional burden on Axis2/C development. Also introducing C
into the picture complicates a lot of things: debugging especially
becomes a nightmare; build and distribution are also made more
complicated.

More importantly, I believe it would be an order of magnitude easier for
somebody to convince themselves of the security of a self-contained
10,000 line Python program than of the security of program that has
5,000 lines of Python calling out to a 100,000 line C library. I also
believe that we can have better security architecture, with a strong
multi-process emphasis, if we are not constrained by the architecture of
Axis2, which was designed for the JVM environment which is totally
difference from a security perspective.

Such a pure Python implementation would almost certainly have
unimpressive performance, but my guess is that the performance would be
good enough for people to experiment with. However, at this point I
think we shouldn't worry too much about performance, except to ensure
that the performance critical parts are well isolated in modules that
can be replaced by C implementations. It is very likely that we would
need to evolve the design and implementation of Telegon substantially
over a period of time based on our experience and feedback from others.
Having all the code in Python would make this much easier.

When (and if) we get to the point that we feel confident that our design
is right and we have a something for which there's a real demand out
there, we can work on replacing the performance critical modules by C
implementations. That might involve Axiom or it might be based on the
XML stuff I was working on or it might be something written specifically
for this.

James

Sanjiva Weerawarana

2007-03-27 08:27:38 UTC

Permalink

Post by James Clark

Post by Sanjiva Weerawarana
I'm familiar with MTAs.

I wasn't familiar with qmail itself but just read thru that paper you
pointed to. Very nice for sure.

Post by James Clark

Post by Sanjiva Weerawarana
I'm unclear why you think the current library
cannot be used to implement such an architecture though.

That's essentially how Sandesha supports shutdown and restart right now:
by writing its state out to a database. So if you wanted to say do the
security validation in a separate process, then we'd write a transport
that picks up the message from the incoming message queue (instead of a
socket), runs the security code only and then another transport that
writes it out to a separate queue representing security cleared messages.

The point is, if the processing steps/stages can be separated, then it can
be done in Axis2. In fact the same modules would work .. the difference is
how you'd configure Axis2: you'd have only the specific module with custom
transports to get the message in and dump it out after the module
completes. Even things like message cleanup, which is currently done with
a thread, can be done by a separate process.

Does that make sense?

Post by James Clark
No. That's not the point I'm trying to make. I'm saying that
a) an modern MTA-like architecture (like postfix or qmail) is the right
kind of architecture for Telegon (because of its superior security
qualities), but

I'm +1 with a multi-process arch being the way to do this.

Post by James Clark
b) it wouldn't be a suitable architecture for a general purpose WS-*
stack (message throughput too small, not a good match for HTTP)

OK.

Post by James Clark
However, it is of course true that using a high-level language rather
than C is good from a security point of view in that it eliminates a
whole class of potential security problems.

That's of course true. The reason we wrote in C however is to enable
direct binding to multiple languages and because we didn't think (and I
don't think you're suggesting this either) that re-writing the entire
stack in Python, Perl, PHP etc. makes sense.

Post by James Clark
My point was that the RM implementation needs to share the message queue
infrastructure with the rest of the server. I think Paul explained some
of the issues here.

I don't see any issues .. let's see whether you agree after reading my
explanation above.

Post by James Clark
The queue as far as a modern MTA is concerned does not play the role of
the transport: it's the heart of the MTA that connects all the parts of
the MTA together.

Agreed- it effectively plays the role that the handler chain plays now
except with more reliability: store and forward between stages of processing.

Post by James Clark
The problem is the Axis2 core. This very much has a WSDL world view, in
particular the two parallel Message/Operation/Service hierarchies, which
is at the heart of the architecture. This is going way beyond the part
of WS-* stack that I think we need for Telegon, which corresponds
roughly to the functionality of Axiom plus some of Rampart (without the
dependencies of Rampart on Axis2 core).

I'm not sure how much of the context stuff and MEP stuff you can skip: the
RM and secure conversation combination creates some complex state handling
requirements.

You of course probably don't need the XSDs but Axis2 doesn't either
(Synapse for example mediates messages without having the XSDs around).
I'm pretty sure you'll need the operation concept (that is essentially the
representation of a MEP). Service may be optional but even that I'm not
sure because to do secure conversation you'll need to remember some
context that spans interactions.

Post by James Clark
The other big problem with the
Axis2 core is that it has a fundamentally single-process Java-esque
architecture.

This is what I covered earlier- I think this can be dealt with.

Post by James Clark
I think you're missing a crucial factor here: security, or, more
precisely, perceived security. This is important for any network
- I envisage Telegon being used between organizations, which implies
that messages will go across the firewall.
- The reason for using Telegon rather than email is presumably that the
message is high-value.
- The sophisticated end-to-end security offered by WS-Security is one of
the main potential selling points of Telegon relative to competing
technologies (AMQP, REST, email). AMQP also has to balance the needs of
security against a requirement for high volume/throughput.

These are exactly the value propositions of WS-* as well: end-to-end
reliable, secure and transactional delivery of messages. What you're
talking about is delivering those benefits to users as a simple to use
application.

Post by James Clark
What drives how a product's security is perceived? People look at the
code, they look at the architecture and they look at the documentation.
That's not to say every user does it, but a conscientious system admin
will try to determine the net consensus about the quality of the
security of a new security-sensitive product, and that will ultimately
be driven by people who have had a detailed look at the implementation.
Important distributions (at least RedHat) also do a security audit on
any program that is a candidate to be included in the distribution.

OK this point I can't argue with :).

Post by James Clark
I think the fastest/cheapest way to get something that is
stable/reliable/secure up and running is to do a pure Python
implementation, without spending any time creating Python bindings to C
libraries. (There are already Python bindings to XML parsers, databases,
openssl: I don't think we could manage without those.) Spending time
creating Python bindings to Axis2/C at this stage is not in my view a
good idea: creating good bindings is a lot of work (just think how much
effort has been spent on the PHP stuff); these bindings would then have
to be kept up to date with the development of Axis2/C, which would an
ongoing additional burden on Axis2/C development. Also introducing C
into the picture complicates a lot of things: debugging especially
becomes a nightmare; build and distribution are also made more
complicated.

OK this is again a general point- whether the right way to bring WS-*
features to a scripting language is the way we're doing it (C impl being
bound to each language manually) or with a custom impl for each language.
I'm not convinced that doing the custom approach is scalable but I don't
disagree that for a fixed set of messages it would be faster to write a
custom impl (in python or whatever) rather than write a bunch of C code.

Post by James Clark
More importantly, I believe it would be an order of magnitude easier for
somebody to convince themselves of the security of a self-contained
10,000 line Python program than of the security of program that has
5,000 lines of Python calling out to a 100,000 line C library. I also
believe that we can have better security architecture, with a strong
multi-process emphasis, if we are not constrained by the architecture of
Axis2, which was designed for the JVM environment which is totally
difference from a security perspective.

Axis2 is a library. It can be deployed/packaged/used in multiple ways. The
default packaging we do is a single process packaging .. that doesn't mean
that's the only way to do it however.

Let me give an example. Steve Loughran (a guy from HP) gave up on Axis2
because he doesn't like the aar packaging model. See:
http://www.1060.org/blogxter/entry?publicid=B7BD2FFDD2C4C2647BE11D2B1DCE6376

But he's *DEAD* wrong. Axis2 does not *mandate* the aar deployment model;
that's just the default way of doing things. If someone wants to change
that and deploy classes from the normal class loader they can.

Similarly, just because in WSAS we package Axis2, Sandesha, Rampart etc.
all into one process doesn't mean that's the only way to do it. In fact we
already have an example of a multiprocess architecture: drop Synapase/ESB
in front of Axis2/WSAS and let the front guy terminate RM etc. and forward
the messages. So its entirely doable because of how flexible Axis2's
architecture and deployment model is.

Synapse is actually a good example of how you can use Axis2 as a library
and not be bothered by its deployment model and so on.

Post by James Clark
Such a pure Python implementation would almost certainly have
unimpressive performance, but my guess is that the performance would be
good enough for people to experiment with. However, at this point I
think we shouldn't worry too much about performance, except to ensure
that the performance critical parts are well isolated in modules that
can be replaced by C implementations. It is very likely that we would
need to evolve the design and implementation of Telegon substantially
over a period of time based on our experience and feedback from others.
Having all the code in Python would make this much easier.

No argument there .. C isn't by any means the easiest language for many
things.

Post by James Clark
When (and if) we get to the point that we feel confident that our design
is right and we have a something for which there's a real demand out
there, we can work on replacing the performance critical modules by C
implementations. That might involve Axiom or it might be based on the
XML stuff I was working on or it might be something written specifically
for this.

I don't want to bet you to a race but I'd be surprised if you could write
this whole thing in Python faster than I can package up Axis2 (Java, in my
case) bits into multiple processes ;-).

No no that's not a challenge!

Sanjiva.

--
Sanjiva Weerawarana, Ph.D.
Founder, Chairman & CEO; WSO2, Inc.; http://www.wso2.com/
email: ***@wso2.com; cell: +94 77 787 6880; fax: +1 509 691 2000

"Oxygenating the Web Service Platform."

James Clark

2007-03-27 10:21:40 UTC

Permalink

Post by Sanjiva Weerawarana
I don't want to bet you to a race but I'd be surprised if you could write
this whole thing in Python faster than I can package up Axis2 (Java, in my
case) bits into multiple processes ;-).

You've convinced me that you could package up Axis2/Java in a
multi-process architecture, and I'm sure that you could it faster than I
could do the whole thing in Python. However, I suspect that JVM startup
time and process size would make this not a viable option.

An option that might work is to use gcj to compile it, being careful to
include just the bits you need. This would also eliminate a runtime
requirement for JRE. That actually might be quite a workable approach.
(Has anybody tried compiling Axis2 with gcj?)

But for Axis2/C I think the balance is very different: first, the very
nature of C makes the kind of repackaging we're talking about much, much
harder; second, the Java stack is much more mature than the C stack;
third, Java protects you from low-level memory problems just as Python
does, whereas C doesn't.

In addition, I believe I would eventually be able to get a better
quality implementation by going with the approach of using all Python at
first with some C added as needed. There are lots of clever things in
the qmail/postfix approach: one of them is the way that they achieve the
reliability and transactional semantics of a database without the
overhead of actually using a database. This is important both for
performance and for avoiding a dependency on a heavy-duty database. (A
dependency on SQLite would be fine, but it has poor performance for
concurrent access by multiple processes.) The cleverness is in having
an extremely carefully crafted scheme of using the filesystem (in terms
in system calls, directories, choice of filenames). The scheme also
ensures that they don't have to repeatedly do a lot of IO on the message
body; mostly processes just need to look at the message header.

I'm fairly confident that with sufficient thought a similar kind of
scheme could be devised for Telegon. I suspect several aspects might be
very specialized to the needs of Telegon: for example, it will be very
common in Telegon for the bulk of the Body to be a single chunk of
base64 data serialized using MTOM, so we may well want to have some
special handling for that.

This is a very different approach from what Sandesha is doing at the
moment: it relies on the database to do the heavy lifting; the handling
of each message is wrapped in a transaction, and the entire
MessageContext state is serialized to the database. This sort of
approach makes for easy implementation, but if you have each handler
doing this kind of thing, you're going to be stuck with really, really
bad performance (and it will not be fixable by rewriting bits in C).

Another issue is state. As you mentioned, we need to be able to handle
complex state: I don't need any convincing that the combination of RM
and WS-Security and WS-SecureConversation has complex state. State that
is scoped to a message can be handled simply by including this in the
on-disk queue format. However, as you pointed out, there will be state
that has a broader scope than a single message. This represents a
challenge for a multi-process architecture. You can't just pass around
a pointer to it as you can do in a single-process architecture. Also
you don't want to be forced to keep this shared state on disk and
serialize/deserialize it all the time. Now you could handle this by
having Axis2 use some general purpose clustering/shared state system
(like you are doing with WSAS clustering), but I think this is far too
heavyweight for Telegon.

Postfix deals with this issue by having a number of separate processes
that are responsible for maintaining this shared state and whose
lifetime is matched to the lifetime of the state they maintain; a
process handling an individual message can then communicate with the
process maintaining the shared state to retrieve and update the state as
necessary. That's the kind of approach I think we need in Telegon, and
it's more than a trivial detail: each process boundary is a potentially
a trust boundary with processes on each side of the boundary have
different security properties; there needs to be defined protocol for
them to communicate and neither process can blindly trust what it's sent
by the other process.

James

Sanjiva Weerawarana

2007-03-27 08:31:51 UTC

Permalink

EVERYONE: If you can spare a bit of time please participate in this
discussion. I (in particular) am making wild claims about how Axis2 can be
bundled and packaged and it'll be good to get someone more in touch with
reality to set me straight ;-).

Also, I know some of you guys have lots of experience with MTAs etc. - so
please comment on overall approach/design etc. in relation to that as the
core of this is trying to be a mail-like but better-than-mail system for
moving files from A to B.

Thanks,

Sanjiva.

--
Sanjiva Weerawarana, Ph.D.
Founder, Chairman & CEO; WSO2, Inc.; http://www.wso2.com/
email: ***@wso2.com; cell: +94 77 787 6880; fax: +1 509 691 2000

"Oxygenating the Web Service Platform."