BizTalk Server Orchestration - Convoy message limits

This is Interesting: Free IT Magazines  
Home > Archive > BizTalk Server Orchestration > November 2004 > Convoy message limits





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author Convoy message limits
Hugo Rodger-Brown

2004-11-19, 7:46 am

Has anybody got any experience of working with v. large convoy scenarios.
I'm working on a process that requires messages to be delivered via a batch
file containing 50,000 records (each record is small (comma separated, < 10
fields), and the xml messages that are used to generate the records are not
large either.)

Is a convoy / aggregation pattern suitable for such large batches, and if
not what are the potential pitfalls?

(The added issue is that of performing a send pipeline transformation to
convert from xml to flat-file on such a large document.)

I'd love to hear from anyone with similar experience - I've only worked with
convoys of < 100 messages before.

Hugo
http://hugo.rodger-brown.com


Stephen W. Thomas

2004-11-19, 5:48 pm

Hello.

On my current project we looked at de batching the inbound flat files into
single messages (using a loop inside an Orchestration so we could better
handle errors and map as single messages) and process them in a convoy by
another Orchestration.

I ran tests with, let me remember, 5,000 messages maybe on a multiple server
BizTalk configuration. It really overloaded the system because the messages
were being de batched much father then they were being processed by the
convoy.

I have a sample on my blog
(http://www.geekswithblogs.net/sthom...8/23/10066.aspx) that
uses a win form to submit any number of messages to a convoy. It can be
easily modified to run thousands of messages if you wanted. We even looked
at using parallel branches
(http://www.geekswithblogs.net/sthom...0/05/12216.aspx) to
process more convoy messages at once, but this actually seems to slow things
down even more.

Anyway, we found that convoys preformed much worse then handling the flat
message as a whole file – but we have not ran any files larger the 25 MB (500
records) through the system. I think this was because of all the overhead
with tracking/creating the singles messages.

We ended up scraping the convoy approach and did the work on the large flat
file message inside the initial de batching orchestration by looping over the
nodes in the Xml document. The design looked great on paper, but the
implementation just did not perform well. Using the single Orchestration
approach, we were still able to process each message as a single message and
in order. We are in the testing phase now and this seems to have been the
right call.

It sounds like you are working on a similar project to what I am working on.
If you have additional questions or thoughts feel free to contract me
directly. I am always curious on how other people are using convoys.

Stephen W. Thomas
http://www.geekswithblogs.net/sthomas


"Hugo Rodger-Brown" wrote:

> Has anybody got any experience of working with v. large convoy scenarios.
> I'm working on a process that requires messages to be delivered via a batch
> file containing 50,000 records (each record is small (comma separated, < 10
> fields), and the xml messages that are used to generate the records are not
> large either.)
>
> Is a convoy / aggregation pattern suitable for such large batches, and if
> not what are the potential pitfalls?
>
> (The added issue is that of performing a send pipeline transformation to
> convert from xml to flat-file on such a large document.)
>
> I'd love to hear from anyone with similar experience - I've only worked with
> convoys of < 100 messages before.
>
> Hugo
> http://hugo.rodger-brown.com
>
>
>

Stephen W. Thomas

2004-11-19, 5:48 pm

Sorry, I think I miss read your post. I thought you wanted to de batch using
a convoy.

We did batching as well. We did not use convoys for this process but rather
used a database to store the real time messages until batching time. At that
point, we ran an Orchestration that extracted the data, mapped it, and sent
it. We were not able to write message out to a waiting flat file because we
needed to build a header and trailer at the end for the batch process.

Are you receiving messages in real time? If so, your Orchestration would
probably be dehydrating / rehydrating all the time. As the day went on, this
would get slower and slower as the overall memory footprint of the
Orchestration increased.

Stephen W. Thomas
http://www.geekswithblogs.net/sthomas


"Hugo Rodger-Brown" wrote:

> Has anybody got any experience of working with v. large convoy scenarios.
> I'm working on a process that requires messages to be delivered via a batch
> file containing 50,000 records (each record is small (comma separated, < 10
> fields), and the xml messages that are used to generate the records are not
> large either.)
>
> Is a convoy / aggregation pattern suitable for such large batches, and if
> not what are the potential pitfalls?
>
> (The added issue is that of performing a send pipeline transformation to
> convert from xml to flat-file on such a large document.)
>
> I'd love to hear from anyone with similar experience - I've only worked with
> convoys of < 100 messages before.
>
> Hugo
> http://hugo.rodger-brown.com
>
>
>

Lars W. Andersen

2004-11-21, 5:47 pm

Stephen and Hugo,

I am just done implementing a number of convoy orchestrations to give
ordered delivery to an interface where we receive messages both in batch and
realtime.

The dehydration and rehydration of messages with the orchestration is
recognized as a bug, and I (and a couple other customers) are currently
testing a fix for this.

The increased memory footprint of the orchestration that Stephen is talking
about MIGHT be due to another orchestration bug where there was a bug with
regards to subscriptions for messages with Delivery Notifications not beeing
deleted that causes the state to grow making rehydration/dehydration very
"expensive". We are currently testing a fix for that too. I am not sure that
this is the cause of higher memoy usage - I havent had that as a problem -
but it sure ate all of the CPU on our high end BizTalk servers :-(

Contact PSS for a hotfix for the issues if you experience them.

But even with these fixes our convoys seem to be quite slow. With the MQ
series adapter I am only able to process one message per second when it is a
convoy orch. What are your performance?

regards
Lars


"Stephen W. Thomas" <StephenWThomas@discussions.microsoft.com> wrote in
message news:654BDD8B-947A-4AA7-9A37-37B9A7CF468B@microsoft.com...
> Sorry, I think I miss read your post. I thought you wanted to de batch

using
> a convoy.
>
> We did batching as well. We did not use convoys for this process but

rather
> used a database to store the real time messages until batching time. At

that
> point, we ran an Orchestration that extracted the data, mapped it, and

sent
> it. We were not able to write message out to a waiting flat file because

we
> needed to build a header and trailer at the end for the batch process.
>
> Are you receiving messages in real time? If so, your Orchestration would
> probably be dehydrating / rehydrating all the time. As the day went on,

this[vbcol=seagreen]
> would get slower and slower as the overall memory footprint of the
> Orchestration increased.
>
> Stephen W. Thomas
> http://www.geekswithblogs.net/sthomas
>
>
> "Hugo Rodger-Brown" wrote:
>
scenarios.[vbcol=seagreen]
batch[vbcol=seagreen]
10[vbcol=seagreen]
not[vbcol=seagreen]
if[vbcol=seagreen]
with[vbcol=seagreen]


Hugo Rodger-Brown

2004-11-22, 7:46 am

We're looking at a sustained load of about 2 msgs / second, over a week long
period. Messages are processed by an orchestration, which consumes a web
service that does some processing on the messages, then must be batched into
flat-files containing 50,000 records at a time. There will be up to 200
orchestration instances (convoys) running in parallel (this is a pilot - the
full implementation will be ~4,000).

It sounds, from your experience, as if convoys are not going to be up to the
job, which is a great shame; I've tested the xml->flat-file map in a custom
pipeline, and its performance seems more than adequate.

I'm going to spend today testing the convoy design, and will report back to
the group.

Hugo


Stephen W. Thomas

2004-11-22, 5:48 pm

As far as performance goes, I tested mostly with the File Adapter. I was
getting about 4 messages per second if I remember correctly on a regular
sequential convoy on a desktop. This took in a single message, mapped it,
and sent it out again. I tried to create parallel braches to process more
messages faster, but that actually slowed down things.

I was not able to get a convoy to work with the MSMQt adapter, but I think
that was due to some other issues (long story). I think one of our other
developers did get a convoy to work using the MSMQt, because I remember him
talking about the poor performance as well. It was just like you said about
one message per second. I have not looked at the general performance of
MSMQt, but they use convoys under the covers so maybe this will impact them
if they are used in another convoy?

Hugo, how do you intend to store your new inbound messages that arrive
inside your Orchestration? Do you intend to just append to an existing
message or variable inside the Orchestration? How long does it take for the
Web Service to return results?

Stephen W. Thomas
http://www.geekswithblogs.net/sthomas


"Hugo Rodger-Brown" wrote:

> We're looking at a sustained load of about 2 msgs / second, over a week long
> period. Messages are processed by an orchestration, which consumes a web
> service that does some processing on the messages, then must be batched into
> flat-files containing 50,000 records at a time. There will be up to 200
> orchestration instances (convoys) running in parallel (this is a pilot - the
> full implementation will be ~4,000).
>
> It sounds, from your experience, as if convoys are not going to be up to the
> job, which is a great shame; I've tested the xml->flat-file map in a custom
> pipeline, and its performance seems more than adequate.
>
> I'm going to spend today testing the convoy design, and will report back to
> the group.
>
> Hugo
>
>
>

Hugo Rodger-Brown

2004-11-22, 5:48 pm

OK - I've just run 10,000 messages through a convoy, and it killed our
development machine. Using the FILE receive adapter we were getting about 10
msgs/sec received, but then the process running the orchestration (which is
doing the subscription matching?) choked the processor at 100% for about an
hour (after which I returned from lunch and killed it ;-))

I also got some very inconsistent results - putting 1,000 files through a
convoy that looked for 100 messages produced 7 output files (with the 100
messages) rather than the expected 10, which suggests that 200+ mssgs got
lost (no sign of them in HAT?)

Those two facts combined have made it impossible for us to go into
production with this (in addition to the subscription matching issue I
mentioned in a previous post.)

We're now going to split the orchestration into two. The first calls the web
service, and then stores the results in a SQL table. We then have a SQL
receive location that calls a sproc that returns the count of messages
waiting, and the timestamp of the last time we retrieved messages.
Our second orchestration uses this receive to manage the batching. When the
count > 50,000, or the timestamp expires (replicating the Loop / Listen
conditions in our original convoy orchestration), we select the messages and
mark them as selected, then push them out through a send pipeline that does
the flatfile conversion.

Initial testing seems more promising than the convoy, though there are still
some hurdles to overcome. (flatfile conversion takes < 10secs for 50,000
records.)

Hugo
(Stephen - yes - I was appending to an internal variable - so a lot of DOM /
xpath activity.)

"Stephen W. Thomas" <StephenWThomas@discussions.microsoft.com> wrote in
message news:2198D456-482F-499D-9E08-F7BBA0063FE7@microsoft.com...
> As far as performance goes, I tested mostly with the File Adapter. I was
> getting about 4 messages per second if I remember correctly on a regular
> sequential convoy on a desktop. This took in a single message, mapped it,


> and sent it out again. I tried to create parallel braches to process more
> messages faster, but that actually slowed down things.
>
> I was not able to get a convoy to work with the MSMQt adapter, but I think
> that was due to some other issues (long story). I think one of our other
> developers did get a convoy to work using the MSMQt, because I remember

him
> talking about the poor performance as well. It was just like you said

about
> one message per second. I have not looked at the general performance of
> MSMQt, but they use convoys under the covers so maybe this will impact

them
> if they are used in another convoy?
>
> Hugo, how do you intend to store your new inbound messages that arrive
> inside your Orchestration? Do you intend to just append to an existing
> message or variable inside the Orchestration? How long does it take for

the[vbcol=seagreen]
> Web Service to return results?
>
> Stephen W. Thomas
> http://www.geekswithblogs.net/sthomas
>
>
> "Hugo Rodger-Brown" wrote:
>
long[vbcol=seagreen]
into[vbcol=seagreen]
the[vbcol=seagreen]
the[vbcol=seagreen]
custom[vbcol=seagreen]
to[vbcol=seagreen]


Stephen W. Thomas

2004-11-22, 5:48 pm

I think it is the appending that is killing the performance.
Your new approach sounds like a good one. Like I said before, we never
thought of using a convoy for Batching. Best of luck!

Just another thought, although I do not think I buys you anything from your
proposed design…
What if you used a single Orchestration that used a convoy but more as a
Process Controller. It could do the web service call, kept the overall
count, and write the results of each web service to SQL. Then, when the
Orchestration hit the 50,000 message it could perform the extraction and send
it to the pipe line.

Stephen W. Thomas
http://www.geekswithblogs.net/sthomas


"Hugo Rodger-Brown" wrote:

> OK - I've just run 10,000 messages through a convoy, and it killed our
> development machine. Using the FILE receive adapter we were getting about 10
> msgs/sec received, but then the process running the orchestration (which is
> doing the subscription matching?) choked the processor at 100% for about an
> hour (after which I returned from lunch and killed it ;-))
>
> I also got some very inconsistent results - putting 1,000 files through a
> convoy that looked for 100 messages produced 7 output files (with the 100
> messages) rather than the expected 10, which suggests that 200+ mssgs got
> lost (no sign of them in HAT?)
>
> Those two facts combined have made it impossible for us to go into
> production with this (in addition to the subscription matching issue I
> mentioned in a previous post.)
>
> We're now going to split the orchestration into two. The first calls the web
> service, and then stores the results in a SQL table. We then have a SQL
> receive location that calls a sproc that returns the count of messages
> waiting, and the timestamp of the last time we retrieved messages.
> Our second orchestration uses this receive to manage the batching. When the
> count > 50,000, or the timestamp expires (replicating the Loop / Listen
> conditions in our original convoy orchestration), we select the messages and
> mark them as selected, then push them out through a send pipeline that does
> the flatfile conversion.
>
> Initial testing seems more promising than the convoy, though there are still
> some hurdles to overcome. (flatfile conversion takes < 10secs for 50,000
> records.)
>
> Hugo
> (Stephen - yes - I was appending to an internal variable - so a lot of DOM /
> xpath activity.)
>
> "Stephen W. Thomas" <StephenWThomas@discussions.microsoft.com> wrote in
> message news:2198D456-482F-499D-9E08-F7BBA0063FE7@microsoft.com...
>
> him
> about
> them
> the
> long
> into
> the
> the
> custom
> to
>
>
>

Hugo Rodger-Brown

2004-11-23, 2:47 am

Stephen - your suggestion was my first approach to the redesign - but using
a convoy (without the appending) still adds the rehydrate / dehydrate
overhead, and as I can't measure accurately whether it's this or the
appending that's causing the problem it seems easier to get rid of both?

I may try this approach as well, and see how it measures up.

As ever I will report back - I'll put a full report on my blog at some
point - http://hugo.rodger-brown.com

"Stephen W. Thomas" <StephenWThomas@discussions.microsoft.com> wrote in
message news:61AFE388-DB6E-4497-A393-E71EDD255420@microsoft.com...
> I think it is the appending that is killing the performance.
> Your new approach sounds like a good one. Like I said before, we never
> thought of using a convoy for Batching. Best of luck!
>
> Just another thought, although I do not think I buys you anything from

your
> proposed design.
> What if you used a single Orchestration that used a convoy but more as a
> Process Controller. It could do the web service call, kept the overall
> count, and write the results of each web service to SQL. Then, when the
> Orchestration hit the 50,000 message it could perform the extraction and

send
> it to the pipe line.
>
> Stephen W. Thomas
> http://www.geekswithblogs.net/sthomas



Hugo Rodger-Brown

2004-11-23, 7:51 am

I've just come across another "feature" related to my earlier post about
convoy subscriptions that I thought I'd share, whilst testing the scenario
Stephen talks about.

I've set up the convoy to accept messages, call the web service, but not
append them to any internal XML.

The web service isn't up yet, so in order to test load, I put in a 2 second
delay to simulate the request-response. I then set up a batch file to
deliver messages to this orchestration at a rate of 2/sec. When pushing 1000
messages through, using a batch size of 100, I had expected to get 10 output
files. Interestingly, I only got 3, and no suspended messages showing in
HAT. Somehow 70% of my messages have disappeared down a black hole!!

I reduced the batch size to 10, did a bit more investigation, and the cause
seems to be the difference between the receive rate and the processing rate.

From the time the first message is received, to the time that the 10th
message has been processed, 20 seconds have elapsed (2 sec / msg for the
simulated web service call.) During this 20 seconds, 40 messages will have
been received into the messagebox. The crucial fact seems to be the lifetime
of the convoy message subscription, which starts as the correlation set is
fixed (on the initial receive), to the time the 10th message is processed,
and the orchestration dies; in this case the full 20 seconds. This means
that 30 messages are being picked up by the subscription, but then never
actually processed, as the orchestration dies after the tenth message, and
no further orchestration instances are created.

These 30 messages disappear down a black hole, never to be seen or heard of
again (no sign of them in HAT). This will effect any similar uniform
sequential convoy in which messages are received faster than they are
processed, and effectively kills any further use of convoys in our
implementation!!!

I'll try and put together a more consistent set of notes on this whole
issue.

Hugo

"Stephen W. Thomas" <StephenWThomas@discussions.microsoft.com> wrote in
message news:61AFE388-DB6E-4497-A393-E71EDD255420@microsoft.com...
> I think it is the appending that is killing the performance.
> Your new approach sounds like a good one. Like I said before, we never
> thought of using a convoy for Batching. Best of luck!
>
> Just another thought, although I do not think I buys you anything from

your
> proposed design.
> What if you used a single Orchestration that used a convoy but more as a
> Process Controller. It could do the web service call, kept the overall
> count, and write the results of each web service to SQL. Then, when the
> Orchestration hit the 50,000 message it could perform the extraction and

send
> it to the pipe line.
>
> Stephen W. Thomas
> http://www.geekswithblogs.net/sthomas



Kenzo

2004-11-25, 5:49 pm

Is this issue occuring because all 1000 messages have the same correlation
id? If so, it sounds like you need to give each batch of messages their own
correlation id. How about using an orchestration that runs prior to your
convoy that applies a correlation id unique to each batch of 100 messages.
It could just keep a counter and every time it hits 100 it creates a unique
id. Therefore your convoy can run under the assumption that it would always
receive 100 related messages using the same id and no more.

"Hugo Rodger-Brown" <hugo@coldhams.com> wrote in message
news:#y6dzwU0EHA.824@TK2MSFTNGP11.phx.gbl...
> I've just come across another "feature" related to my earlier post about
> convoy subscriptions that I thought I'd share, whilst testing the scenario
> Stephen talks about.
>
> I've set up the convoy to accept messages, call the web service, but not
> append them to any internal XML.
>
> The web service isn't up yet, so in order to test load, I put in a 2

second
> delay to simulate the request-response. I then set up a batch file to
> deliver messages to this orchestration at a rate of 2/sec. When pushing

1000
> messages through, using a batch size of 100, I had expected to get 10

output
> files. Interestingly, I only got 3, and no suspended messages showing in
> HAT. Somehow 70% of my messages have disappeared down a black hole!!
>
> I reduced the batch size to 10, did a bit more investigation, and the

cause
> seems to be the difference between the receive rate and the processing

rate.
>
> From the time the first message is received, to the time that the 10th
> message has been processed, 20 seconds have elapsed (2 sec / msg for the
> simulated web service call.) During this 20 seconds, 40 messages will have
> been received into the messagebox. The crucial fact seems to be the

lifetime
> of the convoy message subscription, which starts as the correlation set is
> fixed (on the initial receive), to the time the 10th message is processed,
> and the orchestration dies; in this case the full 20 seconds. This means
> that 30 messages are being picked up by the subscription, but then never
> actually processed, as the orchestration dies after the tenth message, and
> no further orchestration instances are created.
>
> These 30 messages disappear down a black hole, never to be seen or heard

of
> again (no sign of them in HAT). This will effect any similar uniform
> sequential convoy in which messages are received faster than they are
> processed, and effectively kills any further use of convoys in our
> implementation!!!
>
> I'll try and put together a more consistent set of notes on this whole
> issue.
>
> Hugo
>
> "Stephen W. Thomas" <StephenWThomas@discussions.microsoft.com> wrote in
> message news:61AFE388-DB6E-4497-A393-E71EDD255420@microsoft.com...
> your
> send
>
>



Hugo Rodger-Brown

2004-11-26, 2:46 am

They are all correlated using the same correlation set, yes, but that is
what I want.

I think I know what the issue is - I've put a longer description of the
issue here -
http://hugorodgerbrown.blogspot.com...processing.html

"Kenzo" <info@thinkscape.com> wrote in message
news:OEy7HG00EHA.304@TK2MSFTNGP11.phx.gbl...
> Is this issue occuring because all 1000 messages have the same correlation
> id? If so, it sounds like you need to give each batch of messages their

own
> correlation id. How about using an orchestration that runs prior to your
> convoy that applies a correlation id unique to each batch of 100 messages.
> It could just keep a counter and every time it hits 100 it creates a

unique
> id. Therefore your convoy can run under the assumption that it would

always
> receive 100 related messages using the same id and no more.



Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com