Buffering TCPIP data

S

ShaunO

I have a TCPIP socket providing data to my app.
My app works on messages (not textual) with a predefined footer (eg 0x01
followed by 0x02)

How should i go about buffering this and retrieving the complete messages ?

Current approach:
I have a byte array of eg 1MB to hold data until full messages are assembled
when socket is readable, i read all data into the buffer starting at the pos
after the last byte of data that was put into it (maintained through an
integer var LastPos).
A separate thread would scan the array when new data is added. If it finds
the footer 0x01,0x02 then it will take all data before this and copy it out
into a new array and publish this through an event argument as a
MessageRecieved event.
IMPORTANTLY it will then take all data after the 0x02 char and move it to
the front of the buffer.

Is this the best approach ?
Should i be using a Queue of type byte ?
Thoughts much appreciated

Thanks.
S.
 
J

Jon Skeet [C# MVP]

I have a TCPIP socket providing data to my app.
My app works on messages (not textual) with a predefined footer (eg 0x01
followed by 0x02)

How should i go about buffering this and retrieving the complete messages ?

Current approach:
I have a byte array of eg 1MB to hold data until full messages are assembled
when socket is readable, i read all data into the buffer starting at the pos
after the last byte of data that was put into it (maintained through an
integer var LastPos).
A separate thread would scan the array when new data is added. If it finds
the footer 0x01,0x02 then it will take all data before this and copy it out
into a new array and publish this through an event argument as a
MessageRecieved event.
IMPORTANTLY it will then take all data after the 0x02 char and move it to
the front of the buffer.

Is this the best approach ?
Should i be using a Queue of type byte ?
Thoughts much appreciated

It sounds like you really want to create a circular buffer - although
then you *do* need to know the maximum size you'll ever run into, or
cope with the complexity of recreating the buffer as and when you need
to.

Your current solution sounds okay though. A Queue<byte> would work,
but extracting a whole range out as an array may be painful (I haven't
checked the API). It'll still have to do copying, anyway.

Jon
 
M

Marc Gravell

t'was my fault for mentioning Queue<T> (in the lack the full context);
although reasonable for many producer/consumer scenarios, in this case
probably not ;-p

Marc
 
S

ShaunO

The messages could be any size but never likely to go above 128 bytes.
It would be very helpful if you could outline the benefits of a circular
buffer over the byte array that i outlined so that i can evaluate if i need
to re-implement.
Thanks,
Shhaun
 
J

Jon Skeet [C# MVP]

The messages could be any size but never likely to go above 128 bytes.
It would be very helpful if you could outline the benefits of a circular
buffer over the byte array that i outlined so that i can evaluate if i need
to re-implement.

You'd implement the circular buffer with an underlying byte array - it
means that when you've copied the data out for the message, you can
just update the logical "next message start" to the end of the
previous one, with no extra copying involved - i.e. you don't copy the
data you've already received from the next message.

Jon
 
A

Adam Benson

Just an idea - If you know that your message always ends in 0x01,0x02 why
not build each message on the fly?

Allocate a new block.
When you get something in from the socket, copy it byte by byte into the
block, checking for the footer.
When you find your footer :
1) Fire an event with the now-complete message
2) Allocate a new block and start copying into that.

I guess copying byte by byte when you get the TCP data in seems tedious -
but you have to check the data anyway so you know when your message is
finished.

Personally, I wouldn't go for a separate thread scanning the same buffer my
TCP socket was writing into.
If you use an event I don't think you need to worry about thread synching at
all since the message in the event is in a block your TCP thread has
finished with.

Just don't take too long in the event handler - now here's somewhere where
you might use a synchronized queue. So your handler just pushes the block on
the queue and then returns to your TCP code to finish handling the TCP
event.

Another worker thread can be looking at your queue and pulling messages off
and handling them while your TCP code isn't doing anything.

HTH,

Adam.
=========
 
S

ShaunO

(final question)
How does this handle the wrapping around - is there native support for doing
this somehow or would i need to check the pos left in buffer and break into
two parts putting one bit a the end of the buffer and the remainder at the
start ?
Thanks,
Shaun
 
J

Jon Skeet [C# MVP]

(final question)
How does this handle the wrapping around - is there native support for doing
this somehow or would i need to check the pos left in buffer and break into
two parts putting one bit a the end of the buffer and the remainder at the
start ?

The latter - that's exactly why it would be a case of writing a
circular buffer instead of using an existing one :)

(There may well be existing third party implementations around already
- but your case is somewhat specialised as you'll want good support
for dealing with whole arrays of bytes rather than one byte at a
time.)

Jon
 
S

ShaunO

Thanks!

Peter Duniho said:
There is no built-in support for a circular buffer, AFAIK. You'd have to
implement the wrap-around logic yourself. Not _too_ hard, but can be a
little tricky.

That said, if you really need high-performance, it's probably the way to
go. Shifting all your data is potentially expensive, especially for a 1MB
buffer (alternatively, if your individual messages are relatively small,
are you sure you really need such a large buffer?).

Of course, if you don't need high-performance, you may want to consider
using much simpler techniques. For example, use smaller buffers for the
individual network i/o operations (anywhere from 1K up to 8K or so
depending on your usage), and then write the results into a MemoryStream
that represents each individual message. When you reach the terminator,
handle the current message by converting to bytes
(MemoryStream.ToArray()), passing to whatever code will actually process
it, and then start a new MemoryStream with the remaining bytes in your i/o
buffer.

That approach is likely to be much simpler than a circular buffer or
similar technique. If the performance is sufficient for your needs, IMHO
it's a much better way to go.

Pete
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top