Treat a POP3 mailbox like a queue

C

Craig Buchanan

I would like to monitor a POP3 mailbox with multiple clients. However, I want
to ensure that each message is processed by only one client. In essence, I
would like to treat a POP3 mailbox like a queue.

From what I've read thus far, atomic message access (if this is the right term)
isn't a native feature of the POP3 protocol. Am I mistaken?

My approach thus far, is to have one thread connect to the mailbox periodically,
look for new messages, add each message's ID to a queue (MSMQ or SQS), then mark
the message as read. A pool of threads would monitor the queue, get the next
message ID, 'process' the message, then delete it. I would probably segment
these activities into two windows services.

Does this seem like a reasonable pattern? Is there another, similar pattern
that I should consider?

Thanks for your time,

Craig Buchanan
 
J

Jeff Johnson

I would like to monitor a POP3 mailbox with multiple clients.

Let's get this out of the way at the start: does the mail server absolutely
not provide IMAP access?
 
M

Michael B. Trausch

I would like to monitor a POP3 mailbox with multiple clients.
However, I want to ensure that each message is processed by only one
client. In essence, I would like to treat a POP3 mailbox like a
queue.

From what I've read thus far, atomic message access (if this is the
right term) isn't a native feature of the POP3 protocol. Am I
mistaken?

RFC 1939 doesn't provide any assurances that message access be atomic
in the sense I think you mean. For example, if you connect to a POP3
server and get a message list with message IDs 1 through 10, and delete
messages 2, 3, and 4, those messages still technically exist until you
issue the "QUIT" command on the server. The POP3 server then updates
the mailbox and removes the messages. This is because while in the
TRANSACTION state, the "RSET" command may be issued which effectively
undeletes the messages and the server only enters the UPDATE state when
QUIT is issued (in the TRANSACTION state).
My approach thus far, is to have one thread connect to the mailbox
periodically, look for new messages, add each message's ID to a queue
(MSMQ or SQS), then mark the message as read. A pool of threads
would monitor the queue, get the next message ID, 'process' the
message, then delete it. I would probably segment these activities
into two windows services.

Does this seem like a reasonable pattern? Is there another, similar
pattern that I should consider?

This seems reasonable assuming that it is only a single thread
accessing the mailbox and you're using unique identifiers as opposed to
simple ID numbers. Though, keep in mind that unique IDs are not even
required to be unique; the standard permits identical IDs for exactly
identical messages (e.g., a mailserver *could* use a GUID for unique
identifer, or it could use an MD5 hash of the message, so long as
either is transformed into the format required by the RFC.

A better solution, of course, if you must accept email messages to
formulate the queue would be to have your software implement a
mini-SMTP server and accept the messages directly, if this is at all
feasible. This way, you simply take the messages and work with them
directly. When you receive a message, then, you can just put it in the
queue directly. Saving state would be up to you, but you could simply
serialize the queue and save it to disk after every update to the
queue. Or you could use some sort of on-disk spool to represent the
queue's saved state, or whatever. Point being that you can at least
bring the control of the queue (and handling of it) entirely within
your application. You can also handle duplicate queue requests
directly instead of trying to detect them on the server, if duplicates
matter at all in your application.

--- Mike
 
C

Craig Buchanan

Jeff said:
Let's get this out of the way at the start: does the mail server absolutely
not provide IMAP access?
At this point, the server is just POP3. If IMAP offers more flexibility, I
would consider migrating.

One approach occurred to my during lunch. I could use multiple threads in
combination w/ the auto-increment message id (which is different from the
message's unique id). I create a method that synchLocks a counter that starts
at 1 and is incremented by one during each call. each thread gets the Id, then
attempts to retrieve that message from the server. i would need to catch errors
that result from messages that are marked for deletion and when the ID has
exceeded the last message on the server. i would need to create a mechanism to
reset the counter to 1 when a POP3 session is closed.

Thoughts?
 
C

Craig Buchanan

Mike-

Thanks for the reply.

So, there isn't another way to enter the UPDATE state w/o quitting? This would
be quite useful.

If I try to get a message using its ID (the auto-increment value), and it has
been marked for deletion, what happens? I'm assuming that it returns -ERR plus
a message. At what point do auto-increment values get reset?

As I recall, NNTP has a method to return the first and last message ids (without
getting all of the messages). I don't see anything like that in 1939. Is this
correct?

Thanks.

Craig
 
M

Michael B. Trausch

Thanks for the reply.

So, there isn't another way to enter the UPDATE state w/o quitting?
This would be quite useful.

If I try to get a message using its ID (the auto-increment value),
and it has been marked for deletion, what happens? I'm assuming that
it returns -ERR plus a message. At what point do auto-increment
values get reset?

Yes, RFC 1939, page 8, states that DELE takes a message number, "which
may NOT refer to a message marked as deleted". On page 9, it is found
that:

The POP3 server marks the message as deleted. Any future
reference to the message-number associated with the message
in a POP3 command generates an error. The POP3 server does
not actually delete the message until the POP3 session
enters the UPDATE state.

On page 10, regarding the UPDATE state:

When the client issues the QUIT command from the TRANSACTION state,
the POP3 session enters the UPDATE state. (Note that if the client
issues the QUIT command from the AUTHORIZATION state, the POP3
session terminates but does NOT enter the UPDATE state.)

If a session terminates for some reason other than a client-issued
QUIT command, the POP3 session does NOT enter the UPDATE state and
MUST not remove any messages from the maildrop.

This means that if your TCP connection to the POP3 server is severed,
you will have to remember the state on your own, or reprocess the
messages on the server. This would not be a good thing, if your
operations which you're queuing are not idempotent.
As I recall, NNTP has a method to return the first and last message
ids (without getting all of the messages). I don't see anything like
that in 1939. Is this correct?

I don't quite know what you mean here. Message IDs are enumerated by
the server when you reconnect to it. For example, if you connect to a
POP3 server and it has 10 messages, it will number them 1 through 10.
Delete 2, 3, and 4, and QUIT, and reconnect immediately and you have 1
through 7. The numbers will be reused.

If you want identification that is unique and persists across sessions,
you're looking for the IDs which are provided by the UIDL command. Do
note that the UIDL command is optional per RFC 1939.

I'd recommend reading RFC 1939 in its entirety. RFCs are somewhat
terse, but usually reading them a second time a few hours later helps
to "fill in the blanks", particularly with the larger RFCs.

I also would re-echo my recommendation that I gave in my previous
message, to run a mini-SMTP server within your code base and accept
messages that way, so that you're not relying on POP3 to help you out.
I think that using POP3 (or anything external to you that isn't
absolutely ensured to keep state the way you expect) will in the long
run not work out very well...

--- Mike
 
J

Jeff Johnson

At this point, the server is just POP3. If IMAP offers more flexibility,
I would consider migrating.

IMAP offers considerably more flexibility, but of course it's more
complicated to work with. I just happen to know that there's a "seen" flag
for messages under IMAP, so this might do what you want. Plus I believe
every message gets a unique ID.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top