Towards Modern, Scalable Email Facilities
Edwin H. Kremer
Department of Computer Science
Utrecht University
<edwin@cs.uu.nl>
February 23, 1998
January 12, 1999 (update)
1. Introduction
The use of electronic mail (email) at the Department of Computer Science at
Utrecht University started early in 1985. Quite a few things have changed in
the 12+ past years: in the late eighties it was
rare to find a user with a mailbox of size over 200KB, now it is common
to find mailboxes of size near to 25MB. Statistics on the Internet backbone
show that the Email volume on the Internet is still rapidly increasing. The
growth is causing us trouble...
Chapter 2 describes our previous, traditional email environment, explains
where the (daily) trouble occurs and where limits have been reached.
Chapter 3 describes a modern, scalable, client/server based email environment;
this is where we are migrating to.
Chapter 4 summarizes the changes for the users.
2. Traditional Email Environment
In our previous email environment, all user mailboxes reside on a central
server infix.cs.uu.nl, which makes the mailboxes available to all
desktop UNIX workstations via the Network File System (NFS). To send
mail, all desktop UNIX workstations have their own copy of sendmail
on their local disk. UNIX users are running programs like elm,
pine, Zmail,
mailx and netscape to send and receive email.
The non-UNIX systems (PC's and Apple's) heavily depend on network services on
the central server infix.cs.uu.nl like POP and SMTP to respectively
receive and send email (using Eudora or Netscape).
Sendmail, the Mail Transfer Agent (MTA), the `core' program that takes
care of sending the majority of email across the Internet, is one of the
largest, security critical and most complex to maintain pieces of software
that exist on the Internet.
Figure 1 below shows this email environment in more detail.
There are quite a few areas where this configuration is growing beyond its
limits: in performance, reliability and maintenance costs. Problems raised
in these area's:
- Mailboxes have grown to huge sizes.
Because all workstations are accessing the mailbox via NFS, reading mail
and updating a mailbox causes a lot of network traffic. When the
mailbox is very large (mailboxes with 500 messages and size of 15MB are
quite common) this also takes a fair amount of time. As the load on
the servers rises, processing time per mailbox increased rapidly.
While the mailbox
is being read, it is also locked to prevent concurrent updates
when a new message arrives. In addition, NFS is notoriously known for its
very unreliable locking behavior. On our busy network, the combination
of huge mailboxes and NFS locking simply takes too long and is the cause
of loss of email several times each day!
- NFS will hang a system.
Because NFS is a very persistent protocol, the desktop workstation
will suffer serious performance degradation or come to a grinding halt
if the central mail server infix.cs.uu.nl is down. NFS is also
not fully portable to heterogeneous environment and does little to
nothing to address the needs of disconnected or nomadic users.
- Sendmail exists on every desktop system.
As said before, sendmail maintenance is very time consuming and also
a challenge, because it is very complex software. Sendmail tops the
list of programs on the Internet that have had the most security problems
in their lifetime, due to its complexity.
An update of sendmail on all our systems takes about a full week of work,
because it has to be compiled and installed for every architecture.
Every desktop system has its own queue for outgoing email: this queue
can fill up or come to a halt without being noticed easily. This will
stop all user's email from being delivered.
The local presence of sendmail is only required by ancient mail clients
like elm, Zmail and mailx. There's no other
reason why it should be there. Replacing these by modern mail clients,
will eliminate this problem.
- Spam and UCE. Unsolicited Commercial Email (UCE) and `spam'
is exploding. Abuse of our email facilities by spammers increases and
it taking away a huge amount of resources. Preventing this is tricky
and is best done at a central server.
Our goal is to eliminate the problems listed above and to provide a fast
email environment that is dial-tone reliable and will cause a
major decrease in Total Cost of Ownership (TCO).
3. Modern, Scalable Email Environment
The new email environment will offer substantial gains over the
previous environment, like:
- Modern Client/Server Architecture. All email messages are retained
on the mail server. Mail User Agents (clients) retrieve and submit messages
to and from the store via reliable Internet protocols. Many operations that
have traditionally been the responsibility of mail clients, such as searching
and folder management, are now performed by the server, thereby making far
more efficient use of network resources and specially tuned CPU performance.
- Built on Internet Open Standards. All components natively use
Internet electronic messaging standards, including the Internet Message Text
Definition (RFC822), Multipurpose Internet Mail Extensions (MIME), Extended
Simple Mail Transfer Protocol (ESMTP) and Internet Message Access Protocol
version 4, revision 1 (IMAPv4rev1).
- Supports Popular Mail Clients. The POP (Post Office Protocol) and
IMAP (Internet Message Access Protocol) are supported by numerous clients
on UNIX and Microsoft Windows95 and Apple Macintosh systems.
Figure 2 above shows the modern email environment in more detail. If you
compare Figure 1 and figure 2, you will notice the following differences:
- No Sendmail on the Desktop. Sendmail has been eliminated from
all desktop systems, thereby reducing the TCO (Total Cost of Ownership).
- No NFS Mailbox Sharing. All mailboxes, both the normal `incoming'
mailbox (/var/mail/user) and the users' `saved'
mailboxes ($HOME/Mail/...) reside on the server. All file
I/O operations are performed by the server. Initially, only the message
headers are transferred across the network.
- Reduced Number of Email Tools on the Desktop. This is causing
most of the pain for the users, because some of them will have to learn to
use a different mail program. The reason for this is that development on
tools like elm has stopped a couple of years ago. That software
simply can't work in the modern environment. On the UNIX platform,
pine is the most preferred mail client program. Using migrating
from elm will master pine in only a couple of hours. The
advantage is that pine can be used for news reading/posting too!
- Spam and UCE. Postfix (or sendmail) on the server side will be
equipped with special restricted-rules that will stop spammers from
abusing our site as a mail-relay and will stop UCE from well-known sites.
- Introducing IMAP. On the server side the IMAP server
provides access to mailboxes trough the Internet Message Access Protocol
version 4, revision 1 (IMAPv4rev1).
Whereas the POP protocol downloads full messages from a single mailbox
per user to the client system, the IMAPv4 protocols are considerably more
flexible in downloading portions of messages, synchronizing after
disconnected operation, updating the mailbox and maintaining a hierarchy
of host message folders for the user. POP will typically be used by people
who carry their own notebook computer and want to answer email while on
the bus/train. IMAPv4 will typically be used by people who prefer to have
their email at a single well defined, reliable (backups!) location - they
will access their mail via the network, no matter if they are at the
office, at home or somewhere abroad.
4. Brief Summary of Changes for Users
Users on the UNIX platform might have to learn how to use a different
mail client program: choices here are pine, dtMail and
Netscape 4.x. Personally, I prefer pine with Sun Solaris'
dtMail being the runner-up.
Users on Microsoft Windows95/NT and Apple Macintosh systems should consider
moving from POP to IMAP: if your system is a desktop system, use of IMAP
is preferred. If your system is a notebook computer that you take with you
all the time, you might prefer POP instead (so that you always have all your
mail local on the notebook).
E-Mail services naming scheme, configuration hints and e-mail tips may be found in our
E-Mail FAQ
edwin@cs.uu.nl
$Id: email.html,v 1.1 1999/01/12 08:53:51 edwin Exp edwin $