Subject Header Tagging Considered Harmful

An Earnest Plea to Mailing List Administrators

by James Ralston


An email message requires some amount of processing when it is redistributed to a mailing list. At the very least, the envelope must be rewritten to redirect bounces to the list administrator. While the message is being processed, the list administrator might take advantage of the opportunity to munge some of the message headers.

Modern managed mailing list software gives the list administrator the ability to munge a wide range of headers (and usually the body of the message as well). Some forms of munging are helpful, such as the X-BeenThere header for loop-detection. Others are questionable. Most are ill-advised or dangerous.

Some list administrators want to munge the Subject header by tagging it, so that a submission to the mailing list that looks like this:

From: Joe User <user@example.com> To: foo-devel@example.com Subject: schedule for next release?

Is distributed to the mailing list looking like this:

From: Joe User <user@example.com> To: foo-devel@example.com Subject: [foo-devel] schedule for next release?

In the vast majority of cases, this particular type of munging—Subject header tagging—is ill-advised, and should not be performed. If you disagree, or aren't certain, hopefully I can convince you.

The Principle of Minimal Munging

Email handling is complicated. RFC2821 and RFC2822 are 130 combined pages of dense, dry, detail, and there are dozens of other RFCs which apply to email handling. Even changes which might seem innocuous can have unintended and hard-to-predict consequences.

The Principle of Minimal Munging states that you should not make any changes to a mail message unless you know precisely what you want to do, why you want to do it, and what it will affect. Unless you can articulate a clear reason for munging and understand the full consequences of the action, you should not do it.

Phrased differently, the Principle of Minimal Munging states that the default should be not to munge unless munging is explicitly argued and justified, rather than munging by default and requiring a reason not to munge. (This is analogous to system security philosophy, where deny unless explicitly permitted is preferred over permit unless explicitly denied.)

Tagging Is No Longer Needed For Filtering

Historically, many people used mail clients which had limited filtering capabilities, and most mailing lists were relatively low-volume. Under such conditions, Subject header tagging could provide benefits to mailing list subscribers:

But these benefits are lost as the volume of mail increases, and today's Internet isn't a low-volume place: the explosion of popularity of the Internet has translated into overflowing Inboxes for most people. Virtually all mail clients in use today have sophisticated filtering capabilities, simply because they have to, as most people now receive more mail than is feasible to sort by visual scanning.

Moreover, most mailing lists today are managed automatically using managed mailing list software packages (for example, GNU Mailman). A typical message (as processed by managed mailing list software) provides a plethora of filtering possibilities for mail clients with sophisticated filtering capabilities: RFC2369 headers, loop-detection headers, Received headers, or even To or CC headers.

For most people and most mailing lists (high-volume lists in particular), the filtering question is no longer how to distinguish messages to a particular list from non-list messages; the filtering question is now how to determine which of the potentially hundreds of messages to the list are worth the subscriber's time to read. Not only does Subject header tagging not help list subscribers to discriminate among the various messages to the list, it makes the task of choosing messages to read more difficult, because it obscures a portion of a critical header (the Subject header) that subscribers use to determine which messages they wish to read.

Many list administrators who favor Subject header tagging miss this point: they might want to read every single message sent to their lists, but that doesn't mean that everyone else wants to. (In fact, it's been my experience that once list traffic increases beyond a typical announcement list, a significant number of subscribers start wanting to skip threads they simply aren't interested in.)

For a typical announcement list, which has as an explicit goal to keep the list traffic low enough that everyone will read every message, Subject header tagging might not be unreasonable—if list subscribers ask for it. But far too many list administrators just turn on Subject header tagging without even thinking about it. That's not only ignorant, it's arrogant as well: it's assuming that every single subscriber wants to read every single message to your list, and thus doesn't care whether tagging makes it more difficult to discriminate based on the Subject header.

Tagging Wastes Precious Space

Most mail clients display mailbox summaries something like this:

Bugtraq
mailbox summary

The Subject header is important, because it contains information that most list subscribers use to determine which messages (out of all of the message sent to the list) they wish to read. The amount of space available to display the Subject header summaries is limited, and thus precious. Tagging wastes that precious space, making it more difficult to identify and follow interesting threads. (Although it was added by the sender, not by the mailing list, notice how much space the tag in message 328 consumes.)

It may be tempting to retort that my mail client is broken or otherwise to blame for not showing me more of the Subject headers in the mailbox summary, and that if I were to simply fix my mail client, then tagging wouldn't be an issue.

But such a retort would miss the point: the amount of space available to display Subject header summaries is finite, whether it's 30 characters or 70 characters, and tagging the Subject header in any fashion takes up space that would best be served by displaying content that matters.

Such a retort would also be fairly ignorant of the vast number of different ways in which people can read and send email. The person using a wireless Internet-enabled PDA to read email may literally not have any spare screen space. The person reading select messages forwarded to his mobile phone via text messaging often has a hard limit on overall content length; tagging means that more of the body of the message will be truncated. The blind person may not be able to configure his text-to-speech synthesizer software to avoid trying to pronounce the tag, and thus will have to listen to it again... and again... and again.

Moreover, even if the Subject header summaries (tags and all) fit on my display without being truncated, it's more visual work to ignore the tags while scanning the Subject header summaries than it would be if the tags simply weren't there at all.

The bottom line is that the Subject header space is valuable, and should be used as efficiently as possible. Subject header tagging is a frivolous waste of space.

Coddling the Lazy, Penalizing the Conscientious

Incredibly, even though tagging is no longer needed for filtering, it isn't uncommon to find people who favor tagging because they're too lazy to bother to learn how to use the built-in filtering capabilities of their own mail clients. They'd rather hunt and peck through an Inbox with hundreds of different messages from various people and various mailing lists all jumbled together than take the time to configure their mail client to perform this busywork for them.

You might think I'm being overly harsh. I'm not. Some people not only tolerate this position, but apologize for it. Here are some verbatim quotes:

This attitude reminds me of an old joke:

There was once a man from the city who was visiting a small farm, and during this visit he saw a farmer feeding pigs in a most extraordinary manner. The farmer would lift a pig up to a nearby apple tree, and the pig would eat the apples off the tree directly. The farmer would move the pig from one apple to another until the pig was satisfied, then he would start again with another pig.

The city man watched this activity for some time with great astonishment. Finally, he could not resist saying to the farmer, This is the most inefficient method of feeding pigs that I can imagine. Just think of the time that would be saved if you simply shook the apples off the tree and let the pigs eat them from the ground!

The farmer looked puzzled and replied, What's time to a pig?

Ordinarily, I wouldn't care one iota what other people choose to waste their time on. I don't care if your VCR clock blinks 12:00. I don't care if you prefer to print out your email to read it. (And yes, I know people who do.)

Computers are extremely well-suited to performing easily automated, highly repetitive tasks. Filtering e-mail is an easily automated, highly repetitive task. If you can't see that there's an obvious conclusion here, that's not my problem.

But when your desire to waste your own time ends up wasting my time, I take exception. Subject header tagging forces the conscientious people—the ones who've taken the time to configure their mail client's filtering capabilities—to stare at a subfolder (or a subset of messages) where every single damn message has the same useless tag on it, just so that the lazy can continue to be lazy.

In a funny sort of way, I almost respect spammers more than I respect advocates of Subject header tagging. Spammers' unsolicited messages make it more difficult to identify legitimate mail that you want to read. Tagging does the exact same thing. But spammers waste my time because they're trying to make money; that is, however flawed their enterprise is, they're at least trying to do something. In contrast, Subject header taggers waste my time for no other reason than their own laziness.

If mailing list administrators must choose between penalizing the lazy or penalizing the conscientious, they should default to penalizing the lazy.

Tagging Discourages Discussion

The entire point of a mailing list is to facilitate participation and discussion. By making it more difficult for some people to identify and follow the discussion, tagging interferes with that goal.

There are obvious correlations to other commonly-discouraged practices.

For example, even though most people now use mail clients which are capable of displaying messages with text/html content, most lists discourage their subscribers from sending HTML email to the list. This is because HTML email causes problems in certain situations, and thus has the effect of discouraging participation and discussion.

Another commonly-discouraged practice is top-posting. Top-posting is often discouraged because many people find it non-intuitive to read, and thus it has the effect of discouraging participation and discussion.

Tagging is no different. Even though most people aren't affected by tagging, a non-trivial number of people are, and a result, tagging discourages participation and discussion. If only for that reason, list administrators should strongly oppose tagging.

People who administer mailing lists for open source software projects should be particularly sensitive to discouraging participation, because open source software projects rely on participation. To permit Subject header tagging to discourage participation is to ultimately shoot your own software project in the foot.

(There is also a certain irony in that frequently the people who are most religiously opposed to HTML email and top-posting are the ones who are most in favor of tagging, and will defend the practice of tagging using the exact same logic (e.g., it's not my fault if your stupid mail client can't deal with it) that they utterly reject when it is used to defend HTML email or top-posting!)

Tagging Is Often Broken

Simply put, it's difficult to tag correctly, and not all mailing lists do a good job of tagging.

Some mailing lists will add the tag again if it isn't present as a prefix. This can lead to Re: [tag] Re: [tag] Re: [tag] cascades.

Not all mail clients use Re: as a reply prefix; this can confuse some mailing lists into adding the tag multiple times.

Even if the mailing list performs a case-insensitive substring match over the entire Subject line before adding the tag, it can still get added multiple times. Mailing list members can mangle the Subject header in interesting ways. Subscribers may also try to pre-tag their own message submissions, either because they're trying to be helpful, or because they want to put the tag at the end of the message instead of the front. Regardless, it's not uncommon for them to get it wrong, which just increases the useless text in the Subject header.

Some people like to retort that if the mailing list software doesn't do a good job of tagging, then it is broken and needs to be fixed. I don't disagree, but they've missed the point: one of the primary reasons for the Principle of Minimal Munging is that the world is full of broken software which is best perturbed as little as possible.

Tagging Doesn't Scale To Multiple Lists

Tagging incorrectly assumes that people only send mail to one mailing list at a time. Because of this incorrect assumption, if a message is posted to multiple mailing lists that tag, any resulting thread is likely to grow a tag for each mailing list.

For example, someone sends a message to the mailing lists foo, bar, and baz. A person reading on foo sees:

Subject: [foo] schedule for next release?

He replies. Now someone reading on bar sees:

Subject: [bar] Re: [foo] schedule for next release?

Now someone reading on baz replies. Now we have:

Subject: [baz] Re: [bar] Re: [foo] schedule for next release?

Some mailing lists try to avoid this condition by not adding their own tag if they detect a pre-existing tag from any other mailing list. But this avoidance technique renders the mailing list's own tag useless, as subscribers can't rely on the absence of the tag as an indication that the message didn't come from the list.

Counterarguments Are Unconvincing

Some people have argued that tagging helps people track what mailing list a forwarded message came from. But who's to say that the person forwarding the message is going to remove the To header (which will also show where the message came from) and not the Subject header? This is a bogus argument.

Another counterargument is that tagging is effective at tracking off-list replies to on-list topics. This argument might have some merit. However, there's no guarantee that the person who's replying won't manually change the Subject header, and by doing so remove the tag. A more effective way to track off-list replies to on-list topics is to use detail addressing (e.g., user+detail@example.com). In fact, filtering is one of the main reasons why detail addressing was created in the first place.

Of course, not everyone can make use of detail addresses, as not all ISPs and/or MTAs support them. But for all of the problems and inconveniences that tagging causes, is this one tiny and unreliable benefit (for a subset of people) really worth it? The answer is no.

Another counterargument goes something like this:

If the tags bother you, configure your mail client to remove them, or upgrade to a mail client that can.

This is a bogus counterargument, as it violates the Principle of Minimal Munging; it's asking someone to munge again in order to try to undo the first munging. Additionally, pattern-based text substitution of the Subject line is a fairly esoteric feature; very few mail clients support it (although I've been told that Gnus does).

Here's the proper response to the just remove them counterargument:

If you want the tags, configure your mail client to add them, or upgrade to a mail client that can.

I've seen reports that some blind people prefer Subject header tagging, but I'm not certain to what extent, or why. I would appreciate feedback on this issue; see the Credits And Feedback section, below.

Summary

Some list administrators may not care much one way or the other about Subject header tagging, but simply haven't bothered to turn off the feature on their lists. Other list administrators actively want to have their lists perform Subject header tagging, because they believe it provides a benefit for their subscribers. But it doesn't:

In the vast majority of cases, Subject header tagging is ill-advised, and should not be performed.

Credits And Feedback

The formatting of this document was inspired by Chip Rosenthal's Reply-To Munging Considered Harmful document. I also used some of his text in various spots. (As one former elm user and contributer to another: thanks Chip.)

If you have comments, suggestions, or counterarguments, please feel free to let me know.

If your goal is to change my mind about Subject header tagging, I'll be perfectly honest: you probably won't. However, if you provide a good counterargument, I'll add it to the list above.

Addendum

Some people have assumed that my point is that Subject header tagging should never be used. That's not true. My point is that Subject header tagging should never be used by default.

In fact, as amazing as it might seem after reading this document, I am the list administrator of several mailing lists which perform Subject header tagging. When I created the lists, I didn't configure them to tag, but the topic came up on the lists, and the following points were raised:

After careful consideration and discussion, I felt that Subject header tagging would be beneficial to many people, and wouldn't terribly inconvenience the people who didn't need it. So I enabled it.

This is the correct way to approach the issue of Subject header tagging: by default, don't tag, and require a convincing argument (and consensus) before enabling tagging.

If more list administrators behaved this way, the world would be a happier (and more efficient) place.


Feedback is welcome and encouraged; please see my contact information.

Last updated: $Date: 2003-11-11 00:46:21-05 $