How To Setup Exim to allow User-Customized Spam Threshold Settings

Over the past month, I have been working to write my own panel-like application for clients that use my hosting services. This “one-stop-shop” portal login will allow users to create/edit/delete accounts for their domain, access the file manager for their website, create/delete a database or access their database if they have one created, setup mail forwarding for e-mail addresses to their own mailbox or to an external address, and many other features.

One thing that I had a very hard time figuring out was how to allow users to modify and setup their own spam threshold settings. Now, this works if you use Spamassassin with Exim4. In addition, you will need some kind of database with at least three new fields added. Now, since I use Horde GroupWare, I simply added on three fields to the “horde_users” table which contains the following:

If the user has opted to have spam checking on or off
User-specified threshold where the subject is tagged as spam
User-specified threshold when the message to prevent delivery

I was trying to figure it out by mailing the Exim-users newsletter but the responses I received indicated it would be quite difficult to do. Scouring the Internet, I finally uncovered a solution that works!

The issue is this – when the mail server does the acl_check_data routine, you cannot use the $local_part and $domain variables – which would hold the recipient’s e-mail address. However, you can use this in the acl_check_rcpt routine because it is checked for each recipient. The acl_check_data routine is checked after – but just before- the mail is delivered to the user’s inbox. The reason it doesn’t work in the acl_check_data routine is because there may be several e-mail addresses that the message is to be delivered to by this point.

So, the trick is to check for the user-specified spam settings in the acl_check_rcpt routine. However – another issue with this. If a message is sent to multiple individuals and the users all have different user-specified settings, you must defer the message to that individual. So as long as the first e-mail user has the same settings as the second, third, or other recipients – the mail will be delivered to all of them at once. But users that have different settings from the first e-mail address, the message will be deferred to that individual and the sending mail server will retry to send to that individual within the sending mail server’s retry time period.

OK, so here is the code that was used. This was added to the acl_check_rcpt routine. Note that order IS IMPORTANT when it comes to adding this to the acl_check_rcpt routine – so ensure that you put this entire block of code in the correct sequence to your other ACLs.

# Lines below sets the recipient e-mail address to acl_m0
require set acl_m0 = ${local_part}@${domain}
require set acl_m7 = ${lookup mysql{SELECT login FROM aliases WHERE alias=”${acl_m0}”}}
require set acl_m0 = ${if !eq{$acl_m7}{}{$acl_m7}{$acl_m0}}

# Lines below gets the spam_flag setting from the user for checking
require set acl_m1 = ${lookup mysql{SELECT spam_flag FROM horde_users WHERE user_uid=”${acl_m0}”}}

defer
message = Spam Threshold Mismatch
condition = ${if and{{def:acl_m2}{!={$acl_m1}{$acl_m2}}}}

require set acl_m2 = $acl_m1

# Lines below gets the spam_delete setting from the user for checking
require set acl_m3 = ${lookup mysql{SELECT spam_delete FROM horde_users WHERE user_uid=”${acl_m0}”}}

defer
message = Spam Delete Mismatch
condition = ${if and{{def:acl_m4}{!={$acl_m3}{$acl_m4}}}}

require set acl_m4 = $acl_m3

# Lines below will automatically accept if the user has spam filtering disabled
require set acl_m5 = ${lookup mysql{SELECT spam_enable FROM horde_users WHERE user_uid=”${acl_m0}”}}

defer
message = Spam Checking Mismatch
condition = ${if and{{def:acl_m6}{!eq{$acl_m5}{$acl_m6}}}}

require set acl_m6 = $acl_m5

# Lastly, set the values to the default if the values cannot be looked up.
require set acl_m2 = ${if eq{$acl_m2}{}{45}{$acl_m2}}
require set acl_m4 = ${if eq{$acl_m4}{}{53}{$acl_m4}}
require set acl_m6 = ${if eq{$acl_m6}{}{Y}{$acl_m6}}

accept
condition = ${if eq{$acl_m6}{N}}

OK, now that you have the above added, hopefully I can explain what is going on.

In the first three lines (the three “require set” lines), the acl_m0 variable will be set to the actual e-mail address that the message should be delivered to. For instance, if a user setup e-mail account me@bob.com to be forwarded to their real mailbox of bob@bob.com, this will ensure to grab bob@bob.com – because the “user_uid” field in the “horde_users” table is bob@bob.com – not me@bob.com. If the REAL e-mail address isn’t captured, then the code above will not grab any row from the database (because there won’t be a row in the “horde_users” table with a “user_uid” of me@bob.com).

So, acl_m0 is first set to the e-mail address that the sender wants to send it to. Next, acl_m7 is a variable that will hold the alias e-mail account. For instance – me@bob.com. A MySQL lookup is then done on the “aliases” table (which contains two fields – login and alias). It will capture the REAL e-mail address – which is in the “login” field by looking up the alias. Now realize that by doing this check, the result may be empty. If the result is empty, that then means that the e-mail address the sender wants to send to is the REAL user account e-mail address (and their login ID that is in the user_uid field).

So the next line will then set acl_m0 to the REAL e-mail address (the “login” field) IF and ONLY IF acl_m7 is not empty.

Moving on to the next part. Variable acl_m1 is then set to the user’s subject flagging threshold. In this case, the field in the “horde_users” database is called “spam_flag”. The lookup will lookup the row where the “user_uid” field matches the user’s login (or the REAL e-mail address).

Now for the two lines under the “defer” ACL. The first one is the message – which just will simply echo to the sending mail server that the message must be deferred – so the mail server needs to try back again. Why do this? Well, this brings me to the condition. The condition basically says that IF acl_m2 is defined and acl_m2 and acl_m1 do NOT match, then defer. This is the point where the configuration is set to defer the message for more than one recipient IF the recipients do not have the same spam subject tagging threshold set. If there are 50 recipients and all of them have the exact same settings, all 50 will get the message at the same time. If there are 50 recipients and all except two have the exact same settings as the first user, then all users (except those two) will get the message delivered to them at the same time. The other two will be “deferred” so the sending mail server will then retry – just those two users – at a retry period (usually five minutes or maybe 15 minutes on some systems).

Now to the next line of code – the “require set acl_m2 = $acl_m1”. This line will set the $acl_m1 code that holds the threshold value and moves it into acl_m2. At this point, this is how the defer code works. This is because the first user WILL get past the defer code – and therefore acl_m2 will be set with their subject tagging threshold. Subsequent users will go through the same process – but if acl_m2 (the previous user’s threshold) does not match acl_m1 (the current user’s threshold), then of course – the “defer” ACL runs.

Now all of that instruction above basically is done three times. The first time sets the subject tagging threshold (uses variables acl_m1 and acl_m2). The second time will set the e-mail reject threshold (uses variables acl_m3 and acl_m4). And lastly, the third bit of code will check to see if the user even has spam checking turned on (uses variables acl_m5 and acl_m6).

After that is run – as noted in the code above – I then set default values IF the user data cannot be looked up. When would the user data fail to be looked up? Well, if a forwarding account is set to another provider. Using the example with me@bob.com – if that was a forwarding address to me@yahoo.com, clearly I do not host Yahoo so I won’t have a REAL user in my system for “me@yahoo.com” because the me@bob.com address is nothing but a forwarder to the Yahoo account. This is why I set default values; the default values are what I have found are good, reliable settings for Spamassassin (at least on my system). Each of the “require set acl_mX” lines check the data. If the acl_mX variable is empty, then it will change that acl_mX variable to equal the default amount. For subject tagging, that is 45. For rejecting, that is 53. For spam checking, it is set to “Y” which means checking is enabled.

And now lastly – the “accept” code at the bottom. This code is run IF and only IF the user has decided that they do not want spam-checking enabled. So if the acl_m6 variable is equal to “N”, then the mail will be automatically accepted in the acl_check_rcpt routine This is et because I have “deny” conditions below the above configuration. But I don’t want those to run if the user has spam-checking disabled.

Setting the acl_check_data Routine to Use The User-Specified Spam Values in Exim

OK, now you need to use those variables in the acl_check_data routine where Spamassassin actually checks the e-mail. Again, ensure you have these set in the proper location because order is important in each of the routines.

# Will accept if the user has disabled spam checking
accept
condition = ${if eq{$acl_m6}{N}}

# Deny/Check for Spam
warn
spam = Debian-exim:true
message = X-Spam_score: $spam_score\n\
X-Spam_report: $spam_report
condition = ${if <{$spam_score_int}{$acl_m4}}

warn
spam = Debian-exim:true
message = Subject: SPAM SCORE: $spam_score $h_Subject
condition = ${if >{$spam_score_int}{$acl_m2}}

deny
spam = Debian-exim:true
message = E-mail cannot be delivered: $spam_score spam points. $h_Subject
condition = ${if >{$spam_score_int}{$acl_m4}}

In the code above, it first checks to see if the user has disabled spam-checking. Again, if variable acl_m6 is equal to “N” (spam-checking disabled), it will automatically accept the message and skip the other three conditions.

The next condition – the “warn” condition – will add the spam score and spam report to the header of the e-mail message if the spam score is less than the reject threshold. Now, the user-specified spam reject threshold is stored in variable “acl_m4” – hence why you see that in the code.

The next “warn” condition is set to only tag the subject line. It will add “SPAM SCORE: <score>” to the subject line of the e-mail. Now, “acl_m2” was set to the subject-tagging threshold – so this is why the condition contains that. If the spam score is higher than the “acl_m2” variable, the e-mail subject will be tagged.

Lastly – the “deny” condition. This simply rejects the e-mail at SMTP time to prevent the message from even getting to the user’s mailbox. In the condition, you see that “acl_m4” is specified. Again, this variable holds the user-specified reject threshold. So if the spam score is higher than the “acl_m4” variable, the message is rejected and will not be delivered.

So there you have it. Not a very easily solution to all user-specified spam settings in Exim, but it does work. The only issue that may cause trouble is the “defer” conditions if the spam settings are not the same for each of the users that the message should be sent to. If there are ten users and all ten of them have different settings, the message will only be delivered one at a time and the sending mail server will be deferred nine times until all users get the message. Deferring nine times may be a substantial period of time depending upon the sending mail server’s retry settings.