filtermail(1)

fitermail - extensive mail filter
(filtermail_1.05.00)

2023

NAME

filtermail - Filter incoming e-mail to accepted, spam or ignored

SYNOPSIS

filtermail [OPTIONS] base
[OPTIONS] - cf. section OPTIONS
base: the absolute path to the user's home directory

When specified paths with options or in the configuration file specifications may start with ~/, in which case ~ is replaced by base.

DESCRIPTION

Filtermail filters incoming e-mail as either accepted, spam, or ignored e-mail. It uses rule files, which are inspected in sequence until the incoming e-mail matches a rule. Once that happens the rule's associated action (accept, spam, or ignore) is executed. If the e-mail is not matched by any rule then the e-mail is accepted.

Alternatively, when using option --inspect filtermail can be used to find the domain name of the sender, its IP address, its country of origin and the cidr-range containing the received IP address (see sections OPTIONS and FILTERMAIL INSPECT below).

Accepted e-mail normally is appended to the mail file which is used by the incoming mail server when receiving mail for the current user. E.g., if the user's username is frank then incoming mail is appended to the file /var/mail/frank. Users may also define directories to contain saved e-mails (e.g., ~/Mail), and filtermail can be configured to append e-mail considered as spam to, e.g., ~/Mail/spam. Likewise, e-mail matching the 'ignore' criteria could be appended to ~/Mail/ignore. Instead of appending the complete e-mail to its destination file the received e-mail's From: and Subject: headers can be appended to its destination file. This is achieved by prefixing :HDRS: to the name of the destination file. Alternatively, such e-mail can also be ignored, losing it completely, by not specifying a destination file. The option to merely log the received e-mail's From: and Subject: headers may come in handy if the received e-mail is also kept elsewhere (e.g., in another account, which forwards the received e-mail to the computer running filtermail) and the ignoring rules might result in occasional false positive decisions. (see also section OPTIONS below).

Filtermail uses three types of files:

If filtermail detects a syntax error in the rules file or in a rule specification file the incoming mail is accepted. To avoid this situation the --syntax option (see section OPTIONS) should be used when modifying, adding or removing rule files to verify that the specified rules were correctly formulated.

To use filtermail the incoming mail server must recognize it as a valid mail handling program (see section EXAMPLES).

CONFIGURATION

Options (see section OPTIONS) not flagged with `NO_CONFIG' can also be specified in a configutation file. By default the configuration file ~/.filtermail/config is used.

Command line options always take precedence over specifications in the configuration file.

The configuration file must exist, but may be empty. It must exist because its directory defines the directory where the files defining the filtering rules are located. Empty lines and the content of lines starting at the #-character are ignored.

Option --expire is used to remove patterns whose date stamps (cf. section PATTERNS) indicate dates before the date specified by --expire. The configuration file may contain dont-expire: lines. Each dont-expire: line specifies the name of a pattern file whose entries don't expire. Files specified in dont-expire: lines may not exist, but to avoid inspecting pattern files for expired dates the name(s) of those s must be identical to the names of the s used in the rules file (cf. section RULES below).

Other files in the configuration file may use the [~]/path format or must be plain filenames (i.e., not starting with /-characters). Plain files are relative to the directory containing the configuration file (cf. section EXAMPLES).

RULES

All mail filtering rules are defined in the rules file. Mail filtering starts at the first rule until either the incoming e-mail matches a rule, or until all rules have been processed and the e-mail does not match any rule. In the latter case the e-mail is considered accepted.

Empty lines and lines whose first non-blank character is a #-character are ignored. The rules themselves cannot contain #-characters.

Rules are written according to the following syntax (elements between square brackets are optional, the content of bracketed sections followed by a * character may be repeated (not using the square brackets). Lowercase words are keywords and cannot be used otherwise. Capitalized words are described below. Each rule is specified on its own line. Line continuation (using, e.g., \ at the end of a line) is not supported). Here's the rule's syntax:


    if Header File [and Header File]* Action
    

Header is the name of a mail header (e.g., From:, Received:). Header specifications must be identical to the first words of header lines of the received e-mail. So to match the From header in the e-mail's first line specify From (i.e., no colon). Some headers have variants. E.g., Received: and Received-SPF:. To select all headers sharing their initial characters append a +-character to the initial part (e.g., to select all headers starting with Received use Received+, and use From+ to select all From: headers including the e-mail's first line).

File is the name of the file containing patterns to inspect. Filenames must start with ./ and define the locations of files below the configuration file's directory. E.g., ./spam/subject specifies the file subject in a subdirectory spam containing patterns considered by the rule.

Action specifies the action to execute when e-mail matches a rule. Action can be

PATTERNS

Files specified in rules files define patterns which may be found in the headers defined by the rules. The header lines which are selected by the Header specifications in the rules file are matched against those patterns after removing the header labels from those lines. So Subject: hello world is passed to the patterns as the (trimmed) line hello world. Once a header's content matches a pattern inspection ends with a successful match (the rule itself may specify not, in which case a successful match results in a failing match of the rule).

Pattern files may start with file-specific comment (i.e., empty lines and lines whose first non-blank character is #) up to a comment line equal to #=. The patterns themselves may also be preceded by comment lines. Once a pattern is matched it is moved one position upward in the pattern file (including its associated comment).

Pattern specifications use the following syntax (elements between square brackets are optional (when used, the square brackets are not specified), capitalized words are described below, each pattern is defined on a single line, line continuation is not supported):


    Nr Date Expression [and Expression]*
    

This pattern indicates a match when all Expressions match.

This syntax uses the following elements:

The Expressions themselves use the following syntax:

MatchMode [not] Spec
The selected headers are matched against Spec using the specified MatchMode. The not keyword is optional. When specified (omit the square brackets) the result of the match is negated. When multiple Expression specifications are joined by and keywords, then the final pattern results in a match if all Expression specifications indicate a match. Once an Expression does not indicate a match, then subsequent Expressions are not evaluated and headers do not match the pattern.

When using not the e-mail may not match the Spec specification. E.g, when e-mail should contain a To: header or a Cc: header the following rule can be used (cf. section EXAMPLES:)


    if To: ./match/noto and Cc: ./match/noto  spam
        
with match/noto:

     1   23-05-10  p not  '.'
        

Note that Spec must be surrounded by single quotes. To use a single quote inside a Spec escape it (as \'). In general: the character following a backslash is used as-is, removing the backslash from the Spec (e.g., to construct a Spec containing '\n specify \'\\n).

There are five types of MatchModes. MatchModes using regular expressions use extended regular expression patterns: prefix multipliers and bounding-characters by backslashes when they should be interpreted as ordinary characters (i.e., *, +, ?, ^, $, |, (, ), [, ], {, } should be escaped when used as literal characters).

OPTIONS

Short options, when defined, are provided between parentheses immediately following their long option equivalents. Several parameters specify locations of files written or used by filtermail. If a location specification starts with ~/ then the tilde-character is replaced by the base directory specified as filtermail's argument. Otherwise, if the location does not start with a slash (/) character then the location is prefixed by the path of the directory containing the configuration file.

Some options can also be specified in the configuration file (cf. section CONFIGURATION). Options that cannot be specified in the configuration file are marked as NO-CONFIG.

FILTERMAIL INSPECT

When specifying the --inspect (-I) option filtermail expects a received e-mail file at its standard input showing the domain name of the sender, its IP address, its country of origin and the cidr-range containing the received IP address. E.g.,


    from renxincj.com (unknown [104.223.188.228])
    IP = `104.223.188.228', Country: US,  CIDR = 104.223.128.0/17
        
Mail handling programs (e.g., mutt(1)) allow its users to pipe an e-mail file to a program, so the received e-mail can be inspected from inside the mail handling program. E.g., with mutt typing | shows the prompt

    Pipe to command:
        
and assuming that the filtermail program is available in the user's PATH environment variable enter `filtermail -I' to pass the received e-mail to filtermail:

    Pipe to command: filtermail -I
        
Depending on the content of the Received: headers filtermail's output shows the domain name of the sender, its IP address, its country of origin and the cidr-range containing the received IP address. E.g.,

    from renxincj.com (unknown [104.223.188.228])
    IP = `104.223.188.228', Country: US,  CIDR = 104.223.128.0/17
        

IP version 6 addresses are also inspected, producing output like


    from mail.resoascijournal.info (s857e6ba3.fastvps-server.com \ 
                                                [2a03:f480:2:8::3f])
    IP = `2a03:f480:2:8::3f', Country: EE,  CIDR = 2a03:f480:2::/48
        

If the received e-mail is considered conspicuous (e.g., spam or mail to ignore) then the cidr range could be added to a file like suspect.cidr. Once more e-mails from the suspected cidr-range are received, the range could be added to, e.g., ~/etc/filtermail/spam/cidr or to ~/etc/filtermail/ignore/cidr, using a pattern line like


        1   23-05-10    s '2a03:f480:2:'
        

When the option --cls is specified as yes (either as command-line option or in the configuration file) then the terminal screen will be cleared before showing --inspect's output. When the option --received is specified Received: headers appearing before the Received: header containing the content specified at the --received option are ignored.

EXAMPLES

Commonly incoming mail servers define a directory where valid mail handling programs (or links to those programs) are listed. E.g., sendmail(8) uses the `sendmail restricted shell' (/etc/mail/smrsh) directory. If filtermail is installed in a standard user-accessible directory (e.g., /usr/bin) then the smrsh directory should contain the link

    filtermail -> /usr/bin/filtermail

Once the filtermail program is recognized by the incoming mail server users may filter incoming e-mail through filtermail using, e.g., a ~/.forward file. Such .forward files ignore empty lines and end-of-line comment (starting at #). Assuming a standard filtermail-configuration (cf. section CONFIGURATION) and assuming that user frank's home-directory is /home/frank, then /home/frank/.forward should contain the following line:

    "|/usr/bin/filtermail /home/USER"
Note the double quotes: they are required because filtermail is called with an argument.

The following configuration file specifies that the rules and log files are located in the configuration file's directory and defines paths for all three mail categories:

    rules:  rules
    log:    log/log
    accept: /var/spool/mail/USER
    spam:   ~/Mail/spam
    ignore: ~/Mail/ignore
 
    # as illustration of a 'dont-expire:' specification:
    #dont-expire: ./ignore/from

Filtering rules are defined in the file specified by the --rules option or in the rules: line of the configuration file. Note that the pattern files must start with ./.


    if From:                    ./ignore/from           ignore
    if Subject:                 ./spam/nolowercase      spam

        # inspect all Received... headers:
    if Received+                ./spam/cidr             spam

        # a To: or Cc: header is required:
    if To:  ./match/noto  and  Cc: ./match/noto ignore

The final rule uses the pattern in ./match/tocc (shown below) specifying the `any character' ('.') regular expression: if the To: header is empty (which is also true if there is no To: header, then the not To: condition matches. The same holds true for the second condition. So if neither condition matches there is neither a To: nor a Cc: header, in which case the e-mail is sent to the spam destination. Also note that, according to De Morgan's rule, a not X and not Y rule is identical to an X or Y rule.


    # the ./spam/nolowercase pattern:
    1   23-05-10    not p   '[a-z]'

    # e.g., the ./match/noto as used in the above example
    #       requiring either a To: or a Cc: header:
    1   23-05-10    p  not  '.'
    

FILES

By default the configuration file is expected in the subdirectory etc/filtermail of the directory specified as filtermail's argument.

SEE ALSO

mutt(1), pattern(3bobcat), regcomp(3), sendmail(8), syslog(3), tput(1), whois(1)

BUGS

None reported.

AUTHOR

Frank B. Brokken (f.b.brokken@rug.nl).