                                 uucpSF.EXE
                                    v1.00
                      The Spam Filter for UUCP accounts.
                               by Steven Libis


WHAT IS IT?:
------------
UUCPSF is a DOS spam filter for UUCP accounts.  It rejects incoming
messages based on who is receiving the message, as well as rejects
the message based on who is sending the message.

It does *NOT* review subject or message contents (other than to look
for e-mail addresses included in the message).


WHAT DOES IT DO?:
-----------------
UUCPSF reads SPAM.CFG and REJECT.CFG looking for prior spam e-mail
addresses.  It reads ACCEPT.CFG looking for valid user e-mail names.
It reads LISTSERV.CFG looking for e-mail mmessages to not process.

SPAM.CFG, REJECT.CFG, ACCPET.CFG LISTSERV.CFG are plain ASCII text files.

UUCPSF reads the incoming UUCP (control file *.X) file, as well as
the headder of the message (message file *.D) looking for alternate
sender e-mail addresses (Reply-To:, Sender:, From:, ...).

If a message is addressed to anyone in REJECT.CFG it is considered
spam, and all of the senders e-mail addresses will be added to SPAM.CFG.

UUCPSF compares all of the senders e-mail addresses it has previously
stored against any of the senders addresses from the new message.  If
any of the new e-mail addresses match, then the message is spam, and
any of the new e-mail addresses are added to the SPAM.CFG if they are
not already there.

UUCPSF will NOT automatically add any e-mail address that include the
words "root", "webmaster", "postmaster" or "uucp".  However, you can
manually add them to the SPAM.CFG, and they can then be filtered out.

UUCPSF also checks to see how many users the message is addressed to.
You can select how many valid e-mail addresses are considered acceptable
vs. how many it takes to be classified as spam.  It doesn't matter how
many valid e-mail addresses a message is addressed to, if any are on the
REJECT.CFG list, then the message is spam, and all the senders addresses
are added to the SPAM.CFG.

UUCPSF has the option to save (in a directory of your choice) or delete
the incoming spam.  I did this so that I could run the program in 'save'
mode until I feel comfortable enough to start running it in 'delete' mode.
I suggest saving and reviewing messages until you see how the program works.

UUCPSF also reads the ACCEPT.CFG.  If there are any messages addressed to
someone NOT in this file, you can either accept and let the system continue
processing, or save in a separate directory.

You can manually add e-mail addresses to the SPAM.CFG to help get it started.
I use a plain ASCII text editor, that does *NOT* put a control Z as the end
of file marker.


WHAT DOESN'T IT DO?:
-----------------
It doesn't reject messages based on content of the message or subject line.
I understand that there can be censorship issues if you try filtering out
some messages, but some of the filtered messages get in anyway.

Also, I have noticed that many people quote a whole message when replying,
to messages (including spam).  Some people forward particularly noteworthy
spam.  In which case, the contents would be the same but a valid user was
trying pass on and comment on something to someone else.


WHY DID I DO THIS?:
-----------------
UUCPSF written because I would prefer *NOT* to look at incoming spam
(either in my mail program, or on the hard drive) if I don't have to.
And the number of the users of my BBS is decreasing, while the number
of (spam) messages are increasing.  Also, in case some spammer has not
forged an e-mail address (unlikely I know, but it is possible) then I
wouldn't want spam bounce messages going back, letting them know
they have a valid doman, and therefore should continue spamming every
name they have, as well as making up a few new names in the hopes of
getting someone else new.


CHANGES FROM v1.00
------------------
1) None.  This is v1.00.  :-)


SYSTEM REQUIREMENTS/WHERE HAVE I TESTED IT?:
--------------------------------------------
I have tested/used this with Wildcat v4.20 and wcGate v4.20 running
on a NetWare v3.20 file server, with DOS workstations running on
Novell DOS v7.0, and MS-DOS v6.22.  I have tested it on a 486-66
and Pentium 75 and Pentium 233 machine.


SETUP/INSTALL:
--------------
The configuration files, UUCPSF.CFG, REJECT.CFG and ACCEPT.CFG must be
in whatever directory the program file, UUCPSF.EXE, is in.  The program
file, must be on whatever drive the processing directory is on
(Line 3: of the configuration file).

If SPAM.CFG doesn't already exist, UUCPSF will create one as it runs.
I have included a sample SPAM.CFG from my system.  Feel free to modify
or delete it.

UUCPSF.EXE will not run unless all the *.CFG files exist (except SPAM.CFG)
and are in the same directory as the program.

The UNIX2DOS and DOS2UNIX can be in either the directory where UUCPSF.EXE
is installed, or in any directory in the path.

UUCPSF.CFG:
-----------
Line 1: The directory where the incoming UUCP messages are stored.
        This must be a valid directory, or this program won't run.
Line 2: A directory where you can safely process the messages.
        This must be a valid directory, or this program won't run.
Line 3: A directory where you can store rejected messages.
        This must be a valid directory, or this program won't run.
Line 4: A directory where you can store messages to unknown recipients.
        This must be a valid directory, or this program won't run.

The program will not run unless all four directories in this
configuration file are valid *AND* are not duplicates of each other.

Line 5: You have your choice of "delete" or "save" suspected spam messages.
        You must put either "delete" or "save" (without the quotes) or this
        this program won't run.
Line 6: This must be a number.
        If a message is addressed more than this number, it is spam.
Line 7: "accept" or "reject" addressee if not in "accept.cfg" or reject.cfg".
        accept = do nothing to message, let rest of normal processing take place.
        reject = copy message to save unknown directory.  (after all, this could
        be a valid message, but someone just mispelled the users' name.)
Line 8: The name of your ISP, as listed on the first line of the *.X files.
        This is needed in case the *.X file is missing and needs to be recreated.

UUCPSF.CFG must be at least 8 lines long.  It can be longer, but UUCPSF
only reads the first 8 lines of this file.

The rest of the lines in the configuration file are ignored, so you can
put whatever comments you want there.


LOG FILES:
----------
UUCPSF creates a new log file with a unique name (based on the current
date and time) every time it is run.  The log file, lists the file name
each message processed, who the sender is, who the receiver is, whether
it thinks the message is spam, and why it thinks the message is spam.
It also gives you totals for the number of messages processed.

It also creates an INCOMING.LOG file that lists each incoming message
on a seperate line.  It lists the file name, the sender and the recipient.
When it finishes each run, it also puts the date and time finished,
and how many incoming messages reviewed.


THIRD PARTY PROGRAMS:
---------------------
These third party programs are required to make this program functional:
UNIX2DOS and DOS2UNIX.
(they are included with this program)

They must be accessable to UUCPSF.EXE, either by being in the same
directory as the program (or better yet, somewhere in the path).


ASSUMPTIONS/LIMITATIONS:
------------------------
Limitations built into the software by me:

1,000 names in the REJECT.CFG
1,000 names in the ACCEPT.CFG
4,000 names in the SPAM.CFG

All these limits are arbitrary, and can be increased.  But they seemed
more than adequate when processing the messages I had been saving over
the past few weeks.
(If you need any of these limits increased, please let me know.)


UUCP CONFIGURATION:
-------------------
This program has only been tested with UUCP accounts where the message file
and the control file are both 6 character file names starting with 0C and the
file extension is 1 character (the message file is .D and the control file
is .X).  (Newsgroups on my system start with G, but still have the .X or .D
extension). The contents of the control file (.X) is five lines long, and
follow this specific format: (this is true for my ISP and a friend's ISP).
note: the file name leading 0 is dropped on the inside of this .x file.

      0C01AA.X               - file name on hard drive.
      U daemon ispname       - line 1 - ISPname.
      F D.ispnameC01AA       - line 2 - ISPname and message name.
      I D.ispnameC01AA       - line 3 - ISPname and message name.
      R sender@there.com     - line 4 - Who sent the message.
      C rmail me@ipsname.net - line 5 - Who message addressed to.

If this is NOT the case in your situation, please let me know so I can
update this program (please send a message file and control file so I can
see how your ISP is doing this).


ANYTHING ELSE?:
---------------
Nothing that I can think of for the moment.


REGISTRATION?:
--------------
Send $5.00 to:

Steven Libis
P.O.Box 8106
Mission Hills, CA 91346-8106


THE END?:
---------
Questions, comments, complaints and suggestions are appreciated.

Steven Libis
Sysop, Earthquake City BBS
sysop@eqcity.ktb.net
http://eqcitybbs.tripod.com/
