Page 7 of 8 FirstFirst ... 345678 LastLast
Results 121 to 140 of 157

Thread: Our Anti-spam is Ready! - Accuspam

  1. #121
    Maneater JawZ's Avatar
    Join Date
    Feb 2001
    Yeah man...been plugging your site when I can to those who need it. If you haven't already, I would introduce yourself and your very fine product to the people at

    Even though I'm not using your product at this time, I do feel it is important for us all to work together to fight spam.

    ...formerly the omnipotent UOD

  2. #122
    Quote Originally Posted by UOD
    ...I would introduce yourself and your very fine product to the people at
    Thanks but I personally do not have time to contact the EFF.

    Apparently AccuSpam meets the EFF's desired spam tactics 100% and better than any other existing method. Perhaps you can contact them on our behalf and let them know that Bayesian and Spam Assassin, which they are prominently linking to ( does not meet their own criteria, but AccuSpam does:

    "...any measure for stopping spam must ensure that all non-spam messages reach their intended recipients. Proposed solutions that do not fulfill these minimal goals are themselves a form of Internet abuse..."

    "...we would like to see the development of better filtration software on servers, something that could work interactively with the mail recipient in defining what he or she regards as spam using pattern recognition. That is, every time somebody gets a message of a sort he or she does not want, s/he could send it to the filter, thereby making that filter smarter over time, as well as giving it the ability to "learn" as spam techniques develop..."

  3. #123
    Maneater JawZ's Avatar
    Join Date
    Feb 2001
    Quote Originally Posted by accuspam
    Thanks but I personally do not have time to contact the EFF.

    Apparently AccuSpam meets the EFF's desired spam tactics 100% and better than any other existing method. Perhaps you can contact them on our behalf and let them know that Bayesian and Spam Assassin, which they are prominently linking to ( does not meet their own criteria, but AccuSpam does:

    "...any measure for stopping spam must ensure that all non-spam messages reach their intended recipients. Proposed solutions that do not fulfill these minimal goals are themselves a form of Internet abuse..."

    "...we would like to see the development of better filtration software on servers, something that could work interactively with the mail recipient in defining what he or she regards as spam using pattern recognition. That is, every time somebody gets a message of a sort he or she does not want, s/he could send it to the filter, thereby making that filter smarter over time, as well as giving it the ability to "learn" as spam techniques develop..."

    Well, I'll see what I can do. I am a member of the EFF and contribute yearly with monetary donations.

    ...formerly the omnipotent UOD

  4. #124
    Quote Originally Posted by UOD
    Well, I'll see what I can do. I am a member of the EFF and contribute yearly with monetary donations.
    Thanks! Any such prominent links to will help accelerate the snowball downhill effect of ramping up the statistics AccuSpam uses to detect spammers.

    As well, I am working a Bayesian content filter that uses the statistics of all AccuSpam users, and will work essentially exactly the same as the domain blocking. In essense the domain blocking hypothesis is that some domains send 99.9+% spam and < 0.1% non-spam. The naive Bayesian content filtering espoused by Paul Graham (and afaik used by all current Bayesian anti-spam, e.g. Spam Assassin, Spam Bayes, etc.) attempts to correlate spam features which have much less than 99% probability (especially if measured globally for all correlated users), thus it needs to balance the probabilities with "good words". Whereas, I am working on a Bayesian method that works the same as the domain blocking hypothesis and looks only for the features of spam content which are in 99.9+% of spam and in < 0.1% non-spam. This improved form a Bayesian content analysis will have advantages over Paul Graham's Bayesian content filter:

    1. Fundamentally it is correlating not content of spam and non-spam (which is inherently noisy), but correlating volume of spam and non-spam. What makes spam is it's volume, not it's message. So this Bayesian does not try to decide what is bad content and good content as Paul Graham's ( Bayesian does, it instead just tries to find the features of spam sent in bulk that unique from the features of non-spam on the whole.

    2. The risk for false positive (even in future) will be near 0, e.g. 1 in million (same as for domain blocking), because it takes into account the patterns of many correlated users and the many permutations of legitimate email.

    3. No way for spammers to corrupt the "good words" (words in non-spam) probability, because my approach does not use the probability of the "good words", only the probability of very "bad words" (words always spam and never in non-spam).

    4. Effort to identify and train on patterns shared (divided) amongst all users, so many orders of magnitude less effort than per user (Paul Graham) Bayesian.

    5. The only way for spammers to corrupt the very "bad words" is to fight with other spammers by adding more spam weight to the very "bad words" of other spammers. This is same as for the domain blocking. The only way for one spammer to defeat AccuSpam for his domain(s), is to correlate well by disapproving the domains of the other spammers. If all spammers fighting each other then they actually cancel their attempts to defeat AccuSpam, and aid AccuSpam in detecting them. For example, say there are 1000 spammers, then 999 are against each 1 of them, so they add 999 to 1 more disapproval data than they add approval data. If a spammer joins AccuSpam and does not disapprove his fellow spammers, then his votes are ignored because they won't correlate well to AccuSpam users which are disapproving the spammers. It is like a dog chasing his tail, they have no way out to catch it.

    6. Since most single words occur both in spam and non-spam, my improved global Bayesian, will look at n-grams of word combinations, since it is not usually the word "sexy" but the context of the use "sexy" in a phrase that can uniquely identify a spam.

  5. #125
    I have thought of an easier way that spammers can defeat the Bayesian used in afaik all existing (Paul Graham) Bayesian anti-spam, easier than what I wrote before:

    For each spam run, they add 4 or 5 random letters (chosen from a-z and A-Z) to the end of each word that is often used in spam (e.g. ViagraAgtU). Do not insert HTML, space, punctuation or anything between the random letters and the spam word. Simple. Done. All existing (Paul Graham) Bayesian defeated 100%.

    The reason is that given 26 letters times 2 for capitals, then the number of random combinations are (26*2) ^ 4 = 7.3 million. Thus it will take at least 7.3 million spam runs before on average a Bayesian filter will see the same spam word in more than one spam. Given that a Bayesian filter needs to see a word many times before giving it a high spam probability, then probably a billion spam runs will still not be detected. Spammers do not have to ask the other spammers not to use their combinations, because all spammers choose randomly.

    Once the 4 letter combinations start getting caught by Bayesian, then just switch to 5 letters and that is 400 million. The 6 letters is 20 billion, e.g. trillions of spam runs before detection by Bayesian.

    Since my improved Bayesian correlates all users, then for using the same combinations for all spams in a spam run will be detected by my improved Bayesian. The way for spammers to avoid detection with my improved Bayesian is to randomize the letters for each spam in a spam run:

    Anti-spam could attempt to identify words stems which end (or start) with randomized letter combinations, but this could create false positives if analyzed the last letters for randomness. Some legitimate non-spam emails contain unlikely letter combinations, e.g. hexadecimal numbers.

    Anti-spam could attempt to use a dictionary of words to extract the beginning word stem,and ignore words with stems not found in dictionary, but the dictionary would have to contain all possible spellings of spam word stem, then the anti-spam would miss unknown and misspelled spam words (e.g. Viaqra).

    So to be most clever, spammers should combine misspellings with random letter appendages (e.g. ViaqraAgtU) and avoid using anything (no html) but letters a-z and A-Z in their emails. They can defeat all Bayesian that way, even my improved Bayesian if they randomize each spam of spam run.

    Spammers would also have to randomize any urls they insert in their spams. They probably should do it more intelligiently than just adding random "?xxxx" to end, as this is easy for anti-spam to ignore. Instead they must randomize the domain (or portion after a non-spam domain). Much more costly for spammers to randomize their domains and urls. As long as spammers have a correlatible url or reply email address in their spam, then BrightMail and Bayesian can correlate them, but my improved Bayesian can correlate it much faster (since many users data and spammers can change urls frequently compared to only one user data).

    And spammers could combine this with my previous ideas to insert normal prose to help defeat and pollule the good word probabilities of Paul Graham type Bayesian:

    Here is some interesting analysis and examples from another person who believes Bayesian content filtering can and will be defeated:
    Last edited by accuspam; 08-14-04 at 05:09 AM.

  6. #126
    Maneater JawZ's Avatar
    Join Date
    Feb 2001
    What are your thoughts on email encryption/digital signatures? Any problems in how encrypted email interfaces with your service?

    ...formerly the omnipotent UOD

  7. #127
    Quote Originally Posted by UOD
    ...Any problems in how encrypted email interfaces with your service?
    As far as I know, no conflicts in terms of the sender address statistical blocking. AccuSpam does not care about what you put in the email, as long as the normal headers exist.

    However, for the global Bayesian content blocking we are considering, then if the body of the email is encrypted, then that aspect of spam detected would be defeated. However, I think you are referring to a signature which identifies a sender, not the encryption of the email content. In that case, I see no conflict with AccuSpam.

    Note that AccuSpam does not currently propogate all the headers (only for non-Approved Senders in free version, or all senders in the yet unreleased paid version), so any special headers (that normal email does not need) would be lost. This is in my medium term To Do list to fix.

    Quote Originally Posted by UOD
    What are your thoughts on email encryption/digital signatures?...
    If you are referring to encrypting the content of an email using public/private key (e.g. PGP) so that only the sender and recipient can decrypt, then I think that is really not needed or practical for vast majority of users.

    What we really need is secure transport (e.g. SMTP and POP over SSL), so that the email can not be sniffed during transmission, which is especially important now with wireless transmission. Minimally every user needs to demand their ISP support APOP or POP over SSL (it is amazing how many major ISPs do not!), and then set their email program, to prevent the sending of their email passwords in clear text. My ISP supports APOP, but my Host (which is also the Host of AccuSpam) still does not support APOP (even after 2 years of me asking them to), and it was a source of irritation when a college student walked up to me in a coffee shop where I was connected via wireless and showed me my email password. Since then, I always change my email password before doing wireless session, and then change it back afterwards. Note that other than this, is a very secure and excellent Host. The do support most other major secure connection mechanisms, such as SSH (telnet over SSL), SFTP (ftp over SSL), etc..

    If you are instead referring to the use of a digital signature to identify that an email really came from you, then we think this is so important, that it is actually part of way AccuSpam will detect email forgery. Soon there will be a new feature on AccuSpam, where you insert a value in your signature so that all AccuSpam users can receive your email. If you don't sign up for it, AccuSpam users will still get your email, but your email address can be forged by a spammer. Initially we expect major corporations to sign up for this once we have many AccuSpam users, so that they can stop spammers from doing phishing scams using their corporate email addresses. This will also be a free service available for instant signup to individual users.

    -Shelby Moore
    Last edited by accuspam; 08-18-04 at 07:49 AM.

  8. #128
    Improved the correlation of AccuSpam users by only correlating to the target user on domains the target user thinks are spammers. This was done to insure that any attempt to approve a spammer by joining AccuSpam to pollute global stats, would be ineffective because they would also have to disapprove a greater number of other spammers in order to correlate to other users.

    An unexpected benefit is it increased the number of correlations by 50%! So we are 50% closer to critical mass. In hindsight, this makes sense (thinking to myself "why didn't I realize that!" ). Many users will disagree on the % of spam received from non-spam domains, ranging from 0% - less than 80 or 90%. But most users will agree the spammer domains are sending greater than 90+%.

    Some users may see an instant and significant decrease in the length of their Daily Summaries from this simple improvement.

  9. #129
    I am currently having doubts whether I will implement the "improved Bayesian content filtering" I outlined in previous post:

    I have realized any Bayesian filter which recognizes urls and domains, could be effectively used by spammer to blacklist any less frequently domain on the web, by sending out a lot of spam containing that domain. Chalk that up to yet another hole that could be exploited by spammers again Bayesian.

    I could ignore domains and urls in content, and may do that as a defense against current day spam until our critical mass builds for statistical sender blocking, but then as outlined previously, defeating all Bayesian content filters is fairly trivial for spammers if the Bayesian is not considering the urls and domains in the content:

    AccuSpam's statistical sender blocking can NOT be polluted so easily by spammers because we can detect forgery of sender. We have no corresponding way to detect forgery of content.

  10. #130
    Another major improvement has been made to AccuSpam.

    The Daily Summary now has emails ranked by order of greatest chance to be a non-spam first.

    And the chance of being a non-spam is listed below each email summary in the Daily Summary.

    Thus the AccuSpam user can decide how far down to browse the Daily Summary based on his/her desired false positive risk.

    Now there is no excuse not to reply to the Daily Summary. There are some users who are not replying to the Daily Summary, and they will have no one to blame if they lose a legimate email but themselves. We can not hold their quarantine indefinitely. We will probably automatically purge emails from the quarantine which are 14 days old and have less than 1 in 1000 chance to be a non-spam. Or something reasonable like that.

    Additionally the statistical domain blocking algorithm is run again for previously processed emails for a user just before sending the Daily Summary, so that any global data that accumulated since first processing has another chance to detect the spam (as having > 1 in million chance to be non-spam) and not include it in Daily Summary.

    I already noticed this has reduced the lengths of some users' Daily Summaries.

  11. #131
    We made an error in the improvement we made in morning:

    which caused the Daily Summaries to contain blank entries.

    This has been fixed and replacement Daily Summaries have been emailed to all users.

    Do not worry. No email was lost. It was merely an error in the display of the information in the database. No information in the database was affected.
    Last edited by accuspam; 08-16-04 at 02:18 AM.

  12. #132
    Junior Member
    Join Date
    Aug 2004
    sounds dangerous

  13. #133
    Quote Originally Posted by cigamkcalb
    sounds dangerous
    Absolutely not dangerous.

    Before sending the Daily Summaries, the data from the database is copied into an array in memory. The error was that we were not reading correctly from that array when writing the values into the text of the Daily Summary email. No manipulations are performed on the database when composing the Daily Summary, because it is purely a display operation. That is why it wasn't as crucial to test it exhaustively before release. Be confident that any code that changes the database is tested exhaustively both before and during release and continually monitored.

    Besides, the database is backed up frequently.

  14. #134
    I am very happy to report that the backlog in some users' quarantines is being automatically reduced by the improvement I made to apply the statistical blocking again before sending Daily Summary.

    We had a 20% increase in enabled AccuSpam users overnight!

    The global statistical blocking among correlated AccuSpam users is starting to catch up with the rate that spammers use new domains.

    I am confident we will see the Daily Summaries reduce from here.

    The remaining major work for me is to figure out how to deal with UNSPOOFED spam from domains which do not send 100% spam, e.g. major ISP domains. We already delete the spoofed spam in most cases. Luckily this UNSPOOFED spam from non-spammer domains is a small % of the spam being received because ISPs have incentive to stop spam coming from their networks. I will probably have to apply some sort of "safe" Bayesian to UNSPOOFED spam from non-spammer domains. And may be able to apply "safe" reverse DNS on free email that are exclusively Webmail oriented. As well, the statistical blocking by sender email address (not just domain which only needs < 1000 users) will kick in once we have 10,000 AccuSpam users.

    As always, in no case should you receive spam in your Inbox with (paid version of) AccuSpam (and only minute amount if free version used correctly as detailed in the website FAQ).

  15. #135
    We had the first user complain angrily about AccuSpam, and I feel it is important to explain the scenario where AccuSpam will absolutely not work.

    (not counting the old version of AccuSpam in 2003 that was totalling different product and algorithm).

    We did not bother to ask the user why s/he wanted to disable AccuSpam, but s/he was asking for instructions to disable and they seemed angry about not receiving any of their email. We simply pointed them to the instructions that are already on the website for disabling.

    We realized that the user could not have been receiving the Daily Summaries or was not properly replying to the Daily Summaries. That is the only way they could have received none of their email.

    So I realized that the users is probably running another anti-spam (probably at their ISP possibly even without the user's knowledge) and that anti-spam is erroneously blocking the Daily Summary emails from AccuSpam to the user.

    It could very well happen that some people (possibly other anti-spam companies) who feel competition to AccuSpam will try to hurt us by blacklisting our IP address.

    So if you are using AccuSpam and you do not receive the Daily Summaries, then complain to your ISP that they are erroneously blocking your legitimate email.

    AccuSpam does not delete legitimate email. Sadly most other anti-spam does. So do not blame AccuSpam if you run another anti-spam that blocks AccuSpam. Users would be much wiser to run one anti-spam at a time.


    Apparently we were wrong and the problem was the user was not replying to the Daily Summaries.

    Here is copy of our email response to her/his further explanation of the problem they were having. Note that no personal information has been disclosed. This is merely to answer a problem that other users may run into:

    ======AccuSpam wrote to AccuSpam user========
    Thanks for explaining your problem further, especially we did not even ask you to. That is appreciated. We had incorrectly assumed you were not receiving the Daily Summaries from us:

    Sounds to me like you are trying to type into the Daily Summary email you received from us.

    You must click "Reply" first to create a new reply email. Make sure your email program is configure to include the senders email at bottom when you reply to a sender. Else you need to copy and paste the email from AccuSpam into the reply.

    Then you can type into the [ ] boxes in the reply email.

    NOTE: you do NOT need to type a space for each message you wanted deleted. The spaces are already inserted by default in the Daily Summary AccuSpam sends to you.

    At 08:37 PM 8/15/2004 -0400, AccuSpam user wrote:
    >I am trying to place letters (A & R) in the Message ID brackets. I am not
    >able to type anything in the spaces ... nor can I put a space in the Message
    >Brackets for the messages that I want deleted.
    Last edited by accuspam; 08-16-04 at 06:00 AM.

  16. #136
    Improved the Daily Summary instructions so users understand that they do not have to manually type an empty space for each email they wish to delete. They merely reply. They only need to type the A and the R.


       Note the [ ] below already have an empty space by default.

    To bottom of:

    -  Place empty space in Message ID brackets [ ] for messages you want deleted
       permanently and to permanently block sender.
       Future emails from sender will be deleted.
       Use empty space if sure is spam.
       Note the [ ] below already have an empty space by default.

  17. #137
    As I predicted, the spammers are getting more astute at attacking the popular Bayesian content filtering used by most anti-spam (not used by AccuSpam).

    The following content is an attempt at normal prose, but I think it is still too non-random and as I said the urls are what can still be correlated by bayesian, if you do not mind your legit email getting blocked by bayesian if spammers insert non-spam urls in their spams:

    Subject: joke inside

    <DIV><FONT face=Arial size=2><A
    href="">Three blondes were taking a walk in the country when they came upon a line of tracks. The first blonde said, "Those must be deer tracks!" The second blonde said, "No, stupid, anyone can tell those are rabbit tracks!" The third blondie said, "No, you idiots, those are horse tracks!" They where still arguing ten minutes later when a train hit them.</A></FONT></DIV>
    <DIV><FONT face=Arial size=2><A
    href=""><IMG alt="" hspace=0
    src="" align=baseline border=0></A></FONT></DIV>
    <DIV><FONT face=Arial size=2><A
    href="">A blonde got a dent in her car and took it in to the repair shop. The repairman, noticing that the woman was a blonde, decided to have a wee bit of fun. So he told her all she had to do was take it home and blow in the tailpipe until the dent popped itself out. After 15 minutes of this, the blonde's blonde friend came over and asked what she was doing. "I'm trying to pop out this dent, but it's not really working." "Duh. You have to roll up the windows first!"</A></FONT></DIV>

  18. #138
    Request for help.

    To further improve AccuSpam, I need a list of the mail domain (e.g.,, etc.) for subscribers of major and stable ISPs all over the world. Specifically ones that we know have a bonafide non-spam subscriber base of size that would be worth a spammer attacking.

    And for each one, I need a copy of the email headers (specifically the "Received" header lines) from an email sent from a subcriber to that ISP.

    The reason I need this information is to compile a database of ISP domains that support "Reverse DNS" (e.g. PTR records in DNS for their IPs) and also a list of the nameservers for each ISP (e.g. NS records). I can lookup this information from DNS given the email headers.

    It seems that we can delete a lot of the spam that AccuSpam is currently summarizing in the Daily Summaries simply by looking for forged Reverse DNS records! This is different than how most anti-spam use Reverse DNS. I have noticed that many spammers set a Reverse DNS record for their IP to match the lie they give in the email headers, but that then of course the nameservers do not match the major ISP they are pretending to be sending from.

    For spams from IPs which do not have a Reverse DNS record, we will not delete this (as some anti-spam do), as this would cause false positives, but we can assign a probability to this which when combined with other metrics can help detect the spam.

    Start here for lists of major ISPs:

    Any contributions can be emailed to me at:

    If you subscribe to one of those ISPs above, simply send me an email with subject "Here is an ISP header you requested".

    Shelby Moore

  19. #139
    Implemented the pseudo-"Reverse DNS" test (somewhat different that the way other anti-spam use "Reverse DNS"), and have populated it to detect and delete sender email address forgeries from and I can see many of these forgeries now being deleted:

    You should see much less (if any) spam in your Daily Summaries from senders that have and in their email address.

    For major non-webmail ISPs, it will not delete because we can not be sure legitimate email will pass "Reverse DNS" in that case, but it will place a higher probability of spam on those that fail "Reverse DNS". Most importantly it will delete those that forge the "Reverse DNS" of major ISPs.

    FYI, this may seem non-intuitive, but it is MORE important to block forgery of free email domains than paid email domains, because AccuSpam deletes spam from non-existent senders, and thus it is much more costly for spammers to obtain paid email accounts (or to use their mailing list as the senders), or obtain their own domains, than to obtain free email accounts and forge them. The reason the spammer must forge the free email account they created is because they can not send huge volumes of emails through the webmail interface of the free email provider.
    Last edited by accuspam; 08-18-04 at 02:34 PM.

  20. #140
    Forged spam from hotmail and yahoo is now eliminated from Daily Summary. The only way to get a spam from hotmail and yahoo in your Daily Summary is if the spammer actually sent the spam from the webmail (directly or via a program which interfaces to the webmail, e.g. Outlook, Hot Popper, Yahoo Pops, etc) of hotmail or yahoo (which I think yahoo and hotmail prevent sending huge volume of email from their webmail).

    Here is an example that AccuSpam detected and deleted (with "xxxxxxxx" used to obscure private AccuSpam user data):

    Return-path: <>
    Received: from (unverified []) by
     (Rockliffe SMTPRA 5.3.11) with ESMTP id <>;
     Tue, 17 Aug 2004 10:17:14 -0400
    Received: from (Not Verified[]) by
    xxxxxxxxxxxxxxx with DEFSCAN (v3)
            id <BH0cca5226>; Tue, 17 Aug 2004 10:17:12 -0400
    Message-ID: <>
    Reply-To: "Katina Thornton" <>
    From: "Katina Thornton" <>
    To: xxxxxxxxxxxxxxxxx
    Subject: V.iagra on s.ale, save moolah ;    bdlkvdpkgogsc 
    Date: Tue, 17 Aug 2004 12:12:53 -0300
    MIME-Version: 1.0 (produced by decimatesportsman 1.7)
    Content-Type: multipart/alternative;

    Here is the analysis AccuSpam did, where "38k" means it deleted the email as forgery. You can see the spammer was actually sending from, which is "" probably on "" network:

    28: V.iagra on s.ale, save moolah ;  bdlkvdpkgogsc
    ; <<>> DiG 9.2.3rc4 <<>> -x209.153.138.124
    ;; global options:  printcmd
    ;; Got answer:
    ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 10170
    ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 3, ADDITIONAL: 1
    ;        IN        PTR
    ;; ANSWER SECTION: 86400 IN        PTR
    ;; AUTHORITY SECTION: 86400        IN        NS 86400        IN        NS 86400        IN        NS
    ;; ADDITIONAL SECTION:        164766        IN        A
    ;; Query time: 25 msec
    ;; SERVER:
    ;; WHEN: Tue Aug 17 10:19:51 2004
    ;; MSG SIZE  rcvd: 169
    Last edited by accuspam; 08-17-04 at 01:36 PM.

Similar Threads

  1. Good Morning Spam with Monty
    By Think in forum General Discussion Board
    Replies: 1
    Last Post: 08-14-02, 11:27 AM
  2. who is.........
    By nightowl in forum General Discussion Board
    Replies: 17
    Last Post: 11-12-01, 09:13 PM
  3. Im Leaving Thats It
    By Jesse23 in forum General Discussion Board
    Replies: 11
    Last Post: 09-02-01, 01:41 PM
  4. I was at Rite Aid just now and,,,,,,
    By crazyman in forum General Discussion Board
    Replies: 11
    Last Post: 07-09-01, 06:59 PM
  5. The SG Spammer!
    By HalfLifer in forum General Discussion Board
    Replies: 7
    Last Post: 06-10-01, 06:55 PM


Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts