Author: Alexander

  • Importance of comments and automatic spam protection.

    Importance and necessity of comments on websites, blog or forum is already understood by everyone. That is so because they directly influence the traffic. Comments for searching engines imply that the website content is actively discussed. Many users value the utility of a product by comments to it. But there is one peculiarity – this may be the wish of some users to get a backlink and traffic from your website. Nevertheless, all websites and forums are affected by spam, automatic or manual.

    I offer you to speak today about the protection of a web resource from spam. Spam in comments, blogs or in a forum creates considerable workload for moderators and if you’ve got your personal resource, blog or forum, all this work falls on you, taking much time for routine tasks. No user would read forum or website overwhelmed with spam so protective measures are to be taken.

    So what methods or principles of spam protection do various software products offer?

    In fact all spam protection methods consist of complicating the way to the comments for spam bots. But what makes the access more complicated for spam bots, makes it more complicated for users as well, so this is the reverse of the coin. Inputing Captcha, counting numbers and answering questions is quite embarrassing for everyone. Such methods push the visitors off. And comments and messages on forums are the source of traffic. The more sophisticated the methods of protection are, the more complicated the bots become as they learn how to avoid protection.

    Another aspect of the protection methods is protection from manual spamming. It consists in  pre-moderation of messages, i.e. permanent control by a moderator, which also pushes the visitors off as they wish to communicate in real time but not wait for moderation of their messages. If you do not use pre-moderation you may also choose to delete spam messages manually which is also not good.

    You have to hire staff for moderation and this means extra losses.

    And if it is your personal website and you are a moderator here, you waste YOUR time already. So maybe you just automate the process?

    So consider another spam protection method which lacks the drawbacks of the previous ones. Here users may post messages and comments online and they are released now from captcha and another tools offered to them within conterspam methods.

    Let`s consider this solution, which is CleanTalk.Spam protect plug-in, more precisely. This plug-in currently works with the following platforms: Joomla, phpBB, WordPress, DataLife Engine, IP Board, vBulletin.

    Its principle of operation is based on the messages relevancy and evaluation of this relevancy to the topic of website, blog or article to which the comment is written.

    If a commentary was written by user and fits the topic, it will be automatically published. If the message does not coincide with the topic, it will undergo manual moderation and the user will be sent notification with the reason for blocking. The user here will be able to correct his message while a bot will have to switch to a less protected resource.

    As the result the website is protected from spam and off topic messages and visitors have handy and simple registration procedure and comments form which is undoubtedly an advantage as the comments are the source of content. Resources previously spent on moderation may now be redirected to more productive activity.

    Additional features of the CleanTalk plug-in allow to block messages with stop words and form the user`s his own list of stop words. The service also allows to send the messages, which contain obscene lexis, swearing and proclivities to ethnic dissention, to manual moderation.

  • Comparative statistic of spam activity on 4 popular CMS

    We have briefly analysed spam activity on 4 CMS powered platforms with CleanTalk service and got the following results:

    • First place — WordPress, 99.46% of spam messages, 88 thousand queries in selection.
    • Second place — DataLife Engine, 91.12% of spam queries (registrations + messages), 3 thousand queries in selection.
    • Third place — phpBB3, 86.98% of spam queries (registrations + messages), 655 thousand queries in selection.
    • Fourth place — Joomla, 42.08% of spam messages, 2 thousand queries in selection.

    Figures show that our incontestable “winner” is WordPress. This platform is the most popular among spammers.

    Joomla`s low percent is explained by the fact that Cleantalk works only with JComments component so far. Only recently did it start to support standard feedback form component. JComments does not work if website visitor`s computer does not support JavaScript, which accordingly influences the quantity of spam queries.

    We broadened our analysis a bit.

    First, we drew a diagram of previously mentioned data for them to be more vivid. We also counted average number of spam attacks on websites with regard to the platform. We used queries of August 2012 for our analysis. Here are the results:

    • phpBB3 — 180 spam attacks a day on average.
    • Joomla — less than 1.
    • WordPress — 338 spam attacks a day on average.
    • DataLife Engine — 101 spam attacks a day on average.
  • CleanTalk Company expanded its spam protection services for websites. A CMS protection app vBulletin was issued

    CleanTalk Company finished the development of an application for vBulletin which allows protection to a forum from spam.

    This application allows protection from automatically distributed spam, as well as from spam bots registrations. As an extra service, users can take advantage of additional functions, they can block messages with the help of stop-words  or compile their own dictionary of stop-words. The service allows to send messages with obscene words and pejoratives to manual moderation. The protection method offered by CleanTalk allows to switch from the methods that trouble the communication (CAPTCHA, question-answer etc.) to a more convenient one.

    CleanTalk – is a cloud automatic spam protection service.  To detect spam the multi-stage verification systems will check the pertinence of a message to the topic of a page, time of forms filling etc.

    The Company provides automatic spam protection services for popular  CMS : Joomla, phpBB, WordPress, DataLife Engine, IP Board.

    © cleantalk.org 2013

  • Will CAPTCHA help?

    You or your organization have a site or a forum in the internet, you’ve spent a lot of time and resources on its creation, filling, development and attraction of visitors. Your resource is popular, the presented information is discussed by the visitors, they express their opinion and give comments. But…

    Each owner of a site or a forum faces such problem as spam. And it is really a great problem, a site without protection in a couple of days turns into a rubbish heap of ads and references. The site will not interest a single visitor from now on. If your site provides a possibility to communicate, you’ll have to moderate it both from spam and those visitors, who wish to spoil your site.

    Let’s single out two threats for the site, they are: spam and dishonest visitors and let’s examine them in detail.

    How to protect site from bots?

    In order to lower the costs on moderation, various methods were developed for spam protection. The principle is always the same: to automatically discern people and bots.

    The most popular method is CAPTCHA – this is an annoying picture with curved and sloping symbols, which are offered to the visitor to fill in. It is supposed that spam bots won’t discern these symbols, but a visitor will. Captcha provokes great irritation, but if one wants to speak out, he has to fill in these symbols time after time, making mistakes and starting once again. At the sight of captcha and after input errors, many visitors leave the resource. Thus, captcha helps to protect the resource both from bots and visitors.

    Other applied ways of protection consist in using fields, concealed from the visitors. Supposedly, a bot will fill in all the fields and a visitor – only the ones, he sees, and if all the fields are filled, the comment is considered to be spam and is not published. Checkbox is also applied, the visitor is asked to tick the box if he is not a bot. It is quite easy to write a script, which will tick the box and find concealed fields.

    These ways will protect you from spam not more than 2 months, otherwise your resource has poor attendance and it is not interesting for spammers.

    How does CAPTCHA protect from spam-bots?

    With the help of algorithms and technologies of text recognition, the use of captcha gets less and less effective. The most part of captchas are passed  by connection to FineReader or OmniPage libraries, use of Cocke–Younger–Kasami (CYK) algorithm.

    Writing of self-organising algorithm, recognising the symbols in captcha, doesn’t take a lot of time, but the efficiency of recognition is more than 50%. It’s not difficult for spammers to find ready-made algorithms for the most popular captchas in the internet.

    If there are few variants of captcha, let us suppose, you use nonstandard captcha, then it is passed by plain guessing, when a robot sends random answers and some of them may suit.

    One more way to pass captcha is to use human resource, the robot redirects captcha to a highly popular resource, sites of pornographic subject, file hostings and so on, a user of such site is offered to input an answer to a captcha and a robot inputs it in the needed site.

    The use of artificial neuronic networks for robot training hightens their efficiency of discerning captcha. As a rule, recognition systems discern symbols better than people, what makes the impression that captchas are created not for eliminating the robots, but for eliminating people.

    CAPTCHA is not a panacea from spam, the cost of manual recognition is 1-2 $ for 1000 recognitions. The use of such method of spam protection will facilitate your work on resource moderation and annoy visitors.

    Do you still have doubts concerning the need for CAPTCHA in your resource?

    How much time a day do you or your employees spend on moderation? Estimate the cost of this time, and if such task didn’t exist and these resources could be spent on something else, for example, on development? Summing up real losses and missed profit, you’ll get real costs.

    Is there an alternative? The one, that would protect the site from spam, be invisible for the visitors and take less time for moderation?

    Yes, there is – CleanTalk service.

    CleanTalk is a service of automatic site moderation, which uses invisible for the visitor means. It uses multistep checking for eliminating spam-bots. The plug-in working principle is based on determining the message relevance and estimates the conformance of the written message to the topic of the site, blog or an article to which this comment was written.

    The main advantages of the service is automatic protection from spam bots’ registrations, protection from automatic and manual spam distribution, message automoderation, adjustable lists of stop-words. It doesn’t load the database, as the messages don’t get into the base. Quick reaction to appearing of new spam bots.

    There is economy on costs concerning moderation, and the released resources may be used more efficiently, than looking through messages and deleting spam.

    Standard for installation in the site modules are developed for popular CMSJoomla, phpBB, WordPress, DataLifeEngine, IPBoard, vBulletin. API may be used for other CMS.