Tag: api

  • Recaptcha v3 always returns 0.9 score – research by CleanTalk

    Recaptcha v3 always returns 0.9 score – research by CleanTalk

    Who is this article for?

    We’ve been closely following the thread https://github.com/google/recaptcha/issues/235 and noticed that, despite being closed, users continue to report issues.

    We’ve decided to investigate the problem and share our findings with you.

    • How ReCaptcha v3 works
    • What is a score
    • Why you might get a score other than 0.9 in ReCaptcha v2
    • Why you always get a score of 0.9 in ReCaptcha v3
    • Our testing process
    • How to get an accurate score in a test environment
    • CleanTalk’s solutions

    Research Objective

    Users complain that when testing ReCaptcha v3, they always receive the same score of 0.9. However, in the same environments with ReCaptcha v2, the score varies.

    What is a Score?

    The score is the result of the ReCaptcha check. The closer it is to 1, the more likely the visitor is human. The closer it is to 0, the more likely the visitor is a bot.

    How ReCaptcha v3 Works

    Note: The following findings are based on publicly available code and our interpretation.

    1. A user integrates the ReCaptcha script on a form page.
    2. A unique frontend token is added to each form.
    3. The script loads additional obfuscated code.
    4. The obfuscated code collects frontend data (a “black box” not accessible due to Google’s code obfuscation).
    5. Aggregated and encoded data + frontend token is sent to Google’s cloud to get a result token.
    6. The result token is sent to the backend of the testing environment.
    7. The backend validates the token via Google’s API, sending the backend token, result token, and the visitor’s IP address.
    8. Based on the score result, the backend environment can decide whether to allow the visitor to proceed.

    The backend environment decides whether to allow the visitor to proceed based on the score.

    We believe ReCaptcha v3 relies on machine learning based on the traffic environment. The exact decision-making algorithms are proprietary and remain a trade secret of Google.

    Why You Get Score <> 0.9 in ReCaptcha v2

    ReCaptcha v2 does not use machine learning for decision-making.
    It operates in one of two modes:

    1. in the user interaction mode (presence of click-the-flag mechanism on the page).
    2. In silent mode (reCaptcha v2 badge on the page).

    The data collection and processing occur in real time, allowing for accurate, immediate results. Learn more: https://developers.google.com/recaptcha/docs/versions.

    Why You Always Get a Score = 0.9 in ReCaptcha v3:

    ReCaptcha v3 relies on machine learning based on traffic data.
    A consistent score of 0.9 indicates the system lacks sufficient data about your typical traffic to make an accurate decision. To avoid false positives, the system grants a 0.9 score to all visitors until trained.

    Our Testing Process

    Test Environment

    • A PHP website running WordPress 6.2.
    • ReCaptcha v3 integrated according to instructions.

    Bot

    A simple bot created in Python using Selenium.

    The bot was run from three IP addresses, emulating the following parameters

    • headless
    • user agents
    • headers
    • clicks
    • form submissions

    Process

    The bot ran for 24 hours, performing sequential visits and form submissions with random parameters.

    No live traffic was sent to the site.

    Results

    • All bot requests returned a score of 0.9.
    • The score did not change over time.
    • No statistics appeared in Google Analytics.
      We hypothesize that traffic presence, volume, and quality in Google Analytics may act as a training marker for the ReCaptcha system.

    How to Get an Accurate Score in a Test Environment

    The recaptcha v3 model assumes long-lasting training on live traffic.

    This means that the test environment must be loaded in the same way as the production environment. Which will undoubtedly cause some difficulties in deploying such an environment and getting the payload.

    We believe that to get the right score a user will have to turn to testing in a productive environment.

    However, the policy of most companies we know of (including CleanTalk of course) restricts any testing in a production environment.

    Unfortunately, we couldn’t find specific terms for the duration of training in Google’s official documentation. We believe that the duration of training depends on the following parameters:

    • Traffic load
    • Ratio of bots to real users
    • Percentage of “intelligent” bots among total bot traffic

    Without live traffic, no settings or configurations will yield an accurate score in a test environment.

    CleanTalk’s Solutions

    CleanTalk Check Bot

    • Decisions are made online without machine learning.
    • Simpler integration—no need to manually add tokens to forms.
    • Extensive documentation available: GitHub CleanTalk API
    • Immediate and relevant testing results.
    • Technical support response within 24 hours.

    Anti-Spam SAAS for CMS

    CleanTalk provides a cloud-based anti-spam service for websites, blocking spam in real time without CAPTCHAs. It integrates with CMS platforms like WordPress and Joomla, securing comments, registrations, and contact forms. Features include SpamFireWall to block spambots, email validation, and detailed logs, ensuring seamless protection and improved user experience.

    Anti-Spam CleanTalk API

    CleanTalk offers a suite of APIs that integrate anti-spam functionalities into various applications. The Anti-Spam API includes methods like

    • check_newuser() for registration checks;
    • check_message() for evaluating comments and contact form submissions;
    • send_feedback() for moderator inputs.

    The Database (Blacklists) API provides

    • spam_check() to verify IP and email records against CleanTalk’s database;
    • backlinks_check() to detect domains associated with spam;
    • the ip_info() method returns country codes for IP addresses.

    For managing personal lists and uptime monitoring, the Dashboard API offers dedicated methods. These APIs enable developers to enhance their applications’ security and spam prevention capabilities effectively.

  • 9 new fields added in the Blacklists Database

    9 new fields added in the Blacklists Database

    There are several new fields in both Blacklists API and Offline Database. It will help you see the exact time of the checks and a few other things.

    Offline Database

    In the Offline Database there are 4 new fields for the IP files:

    • IP updated time – time from the corresponding date fields. Format: HH:MM:SS
    • Net updated time – time from the corresponding date fields. Format: HH:MM:SS
    • AS updated time – time from the corresponding date fields. Format: HH:MM:SS
    • Last updated – date and time of the last status update. Format: YYYY-MM-DD HH:MM:SS

    And 3 new fields for email files:

    • Submitted time – time from the corresponding date fields. Format: HH:MM:SS
    • Updated time – time from the corresponding date fields. Format: HH:MM:SS
    • Last updated – date and time of the last status update. Format: HH:MM:SS

    spam_check() API method

    As for the spam_check() API method, there were two new fields added for email checking:

    • in_antispam_updated – date and time of the last status update. Format: YYYY-MM-DD HH:MM:SS. API response example,
    JSONwe******@cl*******.org":{"frequency": 0,"submitted": "2024-08-05 09:13:35","updated": "2024-08-05 09:13:35", "spam_rate": 0,"exists": 1,"spam_frequency_24h": 0,"appears": 0,"disposable_email": 0, "in_antispam_updated": "2024-08-05 09:13:35","in_antispam_previous": 1,"sha256": "be25f8b7b9fa76bdf8a2a3275f60dd7603c758598e77b332857a9867f0d6598e"}}}” style=”color:#d8dee9ff;display:none” aria-label=”Copy” class=”code-block-pro-copy-button”>
    {"data":{"we******@cl*******.org":{"frequency": 0,"submitted": "2024-08-05 09:13:35","updated": "2024-08-05 09:13:35",
    "spam_rate": 0,"exists": 1,"spam_frequency_24h": 0,"appears": 0,"disposable_email": 0,
    "in_antispam_updated": "2024-08-05 09:13:35","in_antispam_previous": 1,"sha256": "be25f8b7b9fa76bdf8a2a3275f60dd7603c758598e77b332857a9867f0d6598e"}}}
    • in_antispam_previous –  the previous Anti-Spam blacklist status. It can show if the record was blacklisted or not (0 – wasn’t blacklisted, 1 – was blacklisted, NULL – no change). Format: 1. API response example,
    JSON
    {"data":{"8.8.8.8":{"domains_count": 3011,"domains_list": null,"spam_rate": 1,"submitted": "2022-02-09 21:05:06",
    "updated": "2024-10-11 15:25:25","frequency": 6,"in_antispam": 0,"in_security": 0,"in_antispam_previous": 1,
    "in_antispam_updated": "2024-03-05 16:20:44","spam_frequency_24h": 0,"appears": 0,"network_type": "hosting",
    "country": "US","sha256": "838c4c2573848f58e74332341a7ca6bc5cd86a8aec7d644137d53b4d597f10f5"}}}

    Learn more about using the Blacklists Database API in our Help.

  • How to protect mobile app from bots

    How to protect mobile app from bots

    Why it is important to protect a mobile app from spam bots

    Spam bots are a serious threat to your website, but it affects your mobile app just the same. More than 54% of traffic goes from mobile devices and 76% of internet traffic comes from bad bots. This means that bad bots generate up to 41% of your mobile traffic However, we have listed 5 reasons to protect your mobile app from bots and to stop bad bots before harming your app.

     

    Why it is important to protect your mobile app from spam bots

    1. User experience
      Spam bots can flood your app with fake accounts, comments, and messages, which can eventually lead to degrading the user experience for legitimate users.
    1. Security
      Spam bots can carry out malicious activities such as spreading malware, phishing attacks, and stealing sensitive information from users.
    1. Resource consumption
      Spam bots can overwhelm your servers and consume valuable resources, leading to slower performance and increased costs.
    1. Reputation
      If your app is known for being overrun by spam bots, it can damage your reputation and deter legitimate users from using your app.
    1. Compliance
      Depending on the nature of your app, you may be required to comply with regulations related to data privacy and security. Allowing spam bots to operate unchecked can put you at risk of violating these regulations.

     

    How it works

    The Bot Detector works in the background and is not visible to the user. It does not require the user to confirm that he is not a bot.

    how it works

     

    How to install your mobile app spam protection

    If you need to protect mobile apps from spam, you will definitely need a solution that uses API to check registrations for spam. The Bot Detector service uses the CleanTalk check_bot API method via a special library that you can download and integrate with just 1 line of code. You can check out our detailed instructions on GitHub below.

    Go to GitHub

     

  • Updates for spam_check() API method

    Updates for spam_check() API method

    Keeping you updated with the latest changes about API of the spam_check method.

    The 3 following parameters were removed from the API of the spam_check() method due to lack of demand:

    frequency_time_10m - 10 minutes activity
    frequency_time_1h - 1 hour activity
    frequency_time_24h - 24 hours activity

    Instead of these 3 parameters, we added the “spam_frequency_24h” parameter, which shows the number of spam requests from the address over the past 24 hours.

    You can always find the API method description here: https://cleantalk.org/help/api-spam-check.

    And the description of all parameters here: https://cleantalk.org/help/api-spam-check#response-explanation.

  • 7 useful functions Drupal API that everyone should know!

    In this article we will look at 7 Drupal API functions that are very helpful in the development of sites to Drupal 7.

    check_plain($text) – re-encodes special characters to HTML entities.

    Parameters:

    • $text – the string for conversion

    The return value: the processed string to display as HTML.

    This function can be used to treat all kinds of data coming to the site from a variety of sources: user input, import data from another site, Twitter, etc.

    t($string, array $args = array(), array $options = array()) – converts the string to the user-selected language.

    Parameters:

    • $string – the string to be translated
    • $args – an associative array of wildcard patterns (placeholders)
    • $options – an associative array of additional options, contains two possible options: langcode – a clear indication of the language code that you want to translate a string, context – allows you to set the context of translation.

    The return value: the translated string.

    Example of the function t():

    t('Good afternoon, @first_name @last_name. ', array('@first_name' => 'Jhon', '@last_name' => 'Smith')); // Returns 'Good afternoon, Jhon Smith.'

    There are three types of wildcard patterns:

    • !name – value is substituted without processing
    • @name – value processed by the function check_plain, all HTML tags are cut
    • %name – value processed by the function theme_placeholder (also that check_plain, but the result is wrapped in a tag <em>)

    format_plural($count, $singular, $plural, array $args = array(), array $options = array()) – creates a string containing quantitative value.

    Parameters:

    • $count – quantitative value
    • $singular – a string that will be used if $count == 1
    • $plural – a string that will be used if $count > 1
    • $args – an associative array of of wildcard patterns (placeholders)
    • $options – the same as in the t() function

    The return value: a string translated by using the function t(), depending on the parameter $count selected string that will be used for translation.

    Example:

    $comment_count=1;
    format_plural($comment_count, '1 comment', '@count comments'); // return '1 comment'
    
    $comment_count=5;
    format_plural($comment_count, '1 comment', '@count comments'); // return '5 comments'

    drupal_get_title() – returns the current page title.

    This function can be used in combination with drupal_set_title() to process the header and install a new one.

    drupal_set_title($title = NULL, $output = CHECK_PLAIN) – sets the title of the page.

    Parameters:

    • $title – a string that will be used as the title of the page
    • $output – a flag that determines whether the $title processed by function check_plain().

    The return value: updated page title.

    url($path = NULL, array $options = array()) – forms an internal or external URL.

    Parameters:

    • $ path – internal relative or external absolute path
    • $ options – an associative array of options:
      • query- array passed parameters such as key/value
      • fragment – the anchor element on the page
      • absolute – flag (default FALSE) if set to TRUE then the url will be defined as an absolute.
      • alias – the flag (default FALSE) if set to TRUE then the path will be regarded as an alias (this will not be accomplished in the search for the alias database that will speed up)
      • external – the flag, if set to TURE url will be regarded as external.
      • language – a language object defines the language to find an alias selected language
      • https – the flag is set to TRUE if the path will be https protocol, if FALSE, then the http.
      • base_url – the value to replace the standard base path
      • prefix – the prefix path language

    The returned value: formed URL.

    drupal_goto($path = ”, array $options = array(), $http_response_code = 302) – produces redirect the user to another page.

    Parameters:

    • $path – relative or absolute path to be produced redirect
    • $options – the list of options as a function of the url()
    • $http_response_code – code status code

    This text is a translation of the article “7 полезных функций Drupal API который должен знать каждый!” published on drupal-learning.com.

    Forums and blogs without spam

    CleanTalk is a SaaS spam protection service for Web-sites. CleanTalk uses protection methods which are invisible for site visitors. Connecting to the service eliminates needs for CAPTCHA, questions and answers and other methods of protection, complicating the exchange of information on the site.

  • Drupal API functions for working with taxonomy

    CleanTalk is a SaaS spam protection service for Web-sites. CleanTalk uses protection methods which are invisible for site visitors. Connecting to the service eliminates needs for CAPTCHA, questions and answers and other methods of protection, complicating the exchange of information on the site.

    In Drupal API there are a number of useful features for taxonomy that provide nodes, classified on the definition terms, and let you know parent or child terms, etc.

    Load the object of the term by its tid

    Function taxonomy_term_load() by analogy with node_load() returns an object of the term by tid:

    <?php
      $term = taxonomy_term_load(1);
      print $term->name; // the name of the term
      print $term->vid; // taxonomy vocabulary identifier to which the term belongs
    ?>

    By analogy with node_load_multiple() in Drupal API there is a function taxonomy_term_load_multiple()

    Find terms by name

    To download terms by their names is a function taxonomy_get_term_by_name(), which returns an array of terms with the given name.

    Get all vocabulary terms

    To get all vocabulary terms with the hierarchy, you can use the taxonomy_get_tree(), where you want to pass a value vid – ID taxonomy vocabulary. In this case, the function returns objects with additional properties terms “depth” (the depth of the term in the hierarchy) and “parents” – an array of values tid parent terms. Sample code to display a “tree” vocabulary you will find on the page description of the function.

    Gets the child terms

    A fairly common task – to get the child terms specified term. For its solution is the function taxonomy_get_children(). Note that this function takes the tid of the term, and returns the full facilities of child terms (if any). That is not appropriate to use this feature, if we want, for example, only the values tid or name of child terms. In such situations, for performance reasons should write a request to the site database using db_select() (the basis of a request can be taken from the body of the function taxonomy_get_children()).

    Gets the parent terms

    For the parents given the term in the API Drupal 7 provides two functions – taxonomy_get_parents() and taxonomy_get_parents_all().

    Despite nearly identical names, the functions differ substantially. The first return only “parents” specified term. Suppose we have a dictionary “Electronics”, it parental terms Sony and Panasonic, and the term “TV”, which is made as a child for Sony, as well as for Panasonic. Then the function call taxonomy_get_parents() by substituting into it the values tid of the term “TV” we get objects terms Sony and Panasonic.

    The second function returns objects of “ancestors” of the term, and not only his parents, ie taking into account the entire depth of the vocabulary.

    Load all the nodes of the term

    For content classified by definition of the term, is a function taxonomy_select_nodes(). When all the evidence of her appointment beginners sometimes have problems associated with failure to take into account all the arguments of the function. So let’s look at an example. Suppose the term with tid = 1 is assigned 25 nodes. Then it is logical to assume that the line of code:

    <?php
      $nids = taxonomy_select_nodes(1);
    ?>

    returns an array of 25 elements – nid of nodes. However, it is not. Pay attention to the second argument, namely boolean variable $pager, the default to TRUE. This means that our sample will be divided by page. If we want to get all the nodes of the term on the same page, we need to convert the line of the code in:

    <?php
      $nids = taxonomy_select_nodes(1, false);
    ?>

    Also, when using the taxonomy_select_nodes(), you can set a limit on the number of loaded nodes and set the sort order by the parameters $limit and $order.

    In conclusion, it is worth mentioning such useful features as taxonomy_get_vocabularies() – to download all the vocabularies taxonomy and its simplified version taxonomy_vocabulary_get_names(), which returns an array of objects, properties which are names, machine names and identifiers vid vocabularies.

    This text is a translation of the article “Функции Drupal API для работы с таксономией” published by Sergey Belyaev on sergeybelyaev.name.

    Forums and blogs without spam

    CleanTalk is a SaaS spam protection service for Web-sites. CleanTalk uses protection methods which are invisible for site visitors. Connecting to the service eliminates needs for CAPTCHA, questions and answers and other methods of protection, complicating the exchange of information on the site.

  • Perl, Python anti-spam API to web-site spam protection

    Ready to use API for protecting web site from spam, to an existing class PHP, added antispam modules Perl and Python. Libraries enable you to check on the spam as a new comment and registration. Examples:

    Perl API:

    [perl]
    use strict;
    use WebService::Antispam;

    my $ct = WebService::Antispam->new({
    auth_key => ‘12345’ # API key, please get on cleantalk.org
    });

    my $response = $ct->request({
    message => ‘abc’, # Comment visitor to the site
    example => undef, # The text of the article to which visitor created a comment.
    sender_ip => ‘196.19.250.114’, # IP address of the visitor
    sender_email => ‘st********@ex*****.com‘, # Email IP of the visitor
    sender_nickname => ‘spam_bot’, # Nickname of the visitor
    submit_time => 12, # The time taken to fill the comment form in seconds
    js_on => 1, # The presence of JavaScript for the site visitor, 0|1
    });
    [/perl]

    Python API:

    [python]
    from cleantalk import CleanTalk

    ct = CleanTalk(auth_key=’yourkey’)
    ct_result = ct.request(
    message = ‘abc’, # Visitor comment
    sender_ip = ‘196.19.250.114’, # Visitor IP address
    sender_email = ‘st********@ex*****.com‘, # Visitor email
    sender_nickname = ‘spam_bot’, # Visitor nickname
    js_on = 1, # Is visitor has JavaScript
    submit_time = 12 # Seconds from start form filling till the form POST
    )
    #Check
    if ct_result[‘allow’]:
    print(‘Comment allowed. Reason ‘ + ct_result[‘comment’])
    else:
    print(‘Comment blocked. Reason ‘ + ct_result[‘comment’])
    [/python]

    Python module compatible with the version of Python 2 and Python release 3. Soon to be available API platform .NET.

    Perl anti-spam module to the web site

    Python anti-spam module to the web site

  • Updated PHP API to version 1.21.9

    Ready to use version 1.21.9 PHP API, change the following:

    • Requests to the server switched ring HTTP+JSON. From RPC::XML refuse his redundancy for our service.
    • In the class “Cleantalk added option «data_codepage», which allows you to specify a code page of the data transferred, respectively class automatically encodes the data in UTF-8.

    The input, output the name of the variables and functions are fully compatible with the previous version of the API to update enough to replace cleantalk.class.php.

    Download cleantalk-anti-spam-script-1.21.9.zip.