Unwanted advertising in e-mails, forums and guest books are a current topic. This section will explain what it is and tell some methods used by attackers. Subsequently, it will show what has been done to handle these threats.
There is a fine difference between "Spam" and "Floods".
The word "Spam", as it is used here, means the repeated and unrequested transmission of contents to a person or a group. For example, the sending of unsolicited e-mails with promotional content to third parties. Even massive, unsolicited phone calls are a form of "spam". Sending "spam messages" is illegal in many countries (including Europe and the United States) and will be prosecuted.
The site "spamhaus.org" defined "spam" as follows.
The word " Spam " as applied to Email means Unsolicited Bulk Email ("UBE").
A message is Spam only if it is both Unsolicited and Bulk.
The word " Flooding " means the deliberate, unsolicited "flooding" of a medium with content. Floods are meant to disturb the normal operation of a medium. For example massive publishing of senseless texts in a public forum. Also, a person, who disturbs a public meeting by permanent interruptions, does a form of flooding. Flooding may be illegal in some cases. For example if it causes economic damage.
In most cases "Spam" is produced to publish advertisements. While "Floods" are at most a form of vandalism.
Spam in forums and guest books are an increasing problem. This also applies to web logs an wikis. Often, these records are quickly identified and then removed, but the removal of unwanted entries of this kind takes time and money. The following section describes what an attacker could do and ways to react to it.
The attacker might do as follows. First, the attacker might looks for a possibly vulnerable script like a guestbook or forum. He examines the forms that are used to create new entries, and notes the names of the required form fields. The attacker search for specific texts within the targeted application. Using these texts as search terms, the attacker identifies web sites where the script is installed.
The attacker creates a list of URLs for the pages and stores the list in a file. With the help of a short program, the messages are published to all forms on the list.
If the attacker is lucky enough, search engines like Google index these pages before the administrator removes the ad. This way the embedded hyperlinks also increase the PageRank of the so-advertised web sites.
There are multiple ways to react to such attacks.
The easiest way is to simply change the URL. Usually, the scripts of most attackers work asynchronously. Therefore usually he will not recognize so soon that the targeted site is no longer available. You may expect that the attacker might either wait for the search engine to refresh the URL, which might easily take up 8 weeks, or even longer, until he re-visits the site and extracts the new link manually. In both cases, the victim should temporarily be safe.
For the Framework changing the URL is easy and can be done without any difficulties. It is so simple, because the Framework can handle multiple instances, each identified by a unique id, as "configuration profiles" . Which profiles are active and which are not, is set by the administrator.
Thus, to change the URL of an application you only have to rename the profile and correct the links. This is a question of just a few minutes. The positive side effect consists of the fact that the aggressor does not receive an error 404 (side not found) during synchronous transmission, or with examination of the URL over an automatic program from the server. Even by manual examination he does not notice that the profile is inactive. Depending on the configuration of the framework the attacker will continue to submit messages to the inactive forms and (unless the profile is write-protected) write entries.
By personal experience I can report that aggressor may be lead astray for months without noticing their error.
A further measure is to block IPs or IP-ranges. That is above all useful if the aggressor always uses the same proxies for the attacks. Unfortunately this is not often the case. Nonetheless, this possibility was considered and an included as an optional feature for the Framework. This function can also be used for applications that run within a local network, to prevent unauthorized access from the Internet.
In order to prevent from the start that an aggressor can write entries using an automatic program, you may add CAPTCHAs to threatened forms. In order to be able to write an entry, the visitor must type in the shown code. While this is an easy task for a human, it is a lot more difficult for an automatic program.
There are problems that arise from this for the attacker. The program needs to retrieve the code from the server ? even if the code is successfully extracted it requires valueable time. An asynchronous communication is not longer possible: since the request has to be send and the code retrieved. There is also the risk that the attacker's program is slowed down by an intentionally modified server. Also it will cause high effort for the attacker.
It is to be assumed that an aggressor will avoid such protected web applications.
The Framework implements such a check. The standard library provides the actions "security_get_image" and "security_check_image", that do produce the required graphics in PNG format using the PHP GD library and that do maintain a code table to check the user input. This table is hidden from the public via ".htaccess" and is refreshed automatically.
Flooding can be both, unintentional and intended.
This may happen unintentionally if the server temporarily experiences high traffic. In this case it might occur that a user has to wait a longer time for the server's response. Impatience may cause that he refreshes the page and resends the request, which will create a second identical entry. Even if the web site itself gets low traffic, this may still occur, if the web site is hosted on a "shared server". This is when the web site where the framwork is installed is shared with other customers. It is not uncommon, that 1000 or even 10000 customers on one single server have to share the same resources.
Flooding by intention is often a kind of "vandalism". Goal of the aggressor is it to disturb and/or to hinder the normal use of the targeted media.
Whether intended or accidental - flooding can be reduced by automated measures.
Two methods are to be explained, which attackers could try to disturb the operation of the application. First: by entering extremly long character strings without line breaks, lining an excessive chain of line breaks, writing extremly long (senseless) texts in general, or extra large graphics, where all of these contents may break the layout of application. This may cause that parts of the graphical layout are no longer shown as intended by the author of the web site. In addition it could force other users to scroll the page, and the presentation of contributions by other authors may thereby be impaired.
The second method is mass-submission of irrelevant contributions. This will make it difficult for other users, to recognize the important messages and notice their contents. For this attack, the aggressor sends several identical or similar contributions to the server.
For the second method, the Framework provides the following solution. When two successive contributions are send, it checks whether the contributions have the same content. If so, the second entry is rejected. This esp. prevents accidental flooding, but also makes attacks in general a little more difficult.
Additionally, there is an option, which allows to restrict the number of consecutive entries written by a user to a certain number. The user is identified by the IP. If the option is active and the user writes more entries than expected the user will be denied further contributions until another user writes an entry.
For the avoidance of the first method the framework provides a number of filters. These filters restrict the length of text, limit the proportions of graphics to maximum values, remove conspicuous repetitions and force line breaks, when a string is more than 80 long.
These methods make the disturbance of the normal operation more difficult, yet they can't offer 100% protection. This would not be desirable anyway. This is due to a well-known dilemma. If one narrows the effective region of the filters, then this reduces the number of the false alarms ("false-positives"). These are those submission which are recognized as flooding-attempts, but in fact are none. At the same time however also a higher number of real attacks will slip by the net of the filters. Conversely: if one expands the target area of the filters, the will prevent a larger number of attacks. However, this will also increase the number of "false-positives" accordingly. Ultimately, it is thus always a "trade-off" between adequate security and reasonable tolerance, to prevent false alarms.
Thomas Meyer, www.yanaframework.net