Fonctions
	tln_tagprint ($tagname, $attary, $tagtype)

	tln_casenormalize (&$val)

	tln_skipspace ($body, $offset)

	tln_findnxstr ($body, $offset, $needle)

	tln_findnxreg ($body, $offset, $reg)

	tln_getnxtag ($body, $offset)

	tln_deent (&$attvalue, $regex, $hex=false)

	tln_defang (&$attvalue)

	tln_unspace (&$attvalue)

	tln_fixatts ( $tagname, $attary, $rm_attnames, $bad_attvals, $add_attr_to_tag, $trans_image_path, $block_external_images)

	tln_fixurl ($attname, &$attvalue, $trans_image_path, $block_external_images)

	tln_fixstyle ($body, $pos, $trans_image_path, $block_external_images)

	tln_body2div ($attary, $trans_image_path)

	tln_sanitize ( $body, $tag_list, $rm_tags_with_content, $self_closing_tags, $force_tag_closing, $rm_attnames, $bad_attvals, $add_attr_to_tag, $trans_image_path, $block_external_images)

	HTMLFilter ($body, $trans_image_path, $block_external_images=false)

Documentation des fonctions

◆ HTMLFilter()

HTMLFilter	(	$body,
		$trans_image_path,
		$block_external_images = `false`
	)

Références tln_sanitize().

◆ tln_body2div()

tln_body2div	(	$attary,
		$trans_image_path
	)

Références $text.

Référencé par tln_sanitize().

◆ tln_casenormalize()

tln_casenormalize ( & $val )

A small helper function to use with array_walk. Modifies a by-ref value and makes it lowercase.

Paramètres

string $val a value passed by-ref.

Renvoie: void since it modifies a by-ref value.

◆ tln_deent()

tln_deent	(	&	$attvalue,
			$regex,
			$hex = `false`
	)

Translates entities into literal values so they can be checked.

Paramètres

string	$attvalue	the by-ref value to check.
string	$regex	the regular expression to check against.
boolean	$hex	whether the entities are hexadecimal.

Renvoie: boolean True or False depending on whether there were matches.

Références $i.

Référencé par tln_defang().

◆ tln_defang()

tln_defang ( & $attvalue )

This function checks attribute values for entity-encoded values and returns them translated into 8-bit strings so we can run checks on them.

Paramètres

string $attvalue A string to run entity check against.

Skip this if there aren't ampersands or backslashes.

Références tln_deent().

Référencé par tln_fixatts(), et tln_fixstyle().

◆ tln_findnxreg()

tln_findnxreg	(	$body,
		$offset,
		$reg
	)

This function takes a PCRE-style regexp and tries to match it within the string.

Paramètres

string	$body	The string to look for needle in.
integer	$offset	Start looking from here.
string	$reg	A PCRE-style regex to match.

Renvoie

array|boolean Returns a false if no matches found, or an array with the following members:

integer with the location of the match within $body
string with whatever content between offset and the match
string with whatever it is we matched

Références $offset.

Référencé par tln_getnxtag().

◆ tln_findnxstr()

tln_findnxstr	(	$body,
		$offset,
		$needle
	)

This function looks for the next character within a string. It's really just a glorified "strpos", except it catches the failures nicely.

Paramètres

string	$body	The string to look for needle in.
integer	$offset	Start looking from this position.
string	$needle	The character/string to look for.

Renvoie: integer location of the next occurrence of the needle, or strlen($body) if needle wasn't found.

Références $offset.

Référencé par tln_getnxtag().

◆ tln_fixatts()

tln_fixatts	(	$tagname,
		$attary,
		$rm_attnames,
		$bad_attvals,
		$add_attr_to_tag,
		$trans_image_path,
		$block_external_images
	)

This function runs various checks against the attributes.

Paramètres

string	$tagname	String with the name of the tag.
array	$attary	Array with all tag attributes.
array	$rm_attnames	See description for tln_sanitize
array	$bad_attvals	See description for tln_sanitize
array	$add_attr_to_tag	See description for tln_sanitize
string	$trans_image_path
boolean	$block_external_images

Renvoie: array with modified attributes.

See if this attribute should be removed.

Remove any backslashes, entities, or extraneous whitespace.

Now let's run checks on the attvalues. I don't expect anyone to comprehend this. If you do, get in touch with me so I can drive to where you live and shake your hand personally. :)

There are two arrays in valary. First is matches. Second one is replacements

See if we need to append any attributes to this tag.

Références tln_defang(), tln_fixurl(), et tln_unspace().

Référencé par tln_sanitize().

◆ tln_fixstyle()

tln_fixstyle	(	$body,
		$pos,
		$trans_image_path,
		$block_external_images
	)

First look for general BODY style declaration, which would be like so: body {background: blah-blah} and change it to .bodyclass so we can just assign it to a

Fix url('blah') declarations.

Remove any backslashes, entities, and extraneous whitespace.

Références $content, $i, tln_defang(), tln_fixurl(), et tln_unspace().

Référencé par tln_sanitize().

◆ tln_fixurl()

tln_fixurl	(		$attname,
		&	$attvalue,
			$trans_image_path,
			$block_external_images
	)

Replace empty src tags with the blank image. src is only used for frames, images, and image inputs. Doing a replace should not affect them working as should be, however it will stop IE from being kicked off when src for img tags are not set

Référencé par tln_fixatts(), et tln_fixstyle().

◆ tln_getnxtag()

tln_getnxtag	(	$body,
		$offset
	)

This function looks for the next tag.

Paramètres

string	$body	String where to look for the next tag.
integer	$offset	Start looking from here.

Renvoie

array|boolean false if no more tags exist in the body, or an array with the following members:

string with the name of the tag
array with attributes and their values
integer with tag type (1, 2, or 3)
integer where the tag starts (starting "<")
integer where the tag ends (ending ">") first three members will be false, if the tag is invalid.

We are here: blah blah <tag attribute="value"> ———^

There are 3 kinds of tags:

Opening tag, e.g.: Closing tag, e.g.:
XHTML-style content-less tag, e.g.:

A comment or an SGML declaration.

Assume tagtype 1 for now. If it's type 3, we'll switch values later.

Look for next [-_], which will indicate the end of the tag name.

$match can be either of these: '>' indicating the end of the tag entirely. '' indicating the end of the tag name. '/' indicating that this is type-3 xhtml tag.

Whatever else we find there indicates an invalid tag.

This is an xhtml-style tag with a closing / at the end, like so:

. Check if it's followed by the closing bracket. If not, then this tag is invalid

Check if it's whitespace

This is an invalid tag! Look for the next closing ">".

At this point we're here: <tagname attribute="blah"> -——^

At this point we loop in order to find all attributes.

Non-closed tag.

See if we arrived at a ">" or "/>", which means that we reached the end of the tag.

Yep. So we did.

There are several types of attributes, with optional [:space:] between members. Type 1: attrname[:space:]=[:space:]'CDATA' Type 2: attrname[:space:]=[:space:]"CDATA" Type 3: attr[:space:]=[:space:]CDATA Type 4: attrname

We leave types 1 and 2 the same, type 3 we check for '"' and convert to "&quot" if needed, then wrap in double quotes. Type 4 we convert into: attrname="yes".

Looks like body ended before the end of tag.

We arrived at the end of attribute name. Several things possible here: '>' means the end of the tag and this is attribute type 4 '/' if followed by '>' means the same thing as above '' means a lot of things – look what it's followed by. anything else means the attribute is invalid.

This is an xhtml-style tag with a closing / at the end, like so:

. Check if it's followed by the closing bracket. If not, then this tag is invalid

Skip whitespace and see what we arrive at.

Two things are valid here: '=' means this is attribute type 1 2 or 3. means this was attribute type 4. anything else we ignore and re-loop. End of tag and invalid stuff will be caught by our checks at the beginning of the loop.

Here are 3 possibilities: "'" attribute type 1 '"' attribute type 2 everything else is the content of tag type 3

These are hateful. Look for , or >.

If it's ">" it will be caught at the top.

That was attribute type 4.

An illegal character. Find next '>' and return.

The fact that we got here indicates that the tag end was never found. Return invalid tag indication so it gets stripped.

Références $offset, elseif, tln_findnxreg(), tln_findnxstr(), et tln_skipspace().

Référencé par tln_sanitize().

◆ tln_sanitize()

tln_sanitize	(	$body,
		$tag_list,
		$rm_tags_with_content,
		$self_closing_tags,
		$force_tag_closing,
		$rm_attnames,
		$bad_attvals,
		$add_attr_to_tag,
		$trans_image_path,
		$block_external_images
	)

Paramètres

string	$body	The HTML you wish to filter
array	$tag_list	see description above
array	$rm_tags_with_content	see description above
array	$self_closing_tags	see description above
boolean	$force_tag_closing	see description above
array	$rm_attnames	see description above
array	$bad_attvals	see description above
array	$add_attr_to_tag	see description above
string	$trans_image_path
boolean	$block_external_images

Renvoie: string Sanitized html safe to show on your pages.

Normalize rm_tags and rm_tags_with_content.

See if tag_list is of tags to remove or tags to allow. false means remove these tags true means allow these tags

Take care of netscape's stupid javascript entities like &{alert('boo')};

Take care of <style>

Got to the end of tag we needed to remove.

$rm_tags_with_content

See if this is a self-closing type and change tagtype appropriately.

See if we should skip this tag and any content inside it.

Convert body into div.

This is where we run other checks.

Références null, tln_body2div(), tln_fixatts(), tln_fixstyle(), tln_getnxtag(), et tln_tagprint().

Référencé par HTMLFilter().

◆ tln_skipspace()

tln_skipspace	(	$body,
		$offset
	)

This function skips any whitespace from the current position within a string and to the next non-whitespace value.

Paramètres

string	$body	the string
integer	$offset	the offset within the string where we should start looking for the next non-whitespace character.

Renvoie: integer the location within the $body where the next non-whitespace char is located.

Références $count, et $offset.

Référencé par tln_getnxtag().

◆ tln_tagprint()

tln_tagprint	(	$tagname,
		$attary,
		$tagtype
	)

htmlfilter.inc

This set of functions allows you to filter html in order to remove any malicious tags from it. Useful in cases when you need to filter user input for any cross-site-scripting attempts.

This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.

This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA

Konstantin Riabitsev icon@.nosp@m.linu.nosp@m.x.duk.nosp@m.e.ed.nosp@m.u Jim Jagielski <jim@j.nosp@m.aguN.nosp@m.ET.co.nosp@m.m / jimja.nosp@m.g@gm.nosp@m.ail.c.nosp@m.om> 1.1 ($Date$) This function returns the final tag out of the tag name, an array of attributes, and the type of the tag. This function is called by tln_sanitize internally.

Paramètres

string	$tagname	the name of the tag.
array	$attary	the array of attributes and their values
integer	$tagtype	The type of the tag (see in comments).

Renvoie: string A string with the final tag representation.

Référencé par tln_sanitize().

◆ tln_unspace()

tln_unspace ( & $attvalue )

Kill any tabs, newlines, or carriage returns. Our friends the makers of the browser with 95% market value decided that it'd be funny to make "java[tab]script" be just as good as "javascript".

Paramètres

string $attvalue The attribute value before extraneous spaces removed.

Référencé par tln_fixatts(), et tln_fixstyle().

Fonctions

Documentation des fonctions

◆ HTMLFilter()

◆ tln_body2div()

◆ tln_casenormalize()

◆ tln_deent()

◆ tln_defang()

◆ tln_findnxreg()

◆ tln_findnxstr()

◆ tln_fixatts()

◆ tln_fixstyle()

◆ tln_fixurl()

◆ tln_getnxtag()

◆ tln_sanitize()

◆ tln_skipspace()

◆ tln_tagprint()

htmlfilter.inc

◆ tln_unspace()