Joomla CMS  3.10.11 (avec JPlatform 13.1 inclus)
Documentation des API du CMS Joomla en version 3.10.11 et du framework Joomla Platform intégré
Tout Structures de données Espaces de nommage Fichiers Fonctions Variables Pages
Référence du fichier htmlfilter.php

Fonctions

 tln_tagprint ($tagname, $attary, $tagtype)
 
 tln_casenormalize (&$val)
 
 tln_skipspace ($body, $offset)
 
 tln_findnxstr ($body, $offset, $needle)
 
 tln_findnxreg ($body, $offset, $reg)
 
 tln_getnxtag ($body, $offset)
 
 tln_deent (&$attvalue, $regex, $hex=false)
 
 tln_defang (&$attvalue)
 
 tln_unspace (&$attvalue)
 
 tln_fixatts ( $tagname, $attary, $rm_attnames, $bad_attvals, $add_attr_to_tag, $trans_image_path, $block_external_images)
 
 tln_fixurl ($attname, &$attvalue, $trans_image_path, $block_external_images)
 
 tln_fixstyle ($body, $pos, $trans_image_path, $block_external_images)
 
 tln_body2div ($attary, $trans_image_path)
 
 tln_sanitize ( $body, $tag_list, $rm_tags_with_content, $self_closing_tags, $force_tag_closing, $rm_attnames, $bad_attvals, $add_attr_to_tag, $trans_image_path, $block_external_images)
 
 HTMLFilter ($body, $trans_image_path, $block_external_images=false)
 

Documentation des fonctions

◆ HTMLFilter()

HTMLFilter (   $body,
  $trans_image_path,
  $block_external_images = false 
)

Références tln_sanitize().

◆ tln_body2div()

tln_body2div (   $attary,
  $trans_image_path 
)

Références $text.

Référencé par tln_sanitize().

◆ tln_casenormalize()

tln_casenormalize ( $val)

A small helper function to use with array_walk. Modifies a by-ref value and makes it lowercase.

Paramètres
string$vala value passed by-ref.
Renvoie
void since it modifies a by-ref value.

◆ tln_deent()

tln_deent ( $attvalue,
  $regex,
  $hex = false 
)

Translates entities into literal values so they can be checked.

Paramètres
string$attvaluethe by-ref value to check.
string$regexthe regular expression to check against.
boolean$hexwhether the entities are hexadecimal.
Renvoie
boolean True or False depending on whether there were matches.

Références $i.

Référencé par tln_defang().

◆ tln_defang()

tln_defang ( $attvalue)

This function checks attribute values for entity-encoded values and returns them translated into 8-bit strings so we can run checks on them.

Paramètres
string$attvalueA string to run entity check against.

Skip this if there aren't ampersands or backslashes.

Références tln_deent().

Référencé par tln_fixatts(), et tln_fixstyle().

◆ tln_findnxreg()

tln_findnxreg (   $body,
  $offset,
  $reg 
)

This function takes a PCRE-style regexp and tries to match it within the string.

Paramètres
string$bodyThe string to look for needle in.
integer$offsetStart looking from here.
string$regA PCRE-style regex to match.
Renvoie
array|boolean Returns a false if no matches found, or an array with the following members:
  • integer with the location of the match within $body
  • string with whatever content between offset and the match
  • string with whatever it is we matched

Références $offset.

Référencé par tln_getnxtag().

◆ tln_findnxstr()

tln_findnxstr (   $body,
  $offset,
  $needle 
)

This function looks for the next character within a string. It's really just a glorified "strpos", except it catches the failures nicely.

Paramètres
string$bodyThe string to look for needle in.
integer$offsetStart looking from this position.
string$needleThe character/string to look for.
Renvoie
integer location of the next occurrence of the needle, or strlen($body) if needle wasn't found.

Références $offset.

Référencé par tln_getnxtag().

◆ tln_fixatts()

tln_fixatts (   $tagname,
  $attary,
  $rm_attnames,
  $bad_attvals,
  $add_attr_to_tag,
  $trans_image_path,
  $block_external_images 
)

This function runs various checks against the attributes.

Paramètres
string$tagnameString with the name of the tag.
array$attaryArray with all tag attributes.
array$rm_attnamesSee description for tln_sanitize
array$bad_attvalsSee description for tln_sanitize
array$add_attr_to_tagSee description for tln_sanitize
string$trans_image_path
boolean$block_external_images
Renvoie
array with modified attributes.

See if this attribute should be removed.

Remove any backslashes, entities, or extraneous whitespace.

Now let's run checks on the attvalues. I don't expect anyone to comprehend this. If you do, get in touch with me so I can drive to where you live and shake your hand personally. :)

There are two arrays in valary. First is matches. Second one is replacements

See if we need to append any attributes to this tag.

Références tln_defang(), tln_fixurl(), et tln_unspace().

Référencé par tln_sanitize().

◆ tln_fixstyle()

tln_fixstyle (   $body,
  $pos,
  $trans_image_path,
  $block_external_images 
)

First look for general BODY style declaration, which would be like so: body {background: blah-blah} and change it to .bodyclass so we can just assign it to a

Fix url('blah') declarations.
Remove any backslashes, entities, and extraneous whitespace.

Références $content, $i, tln_defang(), tln_fixurl(), et tln_unspace().

Référencé par tln_sanitize().

◆ tln_fixurl()

tln_fixurl (   $attname,
$attvalue,
  $trans_image_path,
  $block_external_images 
)

Replace empty src tags with the blank image. src is only used for frames, images, and image inputs. Doing a replace should not affect them working as should be, however it will stop IE from being kicked off when src for img tags are not set

Référencé par tln_fixatts(), et tln_fixstyle().

◆ tln_getnxtag()

tln_getnxtag (   $body,
  $offset 
)

This function looks for the next tag.

Paramètres
string$bodyString where to look for the next tag.
integer$offsetStart looking from here.
Renvoie
array|boolean false if no more tags exist in the body, or an array with the following members:
  • string with the name of the tag
  • array with attributes and their values
  • integer with tag type (1, 2, or 3)
  • integer where the tag starts (starting "<")
  • integer where the tag ends (ending ">") first three members will be false, if the tag is invalid.

We are here: blah blah <tag attribute="value"> ———^

There are 3 kinds of tags:

  1. Opening tag, e.g.: Closing tag, e.g.:
  2. XHTML-style content-less tag, e.g.:

A comment or an SGML declaration.

Assume tagtype 1 for now. If it's type 3, we'll switch values later.

Look for next [-_], which will indicate the end of the tag name.

$match can be either of these: '>' indicating the end of the tag entirely. '' indicating the end of the tag name. '/' indicating that this is type-3 xhtml tag.

Whatever else we find there indicates an invalid tag.

This is an xhtml-style tag with a closing / at the end, like so:

. Check if it's followed by the closing bracket. If not, then this tag is invalid

Check if it's whitespace

This is an invalid tag! Look for the next closing ">".

At this point we're here: <tagname attribute="blah"> -——^

At this point we loop in order to find all attributes.

Non-closed tag.

See if we arrived at a ">" or "/>", which means that we reached the end of the tag.

Yep. So we did.

There are several types of attributes, with optional [:space:] between members. Type 1: attrname[:space:]=[:space:]'CDATA' Type 2: attrname[:space:]=[:space:]"CDATA" Type 3: attr[:space:]=[:space:]CDATA Type 4: attrname

We leave types 1 and 2 the same, type 3 we check for '"' and convert to "&quot" if needed, then wrap in double quotes. Type 4 we convert into: attrname="yes".

Looks like body ended before the end of tag.

We arrived at the end of attribute name. Several things possible here: '>' means the end of the tag and this is attribute type 4 '/' if followed by '>' means the same thing as above '' means a lot of things – look what it's followed by. anything else means the attribute is invalid.

This is an xhtml-style tag with a closing / at the end, like so:

. Check if it's followed by the closing bracket. If not, then this tag is invalid

Skip whitespace and see what we arrive at.

Two things are valid here: '=' means this is attribute type 1 2 or 3. means this was attribute type 4. anything else we ignore and re-loop. End of tag and invalid stuff will be caught by our checks at the beginning of the loop.

Here are 3 possibilities: "'" attribute type 1 '"' attribute type 2 everything else is the content of tag type 3

These are hateful. Look for , or >.

If it's ">" it will be caught at the top.

That was attribute type 4.

An illegal character. Find next '>' and return.

The fact that we got here indicates that the tag end was never found. Return invalid tag indication so it gets stripped.

Références $offset, elseif, tln_findnxreg(), tln_findnxstr(), et tln_skipspace().

Référencé par tln_sanitize().

◆ tln_sanitize()

tln_sanitize (   $body,
  $tag_list,
  $rm_tags_with_content,
  $self_closing_tags,
  $force_tag_closing,
  $rm_attnames,
  $bad_attvals,
  $add_attr_to_tag,
  $trans_image_path,
  $block_external_images 
)
Paramètres
string$bodyThe HTML you wish to filter
array$tag_listsee description above
array$rm_tags_with_contentsee description above
array$self_closing_tagssee description above
boolean$force_tag_closingsee description above
array$rm_attnamessee description above
array$bad_attvalssee description above
array$add_attr_to_tagsee description above
string$trans_image_path
boolean$block_external_images
Renvoie
string Sanitized html safe to show on your pages.

Normalize rm_tags and rm_tags_with_content.

See if tag_list is of tags to remove or tags to allow. false means remove these tags true means allow these tags

Take care of netscape's stupid javascript entities like &{alert('boo')};

Take care of <style>

Got to the end of tag we needed to remove.

$rm_tags_with_content

See if this is a self-closing type and change tagtype appropriately.

See if we should skip this tag and any content inside it.

Convert body into div.

This is where we run other checks.

Références null, tln_body2div(), tln_fixatts(), tln_fixstyle(), tln_getnxtag(), et tln_tagprint().

Référencé par HTMLFilter().

◆ tln_skipspace()

tln_skipspace (   $body,
  $offset 
)

This function skips any whitespace from the current position within a string and to the next non-whitespace value.

Paramètres
string$bodythe string
integer$offsetthe offset within the string where we should start looking for the next non-whitespace character.
Renvoie
integer the location within the $body where the next non-whitespace char is located.

Références $count, et $offset.

Référencé par tln_getnxtag().

◆ tln_tagprint()

tln_tagprint (   $tagname,
  $attary,
  $tagtype 
)

htmlfilter.inc

This set of functions allows you to filter html in order to remove any malicious tags from it. Useful in cases when you need to filter user input for any cross-site-scripting attempts.

Copyright (C) 2002-2004 by Duke University

This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.

This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA

Konstantin Riabitsev icon@.nosp@m.linu.nosp@m.x.duk.nosp@m.e.ed.nosp@m.u Jim Jagielski <jim@j.nosp@m.aguN.nosp@m.ET.co.nosp@m.m / jimja.nosp@m.g@gm.nosp@m.ail.c.nosp@m.om> 1.1 ($Date$) This function returns the final tag out of the tag name, an array of attributes, and the type of the tag. This function is called by tln_sanitize internally.

Paramètres
string$tagnamethe name of the tag.
array$attarythe array of attributes and their values
integer$tagtypeThe type of the tag (see in comments).
Renvoie
string A string with the final tag representation.

Référencé par tln_sanitize().

◆ tln_unspace()

tln_unspace ( $attvalue)

Kill any tabs, newlines, or carriage returns. Our friends the makers of the browser with 95% market value decided that it'd be funny to make "java[tab]script" be just as good as "javascript".

Paramètres
string$attvalueThe attribute value before extraneous spaces removed.

Référencé par tln_fixatts(), et tln_fixstyle().