summaryrefslogtreecommitdiffstats
path: root/3rdparty/htmlpurifier/docs
diff options
context:
space:
mode:
authorBernhard Posselt <nukeawhale@gmail.com>2013-05-04 00:15:41 +0200
committerBernhard Posselt <nukeawhale@gmail.com>2013-05-04 00:15:41 +0200
commit10831dd274ff65d4852b47dbc398adae61845206 (patch)
tree9f9397bb7433fd53bfacf88d8c8b3cf2ef50e27d /3rdparty/htmlpurifier/docs
parent7b628a3e4d105f2e571d0fe142d59f201d6a10d0 (diff)
use html purifier for sanitation
Diffstat (limited to '3rdparty/htmlpurifier/docs')
-rw-r--r--3rdparty/htmlpurifier/docs/dev-advanced-api.html26
-rw-r--r--3rdparty/htmlpurifier/docs/dev-code-quality.txt29
-rw-r--r--3rdparty/htmlpurifier/docs/dev-config-bcbreaks.txt79
-rw-r--r--3rdparty/htmlpurifier/docs/dev-config-naming.txt164
-rw-r--r--3rdparty/htmlpurifier/docs/dev-config-schema.html412
-rw-r--r--3rdparty/htmlpurifier/docs/dev-flush.html68
-rw-r--r--3rdparty/htmlpurifier/docs/dev-includes.txt281
-rw-r--r--3rdparty/htmlpurifier/docs/dev-naming.html83
-rw-r--r--3rdparty/htmlpurifier/docs/dev-optimization.html33
-rw-r--r--3rdparty/htmlpurifier/docs/dev-progress.html309
-rw-r--r--3rdparty/htmlpurifier/docs/dtd/xhtml1-transitional.dtd1201
-rw-r--r--3rdparty/htmlpurifier/docs/enduser-customize.html850
-rw-r--r--3rdparty/htmlpurifier/docs/enduser-id.html148
-rw-r--r--3rdparty/htmlpurifier/docs/enduser-overview.txt59
-rw-r--r--3rdparty/htmlpurifier/docs/enduser-security.txt18
-rw-r--r--3rdparty/htmlpurifier/docs/enduser-slow.html120
-rw-r--r--3rdparty/htmlpurifier/docs/enduser-tidy.html231
-rw-r--r--3rdparty/htmlpurifier/docs/enduser-uri-filter.html204
-rw-r--r--3rdparty/htmlpurifier/docs/enduser-utf8.html1060
-rw-r--r--3rdparty/htmlpurifier/docs/enduser-youtube.html153
-rw-r--r--3rdparty/htmlpurifier/docs/entities/xhtml-lat1.ent196
-rw-r--r--3rdparty/htmlpurifier/docs/entities/xhtml-special.ent80
-rw-r--r--3rdparty/htmlpurifier/docs/entities/xhtml-symbol.ent237
-rw-r--r--3rdparty/htmlpurifier/docs/examples/basic.php23
-rw-r--r--3rdparty/htmlpurifier/docs/fixquotes.htc9
-rw-r--r--3rdparty/htmlpurifier/docs/index.html188
-rw-r--r--3rdparty/htmlpurifier/docs/proposal-colors.html49
-rw-r--r--3rdparty/htmlpurifier/docs/proposal-config.txt23
-rw-r--r--3rdparty/htmlpurifier/docs/proposal-css-extraction.txt34
-rw-r--r--3rdparty/htmlpurifier/docs/proposal-errors.txt211
-rw-r--r--3rdparty/htmlpurifier/docs/proposal-filter-levels.txt137
-rw-r--r--3rdparty/htmlpurifier/docs/proposal-language.txt64
-rw-r--r--3rdparty/htmlpurifier/docs/proposal-new-directives.txt44
-rw-r--r--3rdparty/htmlpurifier/docs/proposal-plists.txt218
-rw-r--r--3rdparty/htmlpurifier/docs/ref-content-models.txt50
-rw-r--r--3rdparty/htmlpurifier/docs/ref-css-length.txt30
-rw-r--r--3rdparty/htmlpurifier/docs/ref-devnetwork.html47
-rw-r--r--3rdparty/htmlpurifier/docs/ref-html-modularization.txt166
-rw-r--r--3rdparty/htmlpurifier/docs/ref-proprietary-tags.txt26
-rw-r--r--3rdparty/htmlpurifier/docs/ref-whatwg.txt26
-rw-r--r--3rdparty/htmlpurifier/docs/specimens/LICENSE10
-rw-r--r--3rdparty/htmlpurifier/docs/specimens/html-align-to-css.html165
-rw-r--r--3rdparty/htmlpurifier/docs/specimens/img.pngbin0 -> 2138 bytes
-rw-r--r--3rdparty/htmlpurifier/docs/specimens/jochem-blok-word.html129
-rw-r--r--3rdparty/htmlpurifier/docs/specimens/windows-live-mail-desktop-beta.html74
-rw-r--r--3rdparty/htmlpurifier/docs/style.css76
46 files changed, 7840 insertions, 0 deletions
diff --git a/3rdparty/htmlpurifier/docs/dev-advanced-api.html b/3rdparty/htmlpurifier/docs/dev-advanced-api.html
new file mode 100644
index 000000000..4002fb8be
--- /dev/null
+++ b/3rdparty/htmlpurifier/docs/dev-advanced-api.html
@@ -0,0 +1,26 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
+ "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
+<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"><head>
+<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
+<meta name="description" content="Specification for HTML Purifier's advanced API for defining custom filtering behavior." />
+<link rel="stylesheet" type="text/css" href="style.css" />
+
+<title>Advanced API - HTML Purifier</title>
+
+</head><body>
+
+<h1>Advanced API</h1>
+
+<div id="filing">Filed under Development</div>
+<div id="index">Return to the <a href="index.html">index</a>.</div>
+<div id="home"><a href="http://htmlpurifier.org/">HTML Purifier</a> End-User Documentation</div>
+
+<p>
+ Please see <a href="enduser-customize.html">Customize!</a>
+</p>
+
+</body></html>
+
+<!-- vim: et sw=4 sts=4
+-->
diff --git a/3rdparty/htmlpurifier/docs/dev-code-quality.txt b/3rdparty/htmlpurifier/docs/dev-code-quality.txt
new file mode 100644
index 000000000..afce502f4
--- /dev/null
+++ b/3rdparty/htmlpurifier/docs/dev-code-quality.txt
@@ -0,0 +1,29 @@
+
+Code Quality Issues
+
+Okay, face it. Programmers can get lazy, cut corners, or make mistakes. They
+also can do quick prototypes, and then forget to rewrite them later. Well,
+while I can't list mistakes in here, I can list prototype-like segments
+of code that should be aggressively refactored. This does not list
+optimization issues, that needs to be done after intense profiling.
+
+docs/examples/demo.php - ad hoc HTML/PHP soup to the extreme
+
+AttrDef - a lot of duplication, more generic classes need to be created;
+a lot of strtolower() calls, no legit casing
+ Class - doesn't support Unicode characters (fringe); uses regular expressions
+ Lang - code duplication; premature optimization
+ Length - easily mistaken for CSSLength
+ URI - multiple regular expressions; missing validation for parts (?)
+ CSS - parser doesn't accept advanced CSS (fringe)
+ Number - constructor interface inconsistent with Integer
+Strategy
+ FixNesting - cannot bubble nodes out of structures, duplicated checks
+ for special-case parent node
+ RemoveForeignElements - should be run in parallel with MakeWellFormed
+URIScheme - needs to have callable generic checks
+ mailto - doesn't validate emails, doesn't validate querystring
+ news - doesn't validate opaque path
+ nntp - doesn't constrain path
+
+ vim: et sw=4 sts=4
diff --git a/3rdparty/htmlpurifier/docs/dev-config-bcbreaks.txt b/3rdparty/htmlpurifier/docs/dev-config-bcbreaks.txt
new file mode 100644
index 000000000..31114b2b7
--- /dev/null
+++ b/3rdparty/htmlpurifier/docs/dev-config-bcbreaks.txt
@@ -0,0 +1,79 @@
+
+Configuration Backwards-Compatibility Breaks
+
+In version 4.0.0, the configuration subsystem (composed of the outwards
+facing Config class, as well as the ConfigSchema and ConfigSchema_Interchange
+subsystems), was significantly revamped to make use of property lists.
+While most of the changes are internal, some internal APIs were changed for the
+sake of clarity. HTMLPurifier_Config was kept completely backwards compatible,
+although some of the functions were retrofitted with an unambiguous alternate
+syntax. Both of these changes are discussed in this document.
+
+
+
+1. Outwards Facing Changes
+--------------------------------------------------------------------------------
+
+The HTMLPurifier_Config class now takes an alternate syntax. The general rule
+is:
+
+ If you passed $namespace, $directive, pass "$namespace.$directive"
+ instead.
+
+An example:
+
+ $config->set('HTML', 'Allowed', 'p');
+
+becomes:
+
+ $config->set('HTML.Allowed', 'p');
+
+New configuration options may have more than one namespace, they might
+look something like %Filter.YouTube.Blacklist. While you could technically
+set it with ('HTML', 'YouTube.Blacklist'), the logical extension
+('HTML', 'YouTube', 'Blacklist') does not work.
+
+The old API will still work, but will emit E_USER_NOTICEs.
+
+
+
+2. Internal API Changes
+--------------------------------------------------------------------------------
+
+Some overarching notes: we've completely eliminated the notion of namespace;
+it's now an informal construct for organizing related configuration directives.
+
+Also, the validation routines for keys (formerly "$namespace.$directive")
+have been completely relaxed. I don't think it really should be necessary.
+
+2.1 HTMLPurifier_ConfigSchema
+
+First off, if you're interfacing with this class, you really shouldn't.
+HTMLPurifier_ConfigSchema_Builder_ConfigSchema is really the only class that
+should ever be creating HTMLPurifier_ConfigSchema, and HTMLPurifier_Config the
+only class that should be reading it.
+
+All namespace related methods were removed; they are completely unnecessary
+now. Any $namespace, $name arguments must be replaced with $key (where
+$key == "$namespace.$name"), including for addAlias().
+
+The $info and $defaults member variables are no longer indexed as
+[$namespace][$name]; they are now indexed as ["$namespace.$name"].
+
+All deprecated methods were finally removed, after having yelled at you as
+an E_USER_NOTICE for a while now.
+
+2.2 HTMLPurifier_ConfigSchema_Interchange
+
+Member variable $namespaces was removed.
+
+2.3 HTMLPurifier_ConfigSchema_Interchange_Id
+
+Member variable $namespace and $directive removed; member variable $key added.
+Any method that took $namespace, $directive now takes $key.
+
+2.4 HTMLPurifier_ConfigSchema_Interchange_Namespace
+
+Removed.
+
+ vim: et sw=4 sts=4
diff --git a/3rdparty/htmlpurifier/docs/dev-config-naming.txt b/3rdparty/htmlpurifier/docs/dev-config-naming.txt
new file mode 100644
index 000000000..1f85b6545
--- /dev/null
+++ b/3rdparty/htmlpurifier/docs/dev-config-naming.txt
@@ -0,0 +1,164 @@
+Configuration naming
+
+HTML Purifier 4.0.0 features a new configuration naming system that
+allows arbitrary nesting of namespaces. While there are certain cases
+in which using two namespaces is obviously better (the canonical example
+is where we were using AutoFormatParam to contain directives for AutoFormat
+parameters), it is unclear whether or not a general migration to highly
+namespaced directives is a good idea or not.
+
+== Case studies ==
+
+=== Attr.* ===
+
+We have a dead duck HTML.Attr.Name.UseCDATA which migrated before we decided
+to think this out thoroughly.
+
+We currently have a large number of directives in the Attr.* namespace.
+These directives tweak the behavior of some HTML attributes. They have
+the properties:
+
+* While they apply to only one attribute at a time, the attribute can
+ span over multiple elements (not necessarily all attributes, either).
+ The information of which elements it impacts is either omitted or
+ informally stated (EnableID applies to all elements, DefaultImageAlt
+ applies to <img> tags, AllowedRev doesn't say but only applies to a tags).
+
+* There is a certain degree of clustering that could be applied, especially
+ to the ID directives. The clustering could be done with respect to
+ what element/attribute was used, i.e.
+
+ *.id -> EnableID, IDBlacklistRegexp, IDBlacklist, IDPrefixLocal, IDPrefix
+ img.src -> DefaultInvalidImage
+ img.alt -> DefaultImageAlt, DefaultInvalidImageAlt
+ bdo.dir -> DefaultTextDir
+ a.rel -> AllowedRel
+ a.rev -> AllowedRev
+ a.target -> AllowedFrameTargets
+ a.name -> Name.UseCDATA
+
+* The directives often reference generic attribute types that were specified
+ in the DTD/specification. However, some of the behavior specifically relies
+ on the fact that other use cases of the attribute are not, at current,
+ supported by HTML Purifier.
+
+ AllowedRel, AllowedRev -> heavily <a> specific; if <link> ends up being
+ allowed, we will also have to give users specificity there (we also
+ want to preserve generality) DTD %Linktypes, HTML5 distinguishes
+ between <link> and <a>/<area>
+ AllowedFrameTargets -> heavily <a> specific, but also used by <area>
+ and <form>. Transitional DTD %FrameTarget, not present in strict,
+ HTML5 calls them "browsing contexts"
+ Default*Image* -> as a default parameter, is almost entirely exlcusive
+ to <img>
+ EnableID -> global attribute
+ Name.UseCDATA -> heavily <a> specific, but has heavy other usage by
+ many things
+
+== AutoFormat.* ==
+
+These have the fairly normal pluggable architecture that lends itself to
+large amounts of namespaces (pluggability may be the key to figuring
+out when gratuitous namespacing is good.) Properties:
+
+* Boolean directives are fair game for being namespaced: for example,
+ RemoveEmpty.RemoveNbsp triggers RemoveEmpty.RemoveNbsp.Exceptions,
+ the latter of which only makes sense when RemoveEmpty.RemoveNbsp
+ is set to true. (The same applies to RemoveNbsp too)
+
+The AutoFormat string is a bit long, but is the only bit of repeated
+context.
+
+== Core.* ==
+
+Core is the potpourri of directives, mostly regarding some minor behavioral
+tweaks for HTML handling abilities.
+
+ AggressivelyFixLt
+ ConvertDocumentToFragment
+ DirectLexLineNumberSyncInterval
+ LexerImpl
+ MaintainLineNumbers
+ Lexer
+ CollectErrors
+ Language
+ Error handling (Language is ostensibly a little more general, but
+ it's only used for error handling right now)
+ ColorKeywords
+ CSS and HTML
+ Encoding
+ EscapeNonASCIICharacters
+ Character encoding
+ EscapeInvalidChildren
+ EscapeInvalidTags
+ HiddenElements
+ RemoveInvalidImg
+ Lexing/Output
+ RemoveScriptContents
+ Deprecated
+
+== HTML.* ==
+
+ AllowedAttributes
+ AllowedElements
+ AllowedModules
+ Allowed
+ ForbiddenAttributes
+ ForbiddenElements
+ Element set tuning
+ BlockWrapper
+ Child def advanced twiddle
+ CoreModules
+ CustomDoctype
+ Advanced HTMLModuleManager twiddles
+ DefinitionID
+ DefinitionRev
+ Caching
+ Doctype
+ Parent
+ Strict
+ XHTML
+ Global environment
+ MaxImgLength
+ Attribute twiddle? (applies to two attributes)
+ Proprietary
+ SafeEmbed
+ SafeObject
+ Trusted
+ Extra functionality/tagsets
+ TidyAdd
+ TidyLevel
+ TidyRemove
+ Tidy
+
+== Output.* ==
+
+These directly affect the output of Generator. These are all advanced
+twiddles.
+
+== URI.* ==
+
+ AllowedSchemes
+ OverrideAllowedSchemes
+ Scheme tuning
+ Base
+ DefaultScheme
+ Host
+ Global environment
+ DefinitionID
+ DefinitionRev
+ Caching
+ DisableExternalResources
+ DisableExternal
+ DisableResources
+ Disable
+ Contextual/authority tuning
+ HostBlacklist
+ Authority tuning
+ MakeAbsolute
+ MungeResources
+ MungeSecretKey
+ Munge
+ Transformation behavior (munge can be grouped)
+
+
diff --git a/3rdparty/htmlpurifier/docs/dev-config-schema.html b/3rdparty/htmlpurifier/docs/dev-config-schema.html
new file mode 100644
index 000000000..39d866d4a
--- /dev/null
+++ b/3rdparty/htmlpurifier/docs/dev-config-schema.html
@@ -0,0 +1,412 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
+ "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
+<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
+ <head>
+ <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
+ <meta name="description" content="Describes config schema framework in HTML Purifier." />
+ <link rel="stylesheet" type="text/css" href="./style.css" />
+ <title>Config Schema - HTML Purifier</title>
+ </head>
+ <body>
+
+ <h1>Config Schema</h1>
+
+ <div id="filing">Filed under Development</div>
+ <div id="index">Return to the <a href="index.html">index</a>.</div>
+ <div id="home"><a href="http://htmlpurifier.org/">HTML Purifier</a> End-User Documentation</div>
+
+ <p>
+ HTML Purifier has a fairly complex system for configuration. Users
+ interact with a <code>HTMLPurifier_Config</code> object to
+ set configuration directives. The values they set are validated according
+ to a configuration schema, <code>HTMLPurifier_ConfigSchema</code>.
+ </p>
+
+ <p>
+ The schema is mostly transparent to end-users, but if you're doing development
+ work for HTML Purifier and need to define a new configuration directive,
+ you'll need to interact with it. We'll also talk about how to define
+ userspace configuration directives at the very end.
+ </p>
+
+ <h2>Write a directive file</h2>
+
+ <p>
+ Directive files define configuration directives to be used by
+ HTML Purifier. They are placed in <code>library/HTMLPurifier/ConfigSchema/schema/</code>
+ in the form <code><em>Namespace</em>.<em>Directive</em>.txt</code> (I
+ couldn't think of a more descriptive file extension.)
+ Directive files are actually what we call <code>StringHash</code>es,
+ i.e. associative arrays represented in a string form reminiscent of
+ <a href="http://qa.php.net/write-test.php">PHPT</a> tests. Here's a
+ sample directive file, <code>Test.Sample.txt</code>:
+ </p>
+
+ <pre>Test.Sample
+TYPE: string/null
+DEFAULT: NULL
+ALLOWED: 'foo', 'bar'
+VALUE-ALIASES: 'baz' => 'bar'
+VERSION: 3.1.0
+--DESCRIPTION--
+This is a sample configuration directive for the purposes of the
+&lt;code&gt;dev-config-schema.html&lt;code&gt; documentation.
+--ALIASES--
+Test.Example</pre>
+
+ <p>
+ Each of these segments has a specific meaning:
+ </p>
+
+ <table class="table">
+ <thead>
+ <tr>
+ <th>Key</th>
+ <th>Example</th>
+ <th>Description</th>
+ </tr>
+ </thead>
+ <tbody>
+ <tr>
+ <td>ID</td>
+ <td>Test.Sample</td>
+ <td>The name of the directive, in the form Namespace.Directive
+ (implicitly the first line)</td>
+ </tr>
+ <tr>
+ <td>TYPE</td>
+ <td>string/null</td>
+ <td>The type of variable this directive accepts. See below for
+ details. You can also add <code>/null</code> to the end of
+ any basic type to allow null values too.</td>
+ </tr>
+ <tr>
+ <td>DEFAULT</td>
+ <td>NULL</td>
+ <td>A parseable PHP expression of the default value.</td>
+ </tr>
+ <tr>
+ <td>DESCRIPTION</td>
+ <td>This is a...</td>
+ <td>An HTML description of what this directive does.</td>
+ </tr>
+ <tr>
+ <td>VERSION</td>
+ <td>3.1.0</td>
+ <td><em>Recommended</em>. The version of HTML Purifier this directive was added.
+ Directives that have been around since 1.0.0 don't have this,
+ but any new ones should.</td>
+ </tr>
+ <tr>
+ <td>ALIASES</td>
+ <td>Test.Example</td>
+ <td><em>Optional</em>. A comma separated list of aliases for this directive.
+ This is most useful for backwards compatibility and should
+ not be used otherwise.</td>
+ </tr>
+ <tr>
+ <td>ALLOWED</td>
+ <td>'foo', 'bar'</td>
+ <td><em>Optional</em>. Set of allowed value for a directive,
+ a comma separated list of parseable PHP expressions. This
+ is only allowed string, istring, text and itext TYPEs.</td>
+ </tr>
+ <tr>
+ <td>VALUE-ALIASES</td>
+ <td>'baz' =&gt; 'bar'</td>
+ <td><em>Optional</em>. Mapping of one value to another, and
+ should be a comma separated list of keypair duples. This
+ is only allowed string, istring, text and itext TYPEs.</td>
+ </tr>
+ <tr>
+ <td>DEPRECATED-VERSION</td>
+ <td>3.1.0</td>
+ <td><em>Not shown</em>. Indicates that the directive was
+ deprecated this version.</td>
+ </tr>
+ <tr>
+ <td>DEPRECATED-USE</td>
+ <td>Test.NewDirective</td>
+ <td><em>Not shown</em>. Indicates what new directive should be
+ used instead. Note that the directives will functionally be
+ different, although they should offer the same functionality.
+ If they are identical, use an alias instead.</td>
+ </tr>
+ <tr>
+ <td>EXTERNAL</td>
+ <td>CSSTidy</td>
+ <td><em>Not shown</em>. Indicates if there is an external library
+ the user will need to download and install to use this configuration
+ directive. As of right now, this is merely a Google-able name; future
+ versions may also provide links and instructions.</td>
+ </tr>
+ </tbody>
+ </table>
+
+ <p>
+ Some notes on format and style:
+ </p>
+
+ <ul>
+ <li>
+ Each of these keys can be expressed in the short format
+ (<code>KEY: Value</code>) or the long format
+ (<code>--KEY--</code> with value beneath). You must use the
+ long format if multiple lines are needed, or if a long format
+ has been used already (that's why <code>ALIASES</code> in our
+ example is in the long format); otherwise, it's user preference.
+ </li>
+ <li>
+ The HTML descriptions should be wrapped at about 80 columns; do
+ not rely on editor word-wrapping.
+ </li>
+ </ul>
+
+ <p>
+ Also, as promised, here is the set of possible types:
+ </p>
+
+ <table class="table">
+ <thead>
+ <tr>
+ <th>Type</th>
+ <th>Example</th>
+ <th>Description</th>
+ </tr>
+ </thead>
+ <tbody>
+ <tr>
+ <td>string</td>
+ <td>'Foo'</td>
+ <td><a href="http://docs.php.net/manual/en/language.types.string.php">String</a> without newlines</td>
+ </tr>
+ <tr>
+ <td>istring</td>
+ <td>'foo'</td>
+ <td>Case insensitive ASCII string without newlines</td>
+ </tr>
+ <tr>
+ <td>text</td>
+ <td>"A<em>\n</em>b"</td>
+ <td>String with newlines</td>
+ </tr>
+ <tr>
+ <td>itext</td>
+ <td>"a<em>\n</em>b"</td>
+ <td>Case insensitive ASCII string without newlines</td>
+ </tr>
+ <tr>
+ <td>int</td>
+ <td>23</td>
+ <td>Integer</td>
+ </tr>
+ <tr>
+ <td>float</td>
+ <td>3.0</td>
+ <td>Floating point number</td>
+ </tr>
+ <tr>
+ <td>bool</td>
+ <td>true</td>
+ <td>Boolean</td>
+ </tr>
+ <tr>
+ <td>lookup</td>
+ <td>array('key' =&gt; true)</td>
+