diff options
Diffstat (limited to 'vendor/ezyang/htmlpurifier/docs')
46 files changed, 7840 insertions, 0 deletions
diff --git a/vendor/ezyang/htmlpurifier/docs/dev-advanced-api.html b/vendor/ezyang/htmlpurifier/docs/dev-advanced-api.html new file mode 100644 index 000000000..5b7aaa3c8 --- /dev/null +++ b/vendor/ezyang/htmlpurifier/docs/dev-advanced-api.html @@ -0,0 +1,26 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" + "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> +<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"><head> +<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> +<meta name="description" content="Specification for HTML Purifier's advanced API for defining custom filtering behavior." /> +<link rel="stylesheet" type="text/css" href="style.css" /> + +<title>Advanced API - HTML Purifier</title> + +</head><body> + +<h1>Advanced API</h1> + +<div id="filing">Filed under Development</div> +<div id="index">Return to the <a href="index.html">index</a>.</div> +<div id="home"><a href="http://htmlpurifier.org/">HTML Purifier</a> End-User Documentation</div> + +<p> + Please see <a href="enduser-customize.html">Customize!</a> +</p> + +</body></html> + +<!-- vim: et sw=4 sts=4 +--> diff --git a/vendor/ezyang/htmlpurifier/docs/dev-code-quality.txt b/vendor/ezyang/htmlpurifier/docs/dev-code-quality.txt new file mode 100644 index 000000000..bceedebc4 --- /dev/null +++ b/vendor/ezyang/htmlpurifier/docs/dev-code-quality.txt @@ -0,0 +1,29 @@ + +Code Quality Issues + +Okay, face it. Programmers can get lazy, cut corners, or make mistakes. They +also can do quick prototypes, and then forget to rewrite them later. Well, +while I can't list mistakes in here, I can list prototype-like segments +of code that should be aggressively refactored. This does not list +optimization issues, that needs to be done after intense profiling. + +docs/examples/demo.php - ad hoc HTML/PHP soup to the extreme + +AttrDef - a lot of duplication, more generic classes need to be created; +a lot of strtolower() calls, no legit casing + Class - doesn't support Unicode characters (fringe); uses regular expressions + Lang - code duplication; premature optimization + Length - easily mistaken for CSSLength + URI - multiple regular expressions; missing validation for parts (?) + CSS - parser doesn't accept advanced CSS (fringe) + Number - constructor interface inconsistent with Integer +Strategy + FixNesting - cannot bubble nodes out of structures, duplicated checks + for special-case parent node + RemoveForeignElements - should be run in parallel with MakeWellFormed +URIScheme - needs to have callable generic checks + mailto - doesn't validate emails, doesn't validate querystring + news - doesn't validate opaque path + nntp - doesn't constrain path + + vim: et sw=4 sts=4 diff --git a/vendor/ezyang/htmlpurifier/docs/dev-config-bcbreaks.txt b/vendor/ezyang/htmlpurifier/docs/dev-config-bcbreaks.txt new file mode 100644 index 000000000..29a58ca2f --- /dev/null +++ b/vendor/ezyang/htmlpurifier/docs/dev-config-bcbreaks.txt @@ -0,0 +1,79 @@ + +Configuration Backwards-Compatibility Breaks + +In version 4.0.0, the configuration subsystem (composed of the outwards +facing Config class, as well as the ConfigSchema and ConfigSchema_Interchange +subsystems), was significantly revamped to make use of property lists. +While most of the changes are internal, some internal APIs were changed for the +sake of clarity. HTMLPurifier_Config was kept completely backwards compatible, +although some of the functions were retrofitted with an unambiguous alternate +syntax. Both of these changes are discussed in this document. + + + +1. Outwards Facing Changes +-------------------------------------------------------------------------------- + +The HTMLPurifier_Config class now takes an alternate syntax. The general rule +is: + + If you passed $namespace, $directive, pass "$namespace.$directive" + instead. + +An example: + + $config->set('HTML', 'Allowed', 'p'); + +becomes: + + $config->set('HTML.Allowed', 'p'); + +New configuration options may have more than one namespace, they might +look something like %Filter.YouTube.Blacklist. While you could technically +set it with ('HTML', 'YouTube.Blacklist'), the logical extension +('HTML', 'YouTube', 'Blacklist') does not work. + +The old API will still work, but will emit E_USER_NOTICEs. + + + +2. Internal API Changes +-------------------------------------------------------------------------------- + +Some overarching notes: we've completely eliminated the notion of namespace; +it's now an informal construct for organizing related configuration directives. + +Also, the validation routines for keys (formerly "$namespace.$directive") +have been completely relaxed. I don't think it really should be necessary. + +2.1 HTMLPurifier_ConfigSchema + +First off, if you're interfacing with this class, you really shouldn't. +HTMLPurifier_ConfigSchema_Builder_ConfigSchema is really the only class that +should ever be creating HTMLPurifier_ConfigSchema, and HTMLPurifier_Config the +only class that should be reading it. + +All namespace related methods were removed; they are completely unnecessary +now. Any $namespace, $name arguments must be replaced with $key (where +$key == "$namespace.$name"), including for addAlias(). + +The $info and $defaults member variables are no longer indexed as +[$namespace][$name]; they are now indexed as ["$namespace.$name"]. + +All deprecated methods were finally removed, after having yelled at you as +an E_USER_NOTICE for a while now. + +2.2 HTMLPurifier_ConfigSchema_Interchange + +Member variable $namespaces was removed. + +2.3 HTMLPurifier_ConfigSchema_Interchange_Id + +Member variable $namespace and $directive removed; member variable $key added. +Any method that took $namespace, $directive now takes $key. + +2.4 HTMLPurifier_ConfigSchema_Interchange_Namespace + +Removed. + + vim: et sw=4 sts=4 diff --git a/vendor/ezyang/htmlpurifier/docs/dev-config-naming.txt b/vendor/ezyang/htmlpurifier/docs/dev-config-naming.txt new file mode 100644 index 000000000..66db5bce3 --- /dev/null +++ b/vendor/ezyang/htmlpurifier/docs/dev-config-naming.txt @@ -0,0 +1,164 @@ +Configuration naming + +HTML Purifier 4.0.0 features a new configuration naming system that +allows arbitrary nesting of namespaces. While there are certain cases +in which using two namespaces is obviously better (the canonical example +is where we were using AutoFormatParam to contain directives for AutoFormat +parameters), it is unclear whether or not a general migration to highly +namespaced directives is a good idea or not. + +== Case studies == + +=== Attr.* === + +We have a dead duck HTML.Attr.Name.UseCDATA which migrated before we decided +to think this out thoroughly. + +We currently have a large number of directives in the Attr.* namespace. +These directives tweak the behavior of some HTML attributes. They have +the properties: + +* While they apply to only one attribute at a time, the attribute can + span over multiple elements (not necessarily all attributes, either). + The information of which elements it impacts is either omitted or + informally stated (EnableID applies to all elements, DefaultImageAlt + applies to <img> tags, AllowedRev doesn't say but only applies to a tags). + +* There is a certain degree of clustering that could be applied, especially + to the ID directives. The clustering could be done with respect to + what element/attribute was used, i.e. + + *.id -> EnableID, IDBlacklistRegexp, IDBlacklist, IDPrefixLocal, IDPrefix + img.src -> DefaultInvalidImage + img.alt -> DefaultImageAlt, DefaultInvalidImageAlt + bdo.dir -> DefaultTextDir + a.rel -> AllowedRel + a.rev -> AllowedRev + a.target -> AllowedFrameTargets + a.name -> Name.UseCDATA + +* The directives often reference generic attribute types that were specified + in the DTD/specification. However, some of the behavior specifically relies + on the fact that other use cases of the attribute are not, at current, + supported by HTML Purifier. + + AllowedRel, AllowedRev -> heavily <a> specific; if <link> ends up being + allowed, we will also have to give users specificity there (we also + want to preserve generality) DTD %Linktypes, HTML5 distinguishes + between <link> and <a>/<area> + AllowedFrameTargets -> heavily <a> specific, but also used by <area> + and <form>. Transitional DTD %FrameTarget, not present in strict, + HTML5 calls them "browsing contexts" + Default*Image* -> as a default parameter, is almost entirely exlcusive + to <img> + EnableID -> global attribute + Name.UseCDATA -> heavily <a> specific, but has heavy other usage by + many things + +== AutoFormat.* == + +These have the fairly normal pluggable architecture that lends itself to +large amounts of namespaces (pluggability may be the key to figuring +out when gratuitous namespacing is good.) Properties: + +* Boolean directives are fair game for being namespaced: for example, + RemoveEmpty.RemoveNbsp triggers RemoveEmpty.RemoveNbsp.Exceptions, + the latter of which only makes sense when RemoveEmpty.RemoveNbsp + is set to true. (The same applies to RemoveNbsp too) + +The AutoFormat string is a bit long, but is the only bit of repeated +context. + +== Core.* == + +Core is the potpourri of directives, mostly regarding some minor behavioral +tweaks for HTML handling abilities. + + AggressivelyFixLt + ConvertDocumentToFragment + DirectLexLineNumberSyncInterval + LexerImpl + MaintainLineNumbers + Lexer + CollectErrors + Language + Error handling (Language is ostensibly a little more general, but + it's only used for error handling right now) + ColorKeywords + CSS and HTML + Encoding + EscapeNonASCIICharacters + Character encoding + EscapeInvalidChildren + EscapeInvalidTags + HiddenElements + RemoveInvalidImg + Lexing/Output + RemoveScriptContents + Deprecated + +== HTML.* == + + AllowedAttributes + AllowedElements + AllowedModules + Allowed + ForbiddenAttributes + ForbiddenElements + Element set tuning + BlockWrapper + Child def advanced twiddle + CoreModules + CustomDoctype + Advanced HTMLModuleManager twiddles + DefinitionID + DefinitionRev + Caching + Doctype + Parent + Strict + XHTML + Global environment + MaxImgLength + Attribute twiddle? (applies to two attributes) + Proprietary + SafeEmbed + SafeObject + Trusted + Extra functionality/tagsets + TidyAdd + TidyLevel + TidyRemove + Tidy + +== Output.* == + +These directly affect the output of Generator. These are all advanced +twiddles. + +== URI.* == + + AllowedSchemes + OverrideAllowedSchemes + Scheme tuning + Base + DefaultScheme + Host + Global environment + DefinitionID + DefinitionRev + Caching + DisableExternalResources + DisableExternal + DisableResources + Disable + Contextual/authority tuning + HostBlacklist + Authority tuning + MakeAbsolute + MungeResources + MungeSecretKey + Munge + Transformation behavior (munge can be grouped) + + diff --git a/vendor/ezyang/htmlpurifier/docs/dev-config-schema.html b/vendor/ezyang/htmlpurifier/docs/dev-config-schema.html new file mode 100644 index 000000000..07aecd35a --- /dev/null +++ b/vendor/ezyang/htmlpurifier/docs/dev-config-schema.html @@ -0,0 +1,412 @@ +<?xml version="1.0" encoding="UTF-8"?> +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" + "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> +<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> + <head> + <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> + <meta name="description" content="Describes config schema framework in HTML Purifier." /> + <link rel="stylesheet" type="text/css" href="./style.css" /> + <title>Config Schema - HTML Purifier</title> + </head> + <body> + + <h1>Config Schema</h1> + + <div id="filing">Filed under Development</div> + <div id="index">Return to the <a href="index.html">index</a>.</div> + <div id="home"><a href="http://htmlpurifier.org/">HTML Purifier</a> End-User Documentation</div> + + <p> + HTML Purifier has a fairly complex system for configuration. Users + interact with a <code>HTMLPurifier_Config</code> object to + set configuration directives. The values they set are validated according + to a configuration schema, <code>HTMLPurifier_ConfigSchema</code>. + </p> + + <p> + The schema is mostly transparent to end-users, but if you're doing development + work for HTML Purifier and need to define a new configuration directive, + you'll need to interact with it. We'll also talk about how to define + userspace configuration directives at the very end. + </p> + + <h2>Write a directive file</h2> + + <p> + Directive files define configuration directives to be used by + HTML Purifier. They are placed in <code>library/HTMLPurifier/ConfigSchema/schema/</code> + in the form <code><em>Namespace</em>.<em>Directive</em>.txt</code> (I + couldn't think of a more descriptive file extension.) + Directive files are actually what we call <code>StringHash</code>es, + i.e. associative arrays represented in a string form reminiscent of + <a href="http://qa.php.net/write-test.php">PHPT</a> tests. Here's a + sample directive file, <code>Test.Sample.txt</code>: + </p> + + <pre>Test.Sample +TYPE: string/null +DEFAULT: NULL +ALLOWED: 'foo', 'bar' +VALUE-ALIASES: 'baz' => 'bar' +VERSION: 3.1.0 +--DESCRIPTION-- +This is a sample configuration directive for the purposes of the +<code>dev-config-schema.html<code> documentation. +--ALIASES-- +Test.Example</pre> + + <p> + Each of these segments has a specific meaning: + </p> + + <table class="table"> + <thead> + <tr> + <th>Key</th> + <th>Example</th> + <th>Description</th> + </tr> + </thead> + <tbody> + <tr> + <td>ID</td> + <td>Test.Sample</td> + <td>The name of the directive, in the form Namespace.Directive + (implicitly the first line)</td> + </tr> + <tr> + <td>TYPE</td> + <td>string/null</td> + <td>The type of variable this directive accepts. See below for + details. You can also add <code>/null</code> to the end of + any basic type to allow null values too.</td> + </tr> + <tr> + <td>DEFAULT</td> + <td>NULL</td> + <td>A parseable PHP expression of the default value.</td> + </tr> + <tr> + <td>DESCRIPTION</td> + <td>This is a...</td> + <td>An HTML description of what this directive does.</td> + </tr> + <tr> + <td>VERSION</td> + <td>3.1.0</td> + <td><em>Recommended</em>. The version of HTML Purifier this directive was added. + Directives that have been around since 1.0.0 don't have this, + but any new ones should.</td> + </tr> + <tr> + <td>ALIASES</td> + <td>Test.Example</td> + <td><em>Optional</em>. A comma separated list of aliases for this directive. + This is most useful for backwards compatibility and should + not be used otherwise.</td> + </tr> + <tr> + <td>ALLOWED</td> + <td>'foo', 'bar'</td> + <td><em>Optional</em>. Set of allowed value for a directive, + a comma separated list of parseable PHP expressions. This + is only allowed string, istring, text and itext TYPEs.</td> + </tr> + <tr> + <td>VALUE-ALIASES</td> + <td>'baz' => 'bar'</td> + <td><em>Optional</em>. Mapping of one value to another, and + should be a comma separated list of keypair duples. This + is only allowed string, istring, text and itext TYPEs.</td> + </tr> + <tr> + <td>DEPRECATED-VERSION</td> + <td>3.1.0</td> + <td><em>Not shown</em>. Indicates that the directive was + deprecated this version.</td> + </tr> + <tr> + <td>DEPRECATED-USE</td> + <td>Test.NewDirective</td> + <td><em>Not shown</em>. Indicates what new directive should be + used instead. Note that the directives will functionally be + different, although they should offer the same functionality. + If they are identical, use an alias instead.</td> + </tr> + <tr> + <td>EXTERNAL</td> + <td>CSSTidy</td> + <td><em>Not shown</em>. Indicates if there is an external library + the user will need to download and install to use this configuration + directive. As of right now, this is merely a Google-able name; future + versions may also provide links and instructions.</td> + </tr> + </tbody> + </table> + + <p> + Some notes on format and style: + </p> + + <ul> + <li> + Each of these keys can be expressed in the short format + (<code>KEY: Value</code>) or the long format + (<code>--KEY--</code> with value beneath). You must use the + long format if multiple lines are needed, or if a long format + has been used already (that's why <code>ALIASES</code> in our + example is in the long format); otherwise, it's user preference. + </li> + <li> + The HTML descriptions should be wrapped at about 80 columns; do + not rely on editor word-wrapping. + </li> + </ul> + + <p> + Also, as promised, here is the set of possible types: + </p> + + <table class="table"> + <thead> + <tr> + <th>Type</th> + <th>Example</th> + <th>Description</th> + </tr> + </thead> + <tbody> + <tr> + <td>string</td> + <td>'Foo'</td> + <td><a href="http://docs.php.net/manual/en/language.types.string.php">String</a> without newlines</td> + </tr> + <tr> + <td>istring</td> + <td>'foo'</td> + <td>Case insensitive ASCII string without newlines</td> + </tr> + <tr> + <td>text</td> + <td>"A<em>\n</em>b"</td> + <td>String with newlines</td> + </tr> + <tr> + <td>itext</td> + <td>"a<em>\n</em>b"</td> + <td>Case insensitive ASCII string without newlines</td> + </tr> + <tr> + <td>int</td> + <td>23</td> + <td>Integer</td> + </tr> + <tr> + <td>float</td> + <td>3.0</td> + <td>Floating point number</td> + </tr> + <tr> + <td>bool</td> + <td>true</td> + <td>Boolean</td> + </tr> + <tr> + <td>lookup</td> + <td>array('key' => true)</td> + <td>Lookup array, used with <code>isset($var[$key])</code></td> + </tr> + <tr> + <td>list</td> + <td>array('f', 'b')</td> + <td>List array, with ordered numerical indexes</td> + </tr> + <tr> + <td>hash</td> + <td>array('key' => 'val')</td> + <td>Associative array of keys to values</td> + </tr> + <tr> + <td>mixed</td> + <td>new stdclass</td> + <td>Any PHP variable is fine</td> + </tr> + </tbody> + </table> + + <p> + The examples represent what will be returned |