function FilterHtml::getHTMLRestrictions

Same name in other branches
  1. 9 core/modules/filter/src/Plugin/Filter/FilterHtml.php \Drupal\filter\Plugin\Filter\FilterHtml::getHTMLRestrictions()
  2. 8.9.x core/modules/filter/src/Plugin/Filter/FilterHtml.php \Drupal\filter\Plugin\Filter\FilterHtml::getHTMLRestrictions()
  3. 10 core/modules/filter/src/Plugin/Filter/FilterHtml.php \Drupal\filter\Plugin\Filter\FilterHtml::getHTMLRestrictions()

Overrides FilterBase::getHTMLRestrictions

2 calls to FilterHtml::getHTMLRestrictions()
FilterHtml::filterAttributes in core/modules/filter/src/Plugin/Filter/FilterHtml.php
Provides filtering of tag attributes into accepted HTML.
FilterHtml::process in core/modules/filter/src/Plugin/Filter/FilterHtml.php
Performs the filter processing.

File

core/modules/filter/src/Plugin/Filter/FilterHtml.php, line 245

Class

FilterHtml
Provides a filter to limit allowed HTML tags.

Namespace

Drupal\filter\Plugin\Filter

Code

public function getHTMLRestrictions() {
    if ($this->restrictions) {
        return $this->restrictions;
    }
    // Parse the allowed HTML setting, and gradually make the list of allowed
    // tags more specific.
    $restrictions = [
        'allowed' => [],
    ];
    $html = $this->settings['allowed_html'];
    // Protect any trailing * characters in attribute names, since DomDocument
    // strips them as invalid.
    // cSpell:disable-next-line
    $star_protector = '__zqh6vxfbk3cg__';
    $html = str_replace('*', $star_protector, $html);
    // Use HTML5 parser with a custom tokenizer to correctly parse tags that
    // normally use text mode, such as iframe.
    $events = new DOMTreeBuilder(FALSE, [
        'disable_html_ns' => TRUE,
    ]);
    $scanner = new Scanner('<body>' . $html);
    $parser = new class ($scanner, $events) extends Tokenizer {
        public function setTextMode($textMode, $untilTag = NULL) {
            // Do nothing, we never enter text mode.
        }

};
    $parser->parse();
    $dom = $events->document();
    $xpath = new \DOMXPath($dom);
    foreach ($xpath->query('//body//*') as $node) {
        $tag = $node->tagName;
        // All attributes are already allowed on this tag, this is the most
        // permissive configuration, no additional processing is required.
        if (isset($restrictions['allowed'][$tag]) && $restrictions['allowed'][$tag] === TRUE) {
            continue;
        }
        if ($node->hasAttributes()) {
            // If the tag is not yet present, prepare to add attribute restrictions.
            // Otherwise, check if a more restrictive configuration (FALSE, meaning
            // no attributes were allowed) is present: then override the existing
            // value to prepare to add attribute restrictions.
            if (!isset($restrictions['allowed'][$tag]) || $restrictions['allowed'][$tag] === FALSE) {
                $restrictions['allowed'][$tag] = [];
            }
            // Iterate over any attributes, and mark them as allowed.
            foreach ($node->attributes as $name => $attribute) {
                // Only add specific attribute values if all values are not already
                // allowed.
                if (isset($restrictions['allowed'][$tag][$name]) && $restrictions['allowed'][$tag][$name] === TRUE) {
                    continue;
                }
                // Put back any trailing * on wildcard attribute name.
                $name = str_replace($star_protector, '*', $name);
                // Put back any trailing * on wildcard attribute value and parse out
                // the allowed attribute values.
                $allowed_attribute_values = preg_split('/\\s+/', str_replace($star_protector, '*', $attribute->value), -1, PREG_SPLIT_NO_EMPTY);
                // Sanitize the attribute value: it lists the allowed attribute values
                // but one allowed attribute value that some may be tempted to use
                // is specifically nonsensical: the asterisk. A prefix is required for
                // allowed attribute values with a wildcard. A wildcard by itself
                // would mean allowing all possible attribute values. But in that
                // case, one would not specify an attribute value at all.
                $allowed_attribute_values = array_filter($allowed_attribute_values, function ($value) {
                    return $value !== '*';
                });
                if (empty($allowed_attribute_values)) {
                    // If the value is the empty string all values are allowed.
                    $restrictions['allowed'][$tag][$name] = TRUE;
                }
                else {
                    // A non-empty attribute value is assigned, mark each of the
                    // specified attribute values as allowed.
                    foreach ($allowed_attribute_values as $value) {
                        $restrictions['allowed'][$tag][$name][$value] = TRUE;
                    }
                }
            }
        }
        if (empty($restrictions['allowed'][$tag])) {
            // Mark the tag as allowed, but with no attributes allowed.
            $restrictions['allowed'][$tag] = FALSE;
        }
    }
    // The 'style' and 'on*' ('onClick' etc.) attributes are always forbidden,
    // and are removed by Xss::filter().
    // The 'lang', and 'dir' attributes apply to all elements and are always
    // allowed. The list of allowed values for the 'dir' attribute is enforced
    // by self::filterAttributes(). Note that those two attributes are in the
    // short list of globally usable attributes in HTML5. They are always
    // allowed since the correct values of lang and dir may only be known to
    // the content author. Of the other global attributes, they are not usually
    // added by hand to content, and especially the class attribute can have
    // undesired visual effects by allowing content authors to apply any
    // available style, so specific values should be explicitly allowed.
    // @see http://www.w3.org/TR/html5/dom.html#global-attributes
    $restrictions['allowed']['*'] = [
        'style' => FALSE,
        'on*' => FALSE,
        'lang' => TRUE,
        'dir' => [
            'ltr' => TRUE,
            'rtl' => TRUE,
        ],
    ];
    // Save this calculated result for re-use.
    $this->restrictions = $restrictions;
    return $restrictions;
}

Buggy or inaccurate documentation? Please file an issue. Need support? Need help programming? Connect with the Drupal community.