drupal_xml_parser_create

Definition

drupal_xml_parser_create(&$data)
includes/common.inc, line 1639

Description

Prepare a new XML parser.

This is a wrapper around xml_parser_create) which extracts the encoding from the XML data first and sets the output encoding to UTF-8. This function should be used instead of xml_parser_create), because PHP's XML parser doesn't check the input encoding itself.

This is also where unsupported encodings will be converted. Callers should take this into account: $data might have been changed after the call.

Parameters

&$data The XML data which will be parsed later.

Return value

An XML parser object.

Related topics

Namesort iconDescription
Input validationFunctions to validate user input.

Code

<?php
function drupal_xml_parser_create(&$data) {
  // Default XML encoding is UTF-8
  $encoding = 'utf-8';
  $bom = false;

  // Check for UTF-8 byte order mark (PHP5's XML parser doesn't handle it).
  if (!strncmp($data, "\xEF\xBB\xBF", 3)) {
    $bom = true;
    $data = substr($data, 3);
  }

  // Check for an encoding declaration in the XML prolog if no BOM was found.
  if (!$bom && ereg('^<\?xml[^>]+encoding="([^"]+)"', $data, $match)) {
    $encoding = $match[1];
  }

  // Unsupported encodings are converted here into UTF-8.
  $php_supported = array('utf-8', 'iso-8859-1', 'us-ascii');
  if (!in_array(strtolower($encoding), $php_supported)) {
    $out = drupal_convert_to_utf8($data, $encoding);
    if ($out !== false) {
      $encoding = 'utf-8';
      $data = ereg_replace('^(<\?xml[^>]+encoding)="([^"]+)"', '\\1="utf-8"', $out);
    }
    else {
      watchdog('php', t("Could not convert XML encoding '%s' to UTF-8.", array('%s' => theme('placeholder', $encoding))), WATCHDOG_WARNING);
      return 0;
    }
  }

  $xml_parser = xml_parser_create($encoding);
  xml_parser_set_option($xml_parser, XML_OPTION_TARGET_ENCODING, 'utf-8');
  return $xml_parser;
}
?>
 
 

Drupal is a registered trademark of Dries Buytaert.