Same name and namespace in other branches
- 4.7.x includes/unicode.inc \drupal_xml_parser_create()
- 5.x includes/unicode.inc \drupal_xml_parser_create()
- 6.x includes/unicode.inc \drupal_xml_parser_create()
- 7.x includes/unicode.inc \drupal_xml_parser_create()
- 8.9.x core/includes/unicode.inc \drupal_xml_parser_create()
Prepare a new XML parser.
This is a wrapper around xml_parser_create() which extracts the encoding from the XML data first and sets the output encoding to UTF-8. This function should be used instead of xml_parser_create(), because PHP's XML parser doesn't check the input encoding itself.
This is also where unsupported encodings will be converted. Callers should take this into account: $data might have been changed after the call.
Parameters
&$data: The XML data which will be parsed later.
Return value
An XML parser object.
Related topics
1 call to drupal_xml_parser_create()
- aggregator_parse_feed in modules/
aggregator.module
File
- includes/
common.inc, line 1639 - Common functions that many Drupal modules will need to reference.
Code
function drupal_xml_parser_create(&$data) {
// Default XML encoding is UTF-8
$encoding = 'utf-8';
$bom = false;
// Check for UTF-8 byte order mark (PHP5's XML parser doesn't handle it).
if (!strncmp($data, "", 3)) {
$bom = true;
$data = substr($data, 3);
}
// Check for an encoding declaration in the XML prolog if no BOM was found.
if (!$bom && ereg('^<\\?xml[^>]+encoding="([^"]+)"', $data, $match)) {
$encoding = $match[1];
}
// Unsupported encodings are converted here into UTF-8.
$php_supported = array(
'utf-8',
'iso-8859-1',
'us-ascii',
);
if (!in_array(strtolower($encoding), $php_supported)) {
$out = drupal_convert_to_utf8($data, $encoding);
if ($out !== false) {
$encoding = 'utf-8';
$data = ereg_replace('^(<\\?xml[^>]+encoding)="([^"]+)"', '\\1="utf-8"', $out);
}
else {
watchdog('php', t("Could not convert XML encoding '%s' to UTF-8.", array(
'%s' => theme('placeholder', $encoding),
)), WATCHDOG_WARNING);
return 0;
}
}
$xml_parser = xml_parser_create($encoding);
xml_parser_set_option($xml_parser, XML_OPTION_TARGET_ENCODING, 'utf-8');
return $xml_parser;
}