filter.module

  1. drupal
    1. 4.6 modules/filter.module
    2. 4.7 modules/filter.module
    3. 5 modules/filter/filter.module
    4. 6 modules/filter/filter.module
    5. 7 modules/filter/filter.module
    6. 8 core/modules/filter/filter.module

Framework for handling filtering of content.

Functions & methods

NameDescription
check_outputRun all the enabled filters on a piece of text.
filter_accessReturns true if the user is allowed to access this format.
filter_admin_addAdd a new input format.
filter_admin_configureMenu callback; display settings defined by filters.
filter_admin_deleteMenu callback; confirm deletion of a format.
filter_admin_filtersMenu callback; configure the filters for a format.
filter_admin_filters_saveSave enabled/disabled status for filters in a format.
filter_admin_orderMenu callback; display form for ordering filters for a format.
filter_admin_order_saveSave the weights of filters in a format.
filter_admin_overviewMenu callback; allows administrators to set up input formats.
filter_admin_saveSave input formats on the overview page.
filter_filterImplementation of hook_filter(). Contains a basic set of essential filters.
filter_filter_tipsImplementation of hook_filter_tips().
filter_formGenerate a selector for choosing a format in a form.
filter_formatsRetrieve a list of input formats.
filter_format_allowcacheCheck if text in a certain input format is allowed to be cached.
filter_helpImplementation of hook_help().
filter_list_allBuild a list of all filters.
filter_list_formatRetrieve a list of filters for a certain format.
filter_menuImplementation of hook_menu().
filter_permImplementation of hook_perm().
filter_tips_longMenu callback; show a page with long filter tips.
filter_xssFilters XSS. Based on kses by Ulf Harnhammar, see http://sourceforge.net/projects/kses
filter_xss_bad_protocolProcesses an HTML attribute value and ensures it does not contain an URL with a disallowed protocol (e.g. javascript:)
theme_filter_tipsFormat a set of filter tips.
_filter_autopConvert line breaks into <p> and <br> in an intelligent fashion. Based on: http://photomatt.net/scripts/autop
_filter_htmlHTML filter. Provides filtering of input into accepted HTML.
_filter_html_settingsSettings for the HTML filter.
_filter_list_cmpHelper function for sorting the filter list by filter name.
_filter_tipsHelper function for fetching filter tips.
_filter_xss_attributesProcesses a string of HTML attributes.
_filter_xss_splitProcesses an HTML tag.

Constants

NameDescription
FILTER_FORMAT_DEFAULT
FILTER_HTML_ESCAPE
FILTER_HTML_STRIP

File

modules/filter.module
View source
  1. /**
  2. * @file
  3. * Framework for handling filtering of content.
  4. */
  5. // This is a special format ID which means "use the default format". This value
  6. // can be passed to the filter APIs as a format ID: this is equivalent to not
  7. // passing an explicit format at all.
  8. define('FILTER_FORMAT_DEFAULT', 0);
  9. define('FILTER_HTML_STRIP', 1);
  10. define('FILTER_HTML_ESCAPE', 2);
  11. /**
  12. * Implementation of hook_help().
  13. */
  14. function filter_help($section) {
  15. switch ($section) {
  16. case 'admin/modules#description':
  17. return t('Handles the filtering of content in preparation for display.');
  18. case 'admin/filters':
  19. return t('
  20. <p><em>Input formats</em> define a way of processing user-supplied text in Drupal. Every input format has its own settings of which <em>filters</em> to apply. Possible filters include stripping out malicious HTML and making URLs clickable.</p>
  21. <p>Users can choose between the available input formats when submitting content.</p>
  22. <p>Below you can configure which input formats are available to which roles, as well as choose a default input format (used for imported content, for example).</p>');
  23. case 'admin/filters/'. arg(2):
  24. return t('
  25. <p>Every <em>filter</em> performs one particular change on the user input, for example stripping out malicious HTML or making URLs clickable. Choose which filters you want to apply to text in this input format.</p>
  26. <p>If you notice some filters are causing conflicts in the output, you can <a href="%order">rearrange them</a>.</p>', array('%order' => check_url(url('admin/filters/'. arg(2) .'/order'))));
  27. case 'admin/filters/'. arg(2) .'/configure':
  28. return t('
  29. <p>If you cannot find the settings for a certain filter, make sure you\'ve enabled it on the <a href="%url">list filters</a> tab first.</p>', array('%url' => check_url(url('admin/filters/'. arg(2) .'/list'))));
  30. case 'admin/filters/'. arg(2) .'/order':
  31. return t('
  32. <p>Because of the flexible filtering system, you might encounter a situation where one filter prevents another from doing its job. For example: a word in an URL gets converted into a glossary term, before the URL can be converted in a clickable link. When this happens, you will need to rearrange the order in which filters get executed.</p>
  33. <p>Filters are executed from top-to-bottom. You can use the weight column to rearrange them: heavier filters \'sink\' to the bottom.</p>');
  34. }
  35. }
  36. /**
  37. * Implementation of hook_filter_tips().
  38. */
  39. function filter_filter_tips($delta, $format, $long = false) {
  40. global $base_url;
  41. switch ($delta) {
  42. case 0:
  43. if (variable_get("filter_html_$format", FILTER_HTML_STRIP) == FILTER_HTML_STRIP) {
  44. if ($allowed_html = variable_get("allowed_html_$format", '<a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>')) {
  45. switch ($long) {
  46. case 0:
  47. return t('Allowed HTML tags') .': '. check_plain($allowed_html);
  48. case 1:
  49. $output = '<p>'. t('Allowed HTML tags') .': '. check_plain($allowed_html) .'</p>';
  50. if (!variable_get("filter_html_help_$format", 1)) {
  51. return $output;
  52. }
  53. $output .= t('<p>This site allows HTML content. While learning all of HTML may feel intimidating, learning how to use a very small number of the most basic HTML "tags" is very easy. This table provides examples for each tag that is enabled on this site.</p>
  54. <p>For more information see W3C\'s <a href="http://www.w3.org/TR/html/">HTML Specifications</a> or use your favorite search engine to find other sites that explain HTML.</p>');
  55. $tips = array(
  56. 'a' => array( t('Anchors are used to make links to other pages.'), '<a href="'. $base_url .'">'. variable_get('site_name', 'drupal') .'</a>'),
  57. 'br' => array( t('By default line break tags are automatically added, so use this tag to add additional ones. Use of this tag is different because it is not used with an open/close pair like all the others. Use the extra " /" inside the tag to maintain XHTML 1.0 compatibility'), t('Text with <br />line break')),
  58. 'p' => array( t('By default paragraph tags are automatically added, so use this tag to add additional ones.'), '<p>'. t('Paragraph one.') .'</p> <p>'. t('Paragraph two.') .'</p>'),
  59. 'strong' => array( t('Strong'), '<strong>'. t('Strong'). '</strong>'),
  60. 'em' => array( t('Emphasized'), '<em>'. t('Emphasized') .'</em>'),
  61. 'cite' => array( t('Cited'), '<cite>'. t('Cited') .'</cite>'),
  62. 'code' => array( t('Coded text used to show programming source code'), '<code>'. t('Coded') .'</code>'),
  63. 'b' => array( t('Bolded'), '<b>'. t('Bolded') .'</b>'),
  64. 'u' => array( t('Underlined'), '<u>'. t('Underlined') .'</u>'),
  65. 'i' => array( t('Italicized'), '<i>'. t('Italicized') .'</i>'),
  66. 'sup' => array( t('Superscripted'), t('<sup>Super</sup>scripted')),
  67. 'sub' => array( t('Subscripted'), t('<sub>Sub</sub>scripted')),
  68. 'pre' => array( t('Preformatted'), '<pre>'. t('Preformatted') .'</pre>'),
  69. 'blockquote' => array( t('Block quoted'), '<blockquote>'. t('Block quoted') .'</blockquote>'),
  70. 'q' => array( t('Quoted inline'), '<q>'. t('Quoted inline') .'</q>'),
  71. // Assumes and describes tr, td, th.
  72. 'table' => array( t('Table'), '<table> <tr><th>'. t('Table header') .'</th></tr> <tr><td>'. t('Table cell') .'</td></tr> </table>'),
  73. 'tr' => NULL, 'td' => NULL, 'th' => NULL,
  74. 'del' => array( t('Deleted'), '<del>'. t('Deleted') .'</del>'),
  75. 'ins' => array( t('Inserted'), '<ins>'. t('Inserted') .'</ins>'),
  76. // Assumes and describes li.
  77. 'ol' => array( t('Ordered list - use the &lt;li&gt; to begin each list item'), '<ol> <li>'. t('First item') .'</li> <li>'. t('Second item') .'</li> </ol>'),
  78. 'ul' => array( t('Unordered list - use the &lt;li&gt; to begin each list item'), '<ul> <li>'. t('First item') .'</li> <li>'. t('Second item') .'</li> </ul>'),
  79. 'li' => NULL,
  80. // Assumes and describes dt and dd.
  81. 'dl' => array( t('Definition lists are similar to other HTML lists. &lt;dl&gt; begins the definition list, &lt;dt&gt; begins the definition term and &lt;dd&gt; begins the definition description.'), '<dl> <dt>'. t('First term') .'</dt> <dd>'. t('First definition') .'</dd> <dt>'. t('Second term') .'</dt> <dd>'. t('Second definition') .'</dd> </dl>'),
  82. 'dt' => NULL, 'dd' => NULL,
  83. 'h1' => array( t('Header'), '<h1>'. t('Title') .'</h1>'),
  84. 'h2' => array( t('Header'), '<h2>'. t('Subtitle') .'</h2>'),
  85. 'h3' => array( t('Header'), '<h3>'. t('Subtitle three') .'</h3>'),
  86. 'h4' => array( t('Header'), '<h4>'. t('Subtitle four') .'</h4>'),
  87. 'h5' => array( t('Header'), '<h5>'. t('Subtitle five') .'</h5>'),
  88. 'h6' => array( t('Header'), '<h6>'. t('Subtitle six') .'</h6>')
  89. );
  90. $header = array(t('Tag Description'), t('You Type'), t('You Get'));
  91. preg_match_all('/<([a-z0-9]+)[^a-z0-9]/i', $allowed_html, $out);
  92. foreach ($out[1] as $tag) {
  93. if (array_key_exists($tag, $tips)) {
  94. if ($tips[$tag]) {
  95. $rows[] = array(
  96. array('data' => $tips[$tag][0], 'class' => 'description'),
  97. array('data' => '<code>'. check_plain($tips[$tag][1]) .'</code>', 'class' => 'type'),
  98. array('data' => $tips[$tag][1], 'class' => 'get')
  99. );
  100. }
  101. }
  102. else {
  103. $rows[] = array(
  104. array('data' => t('No help provided for tag %tag.', array('%tag' => check_plain($tag))), 'class' => 'description', 'colspan' => 3),
  105. );
  106. }
  107. }
  108. $output .= theme('table', $header, $rows);
  109. $output .= t('<p>Most unusual characters can be directly entered without any problems.</p>
  110. <p>If you do encounter problems, try using HTML character entities. A common example looks like &amp;amp; for an ampersand &amp; character. For a full list of entities see HTML\'s <a href="http://www.w3.org/TR/html4/sgml/entities.html">entities</a> page. Some of the available characters include:</p>');
  111. $entities = array(
  112. array( t('Ampersand'), '&amp;'),
  113. array( t('Greater than'), '&gt;'),
  114. array( t('Less than'), '&lt;'),
  115. array( t('Quotation mark'), '&quot;'),
  116. );
  117. $header = array(t('Character Description'), t('You Type'), t('You Get'));
  118. unset($rows);
  119. foreach ($entities as $entity) {
  120. $rows[] = array(
  121. array('data' => $entity[0], 'class' => 'description'),
  122. array('data' => '<code>'. check_plain($entity[1]) .'</code>', 'class' => 'type'),
  123. array('data' => $entity[1], 'class' => 'get')
  124. );
  125. }
  126. $output .= theme('table', $header, $rows);
  127. return $output;
  128. }
  129. }
  130. }
  131. else {
  132. return t('No HTML tags allowed');
  133. }
  134. break;
  135. case 1:
  136. switch ($long) {
  137. case 0:
  138. return t('You may post PHP code. You should include &lt;?php ?&gt; tags.');
  139. case 1:
  140. return t('
  141. <h4>Using custom PHP code</h4>
  142. <p>If you know how to script in PHP, Drupal gives you the power to embed any script you like. It will be executed when the page is viewed and dynamically embedded into the page. This gives you amazing flexibility and power, but of course with that comes danger and insecurity if you don\'t write good code. If you are not familiar with PHP, SQL or with the site engine, avoid experimenting with PHP because you can corrupt your database or render your site insecure or even unusable! If you don\'t plan to do fancy stuff with your content then you\'re probably better off with straight HTML.</p>
  143. <p>Remember that the code within each PHP item must be valid PHP code - including things like correctly terminating statements with a semicolon. It is highly recommended that you develop your code separately using a simple test script on top of a test database before migrating to your production environment.</p>
  144. <p>Notes:</p><ul><li>You can use global variables, such as configuration parameters, within the scope of your PHP code but remember that global variables which have been given values in your code will retain these values in the engine afterwards.</li><li>register_globals is now set to <strong>off</strong> by default. If you need form information you need to get it from the "superglobals" $_POST, $_GET, etc.</li><li>You can either use the <code>print</code> or <code>return</code> statement to output the actual content for your item.</li></ul>
  145. <p>A basic example:</p>
  146. <blockquote><p>You want to have a box with the title "Welcome" that you use to greet your visitors. The content for this box could be created by going:</p>
  147. <pre>
  148. print t("Welcome visitor, ... welcome message goes here ...");
  149. </pre>
  150. <p>If we are however dealing with a registered user, we can customize the message by using:</p>
  151. <pre>
  152. global $user;
  153. if ($user->uid) {
  154. print t("Welcome $user->name, ... welcome message goes here ...");
  155. }
  156. else {
  157. print t("Welcome visitor, ... welcome message goes here ...");
  158. }
  159. </pre></blockquote>
  160. <p>For more in-depth examples, we recommend that you check the existing Drupal code and use it as a starting point, especially for sidebar boxes.</p>');
  161. }
  162. case 2:
  163. switch ($long) {
  164. case 0:
  165. return t('Lines and paragraphs break automatically.');
  166. case 1:
  167. return t('Lines and paragraphs are automatically recognized. The &lt;br /&gt; line break, &lt;p&gt; paragraph and &lt;/p&gt; close paragraph tags are inserted automatically. If paragraphs are not recognized simply add a couple blank lines.');
  168. break;
  169. }
  170. }
  171. }
  172. /**
  173. * Implementation of hook_menu().
  174. */
  175. function filter_menu($may_cache) {
  176. $items = array();
  177. if ($may_cache) {
  178. $items[] = array('path' => 'admin/filters', 'title' => t('input formats'),
  179. 'callback' => 'filter_admin_overview',
  180. 'access' => user_access('administer filters'));
  181. $items[] = array('path' => 'admin/filters/delete', 'title' => t('delete input format'),
  182. 'callback' => 'filter_admin_delete',
  183. 'type' => MENU_CALLBACK,
  184. 'access' => user_access('administer filters'));
  185. $items[] = array('path' => 'filter/tips', 'title' => t('compose tips'),
  186. 'callback' => 'filter_tips_long', 'access' => TRUE,
  187. 'type' => MENU_SUGGESTED_ITEM);
  188. }
  189. else {
  190. if (arg(0) == 'admin' && arg(1) == 'filters' && is_numeric(arg(2))) {
  191. $formats = filter_formats();
  192. if (isset($formats[arg(2)])) {
  193. $items[] = array('path' => 'admin/filters/'. arg(2), 'title' => t("'%format' input format", array('%format' => $formats[arg(2)]->name)),
  194. 'callback' => 'filter_admin_filters',
  195. 'type' => MENU_CALLBACK,
  196. 'access' => user_access('administer filters'));
  197. $items[] = array('path' => 'admin/filters/'. arg(2) .'/list', 'title' => t('list'),
  198. 'callback' => 'filter_admin_filters',
  199. 'type' => MENU_DEFAULT_LOCAL_TASK,
  200. 'weight' => 0,
  201. 'access' => user_access('administer filters'));
  202. $items[] = array('path' => 'admin/filters/'. arg(2) .'/configure', 'title' => t('configure'),
  203. 'callback' => 'filter_admin_configure',
  204. 'type' => MENU_LOCAL_TASK,
  205. 'weight' => 1,
  206. 'access' => user_access('administer filters'));
  207. $items[] = array('path' => 'admin/filters/'. arg(2) .'/order', 'title' => t('rearrange'),
  208. 'callback' => 'filter_admin_order',
  209. 'type' => MENU_LOCAL_TASK,
  210. 'weight' => 2,
  211. 'access' => user_access('administer filters'));
  212. }
  213. }
  214. }
  215. return $items;
  216. }
  217. /**
  218. * Implementation of hook_perm().
  219. */
  220. function filter_perm() {
  221. return array('administer filters');
  222. }
  223. /**
  224. * Menu callback; allows administrators to set up input formats.
  225. */
  226. function filter_admin_overview() {
  227. // Process form submission
  228. switch ($_POST['op']) {
  229. case t('Save input formats'):
  230. filter_admin_save();
  231. break;
  232. case t('Add input format'):
  233. filter_admin_add();
  234. break;
  235. }
  236. // Overview of all formats.
  237. $formats = filter_formats();
  238. $roles = user_roles();
  239. $error = false;
  240. $header = array(t('Name'), t('Default'));
  241. foreach ($roles as $name) {
  242. $header[] = $name;
  243. }
  244. $header[] = array('data' => t('Operations'), 'colspan' => 2);
  245. $rows = array();
  246. foreach ($formats as $id => $format) {
  247. $row = array();
  248. $default = ($id == variable_get('filter_default_format', 1));
  249. $row[] = form_textfield('', "name][$id", $format->name, 16, 255);
  250. $row[] = form_radio('', 'default', $id, $default);
  251. foreach ($roles as $rid => $name) {
  252. $checked = strstr($format->roles, ",$rid,");
  253. $row[] = form_checkbox('', "roles][$id][$rid", 1, $default || $checked, NULL, $default ? array('disabled' => 'disabled') : NULL);
  254. }
  255. $row[] = l(t('configure'), 'admin/filters/'. $id);
  256. $row[] = $default ? '' : l('delete', 'admin/filters/delete/'. $id);
  257. $rows[] = $row;
  258. }
  259. $group = theme('table', $header, $rows);
  260. $group .= form_submit(t('Save input formats'));
  261. $output = '<h2>'. t('Permissions and settings') . '</h2>' . form($group);
  262. // Form to add a new format.
  263. $group = t("<p>To add a new input format, type its name here. After it has been added, you can configure its options.</p>");
  264. $form = form_textfield(t('Name'), 'name', '', 40, 255);
  265. $form .= form_submit(t('Add input format'));
  266. $group .= form($form);
  267. $output .= '<h2>'. t('Add new input format') .'</h2>'. $group;
  268. print theme('page', $output);
  269. }
  270. /**
  271. * Save input formats on the overview page.
  272. */
  273. function filter_admin_save() {
  274. $edit = $_POST['edit'];
  275. variable_set('filter_default_format', $edit['default']);
  276. foreach ($edit['name'] as $id => $name) {
  277. $name = trim($name);
  278. if (strlen($name) == 0) {
  279. drupal_set_message(t('You must enter a name for this input format.'));
  280. drupal_goto('admin/filters');
  281. }
  282. else {
  283. db_query("UPDATE {filter_formats} SET name='%s' WHERE format = %d", $name, $id);
  284. }
  285. }
  286. // We store the roles as a string for ease of use.
  287. // We use leading and trailing comma's to allow easy substring matching.
  288. foreach ($edit['roles'] as $id => $format) {
  289. $roles = ',';
  290. foreach ($format as $rid => $value) {
  291. if ($value) {
  292. $roles .= $rid .',';
  293. }
  294. }
  295. db_query("UPDATE {filter_formats} SET roles = '%s' WHERE format = %d", $roles, $id);
  296. }
  297. drupal_set_message(t('The input format settings have been updated.'));
  298. drupal_goto('admin/filters');
  299. }
  300. /**
  301. * Add a new input format.
  302. */
  303. function filter_admin_add() {
  304. $edit = $_POST['edit'];
  305. $name = trim($edit['name']);
  306. if (strlen($name) == 0) {
  307. drupal_set_message(t('You must enter a name for this input format.'));
  308. drupal_goto('admin/filters');
  309. }
  310. else {
  311. db_query("INSERT INTO {filter_formats} (name) VALUES ('%s')", $name);
  312. }
  313. drupal_set_message(t('Added input format %format.', array('%format' => theme('placeholder', $edit['name']))));
  314. drupal_goto('admin/filters');
  315. }
  316. /**
  317. * Menu callback; confirm deletion of a format.
  318. */
  319. function filter_admin_delete() {
  320. $edit = $_POST['edit'];
  321. if ($edit['confirm']) {
  322. if ($edit['format'] != variable_get('filter_default_format', 1)) {
  323. db_query("DELETE FROM {filter_formats} WHERE format = %d", $edit['format']);
  324. db_query("DELETE FROM {filters} WHERE format = %d", $edit['format']);
  325. $default = variable_get('filter_default_format', 1);
  326. db_query("UPDATE {node} SET format = %d WHERE format = %d", $default, $edit['format']);
  327. db_query("UPDATE {comments} SET format = %d WHERE format = %d", $default, $edit['format']);
  328. db_query("UPDATE {boxes} SET format = %d WHERE format = %d", $default, $edit['format']);
  329. cache_clear_all('filter:'. $edit['format'], true);
  330. drupal_set_message(t('Deleted input format %format.', array('%format' => theme('placeholder', $edit['name']))));
  331. }
  332. drupal_goto('admin/filters');
  333. }
  334. $format = arg(3);
  335. $format = db_fetch_object(db_query('SELECT * FROM {filter_formats} WHERE format = %d', $format));
  336. $extra = form_hidden('format', $format->format);
  337. $extra .= form_hidden('name', $format->name);
  338. $output = theme('confirm',
  339. t('Are you sure you want to delete the input format %format?', array('%format' => theme('placeholder', $format->name))),
  340. 'admin/filters',
  341. t('If you have any content left in this input format, it will be switched to the default input format. This action cannot be undone.'),
  342. t('Delete'),
  343. t('Cancel'),
  344. $extra);
  345. print theme('page', $output);
  346. }
  347. /**
  348. * Menu callback; configure the filters for a format.
  349. */
  350. function filter_admin_filters() {
  351. $format = arg(2);
  352. // Handle saving of weights.
  353. if ($_POST['op']) {
  354. filter_admin_filters_save($format, $_POST['edit']);
  355. }
  356. $all = filter_list_all();
  357. $enabled = filter_list_format($format);
  358. // Table with filters
  359. $header = array(t('Enabled'), t('Name'), t('Description'));
  360. $rows = array();
  361. foreach ($all as $id => $filter) {
  362. $row = array();
  363. $row[] = form_checkbox('', $id, 1, isset($enabled[$id]));
  364. $row[] = $filter->name;
  365. $row[] = module_invoke($filter->module, 'filter', 'description', $filter->delta);
  366. $rows[] = $row;
  367. }
  368. $form = theme('table', $header, $rows);
  369. if (!$empty) {
  370. $form .= form_submit(t('Save configuration'));
  371. }
  372. $output .= '<h2>'. t('Filters') .'</h2>'. form($form);
  373. // Composition tips (guidelines)
  374. $tips = _filter_tips($format, false);
  375. $extra = l(t('More information about formatting options'), 'filter/tips');
  376. $tiplist = theme('filter_tips', $tips, false, $extra);
  377. if (!$tiplist) {
  378. $tiplist = t('<p>No guidelines available.</p>');
  379. }
  380. $group = t('<p>These are the guidelines that users will see for posting in this input format. They are automatically generated from the filter settings.</p>');
  381. $group .= $tiplist;
  382. $output .= '<h2>'. t('Formatting guidelines') .'</h2>'. $group;
  383. print theme('page', $output);
  384. }
  385. /**
  386. * Save enabled/disabled status for filters in a format.
  387. */
  388. function filter_admin_filters_save($format, $toggles) {
  389. $current = filter_list_format($format);
  390. $cache = true;
  391. db_query("DELETE FROM {filters} WHERE format = %d", $format);
  392. foreach ($toggles as $id => $checked) {
  393. if ($checked) {
  394. list($module, $delta) = explode('/', $id);
  395. // Add new filters to the bottom
  396. $weight = isset($current[$id]->weight) ? $current[$id]->weight : 10;
  397. db_query("INSERT INTO {filters} (format, module, delta, weight) VALUES (%d, '%s', %d, %d)", $format, $module, $delta, $weight);
  398. // Check if there are any 'no cache' filters
  399. $cache &= !module_invoke($module, 'filter', 'no cache', $delta);
  400. }
  401. }
  402. // Update the format's 'no cache' flag.
  403. db_query('UPDATE {filter_formats} SET cache = %d WHERE format = %d', (int)$cache, $format);
  404. cache_clear_all('filter:'. $format, true);
  405. drupal_set_message(t('The input format has been updated.'));
  406. drupal_goto('admin/filters/'. arg(2) .'/list');
  407. }
  408. /**
  409. * Menu callback; display form for ordering filters for a format.
  410. */
  411. function filter_admin_order() {
  412. $format = arg(2);
  413. if ($_POST['op']) {
  414. filter_admin_order_save($format, $_POST['edit']);
  415. }
  416. // Get list (with forced refresh)
  417. $filters = filter_list_format($format);
  418. $header = array(t('Name'), t('Weight'));
  419. $rows = array();
  420. foreach ($filters as $id => $filter) {
  421. $rows[] = array($filter->name, form_weight('', $id, $filter->weight));
  422. }
  423. $form = theme('table', $header, $rows);
  424. $form .= form_submit(t('Save configuration'));
  425. $output = form($form);
  426. print theme('page', $output);
  427. }
  428. /**
  429. * Save the weights of filters in a format.
  430. */
  431. function filter_admin_order_save($format, $weights) {
  432. foreach ($weights as $id => $weight) {
  433. list($module, $delta) = explode('/', $id);
  434. db_query("UPDATE {filters} SET weight = %d WHERE format = %d AND module = '%s' AND delta = %d", $weight, $format, $module, $delta);
  435. }
  436. drupal_set_message(t('The filter weights have been saved.'));
  437. cache_clear_all('filter:'. $format, true);
  438. drupal_goto('admin/filters/'. arg(2) .'/order');
  439. }
  440. /**
  441. * Menu callback; display settings defined by filters.
  442. */
  443. function filter_admin_configure() {
  444. $format = arg(2);
  445. system_settings_save();
  446. $list = filter_list_format($format);
  447. $form = "";
  448. foreach ($list as $filter) {
  449. $form .= module_invoke($filter->module, 'filter', 'settings', $filter->delta, $format);
  450. }
  451. if (trim($form) != '') {
  452. $output = system_settings_form($form);
  453. }
  454. else {
  455. $output = t('No settings are available.');
  456. }
  457. print theme('page', $output);
  458. }
  459. /**
  460. * Retrieve a list of input formats.
  461. */
  462. function filter_formats() {
  463. global $user;
  464. static $formats;
  465. // Administrators can always use all input formats.
  466. $all = user_access('administer filters');
  467. if (!isset($formats)) {
  468. $formats = array();
  469. $query = 'SELECT * FROM {filter_formats}';
  470. // Build query for selecting the format(s) based on the user's roles.
  471. if (!$all) {
  472. $where = array();
  473. foreach ($user->roles as $rid => $role) {
  474. $where[] = "roles LIKE '%%,%d,%%'";
  475. $args[] = $rid;
  476. }
  477. $query .= ' WHERE '. implode(' OR ', $where) . ' OR format = %d';
  478. $args[] = variable_get('filter_default_format', 1);
  479. }
  480. $result = db_query($query, $args);
  481. while ($format = db_fetch_object($result)) {
  482. $formats[$format->format] = $format;
  483. }
  484. }
  485. return $formats;
  486. }
  487. /**
  488. * Build a list of all filters.
  489. */
  490. function filter_list_all() {
  491. $filters = array();
  492. foreach (module_list() as $module) {
  493. $list = module_invoke($module, 'filter', 'list');
  494. if (is_array($list)) {
  495. foreach ($list as $delta => $name) {
  496. $filters[$module .'/'. $delta] = (object)array('module' => $module, 'delta' => $delta, 'name' => $name);
  497. }
  498. }
  499. }
  500. uasort($filters, '_filter_list_cmp');
  501. return $filters;
  502. }
  503. /**
  504. * Helper function for sorting the filter list by filter name.
  505. */
  506. function _filter_list_cmp($a, $b) {
  507. return strcmp($a->name, $b->name);
  508. }
  509. /**
  510. * Check if text in a certain input format is allowed to be cached.
  511. */
  512. function filter_format_allowcache($format) {
  513. static $cache = array();
  514. if (!isset($cache[$format])) {
  515. $cache[$format] = db_result(db_query('SELECT cache FROM {filter_formats} WHERE format = %d', $format));
  516. }
  517. return $cache[$format];
  518. }
  519. /**
  520. * Retrieve a list of filters for a certain format.
  521. */
  522. function filter_list_format($format) {
  523. static $filters = array();
  524. if (!isset($filters[$format])) {
  525. $filters[$format] = array();
  526. $result = db_query("SELECT * FROM {filters} WHERE format = %d ORDER BY weight ASC", $format);
  527. while ($filter = db_fetch_object($result)) {
  528. $list = module_invoke($filter->module, 'filter', 'list');
  529. if (is_array($list) && isset($list[$filter->delta])) {
  530. $filter->name = $list[$filter->delta];
  531. $filters[$format][$filter->module .'/'. $filter->delta] = $filter;
  532. }
  533. }
  534. }
  535. return $filters[$format];
  536. }
  537. /**
  538. * @name Filtering functions
  539. * @{
  540. * Modules which need to have content filtered can use these functions to
  541. * interact with the filter system.
  542. *
  543. * For more info, see the hook_filter() documentation.
  544. *
  545. * Note: because filters can inject JavaScript or execute PHP code, security is
  546. * vital here. When a user supplies a $format, you should validate it with
  547. * filter_access($format) before accepting/using it. This is normally done in
  548. * the validation stage of the node system. You should for example never make a
  549. * preview of content in a disallowed format.
  550. */
  551. /**
  552. * Run all the enabled filters on a piece of text.
  553. *
  554. * You can do a filter_access() check on $format automatically by passing
  555. * $check = TRUE. Note that this will check the permissions of the current user,
  556. * so you should specify $check = FALSE when viewing other people's content.
  557. *
  558. * @param $text
  559. * The text to be filtered.
  560. * @param $format
  561. * The format of the text to be filtered. Specify FILTER_FORMAT_DEFAULT for
  562. * the default format.
  563. * @param $check
  564. * Whether to check the $format with filter_access() first. If set to false,
  565. * make sure the check has performed at some point earlier.
  566. */
  567. function check_output($text, $format = FILTER_FORMAT_DEFAULT, $check = FALSE) {
  568. // When $check = true, do an access check on $format.
  569. if (isset($text) && (!$check || filter_access($format))) {
  570. if ($format == FILTER_FORMAT_DEFAULT) {
  571. $format = variable_get('filter_default_format', 1);
  572. }
  573. // Check for a cached version of this piece of text.
  574. $id = 'filter:'. $format .':'. md5($text);
  575. if ($cached = cache_get($id)) {
  576. return $cached->data;
  577. }
  578. // See if caching is allowed for this format.
  579. $cache = filter_format_allowcache($format);
  580. // Convert all Windows and Mac newlines to a single newline,
  581. // so filters only need to deal with one possibility.
  582. $text = str_replace(array("\r\n", "\r"), "\n", $text);
  583. // Get a complete list of filters, ordered properly.
  584. $filters = filter_list_format($format);
  585. // Give filters the chance to escape HTML-like data such as code or formulas.
  586. foreach ($filters as $filter) {
  587. $text = module_invoke($filter->module, 'filter', 'prepare', $filter->delta, $format, $text);
  588. }
  589. // Perform filtering.
  590. foreach ($filters as $filter) {
  591. $text = module_invoke($filter->module, 'filter', 'process', $filter->delta, $format, $text);
  592. }
  593. // Store in cache with a minimum expiration time of 1 day.
  594. if ($cache) {
  595. cache_set($id, $text, time() + (60 * 60 * 24));
  596. }
  597. }
  598. else {
  599. $text = message_na();
  600. }
  601. return $text;
  602. }
  603. /**
  604. * Generate a selector for choosing a format in a form.
  605. *
  606. * @param $name
  607. * The internal name used to refer to the form element.
  608. * @param $value
  609. * The ID of the format that is currently selected.
  610. * @return
  611. * HTML for the form element.
  612. */
  613. function filter_form($name = 'format', $value = FILTER_FORMAT_DEFAULT) {
  614. if ($value == FILTER_FORMAT_DEFAULT) {
  615. $value = variable_get('filter_default_format', 1);
  616. }
  617. $formats = filter_formats();
  618. $extra = l(t('More information about formatting options'), 'filter/tips');
  619. if (count($formats) > 1) {
  620. // Multiple formats available: display radio buttons with tips.
  621. $output = '';
  622. foreach ($formats as $format) {
  623. $tips = _filter_tips($format->format, false);
  624. // TODO: get support for block-level radios so the <br /> is not output?
  625. $output .= '<div>';
  626. $output .= '<label class="option"><input type="radio" class="form-radio" name="edit['. $name .']" value="'. $format->format .'"'. ($format->format == $value ? ' checked="checked"' : '') .' /> '. $format->name .'</label>';
  627. $output .= theme('filter_tips', $tips);
  628. $output .= '</div>';
  629. }
  630. return theme('form_element', t('Input format'), $output, $extra, NULL, _form_get_error($name));
  631. }
  632. else {
  633. // Only one format available: use a hidden form item and only show tips.
  634. $format = array_shift($formats);
  635. $output = form_hidden($name, $format->format);
  636. $tips = _filter_tips(variable_get('filter_default_format', 1), false);
  637. $output .= form_item(t('Formatting guidelines'), theme('filter_tips', $tips, false, $extra), $extra);
  638. return $output;
  639. }
  640. }
  641. /**
  642. * Returns true if the user is allowed to access this format.
  643. */
  644. function filter_access($format) {
  645. if (user_access('administer filters') || ($format == FILTER_FORMAT_DEFAULT) || ($format == variable_get('filter_default_format', 1))) {
  646. return true;
  647. }
  648. else {
  649. $formats = filter_formats();
  650. return isset($formats[$format]);
  651. }
  652. }
  653. /**
  654. * @} End of "Filtering functions".
  655. */
  656. /**
  657. * Menu callback; show a page with long filter tips.
  658. */
  659. function filter_tips_long() {
  660. $format = arg(2);
  661. if ($format) {
  662. $output = theme('filter_tips', _filter_tips($format, true), true);
  663. }
  664. else {
  665. $output = theme('filter_tips', _filter_tips(-1, true), true);
  666. }
  667. print theme('page', $output);
  668. }
  669. /**
  670. * Helper function for fetching filter tips.
  671. */
  672. function _filter_tips($format, $long = false) {
  673. if ($format == -1) {
  674. $formats = filter_formats();
  675. }
  676. else {
  677. $formats = array(db_fetch_object(db_query("SELECT * FROM {filter_formats} WHERE format = %d", $format)));
  678. }
  679. $tips = array();
  680. foreach ($formats as $format) {
  681. $filters = filter_list_format($format->format);
  682. $tips[$format->name] = array();
  683. foreach ($filters as $id => $filter) {
  684. if ($tip = module_invoke($filter->module, 'filter_tips', $filter->delta, $format->format, $long)) {
  685. $tips[$format->name][] = array('tip' => $tip, 'id' => $id);
  686. }
  687. }
  688. }
  689. return $tips;
  690. }
  691. /**
  692. * Format a set of filter tips.
  693. *
  694. * @ingroup themeable
  695. */
  696. function theme_filter_tips($tips, $long = false, $extra = '') {
  697. $output = '';
  698. $multiple = count($tips) > 1;
  699. if ($multiple) {
  700. $output = t('Input formats') .':';
  701. }
  702. if (count($tips)) {
  703. if ($multiple) {
  704. $output .= '<ul>';
  705. }
  706. foreach ($tips as $name => $tiplist) {
  707. if ($multiple) {
  708. $output .= '<li>';
  709. $output .= '<strong>'. $name .'</strong>:<br />';
  710. }
  711. $tips = '';
  712. foreach ($tiplist as $tip) {
  713. $tips .= '<li'. ($long ? ' id="filter-'. str_replace("/", "-", $tip['id']) .'">' : '>') . $tip['tip'] . '</li>';
  714. }
  715. if ($tips) {
  716. $output .= "<ul class=\"tips\">$tips</ul>";
  717. }
  718. if ($multiple) {
  719. $output .= '</li>';
  720. }
  721. }
  722. if ($multiple) {
  723. $output .= '</ul>';
  724. }
  725. }
  726. return $output;
  727. }
  728. /**
  729. * @name Standard filters
  730. * @{
  731. * Filters implemented by the filter.module.
  732. */
  733. /**
  734. * Implementation of hook_filter(). Contains a basic set of essential filters.
  735. * - HTML filter:
  736. * Validates user-supplied HTML, transforming it as necessary.
  737. * - PHP evaluator:
  738. * Executes PHP code.
  739. * - Line break converter:
  740. * Converts newlines into paragraph and break tags.
  741. */
  742. function filter_filter($op, $delta = 0, $format = -1, $text = '') {
  743. switch ($op) {
  744. case 'list':
  745. return array(0 => t('HTML filter'), 1 => t('PHP evaluator'), 2 => t('Line break converter'));
  746. case 'no cache':
  747. return $delta == 1; // No caching for the PHP evaluator.
  748. case 'description':
  749. switch ($delta) {
  750. case 0:
  751. return t('Allows you to restrict if users can post HTML and which tags to filter out.');
  752. case 1:
  753. return t('Runs a piece of PHP code. The usage of this filter should be restricted to administrators only!');
  754. case 2:
  755. return t('Converts line breaks into HTML (i.e. &lt;br&gt; and &lt;p&gt; tags).');
  756. default:
  757. return;
  758. }
  759. case 'process':
  760. switch ($delta) {
  761. case 0:
  762. return _filter_html($text, $format);
  763. case 1:
  764. return drupal_eval($text);
  765. case 2:
  766. return _filter_autop($text);
  767. default:
  768. return $text;
  769. }
  770. case 'settings':
  771. switch ($delta) {
  772. case 0:
  773. return _filter_html_settings($format);
  774. default:
  775. return;
  776. }
  777. default:
  778. return $text;
  779. }
  780. }
  781. /**
  782. * Settings for the HTML filter.
  783. */
  784. function _filter_html_settings($format) {
  785. $group = form_radios(t('Filter HTML tags'), "filter_html_$format", variable_get("filter_html_$format", FILTER_HTML_STRIP), array(FILTER_HTML_STRIP => t('Strip disallowed tags'), FILTER_HTML_ESCAPE => t('Escape tags')), t('How to deal with HTML tags in user-contributed content. If set to "Strip disallowed tags", dangerous tags are removed (see below). If set to "Escape tags", all HTML is escaped and presented as it was typed.'));
  786. $group .= form_textfield(t('Allowed HTML tags'), "allowed_html_$format", variable_get("allowed_html_$format", '<a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>'), 64, 255, t('If "Strip tags" is selected, optionally specify tags which should not be stripped. Javascript event attributes are always stripped.'));
  787. $group .= form_checkbox(t('Display HTML help'), "filter_html_help_$format", 1, variable_get("filter_html_help_$format", 1), t('If enabled, Drupal will display some basic HTML help in the long filter tips.'));
  788. $group .= form_checkbox(t('Spam link deterrent'), "filter_html_nofollow_$format", 1, variable_get("filter_html_nofollow_$format", FALSE), t('If enabled, Drupal will add rel="nofollow" to all links, as a measure to reduce the effectiveness of spam links. Note: this will also prevent valid links from being followed by search engines, therefore it is likely most effective when enabled for anonymous users.'));
  789. $output .= form_group(t('HTML filter'), $group);
  790. return $output;
  791. }
  792. /**
  793. * HTML filter. Provides filtering of input into accepted HTML.
  794. */
  795. function _filter_html($text, $format) {
  796. if (variable_get("filter_html_$format", FILTER_HTML_STRIP) == FILTER_HTML_STRIP) {
  797. $allowed_tags = preg_split('/\s+|<|>/', variable_get("allowed_html_$format", '<a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>'), -1, PREG_SPLIT_NO_EMPTY);
  798. $text = filter_xss($text, $allowed_tags);
  799. }
  800. if (variable_get("filter_html_$format", FILTER_HTML_STRIP) == FILTER_HTML_ESCAPE) {
  801. // Escape HTML
  802. $text = check_plain($text);
  803. }
  804. if (variable_get("filter_html_nofollow_$format", FALSE)) {
  805. $text = preg_replace('/<a([^>]+)>/i', '<a\\1 rel="nofollow">', $text);
  806. }
  807. return trim($text);
  808. }
  809. /**
  810. * Convert line breaks into <p> and <br> in an intelligent fashion.
  811. * Based on: http://photomatt.net/scripts/autop
  812. */
  813. function _filter_autop($text) {
  814. // All block level tags
  815. $block = '(?:table|thead|tfoot|caption|colgroup|tbody|tr|td|th|div|dl|dd|dt|ul|ol|li|pre|select|form|blockquote|address|p|h[1-6])';
  816. // Split at <pre>, <script>, <style> and </pre>, </script>, </style> tags.
  817. // We don't apply any processing to the contents of these tags to avoid messing
  818. // up code. We look for matched pairs and allow basic nesting. For example:
  819. // "processed <pre> ignored <script> ignored </script> ignored </pre> processed"
  820. $chunks = preg_split('@(</?(?:pre|script|style)[^>]*>)@i', $text, -1, PREG_SPLIT_DELIM_CAPTURE);
  821. // Note: PHP ensures the array consists of alternating delimiters and literals
  822. // and begins and ends with a literal (inserting NULL as required).
  823. $ignore = false;
  824. $ignoretag = '';
  825. $output = '';
  826. foreach ($chunks as $i => $chunk) {
  827. if ($i % 2) {
  828. // Opening or closing tag?
  829. $open = ($chunk{1} != '/');
  830. list($tag) = split('[ >]', substr($chunk, 2 - $open), 2);
  831. if (!$ignore) {
  832. if ($open) {
  833. $ignore = true;
  834. $ignoretag = $tag;
  835. }
  836. }
  837. // Only allow a matching tag to close it.
  838. else if (!$open && $ignoretag == $tag) {
  839. $ignore = false;
  840. $ignoretag = '';
  841. }
  842. }
  843. else if (!$ignore) {
  844. $chunk = preg_replace('|\n*$|', '', $chunk) ."\n\n"; // just to make things a little easier, pad the end
  845. $chunk = preg_replace('|<br />\s*<br />|', "\n\n", $chunk);
  846. $chunk = preg_replace('!(<'. $block .'[^>]*>)!', "\n$1", $chunk); // Space things out a little
  847. $chunk = preg_replace('!(</'. $block .'>)!', "$1\n\n", $chunk); // Space things out a little
  848. $chunk = preg_replace("/\n\n+/", "\n\n", $chunk); // take care of duplicates
  849. $chunk = preg_replace('/\n?(.+?)(?:\n\s*\n|\z)/s', "<p>$1</p>\n", $chunk); // make paragraphs, including one at the end
  850. $chunk = preg_replace('|<p>\s*?</p>|', '', $chunk); // under certain strange conditions it could create a P of entirely whitespace
  851. $chunk = preg_replace("|<p>(<li.+?)</p>|", "$1", $chunk); // problem with nested lists
  852. $chunk = preg_replace('|<p><blockquote([^>]*)>|i', "<blockquote$1><p>", $chunk);
  853. $chunk = str_replace('</blockquote></p>', '</p></blockquote>', $chunk);
  854. $chunk = preg_replace('!<p>\s*(</?'. $block .'[^>]*>)!', "$1", $chunk);
  855. $chunk = preg_replace('!(</?'. $block .'[^>]*>)\s*</p>!', "$1", $chunk);
  856. $chunk = preg_replace('|(?<!<br />)\s*\n|', "<br />\n", $chunk); // make line breaks
  857. $chunk = preg_replace('!(</?'. $block .'[^>]*>)\s*<br />!', "$1", $chunk);
  858. $chunk = preg_replace('!<br />(\s*</?(?:p|li|div|th|pre|td|ul|ol)>)!', '$1', $chunk);
  859. $chunk = preg_replace('/&([^#])(?![A-Za-z0-9]{0,7};)/', '&amp;$1', $chunk);
  860. }
  861. $output .= $chunk;
  862. }
  863. return $output;
  864. }
  865. /**
  866. * Filters XSS. Based on kses by Ulf Harnhammar, see
  867. * http://sourceforge.net/projects/kses
  868. *
  869. * For examples of various XSS attacks, see:
  870. * http://ha.ckers.org/xss.html
  871. *
  872. * This code does four things:
  873. * - Removes characters and constructs that can trick browsers
  874. * - Makes sure all HTML entities are well-formed
  875. * - Makes sure all HTML tags and attributes are well-formed
  876. * - Makes sure no HTML tags contain URLs with a disallowed protocol (e.g. javascript:)
  877. *
  878. * @param $string
  879. * The string with raw HTML in it. It will be stripped of everything that can cause
  880. * an XSS attack.
  881. * @param $allowed_tags
  882. * An array of allowed tags.
  883. * @param $format
  884. * The format to use.
  885. */
  886. function filter_xss($string, $allowed_tags = array('a', 'em', 'strong', 'cite', 'code', 'ul', 'ol', 'li', 'dl', 'dt', 'dd')) {
  887. // Store the input format
  888. _filter_xss_split($allowed_tags, TRUE);
  889. // Remove NUL characters (ignored by some browsers)
  890. $string = str_replace(chr(0), '', $string);
  891. // Remove Netscape 4 JS entities
  892. $string = preg_replace('%&\s*\{[^}]*(\}\s*;?|$)%', '', $string);
  893. // Defuse all HTML entities
  894. $string = str_replace('&', '&amp;', $string);
  895. // Change back only well-formed entities in our whitelist
  896. // Named entities
  897. $string = preg_replace('/&amp;([A-Za-z][A-Za-z0-9]*;)/', '&\1', $string);
  898. // Decimal numeric entities
  899. $string = preg_replace('/&amp;#0*([0-9]+;)/', '&#\1', $string);
  900. // Hexadecimal numeric entities
  901. $string = preg_replace('/&amp;#[Xx]0*((?:[0-9A-Fa-f]{2})+;)/', '&#x\1', $string);
  902. return preg_replace_callback('%
  903. (
  904. <[^>]*.(>|$) # a string that starts with a <, up until the > or the end of the string
  905. | # or
  906. > # just a >
  907. )%x', '_filter_xss_split', $string);
  908. }
  909. /**
  910. * Processes an HTML tag.
  911. *
  912. * @param @m
  913. * An array with various meaning depending on the value of $store.
  914. * If $store is TRUE then the array contains the allowed tags.
  915. * If $store is FALSE then the array has one element, the HTML tag to process.
  916. * @param $store
  917. * Whether to store $m.
  918. * @return
  919. * If the element isn't allowed, an empty string. Otherwise, the cleaned up
  920. * version of the HTML element.
  921. */
  922. function _filter_xss_split($m, $store = FALSE) {
  923. static $allowed_html;
  924. if ($store) {
  925. $allowed_html = array_flip($m);
  926. return;
  927. }
  928. $string = $m[1];
  929. if (substr($string, 0, 1) != '<') {
  930. // We matched a lone ">" character
  931. return '&gt;';
  932. }
  933. if (!preg_match('%^<\s*(/\s*)?([a-zA-Z0-9]+)([^>]*)>?$%', $string, $matches)) {
  934. // Seriously malformed
  935. return '';
  936. }
  937. $slash = trim($matches[1]);
  938. $elem = &$matches[2];
  939. $attrlist = &$matches[3];
  940. if (!isset($allowed_html[strtolower($elem)])) {
  941. // Disallowed HTML element
  942. return '';
  943. }
  944. if ($slash != '') {
  945. return "</$elem>";
  946. }
  947. // Is there a closing XHTML slash at the end of the attributes?
  948. $xhtml_slash = preg_match('%\s/\s*$%', $attr) ? '/' : '';
  949. // Clean up attributes
  950. $attr2 = implode(' ', _filter_xss_attributes($attrlist));
  951. $attr2 = preg_replace('/[<>]/', '', $attr2);
  952. return "<$elem $attr2$xhtml_slash>";
  953. }
  954. /**
  955. * Processes a string of HTML attributes.
  956. *
  957. * @return
  958. * Cleaned up version of the HTML attributes.
  959. */
  960. function _filter_xss_attributes($attr) {
  961. $attrarr = array();
  962. $mode = 0;
  963. $attrname = '';
  964. while (strlen($attr) != 0) {
  965. // Was the last operation successful?
  966. $working = 0;
  967. switch ($mode) {
  968. case 0:
  969. // Attribute name, href for instance
  970. if (preg_match('/^([-a-zA-Z]+)/', $attr, $match)) {
  971. $attrname = strtolower($match[1]);
  972. $skip = ($attrname == 'style' || substr($attrname, 0, 2) == 'on');
  973. $working = $mode = 1;
  974. $attr = preg_replace('/^[-a-zA-Z]+/', '', $attr);
  975. }
  976. break;
  977. case 1:
  978. // Equals sign or valueless ("selected")
  979. if (preg_match('/^\s*=\s*/', $attr)) {
  980. $working = 1; $mode = 2;
  981. $attr = preg_replace('/^\s*=\s*/', '', $attr);
  982. break;
  983. }
  984. if (preg_match('/^\s+/', $attr)) {
  985. $working = 1; $mode = 0;
  986. if (!$skip) {
  987. $attrarr[] = $attrname;
  988. }
  989. $attr = preg_replace('/^\s+/', '', $attr);
  990. }
  991. break;
  992. case 2:
  993. // Attribute value, a URL after href= for instance
  994. if (preg_match('/^"([^"]*)"(\s+|$)/', $attr, $match)) {
  995. $thisval = filter_xss_bad_protocol($match[1]);
  996. if (!$skip) {
  997. $attrarr[] = "$attrname=\"$thisval\"";
  998. }
  999. $working = 1;
  1000. $mode = 0;
  1001. $attr = preg_replace('/^"[^"]*"(\s+|$)/', '', $attr);
  1002. break;
  1003. }
  1004. if (preg_match("/^'([^']*)'(\s+|$)/", $attr, $match)) {
  1005. $thisval = filter_xss_bad_protocol($match[1]);
  1006. if (!$skip) {
  1007. $attrarr[] = "$attrname='$thisval'";;
  1008. }
  1009. $working = 1; $mode = 0;
  1010. $attr = preg_replace("/^'[^']*'(\s+|$)/", '', $attr);
  1011. break;
  1012. }
  1013. if (preg_match("%^([^\s\"']+)(\s+|$)%", $attr, $match)) {
  1014. $thisval = filter_xss_bad_protocol($match[1]);
  1015. if (!$skip) {
  1016. $attrarr[] = "$attrname=\"$thisval\"";
  1017. }
  1018. $working = 1; $mode = 0;
  1019. $attr = preg_replace("%^[^\s\"']+(\s+|$)%", '', $attr);
  1020. }
  1021. break;
  1022. }
  1023. if ($working == 0) {
  1024. // not well formed, remove and try again
  1025. $attr = preg_replace('/
  1026. ^
  1027. (
  1028. "[^"]*("|$) # - a string that starts with a double quote, up until the next double quote or the end of the string
  1029. | # or
  1030. \'[^\']*(\'|$)| # - a string that starts with a quote, up until the next quote or the end of the string
  1031. | # or
  1032. \S # - a non-whitespace character
  1033. )* # any number of the above three
  1034. \s* # any number of whitespaces
  1035. /x', '', $attr);
  1036. $mode = 0;
  1037. }
  1038. }
  1039. // the attribute list ends with a valueless attribute like "selected"
  1040. if ($mode == 1) {
  1041. $attrarr[] = $attrname;
  1042. }
  1043. return $attrarr;
  1044. }
  1045. /**
  1046. * Processes an HTML attribute value and ensures it does not contain an URL
  1047. * with a disallowed protocol (e.g. javascript:)
  1048. *
  1049. * @param $string
  1050. * The string with the attribute value.
  1051. * @param $decode
  1052. * Whether to decode entities in the $string. Set to FALSE if the $string
  1053. * is in plain text, TRUE otherwise. Defaults to TRUE.
  1054. * @return
  1055. * Cleaned up and HTML-escaped version of $string.
  1056. */
  1057. function filter_xss_bad_protocol($string, $decode = TRUE) {
  1058. static $allowed_protocols;
  1059. if (!isset($allowed_protocols)) {
  1060. $allowed_protocols = array_flip(variable_get('filter_allowed_protocols', array('http', 'https', 'ftp', 'news', 'nntp', 'telnet', 'mailto', 'irc', 'ssh', 'sftp', 'webcal')));
  1061. }
  1062. // Get the plain text representation of the attribute value (i.e. its meaning)
  1063. if ($decode) {
  1064. $string = decode_entities($string);
  1065. }
  1066. // Iteratively remove any invalid protocol found.
  1067. do {
  1068. $before = $string;
  1069. $colonpos = strpos($string, ':');
  1070. if ($colonpos > 0) {
  1071. // We found a colon, possibly a protocol. Verify.
  1072. $protocol = substr($string, 0, $colonpos);
  1073. // If a colon is preceded by a slash, question mark or hash, it cannot
  1074. // possibly be part of the URL scheme. This must be a relative URL,
  1075. // which inherits the (safe) protocol of the base document.
  1076. if (preg_match('![/?#]!', $protocol)) {
  1077. break;
  1078. }
  1079. // Check if this is a disallowed protocol
  1080. if (!isset($allowed_protocols[$protocol])) {
  1081. $string = substr($string, $colonpos + 1);
  1082. }
  1083. }
  1084. } while ($before != $string);
  1085. return check_plain($string);
  1086. }
  1087. /**
  1088. * @} End of "Standard filters".
  1089. */
Login or register to post comments