class SearchTextProcessorTest

Same name and namespace in other branches
  1. 11.x core/modules/search/tests/src/Kernel/SearchTextProcessorTest.php \Drupal\Tests\search\Kernel\SearchTextProcessorTest
  2. 10 core/modules/search/tests/src/Kernel/SearchTextProcessorTest.php \Drupal\Tests\search\Kernel\SearchTextProcessorTest
  3. 9 core/modules/search/tests/src/Kernel/SearchTextProcessorTest.php \Drupal\Tests\search\Kernel\SearchTextProcessorTest

Test search text preprocessing functionality.

Attributes

#[Group('search')] #[RunTestsInSeparateProcesses]

Hierarchy

Expanded class hierarchy of SearchTextProcessorTest

File

core/modules/search/tests/src/Kernel/SearchTextProcessorTest.php, line 16

Namespace

Drupal\Tests\search\Kernel
View source
class SearchTextProcessorTest extends KernelTestBase {
  
  /**
   * {@inheritdoc}
   */
  protected static $modules = [
    'search',
  ];
  
  /**
   * Tests that text processing handles Unicode characters correctly.
   */
  public function testSearchTextProcessorUnicode() : void {
    // This test uses a file that was constructed so that the even lines are
    // boundary characters, and the odd lines are valid word characters. (It
    // was generated as a sequence of all the Unicode characters, and then the
    // boundary characters (punctuation, spaces, etc.) were split off into
    // their own lines). So the even-numbered lines should simplify to nothing,
    // and the odd-numbered lines we need to split into shorter chunks and
    // verify that text processing doesn't lose any characters.
    $input = file_get_contents($this->root . '/core/modules/search/tests/UnicodeTest.txt');
    $base_strings = explode(chr(10), $input);
    $strings = [];
    $text_processor = \Drupal::service('search.text_processor');
    assert($text_processor instanceof SearchTextProcessorInterface);
    foreach ($base_strings as $key => $string) {
      if ($key % 2) {
        // Even line - should simplify down to a space.
        $simplified = $text_processor->analyze($string);
        $this->assertSame(' ', $simplified, "Line {$key} is excluded from the index");
      }
      else {
        // Odd line, should be word characters.
        // Split this into 30-character chunks, so we don't run into limits of
        // truncation in
        // \Drupal\search\SearchTextProcessorInterface::analyze().
        $start = 0;
        while ($start < mb_strlen($string)) {
          $new_string = mb_substr($string, $start, 30);
          // Special case: leading zeros are removed from numeric strings,
          // and there's one string in this file that is numbers starting with
          // zero, so prepend a 1 on that string.
          if (preg_match('/^[0-9]+$/', $new_string)) {
            $new_string = '1' . $new_string;
          }
          $strings[] = $new_string;
          $start += 30;
        }
      }
    }
    foreach ($strings as $key => $string) {
      $simplified = $text_processor->analyze($string);
      $this->assertGreaterThanOrEqual(mb_strlen($string), mb_strlen($simplified), "Nothing is removed from string {$key}.");
    }
    // Test the low-numbered ASCII control characters separately. They are not
    // in the text file because they are problematic for diff, especially \0.
    $string = '';
    for ($i = 0; $i < 32; $i++) {
      $string .= chr($i);
    }
    $this->assertSame(' ', $text_processor->analyze($string), 'Search simplify works for ASCII control characters.');
  }
  
  /**
   * Tests that text analysis does the right thing with punctuation.
   */
  public function testSearchTextProcessorPunctuation() : void {
    $cases = [
      [
        '20.03/94-28,876',
        '20039428876',
        'Punctuation removed from numbers',
      ],
      [
        'great...drupal--module',
        'great drupal module',
        'Multiple dot and dashes are word boundaries',
      ],
      [
        'very_great-drupal.module',
        'verygreatdrupalmodule',
        'Single dot, dash, underscore are removed',
      ],
      [
        'regular,punctuation;word',
        'regular punctuation word',
        'Punctuation is a word boundary',
      ],
    ];
    $text_processor = \Drupal::service('search.text_processor');
    assert($text_processor instanceof SearchTextProcessorInterface);
    foreach ($cases as $case) {
      $out = trim($text_processor->analyze($case[0]));
      $this->assertEquals($case[1], $out, $case[2]);
    }
  }

}

Members

Title Sort descending Modifiers Object type Summary Overriden Title Overrides
AssertContentTrait::$content protected property The current raw content.
AssertContentTrait::$drupalSettings protected property The drupalSettings value from the current raw $content.
AssertContentTrait::$elements protected property The XML structure parsed from the current raw $content.
AssertContentTrait::$plainTextContent protected property The plain-text content of raw $content (text nodes).
AssertContentTrait::assertEscaped protected function Passes if the raw text IS found escaped on the loaded page, fail otherwise.
AssertContentTrait::assertField protected function Asserts that a field exists with the given name or ID.
AssertContentTrait::assertFieldByName protected function Asserts that a field exists with the given name and value.
AssertContentTrait::assertFieldByXPath protected function Asserts that a field exists in the current page by the given XPath.
AssertContentTrait::assertFieldsByValue protected function Asserts that a field exists in the current page with a given Xpath result.
AssertContentTrait::assertLink protected function Passes if a link with the specified label is found.
AssertContentTrait::assertLinkByHref protected function Passes if a link containing a given href (part) is found.
AssertContentTrait::assertNoLink protected function Passes if a link with the specified label is not found.
AssertContentTrait::assertNoPattern protected function Triggers a pass if the perl regex pattern is not found in raw content.
AssertContentTrait::assertNoRaw protected function Passes if the raw text is NOT found on the loaded page, fail otherwise.
AssertContentTrait::assertNoText protected function Passes if the page (with HTML stripped) does not contains the text.
AssertContentTrait::assertPattern protected function Triggers a pass if the Perl regex pattern is found in the raw content.
AssertContentTrait::assertRaw protected function Passes if the raw text IS found on the loaded page, fail otherwise.
AssertContentTrait::assertText protected function Passes if the page (with HTML stripped) contains the text.
AssertContentTrait::assertTextHelper protected function Helper for assertText and assertNoText.
AssertContentTrait::assertThemeOutput protected function Asserts themed output.
AssertContentTrait::assertTitle protected function Pass if the page title is the given string.
AssertContentTrait::buildXPathQuery protected function Builds an XPath query.
AssertContentTrait::constructFieldXpath protected function Helper: Constructs an XPath for the given set of attributes and value.
AssertContentTrait::cssSelect protected function Searches elements using a CSS selector in the raw content.
AssertContentTrait::getAllOptions protected function Get all option elements, including nested options, in a select.
AssertContentTrait::getDrupalSettings protected function Gets the value of drupalSettings for the currently-loaded page.
AssertContentTrait::getRawContent protected function Gets the current raw content.
AssertContentTrait::getSelectedItem protected function Get the selected value from a select field.
AssertContentTrait::getTextContent protected function Retrieves the plain-text content from the current raw content.
AssertContentTrait::parse protected function Parse content returned from curlExec using DOM and SimpleXML.
AssertContentTrait::removeWhiteSpace protected function Removes all white-space between HTML tags from the raw content.
AssertContentTrait::setDrupalSettings protected function Sets the value of drupalSettings for the currently-loaded page.
AssertContentTrait::setRawContent protected function Sets the raw content (e.g. HTML).
AssertContentTrait::xpath protected function Performs an xpath search on the contents of the internal browser.
BrowserHtmlDebugTrait::$htmlOutputBaseUrl protected property The Base URI to use for links to the output files.
BrowserHtmlDebugTrait::$htmlOutputClassName protected property Class name for HTML output logging.
BrowserHtmlDebugTrait::$htmlOutputCounter protected property Counter for HTML output logging.
BrowserHtmlDebugTrait::$htmlOutputCounterStorage protected property Counter storage for HTML output logging.
BrowserHtmlDebugTrait::$htmlOutputDirectory protected property Directory name for HTML output logging.
BrowserHtmlDebugTrait::$htmlOutputEnabled protected property HTML output enabled.
BrowserHtmlDebugTrait::$htmlOutputTestId protected property HTML output test ID.
BrowserHtmlDebugTrait::formatHtmlOutputHeaders protected function Formats HTTP headers as string for HTML output logging.
BrowserHtmlDebugTrait::getHtmlOutputHeaders protected function Returns headers in HTML output format. 1
BrowserHtmlDebugTrait::getResponseLogHandler protected function Provides a Guzzle middleware handler to log every response received.
BrowserHtmlDebugTrait::getTestMethodCaller protected function Retrieves the current calling line in the class under test. 1
BrowserHtmlDebugTrait::htmlOutput protected function Logs a HTML output message in a text file.
BrowserHtmlDebugTrait::initBrowserOutputFile protected function Creates the directory to store browser output.
ConfigTestTrait::configImporter protected function Returns a ConfigImporter object to import test configuration.
ConfigTestTrait::copyConfig protected function Copies configuration objects from source storage to target storage.
DrupalTestCaseTrait::$root protected property The Drupal root directory.
DrupalTestCaseTrait::checkErrorHandlerOnTearDown public function Checks the test error handler after test execution. 1
DrupalTestCaseTrait::getDrupalRoot protected static function Returns the Drupal root directory. 1
DrupalTestCaseTrait::setDebugDumpHandler public static function Registers the dumper CLI handler when the DebugDump extension is enabled.
ExtensionListTestTrait::getModulePath protected function Gets the path for the specified module.
ExtensionListTestTrait::getThemePath protected function Gets the path for the specified theme.
HttpKernelUiHelperTrait::$mink protected property Mink session manager.
HttpKernelUiHelperTrait::assertSession public function Returns WebAssert object.
HttpKernelUiHelperTrait::buildUrl protected function Builds a URL from a system path or a URL object.
HttpKernelUiHelperTrait::clickLink protected function Follows a link by complete name.
HttpKernelUiHelperTrait::drupalGet protected function Retrieves a Drupal path.
HttpKernelUiHelperTrait::getDefaultDriverInstance protected function Gets an instance of the default Mink driver.
HttpKernelUiHelperTrait::getNodeElementsByXpath protected function Performs an xpath search on the contents of the internal browser.
HttpKernelUiHelperTrait::getSession public function Returns Mink session.
HttpKernelUiHelperTrait::getUrl protected function Gets the current URL from the browser.
HttpKernelUiHelperTrait::initMink protected function Initializes Mink sessions.
KernelTestBase::$classLoader protected property The class loader.
KernelTestBase::$configImporter protected property The configuration importer.
KernelTestBase::$configSchemaCheckerExclusions protected static property An array of config object names that are excluded from schema checking. 4
KernelTestBase::$container protected property The test container.
KernelTestBase::$databasePrefix protected property The test database prefix.
KernelTestBase::$keyValue protected property The key_value service that must persist between container rebuilds.
KernelTestBase::$siteDirectory protected property The relative path to the test site directory.
KernelTestBase::$strictConfigSchema protected property Set to TRUE to strict check all configuration saved. 9
KernelTestBase::$usesSuperUserAccessPolicy protected property Set to TRUE to make user 1 a super user. 1
KernelTestBase::$vfsRoot protected property The virtual filesystem root directory.
KernelTestBase::assertPostConditions protected function 1
KernelTestBase::bootEnvironment protected function Bootstraps a basic test environment.
KernelTestBase::bootKernel protected function Bootstraps a kernel for a test. 1
KernelTestBase::config protected function Configuration accessor for tests. Returns non-overridden configuration.
KernelTestBase::disableModules protected function Disables modules for this test.
KernelTestBase::enableModules protected function Enables modules for this test. 2
KernelTestBase::getConfigSchemaExclusions protected function Gets the config schema exclusions for this test.
KernelTestBase::getDatabaseConnectionInfo protected function Returns the Database connection info to be used for this test. 3
KernelTestBase::getDatabasePrefix public function Gets the database prefix used for test isolation.
KernelTestBase::getExtensionsForModules private function Returns Extension objects for $modules to install.
KernelTestBase::getModulesToEnable protected static function Returns the modules to install for this test.
KernelTestBase::initFileCache protected function Initializes the FileCache component.
KernelTestBase::installConfig protected function Installs default configuration for a given list of modules.
KernelTestBase::installEntitySchema protected function Installs the storage schema for a specific entity type.
KernelTestBase::installSchema protected function Installs database tables from a module schema definition.
KernelTestBase::register public function Registers test-specific services. Overrides ServiceProviderInterface::register 40
KernelTestBase::render protected function Renders a render array. 1
KernelTestBase::setInstallProfile protected function Sets the install profile and rebuilds the container to update it.
KernelTestBase::setSetting protected function Sets an in-memory Settings variable.
KernelTestBase::setUp protected function 454
KernelTestBase::setUpFilesystem protected function Sets up the filesystem, so things like the file directory. 3
KernelTestBase::tearDown protected function 10
KernelTestBase::tearDownCloseDatabaseConnection public function Additional tear down method to close the connection at the end.
KernelTestBase::vfsDump protected function Dumps the current state of the virtual filesystem to STDOUT.
KernelTestBase::__sleep public function Prevents serializing any properties.
RandomGeneratorTrait::getRandomGenerator protected function Gets the random generator for the utility methods.
RandomGeneratorTrait::randomMachineName protected function Generates a unique random string containing letters and numbers.
RandomGeneratorTrait::randomObject public function Generates a random PHP object.
RandomGeneratorTrait::randomString public function Generates a pseudo-random string of ASCII characters of codes 32 to 126.
SearchTextProcessorTest::$modules protected static property Modules to install. Overrides KernelTestBase::$modules
SearchTextProcessorTest::testSearchTextProcessorPunctuation public function Tests that text analysis does the right thing with punctuation.
SearchTextProcessorTest::testSearchTextProcessorUnicode public function Tests that text processing handles Unicode characters correctly.
StorageCopyTrait::replaceStorageContents protected static function Copy the configuration from one storage to another and remove stale items.

Buggy or inaccurate documentation? Please file an issue. Need support? Need help programming? Connect with the Drupal community.