Does DNN Lucene search support foreign languages?


This article explains if different foreign languages are supported by the DNN search engine.





DNN Search is built on Lucene, an open-source search engine software library and we utilize its indexing and querying engine to store searchable content.

As of writing (10/8/2020), the search engine is currently on Lucene version 3.03, the available languages are listed in the Lucene 3.03 documentation as namespaces.

The following foreign languages that are supported by Lucene are:

  • Arabic
  • Brazilian
  • Chinese
  • Czech
  • German
  • Greek
  • Persian
  • French
  • Dutch
  • Russian
  • Thai

These languages and their foreign characters can be extracted from the content and stored in the Lucene index files.

By default, only English characters are accepted to be stored in the index file. You can, however, enable a specific foreign language's characters to be indexed in addition to English characters. To enable a specific foreign language's characters to be indexed, you must:

  1. Go to the Persona Bar > Settings > Site Settings > Search > Basic Settings.
  2. Change the Custom Analyzer Type to the foreign language from the above list.
  3. Click Save.
  4. You will then need to click Re-Index content to delete existing

One caveat with this method is that you can only set ONE custom analyzer type per installation, this will mean only one foreign language can be indexed at one time.



To test that the search is successfully indexing foreign characters:

  1. Deploy an HTML module onto the page with foreign character content.
  2. Go to Settings > Scheduler > Scheduler > Edit the Search: Site Crawler > Click Run Now.
  3. Search for the foreign characters that you've just created in the search bar.

Text including the characters should appear in the search results.



Please sign in to leave a comment.