Regex Pattern to exclude Search Results

Overview

You can optimize the search index by marking generated pages as duplicates if they differ in dynamic content only. The crawler indexes only one from each set of pages that match a specified regex pattern.

 

Prerequisites

 

Solution

  1. Go to Persona Bar > Settings > Site Settings.
    Persona Bar > Settings > Site Settings

     

  2. Go to the Search tab, and then the Crawling subtab.
    Search > Crawling
  3. In Duplicates, click/tap + Add Regex Pattern.

  4. Define the regex pattern for duplicates.


    Field Description
    Description A name for the regex pattern entry. Typically, the name of the module that generates the dynamic content in pages that match the specified pattern.
    Regex Pattern The regular expression that describes the various URLs and URL parameters of pages that are considered duplicates.
  5. Save.

Testing

You can test that regex has successfully excluded the URL by:

  1. Go to the site's Search Results
  2. Search for the keyword that you've excluded in your Regular Expression.

 

Comments

0 comments

Please sign in to leave a comment.