You can optimize the search index by marking generated pages as duplicates if they differ in dynamic content only. The crawler indexes only one from each set of pages that match a specified regex pattern.
- Access to a SuperUser account
- Go to .
- Go to the Search tab, and then the Crawling subtab.
- In Duplicates, click/tap + Add Regex Pattern.
- Define the regex pattern for duplicates.
Field Description Description A name for the regex pattern entry. Typically, the name of the module that generates the dynamic content in pages that match the specified pattern. Regex Pattern The regular expression that describes the various URLs and URL parameters of pages that are considered duplicates.
You can test that regex has successfully excluded the URL by:
- Go to the site's Search Results
- Search for the keyword that you've excluded in your Regular Expression.