CFLib.org – Common Function Library Project

solrClean(input)

Last updated December 30, 2011

Version: 2 | Requires: ColdFusion 9 | Library: UtilityLib

 
Rated 1 time(s). Average Rating: 5.0

Description:
Like VerityClean, massages text input to make it Solr compatible. NOTE: requires uCaseWordsForSolr UDF.

Return Values:
Returns a string.

Example:

view plain print about
<cfset cleanSolrSearchText = solrClean(userSearchText) />

Parameters:

Name Description Required
input String to run against Yes

Full UDF Source:

view plain print about
<!---
 Like VerityClean, massages text input to make it Solr compatible.
 v2 by Daria Norris to deal with wildcard characters used as the first letter of the search
 
 @param input      String to run against (Required)
 @return Returns a string. 
 @author Sami Hoda (sami@bytestopshere.com) 
 @version 2, December 30, 2011 
--->

<cffunction name="solrClean" access="public" output="false" returntype="Any" >
    <cfargument name="input" type="string" default="" required="true" hint="String to run against" />

    <cfset var cleanText = trim(arguments.input) />

    <!--- // List of special characters to remove --->
    <cfset var reBadChars = "\\|@|'|<|>|\(|\)|!|=|\[|\]|\{|\}|\#chr(44)#|`" />


    <cfscript>
    //=-=-=-=-=-=-=-=-
    // Replace comma with OR
    //=-=-=-=-=-=-=-=-
    cleanText = replace(cleanText, "," , " or " , "all");

    //=-=-=-=-=-=-=-=-
    // Strip double spaces
    //=-=-=-=-=-=-=-=-
    cleanText = reReplace(cleanText,{2,}"," ","all");

    //=-=-=-=-=-=-=-=-=-
    // Strip bad characters
    //=-=-=-=-=-=-=-=-=
    cleanText = reReplace(cleanText,reBadChars," ","all");

    //=-=-=-=-=-=-=-=-
    // Clean up sequences of space characters
    //=-=-=-=-=-=-=-=-
    cleanText = reReplace(cleanText,"[[:space:]]+"," ","all");

    // clean up wildcard characters as first characters
    cleanText = reReplace(cleanText,'(^[\*\?]{1,})','');

    //=-=-=-=-=-=-=-=-=-
    // uCaseWords - and=AND, etc - lcase rest. if keyword is mixed case - solr treats as case-sensitive!
    //=-=-=-=-=-=-=-=-=
    cleanText = uCaseWordsForSolr(cleanText);
    
</cfscript>

    <cfreturn trim(cleanText) />
</cffunction>
blog comments powered by Disqus

Search CFLib.org


Latest Additions

Dave Anderson Dave Anderson added
iniToStruct
20 day(s) ago

Dave Anderson Dave Anderson added
deDupeArray
20 day(s) ago

Richard Richard added
dice
22 day(s) ago

Isaac Dealey Isaac Dealey added
getRelative
a while ago

Top Rated

Darwan Leonardo Sitepu backupDatabase
Rated 5.0, 22 time(s)

Barney Boisvert indentXml
Rated 5.0, 10 time(s)

Kevin Pepperman generateSsccAsn
Rated 5.0, 4 time(s)

Raymond Camden highlightAndCrop
Rated 5.0, 4 time(s)

Created by Raymond Camden / Design by Justin Johnson