CFLib.org – Common Function Library Project

popularWords(qQuery, targetCol[, returnCount][, ignoreWords])

Last updated August 12, 2005

author

C. Hatton Humphrey

Version: 2 | Requires: CF6 | Library: StrLib

Description:
Generates a query that contains the x most popular words contained in a query column as well as their count. It is called by sending a query, the column to count, the number of rows to return and a stop list.

Return Values:
Returns a query.

Example:

<!--- This is just a query to get the initial column --->
<cfquery name="qGetColumn" datasource="xxxxxx">
SELECT Description
FROM Events
</cfquery>

<cfdump var="#popularWords(qGetColumn, "Description", 3)#">

Parameters:

Name Description Required
qQuery The query to inspect. Yes
targetCol The column to inspect. Yes
returnCount Number of top words to return. Defaults to 10. No
ignoreWords Words to ignore. Defaults to: I,me,the,and,if,but,not,as,a,an,for,of,this,on,to,is No

Full UDF Source:

<!---
 Returns the most popular words in a query column and their count.
 Version 2 mods by Raymond Camden
 
 @param qQuery      The query to inspect. (Required)
 @param targetCol      The column to inspect. (Required)
 @param returnCount      Number of top words to return. Defaults to 10. (Optional)
 @param ignoreWords      Words to ignore. Defaults to: I,me,the,and,if,but,not,as,a,an,for,of,this,on,to,is (Optional)
 @return Returns a query. 
 @author C. Hatton Humphrey (hat@guardian-web.com) 
 @version 2, August 12, 2005 
--->
<cffunction name="popularWords" returntype="query" output="No">
    <cfargument name="qQuery" type="query" required="true">
    <cfargument name="targetCol" type="string" required="true">
    <cfargument name="returnCount" type="numeric" required="false" default="10">
    <cfargument name="ignoreWords" type="string" required="false" default="I,me,the,and,if,but,not,as,a,an,for,of,this,on,to,is">

    <cfset var thisRow = "">
    <cfset var thisLine = "">
    <cfset var thisWord = "">
    <cfset var wordData = structNew()>
    <cfset var qFinalResults = "">
    
    <!--- Create a query to contain the results, prime it so that loops
    don't fail since we can't INSERT or UPDATE using QoQ --->
    <cfset var qResults = queryNew("word,times")>

    <!--- Begin the looping, go through the query to check --->
    <cfloop from="1" to="#arguments.qQuery.RecordCount#" index="thisRow">
        <!--- Ease of use; set a "nickname" for the current line --->
        <cfset thisLine = arguments.qQuery[targetcol][thisRow]>

         <!--- Loop through the line treating it as a list --->
         <cfloop list="#thisLine#" delimiters=" " index="thisWord">
   
            <!--- Test for the words that we need to ignore (include all one-letter words) --->
             <cfif not listFindNoCase(arguments.ignoreWords, thisWord) and len(trim(thisWord)) gt 1>
                  <cfif not structKeyExists(wordData, thisWord)>
                        <cfset wordData[thisWord] = 0>
                </cfif>
                <cfset wordData[thisWord] = wordData[thisWord] + 1>
            </cfif>

       </cfloop>
    </cfloop>

    <cfloop item="thisWord" collection="#wordData#">
        <cfset queryAddRow(qResults)>
        <cfset querySetCell(qResults, "word", thisWord)>
        <cfset querySetCell(qResults, "times", wordData[thisWord])>
    </cfloop>
    
    <!--- We've built our query, now use QoQ to get the "top 10" by count --->
    <cfquery name="qFinalResults" dbtype="query" maxrows="#arguments.returnCount#">
    select word, times
    from qresults
    order by times desc
    </cfquery>
    
    <cfreturn qFinalResults>
</cffunction>

Search CFLib.org


Latest Additions

Raymond Camden added
QueryDeleteRows
November 04, 2017

Leigh added
nullPad
May 11, 2016

Raymond Camden added
stripHTML
May 10, 2016

Kevin Cotton added
date2ExcelDate
May 05, 2016

Raymond Camden added
CapFirst
April 25, 2016

Created by Raymond Camden / Design by Justin Johnson