CFLib.org – Common Function Library Project

popularWords(qQuery, targetCol[, returnCount][, ignoreWords])

Last updated August 12, 2005

author

C. Hatton Humphrey

Version: 2 | Requires: CF6 | Library: StrLib

Description:
Generates a query that contains the x most popular words contained in a query column as well as their count. It is called by sending a query, the column to count, the number of rows to return and a stop list.

Return Values:
Returns a query.

Example:

<!--- This is just a query to get the initial column --->
<cfquery name="qGetColumn" datasource="xxxxxx">
SELECT Description
FROM Events
</cfquery>

<cfdump var="#popularWords(qGetColumn, "Description", 3)#">

Parameters:

Name Description Required
qQuery The query to inspect. Yes
targetCol The column to inspect. Yes
returnCount Number of top words to return. Defaults to 10. No
ignoreWords Words to ignore. Defaults to: I,me,the,and,if,but,not,as,a,an,for,of,this,on,to,is No

Full UDF Source:

<!---
 Returns the most popular words in a query column and their count.
 Version 2 mods by Raymond Camden
 
 @param qQuery 	 The query to inspect. (Required)
 @param targetCol 	 The column to inspect. (Required)
 @param returnCount 	 Number of top words to return. Defaults to 10. (Optional)
 @param ignoreWords 	 Words to ignore. Defaults to: I,me,the,and,if,but,not,as,a,an,for,of,this,on,to,is (Optional)
 @return Returns a query. 
 @author C. Hatton Humphrey (hat@guardian-web.com) 
 @version 2, August 12, 2005 
--->
<cffunction name="popularWords" returntype="query" output="No">
	<cfargument name="qQuery" type="query" required="true">
	<cfargument name="targetCol" type="string" required="true">
	<cfargument name="returnCount" type="numeric" required="false" default="10">
	<cfargument name="ignoreWords" type="string" required="false" default="I,me,the,and,if,but,not,as,a,an,for,of,this,on,to,is">

	<cfset var thisRow = "">
	<cfset var thisLine = "">
	<cfset var thisWord = "">
	<cfset var wordData = structNew()>
	<cfset var qFinalResults = "">
	
	<!--- Create a query to contain the results, prime it so that loops
	don't fail since we can't INSERT or UPDATE using QoQ --->
	<cfset var qResults = queryNew("word,times")>

	<!--- Begin the looping, go through the query to check --->
	<cfloop from="1" to="#arguments.qQuery.RecordCount#" index="thisRow">
		<!--- Ease of use; set a "nickname" for the current line --->
		<cfset thisLine = arguments.qQuery[targetcol][thisRow]>

 		<!--- Loop through the line treating it as a list --->
 		<cfloop list="#thisLine#" delimiters=" " index="thisWord">
   
			<!--- Test for the words that we need to ignore (include all one-letter words) --->
	 		<cfif not listFindNoCase(arguments.ignoreWords, thisWord) and len(trim(thisWord)) gt 1>
		  		<cfif not structKeyExists(wordData, thisWord)>
						<cfset wordData[thisWord] = 0>
				</cfif>
				<cfset wordData[thisWord] = wordData[thisWord] + 1>
			</cfif>

	   </cfloop>
	</cfloop>

	<cfloop item="thisWord" collection="#wordData#">
		<cfset queryAddRow(qResults)>
		<cfset querySetCell(qResults, "word", thisWord)>
		<cfset querySetCell(qResults, "times", wordData[thisWord])>
	</cfloop>
	
	<!--- We've built our query, now use QoQ to get the "top 10" by count --->
	<cfquery name="qFinalResults" dbtype="query" maxrows="#arguments.returnCount#">
	select word, times
	from qresults
	order by times desc
	</cfquery>
	
	<cfreturn qFinalResults>
</cffunction>
blog comments powered by Disqus

Search CFLib.org


Latest Additions

Kevin Cotton added
date2ExcelDate
May 5, 2016

Raymond Camden added
CapFirst
April 25, 2016

Chris Wigginton added
loremIpsum
January 18, 2016

Gary Stanton added
calculateArrival...
November 19, 2015

Sebastiaan Naafs - van Dijk added
getDaysInQuarter
November 13, 2015

Created by Raymond Camden / Design by Justin Johnson