Little Girl's Mostly Linux Blog

WordFrequencyScripts

Word frequency scripts

This page was last updated on May 16, 2016.

Table of Contents

About


Since there aren’t a whole lot of word frequency programs easily found for GNU/Linux, perhaps some scripts would be useful for some of you out there. Below are a couple of Bash scripts and a Python script, each of which can take pretty much any text as input and will display a list of all the words (and numbers, which are treated as words) in the text, and how many times each was used.

The Bash scripts were written by Frank Pirrone and the Python script was written by me, but uses the framework of a Python Tkinter script Frank and I collaborated on a while back as its foundation. The Bash Yad script has been tested on Linux Mint Cinnamon. The Bash Zenity script has been tested on Linux Mint Cinnamon and Linux Mint MATE. The Python script has been tested on Linux Mint Cinnamon and Linux Mint MATE.

Please excuse the purple title bars and the magenta scrollbar in some of the screenshots. Your system’s title bar and scroll bar colors will be used for those when you run the scripts.

Similarities between the scripts

  • All three scripts launch a GUI and let you browse to a file or paste the input in.
  • All three scripts are fully commented to let you know what each line of code does.
  • All three scripts offer the option to save a copy of the result to a file in the location you specify.
  • All three scripts allow you to choose whether to display case sensitive or case insensitive results.
  • All three scripts display a word count in addition to a word frequency count.
  • All three scripts offer the option to sort the results numerically.

Differences in the scripts

What these scripts don’t do


These scripts are intended for use as light-weight word frequency utilities whose focus is on words and numbers. Although they handle telephone numbers, times, URLs, IPs, email addresses, and American dollar amounts, they were not intended to elegantly handle code or text with a lot of non-alphanumeric characters. As a result, when using that sort of input, you may see some not so pretty results. They may also fail to elegantly handle some common types of text that we hadn’t thought to test them with. My apologies in advance if that occurs.

Example source file used in the screenshots below


An example of how these scripts work can be seen by creating a file named pie.txt with these contents:

I love apple pie.
I love banana pie.
I love coconut pie.

Bash Yad main interface


Word_Frequency_Bash_YAD_Main_Interface

Bash Yad example result alphabetically sorted (case insensitive)


If you choose File or Paste to insert text into the text box, choose Alphabetic, choose Insensitive, and choose Save or Don’t save, you will get this result:

Word_Frequency_Bash_YAD_Result_Alphabetic_Insensitive

Bash Yad example result alphabetically sorted (case sensitive)


If you choose File or Paste to insert text into the text box, choose Alphabetic, choose Sensitive, and choose Save or Don’t save, you will get this result:

Word_Frequency_Bash_YAD_Result_Alphabetic_Sensitive

Bash Yad example result numerically sorted (case insensitive)


If you choose File or Paste to insert text into the text box, choose Numeric, choose Insensitive, and choose Save or Don’t save, you will get this result:

Word_Frequency_Bash_YAD_Result_Numeric_Insensitive

Bash Yad example result numerically sorted (case Sensitive)


If you choose File or Paste to insert text into the text box, choose Numeric, choose Sensitive, and choose Save or Don’t save, you will get this result:

Word_Frequency_Bash_YAD_Result_Numeric_Sensitive

Bash Zenity main interface


Word_Frequency_Bash_Zenity_Main_Interface

Bash Zenity example result alphabetically sorted (case insensitive)


If you choose File or Paste to insert text into the text box, choose Alphabetic, choose Insensitive, and choose Save or Don’t save, you will get this result:

Word_Frequency_Bash_Zenity_Result_Alphabetic_Insensitive

Bash Zenity example result alphabetically sorted (case sensitive)


If you choose File or Paste to insert text into the text box, choose Alphabetic, choose Sensitive, and choose Save or Don’t save, you will get this result:

Word_Frequency_Bash_Zenity_Result_Alphabetic_Sensitive

Bash Zenity example result numerically sorted (case insensitive)


If you choose File or Paste to insert text into the text box, choose Numeric, choose Insensitive, and choose Save or Don’t save, you will get this result:

Word_Frequency_Bash_Zenity_Result_Numeric_Insensitive

Bash Zenity example result numerically sorted (case Sensitive)


If you choose File or Paste to insert text into the text box, choose Numeric, choose Sensitive, and choose Save or Don’t save, you will get this result:

Word_Frequency_Bash_Zenity_Result_Numeric_Sensitive

Python main interface


Word_Frequency_Python_Main_Interface

Python example result alphabetically sorted (case insensitive)


If you insert text into the text box, process the text, say yes to Insensitive, and say no to Numeric, you will get this result:

Word_Frequency_Python_Result_Alphabetic_Insensitive

Python example result alphabetically sorted (case sensitive)


If you insert text into the text box, process the text, say no to Insensitive, and say no to Numeric, you will get this result:

Word_Frequency_Python_Result_Alphabetic_Sensitive

Python example result numerically sorted (case insensitive)


If you insert text into the text box, process the text, say yes to Insensitive, and say yes to Numeric, you will get this result:

Word_Frequency_Python_Result_Numeric_Insensitive

Python example result numerically sorted (case Sensitive)


If you insert text into the text box, process the text, say no to Insensitive, and say yes to Numeric, you will get this result:

Word_Frequency_Python_Result_Numeric_Sensitive

Get the scripts


The Python script has now been updated and is at version 2, which contains an Export button that saves the result to a file in the current directory and a Help button that displays some basic help for running the GUI.

See the Bash Yad script

#!/bin/bash

# Make program settings choices using YAD and format the output for capture: 
for i in `yad --list --print-all --title "Word Frequency Application Settings"\
 --text "<b>Click</b> to select words processing item from each first and second row pairs:\n"\
 --column=col1 --column=1:rd --column=col2 --column=2:rd\
 --column=col3 --column=3:rd --column=col4 --column=4:rd\
 "File Text" true "Sensitive Case" true  "Alphabetic Sort" true "Save File" false\
 "Paste Text" false "Insensitive Case" false "Numeric Sort" false "No Save" true\
 --no-headers --height=150 --width=450\
 | sed 's/|FALSE|/\n/g' | sed 's/|TRUE|/-\n/g' | grep - | cut -d' ' -f1 | tr '\n' ' '`
 
# Capture formatted output of program setting choices by setting variables with Case:
do
	case "$i" in
	"File")       file=TRUE;;
	"Sensitive")  sensitive=TRUE;;
	"Alphabetic") alphabetic=TRUE;;
	"Save")       save=TRUE;;
	*)
	esac
done

# Set file or paste text as input source and process paste output with Printf:
if [ $file ]; then 
	file=$(yad --file-selection --title "File Source Text" --height=400 --width=600)
	text=$(cat "$file")
else
	text=$(yad --form --text "<b>Paste</b> text into box:" --field "":txt --title "Paste Source Text" \
	--height=400 --width=400)
	text=$(printf %b "$text")
fi

# Set sensitive or insensitive case of words using Translate:
if [ -z $sensitive ]; then
	text=$(echo "$text" | tr "[:upper:]" "[:lower:]")
fi

# Create text processing function with common text and I/O utilities:
# Note, each command is followed by a pipe to send its output to the next command:
process(){
	# Use echo to output contents of text to command stream
	echo "$text" |
	# Use tr to preserve desired alphanum and punct characters:
	tr -cd [[:alnum:]"\n@\'-/' ',:\$"] |
	# Use tr to remove specific extra punctuation not handled above:
	tr -d [\(\)+*] |
	# Use sed to replace spaces with newlines using global:
	sed "s/[[:space:]]/\n/g" |
	# Use sed to remove trailing commas colons and periods:
	sed "s/[,:.]*$//" |
	# Use sed to remove any remaining blank lines:
	sed "/^$/d" |
	# Use sed to remove lines containing just valid punctuation:
	sed "/^[@,./\$\'-]*$/d" |
	# Use sort to alphabetize word list in natural dictionary order:
	sort |
	# Use uniq to generate word counts:
	uniq -c	|
	# Use awk to print tab separated count and word columns:
	awk '{print $1" \t"$2}'
}

# Store alphabetic or numeric output of text processing function in results variable:
if [ $alphabetic ]; then
	results=$(process)
else
	results=$(process | sort -nr)
fi

# Cut word frequencies column and get word total and fill display variable:
total=$(echo "$results" | cut -d" " -f1 |  paste -sd+ | bc)
spacer="---------------"
display="Word Count:\n$spacer\n$total\n$spacer\nWord Frequency\n$spacer\n$results"

# Display alphabetic or numeric results and save file if option chosen:
if [ $alphabetic ]; then
	printf "$display" | yad --text-info --title "Word Counts Sorted Alphabetically" \
	--height=400 --width=435
else
	printf "$display" | yad --text-info --title "Word Counts Sorted Numerically" \
	--height=400 --width=435
fi
if [ $save ]; then
	outfile=$(yad --file-selection --save --confirm-overwrite="File Exists, Overwrite?" \
	--file-filter="*.txt" --title "Save Results File" --height=400 --width=500)
	outfile=$(echo $outfile | cut -f 1 -d '.')
	printf "$display" > "$outfile.txt"
fi

See the Bash Zenity script

#!/bin/bash
# Originally by Frank Pirrone on 04/22/2016 with very slight contributions by Little Girl.

# This script launches a GUI that lets you provide text for it to process, either by
# importing a file or pasting in its contents so they can be processed for display of
# all of the words and how frequently each was used, and the option to save the results
# to your hard drive in a text file.

# Choose File or Paste as the source of words to be counted:
choice=$(zenity --list --radiolist --title "Choose Text Source" --text "" --hide-header \
--column "choice" --column "Item" FALSE "File" FALSE "Paste");

# If File was chosen:
if [ "$choice" == "File" ];then
	# Load variable with the selected file:
	file=$(zenity --file-selection --title "File Source Text" );
	# Load variable with the file read in:
	text=$(cat "$file")
# Else if Paste was chosen:
else
	# Load variable with text pasted in:
	text=$(zenity --text-info --editable "Paste text into box:"  --title "Paste Source Text" \
	--height=400 --width=435) 
	text=$(printf %b "$text")	
fi		

# Choose whether alphabetic or numeric sort:
sort=$(zenity --list --radiolist --title "Choose sort type" --text "" --hide-header \
--column "Choice" --column "Item" FALSE "Alphabetic" FALSE "Numeric");

# If alphabetic was chosen then set the variable:
if [ "$sort" == "Alphabetic" ];then 
	# Load variable with boolean:
	alphabetic=TRUE
fi

# Choose whether case insensitive or case sensitive:
case=$(zenity --list --radiolist --title "Choose case type" --text "" --hide-header \
--column "Choice" --column "Item" FALSE "Insensitive" FALSE "Sensitive");

# If Insensitive was chosen, else Sensitive:
if [ "$case" == "Insensitive" ];then 
	# Load text variable with text piped to the tr command to convert case:
	text=$(echo "$text" | tr "[:upper:]" "[:lower:]");
else
	# Load text variable with text unconverted:
	text=$(echo "$text");
fi

# Choose whether or not to save file:
saveit=$(zenity --list --radiolist --title "Choose save file" --text "" --hide-header \
--column "Choice" --column "Item" FALSE "Save" FALSE "Don't Save");

# If Save was chosen then set the variable:
if [ "$saveit" == "Save" ];then 
	# Load variable with boolean:
	save=TRUE
fi

# pasted from here to the end with the code of the YAD version:

# Create text processing function with common text and I/O utilities:
# Note, each command is followed by a pipe to send its output to the next command:
process(){
	# Use echo to output contents of text to command stream
	echo "$text" |
	# Use tr to preserve desired alphanum and punct characters:
	tr -cd [[:alnum:]"\n@\'-/' ',:\$"] |
	# Use tr to remove specific extra punctuation not handled above:
	tr -d [\(\)+*] |
	# Use sed to replace spaces with newlines using global:
	sed "s/[[:space:]]/\n/g" |
	# Use sed to remove trailing commas colons and periods:
	sed "s/[,:.]*$//" |
	# Use sed to remove any remaining blank lines:
	sed "/^$/d" |
	# Use sed to remove lines containing just valid punctuation:
	sed "/^[@,./\$\'-]*$/d" |
	# Use sort to alphabetize word list in natural dictionary order:
	sort |
	# Use uniq to generate word counts:
	uniq -c	|
	# Use awk to print tab separated count and word columns:
	awk '{print $1" \t"$2}'
}

# Store alphabetic or numeric output of text processing function in results variable:
if [ $alphabetic ]; then
	results=$(process)
else
	results=$(process | sort -nr)
fi

# Cut word frequencies column and get word total and fill display variable:
total=$(echo "$results" | cut -d" " -f1 |  paste -sd+ | bc)
spacer="---------------"
display="Word Count:\n$spacer\n$total\n$spacer\nWord Frequency\n$spacer\n$results"

# Display alphabetic or numeric results and save file if option chosen:
if [ $alphabetic ]; then
	printf "$display" | zenity --text-info --title "Word Counts Sorted Alphabetically" \
	--height=400 --width=435
else
	printf "$display" | zenity --text-info --title "Word Counts Sorted Numerically" \
	--height=400 --width=435
fi
if [ $save ]; then
	outfile=$(zenity --file-selection --save --confirm-overwrite="File Exists, Overwrite?" \
	--file-filter="*.txt" --title "Save Results File" --height=400 --width=500)
	outfile=$(echo $outfile | cut -f 1 -d '.')
	printf "$display" > "$outfile.txt"
fi

See the Python script

#!/usr/bin/env python
# This script was written by Little Girl, is based on the foundation of a script that she and Frank Pirrone collaborated on a while back, got a bit of code added to it from some nice guys in IRC, and Frank contributed the finishing touches to it with the case insensitive/case sensitive and alphabetic/numeric sorting.
import ScrolledText as st
import string
import Tkinter as tk
from Tkinter import *
import tkFileDialog, tkMessageBox
from collections import Counter

helptext = """
This script launches a GUI that processes your text for display
of all of the words and how frequently each was used.

The Import button can be used to import the contents of a
file into the text box, or you can type or paste text into the
text box.

The Process button offers case insensitive or case sensitive
processing and numeric or alphabetic sort and displays the
word frequency of the words in the text box.

The Export button exports the results to a text file you
specify the name of in the current directory, creating it if
it doesn't exist, and overwriting its contents if it does.

The Reset button clears the text box.

The Help button displays this help text.
"""

###############
# ROOT WINDOW #
###############

# Define the window:
root = tk.Tk()

# Give the window a title:
root.title("Word-Frequency-O-Matic")

# Define the window size and center function:
def size_and_center_window(target, w=0, h=0):
	# Define the variable to hold the screen width:
	ws = target.winfo_screenwidth()
	# Define the variable to hold the screen height:
	hs = target.winfo_screenheight()
	# Define the variable to hold half the screen width:
	x = (ws/2) - (w/2)
	# Define the variable to hold half the screen height:
	y = (hs/2) - (h/2)
	# Use the combined measurements to center the window:
	target.geometry('%dx%d+%d+%d' % (w, h, x, y))

# Run the window size and center function to create the root window at the specified size:
size_and_center_window(root, 500, 440)

#####################
# ROOT WINDOW FRAME #
#####################

# Create and configure a frame in the window:
frame = tk.Frame(root)
# Color the frame's background:
frame.configure(bg = "lightgreen")
# Position the frame and determine how much space it takes up:
frame.pack(fill='both', expand=True)

##@########################
# ROOT WINDOW FRAME LABEL #
###########################

# Create a label in the frame:
label = Label(frame)
# Define the label's text and color its background:
label.configure(text = "Import text from a file or paste text into the text box.", bg = "lightgreen")
# Position the label and determine how much white space surrounds it:
label.pack(side=BOTTOM, pady=2)

########################
# ROOT WINDOW TEXT BOX #
########################

# Create a scrolled text box in the frame:
textbox = st.ScrolledText(master = frame, bg = "lightyellow")
# Give the text box focus:
textbox.focus()
# Position the text box and determine how much white space surrounds it:
textbox.pack(fill='both', expand=True, padx=8, pady=8)

#####################################
# ROOT WINDOW TEXT BOX CONTEXT MENU #
#####################################

# Create the context menu:
menu = Menu(textbox, tearoff=0, cursor="top_left_arrow", activebackground="lavender", background="lightgreen")

# Define the context menu open function:
def open_context_menu(event):
	menu.post(event.x_root, event.y_root)

# Define the context menu close function:
def close_context_menu(event):
	menu.unpost()

# Define the context menu select all function:
def selectall():
    textbox.tag_add("sel","1.0","end")

# Add the Select all entry to the context menu:
menu.add_command(label="Select all", command=lambda: selectall())
# Add the Cut entry to the context menu:
menu.add_command(label="Cut", command=lambda: textbox.event_generate("<<Cut>>"))
# Add the Copy entry to the context menu:
menu.add_command(label="Copy", command=lambda: textbox.event_generate("<<Copy>>"))
# Add the Paste entry to the context menu:
menu.add_command(label="Paste", command=lambda: textbox.event_generate("<<Paste>>"))

# Bind the text box widget and context menu together so the menu opens on right click:
textbox.bind("<Button-3>", lambda(event): open_context_menu(event))
# Bind the text box widget and context menu together so the menu closes on left click:
textbox.bind("<Button-1>", lambda(event): close_context_menu(event))

###############
# FILE IMPORT #
###############

# Define the file import function:
def readfile():
	# Open a browsing interface to open a file in read only mode:
	file = tkFileDialog.askopenfile(parent=root,mode='rb',title='Choose a file')
	# Determine what to do if a file is opened:
	if file != None:
		# Put the contents of the file into the data variable:
		data = file.read()
		# Insert the data into the text box:
		textbox.insert(INSERT, data)
		# Close the file:
		file.close()

#########################
# PROCESS TEXT FUNCTION #
######################### 
# Define the text processing function:
def process():
	# Read data from text box into the mylist variable:
	mydata = textbox.get(1.0, END)
	# Ask if case insensitive sort and if True convert to lower:
	case = tkMessageBox.askyesno("Sort Option", "Case insensitive?")
	if case == True: mydata = mydata.lower()
	# Ask if numerical sort and hold result for text box display:
	num = tkMessageBox.askyesno("Sort Option", "Sort numerically?")
	# Split string into a list at whitespace:
	mylist = mydata.split()
	# Remove punctuation from the beginning of each list item:
	mylist = [x.lstrip('`~!@#%^&*()-_=+[{]}\\|;:",<.>? ') for x in mylist]
	# Remove punctuation from the end of each list item:
	mylist = [x.rstrip('`~!@#$%^&*()-_=+[{]}\\|;:",<.>? ') for x in mylist]
	# Remove pure punctuation items:
	mycleanlist = filter (lambda string:any([x.isalnum() for x in string]), mylist)
	# Remove duplicates by converting the list to a set:
	myset=set(mycleanlist)
	# Define the function used to sort the list (thanks to 2 nice guys in IRC):
	def inverse_case_key(s):
		# Define the result variable as a list:
		result = []
		# For x in each Unicode character:
		for x in s:
			# If the character is lower case:
			if x.islower():
				# Add it as an upper case character to the result:
				result.append(x.upper())
			# If the character is upper case:
			elif x.isupper():
				# Add it as a lower case character to the result:
				result.append(x.lower())
			# Otherwise:
			else:
				# Add it as is to the result:
				result.append(x)
		# Return the result sorted as lower case:
		return (s.lower(), result)
	# Do an inverse case sort of the list:
	mylist = sorted(set(myset), key=inverse_case_key)

	# Clear the text box display:
	textbox.delete(1.0, END)
	# Determine what to do for each word in the list:
	if num == False:
		# Display results alphabetically:
		for word in mylist:
			# Insert the number of occurrences followed by a space followed by the word into the text box:
			textbox.insert(INSERT, mycleanlist.count(word), INSERT, '\t', INSERT, word, INSERT, '\n')
		# Insert the total words above the counts and sorted words:
		textbox.insert(1.0, 'Total Words\n------------\n', INSERT, sum(Counter(mycleanlist).values()), INSERT, '\n------------\n' )		
	else:
		# Display results numerically:
		for value, count in Counter(mycleanlist).most_common():
			# Insert the number of occurrences followed by a space followed by the word into the text box:
			textbox.insert(INSERT, count, INSERT, '\t', INSERT, value, INSERT, '\n')
		# Insert the total words above the counts and sorted words:
		textbox.insert(1.0, 'Total Words\n------------\n', INSERT, sum(Counter(mycleanlist).values()), INSERT, '\n------------\n' )		

######################
# SAVE FILE FUNCTION #
######################

def savefile():
	# Load the myfile variable with a text file in write mode in the current directory:
	myfile = tkFileDialog.asksaveasfile(mode='w', filetypes=[("text files", "*.txt")])
	# If the Cancel button was pressed or the window was closed:
	if myfile is None:
		# Exit the savefile function:
		return
	# Load the result variable with the contents of the text box:
	result=textbox.get(1.0, END)
	# Write the result variable contents to myfile:
	myfile.write(result)
	# Close myfile:
	myfile.close()

#################
# HELP FUNCTION #
#################

# Define the help window function:
def help():
	# Define the top window as a TopLevel window:
	top = Toplevel()
	# Define the title for the help window:
	top.title("HELP")
	# Run the function to create the centered help window at the specified size:
	size_and_center_window(top, 400, 340)
	# Create the label for the help window, filling it with the helptext variable contents:
	label = Label(top, anchor=CENTER, justify=LEFT, text=helptext, wraplength=375)
	# Define the background color for the label:
	label.configure(bg = "orchid1")
	# Set the label into place and get it to fill the entire top window:
	label.pack(fill='both', expand=True)
	# Run the top window:
	top.mainloop()

########################
# WINDOW FRAME BUTTONS #
########################

# Create an Import button in the frame:
Import = tk.Button(frame, text='Import', command = lambda: readfile())
# Define the File button's background color, border width, and hide the highlight around it:
Import.configure(bg = "lightblue", bd = 5, highlightthickness = 0)
# Position the File button and determine how much white space surrounds it on either side:
Import.pack(side=LEFT, fill=BOTH, expand=True, padx=(15,10))

# Create a Process button in the frame:
Process = tk.Button(frame, text='Process', command = lambda: process())
# Define the Process button's background color, border width, and hide the highlight around it:
Process.configure(bg = "lightblue", bd = 5, highlightthickness = 0)
# Position the Process button and determine how much white space surrounds it on either side:
Process.pack(side=LEFT, fill=BOTH, expand=True, padx=(10,10))

# Create an Export button in the frame:
Export = tk.Button(frame, text='Export', command = lambda: savefile())
# Define the File button's background color, border width, and hide the highlight around it:
Export.configure(bg = "lightblue", bd = 5, highlightthickness = 0)
# Position the File button and determine how much white space surrounds it on either side:
Export.pack(side=LEFT, fill=BOTH, expand=True, padx=(10,10))

# Create a Reset button in the frame:
Reset = tk.Button(frame, text='Reset', command = lambda: textbox.delete(1.0,END))
# Define the Reset button's background color, border width, and hide the highlight around it:
Reset.configure(bg = "lightblue", bd = 5, highlightthickness = 0)
# Position the Reset button and determine how much white space surrounds it on either side:
Reset.pack(side=LEFT, fill=BOTH, expand=True, padx=(10,10))

# Create a Help button in the frame:
Help = tk.Button(frame, text='Help', command = help)
# Define the Help button's background color, border width, and hide the highlight around it:
Help.configure(bg = "lightblue", bd = 5, highlightthickness = 0)
# Position the Help button and determine how much white space surrounds it on either side:
Help.pack(side=LEFT, fill=BOTH, expand=True, padx=(10,15))

#############
# LAUNCH IT #
#############

# Run the whole script:
root.mainloop()

Some extra goodies


Here’s the Bash algorithm used in the Bash scripts above in the form of a “one-liner” you can paste into a terminal window inside any directory that contains text files (files ending in the .txt extension) and it will display the word frequency of all the words in all the text files in that directory:

cat *.txt | tr -cd [[:alnum:]&quot;\n@\'-/' ',\$&quot;] | tr -d [\(\)+*] | sed &quot;s/[[:space:]]/\n/g&quot; | sed &quot;s/\.$//g&quot; | sed &quot;s/[,:.]*$//&quot; | sed &quot;/^$/d&quot; | sed &quot;/^[@,./\$\'-]*$/d&quot; | sort | uniq -c | awk '{print $1&quot; \t&quot;$2}'

Here’s the same algorithm with one space removed before the \t near the end in order to compact the tab between the word frequency count and the words:

cat *.txt | tr -cd [[:alnum:]&quot;\n@\'-/' ',\$&quot;] | tr -d [\(\)+*] | sed &quot;s/[[:space:]]/\n/g&quot; | sed &quot;s/\.$//g&quot; | sed &quot;s/[,:.]*$//&quot; | sed &quot;/^$/d&quot; | sed &quot;/^[@,./\$\'-]*$/d&quot; | sort | uniq -c | awk '{print $1&quot;\t&quot;$2}'

You can get a detailed explanation of the algorithm written by Frank Pirrone here: https://app.box.com/s/0nr5l0eqczlhdog0srs0t56bt3kpes99

Last, but far from least, you can get a Word Frequency Diatribe written by Frank Pirrone about learning from example with the Bash scripts above here: https://app.box.com/s/6ic13hwaahifhxftks7s15rpncp17rwi


Obligatory Happy Ending

And they all lived happily ever after. The end.

Create a free website or blog at WordPress.com.

%d bloggers like this: