Little Girl's Mostly Linux Blog

ManipulateText

Change or manipulate text

This page was last updated on July 18, 2009.

Table of Contents

Introduction

Linux offers various ways of changing or manipulating the contents of text files. Below are some examples of how to do this with various programs on the command line. The programs used in the examples below are:

  • awk
  • cut
  • sed
  • tr

Before you begin

Since the programs below are used for manipulating text, it’s possible the text you’ll want to manipulate will sometimes contain characters the programs would interpret as part of the command. Examples of this are the forward slash (/), backslash (\), period (.), apostrophe (‘), or ampersand (&). To use this kind of character, put a backslash to the left of it to let the programs know that they shouldn’t interpret that character, but should treat it as plain text.

Examples



Change case of letters

  • If I have a file named file.txt with lowercase letters in it and I’d like the file to contain only uppercase letters, I’d type this command:
  • tr 'a-z' 'A-Z' < file.txt
    • Result:
    • Contents of file.txt Result of the tr command
      Attention!
      All employees must wash hands!
      ATTENTION!
      ALL EMPLOYEES MUST WASH HANDS!

  • To change all uppercase letters to lowercase letters, I’d type this command:
  • tr 'A-Z' 'a-z' < file.txt
    • Result:
    • Contents of file.txt Result of the tr command
      Attention!
      All employees must wash hands!
      attention!
      all employees must wash hands!


Display a range of characters from each line

  • If I have a file named file.txt and I only want to see the third through the fifth characters of each line, I can use cut to display them in the terminal window without changing the file:
  • cut -c 3-5 file.txt
    • Result:
    • Contents of file.txt Result of the cut command
      1234 6
      1 3456
      123 56
      34
      345
      3 5

      Note: Change 3 and 5 to the column numbers you’d like to see.

  • To create a new text file named results.txt with the results of this command in it, I can modify the command slightly, like this:
  • cut -c 3-5 file.txt > results.txt
    • The result will be the same as above, and will be saved to results.txt while file.txt remains unchanged.


Display specific columns from each line

  • If I have a text file with columns of information that are separated by either spaces or tabs, I can use awk to display specific columns in the terminal window without changing the file:
    • To display just the first column, I would type this command:
    • awk '{print $1}' file.txt
      • Result:
      • Contents of file.txt Result of the awk command
        1 2 3 4 5
        Mary Dave Steve Lucy Dan
        1
        mary

    • To display the first and third column, I would type this command:
    • awk '{print $1, $3}' file.txt
      • Result:
      • Contents of file.txt Result of the awk command
        1 2 3 4 5
        Mary Dave Steve Lucy Dan
        1 3
        Mary Steve

    • To display the first, second, and fifth columns, I would type this command:
    • awk '{print $1, $2, $5}' file.txt
      • Result:
      • Contents of file.txt Result of the awk command
        1 2 3 4 5
        Mary Dave Steve Lucy Dan
        1 2 5
        Mary Dave Dan

    • To create a new text file named results.txt with the results of any of these commands in it, I can modify the commands by adding > results.txt to the end of any of the commands, like this:
    • awk '{print $1, $3}' file.txt > results.txt
      • The results will be the same as above, and will be saved to results.txt while file.txt remains unchanged.


Remove all blank lines

  • Let’s say I have a shopping list named file.txt that’s double-spaced. If I’d like to remove all the blank lines in it so that it’s single-spaced, I’d type this command:
  • tr -s \\n < file.txt
    • Result:
    • Contents of file.txt Result of the tr command
      Apples.

      Bananas.

      Oranges.

      Apples.
      Bananas.
      Oranges.


Remove all + from each line

  • Let’s say I have a file named file.txt that accidentally has a lot of + in it. If I’d like to removing them all, I’d type this command:
  • tr -d + < file.txt

    Or I’d type this command:

    sed -e 's/+//g' file.txt
    • Result:
    • Contents of file.txt Result of tr or sed command
      This+ is+ an+ important+ file+ that+ should+ have+ no+ symbols.
      It+ should+ be+ easy+ to+ read+.
      This is an important file that should have no symbols.
      It should be easy to read.


Remove the first 0 from each line

  • Let’s say I have a file named file.txt with a numbered list. If I’d like to remove the first 0 from each line, I’d type this command:
  • sed -e 's/0//' file.txt
    • Result:
    • Contents of file.txt Result of the sed command
      01. Buy 20 stamps.
      02. Work out for 30 minutes.
      03. Put 10 gallons of fuel in the car.
      1. Buy 20 stamps.
      2. Work out for 30 minutes.
      3. Put 10 gallons of fuel in the car.


Replace all + on each line with –

  • Let’s say I have a file named file.txt with + in front of the columns of text. If I’d like to replace the + with on each line, I’d type this command:
  • sed -e 's/+/-/g' file.txt
    • Result:
    • Contents of file.txt Result of the sed command
      user1 + user2 + user3
      user4 + user5 + user6
      user7 + user8 + user9
      user1 – user2 – user3
      user4 – user5 – user6
      user7 – user8 – user9


Replace the first = on each line with +

  • Let’s say I have a file named file.txt with a series of numbers separated by =. If I’d like the first = on each line replaced by +, I’d type this command:
  • sed -e 's/=/+/' file.txt
    • Result:
    • Contents of file.txt Result of the sed command
      1 = 1 = 2
      2 = 2 = 4
      3 = 3 = 6
      1 + 1 = 2
      2 + 2 = 4
      3 + 3 = 6


Replace word1 with word2 on each line

  • Let’s say I have a file named file.txt with the word rainy in it. If I’d like to replace that word on each line with foggy, I’d type this command:
    sed "s/rainy/foggy/g" file.txt
    • Result:
    • Contents of file.txt Result of the sed command
      The weather is rainy today.
      We hope you enjoy it when
      it’s rainy.
      The weather is foggy today.
      We hope you enjoy it when
      it’s foggy.


Replace word1 with word2 on each line in multiple files

  • This replaces word1 with word2 on each line of all text files in the current directory as well as all text files in subdirectories off of the current directory:
  • find . -name '*.txt' -print | xargs perl -pi -e's/word1/word2/ig' *.txt


Sort a text file

This will sort a text file alphabetically, inserting any blank lines in the file at the top.

  • If I have a file named file.txt and I want to sort its contents, I can use sort to display the sorted content in the terminal window without changing the file:
  • sort file.txt
  • To create a new text file named results.txt containing the sorted contents, I can modify the command slightly, like this:
  • sort file.txt > results.txt

    The result will be the same as above, and will be saved to results.txt while file.txt remains unchanged.


Turn all spaces into newlines

  • Let’s say I have a file named ”’file.txt”’ with a list of names separated by spaces. If I’d like each name on its own line, I’d type this command:
  • tr ' ' \\n < file.txt
    • Result:
    • Contents of file.txt Result of the tr command
      Lucy Steve Mary Dave Lucy
      Steve
      Mary
      Dave



Obligatory Happy Ending

And they all lived happily ever after. The end.

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Comment:

Blog at WordPress.com.