This chapter will give an overview of ruby syntax for command line usage and some examples to show what kind of problems are typically suited for one-liners.

Why use Ruby for one-liners?

I assume you are already familiar with use cases where command line is more productive compared to GUI. See also this series of articles titled Unix as IDE.

A shell utility like bash provides built-in commands and scripting features to make it easier to solve and automate various tasks. External *nix commands like grep, sed, awk, sort, find, parallel etc can be combined to work with each other. Depending upon your familiarity with those tools, you can either use ruby as a single replacement or complement them for specific use cases.

Here’s some one-liners (options will be explained later):

  • ruby -e 'puts readlines.uniq' *.txt — retain only one copy if lines are duplicated from the given list of input file(s)
  • ruby -e 'puts readlines.uniq {|s| s.split[1]}' *.txt — retain only first copy of duplicate lines using second field as duplicate criteria
  • ruby -rcommonregex -ne 'puts CommonRegex.get_links($_)' *.md — extract only the URLs, using a third-party CommonRegexRuby library
  • stackoverflow: merge duplicate key values while preserving order — a recent Q&A that I answered with a simpler ruby solution compared to awk

The main advantage of ruby over tools like grep, sed and awk includes feature rich regular expression engine, standard library and third-party libraries. If you don’t already know the syntax and idioms for sed and awk, learning command line options for ruby would be the easier option. The main disadvantage is that ruby is likely to be slower compared to those tools.

Command line options

Option Description
-0[octal] specify record separator (, if no argument)
-a autosplit mode with -n or -p (splits $_ into $F)
-c check syntax only
-Cdirectory cd to directory before executing your script
-d set debugging flags (set $DEBUG to true)
-e 'command' one line of script. Several -e‘s allowed. Omit [programfile]
-Eex[:in] specify the default external and internal character encodings
-Fpattern split() pattern for autosplit (-a)
-i[extension] edit ARGV files in place (make backup if extension supplied)
-Idirectory specify $LOAD_PATH directory (may be used more than once)
-l enable line ending processing
-n assume 'while gets(); ... end' loop around your script
-p assume loop like -n but print line also like sed
-rlibrary require the library before executing your script
-s enable some switch parsing for switches after script name
-S look for the script using PATH environment variable
-v print the version number, then turn on verbose mode
-w turn warnings on for your script
-W[level=2|:category] set warning level; 0=silence, 1=medium, 2=verbose
-x[directory] strip off text before #!ruby line and perhaps cd to directory
--jit enable JIT with default options (experimental)
--jit-[option] enable JIT with an option (experimental)
-h show this message, --help for more info

This chapter will show examples with -e, -n, -p and -a options. Some more options will be covered in later chapters, but not all of them are discussed in this book.

Executing Ruby code

If you want to execute a ruby program file, one way is to pass the filename as argument to the ruby command.

$ echo 'puts "Hello Ruby"' > hello.rb
$ ruby hello.rb
Hello Ruby

For short programs, you can also directly pass the code as an argument to the -e option.

$ ruby -e 'puts "Hello Ruby"'
Hello Ruby

$ # multiple statements can be issued separated by ;
$ ruby -e 'x=25; y=12; puts x**y'
59604644775390625
$ # or use -e option multiple times
$ ruby -e 'x=25' -e 'y=12' -e 'puts x**y'
59604644775390625

Filtering

ruby one-liners can be used for filtering lines matched by a regexp, similar to grep, sed and awk. And similar to many command line utilities, ruby can accept input from both stdin and file arguments.

$ # sample stdin data
$ printf 'gatenapplenwhatnkiten'
gate
apple
what
kite

$ # print all lines containing 'at'
$ # same as: grep 'at' and sed -n '/at/p' and awk '/at/'
$ printf 'gatenapplenwhatnkiten' | ruby -ne 'print if /at/'
gate
what

$ # print all lines NOT containing 'e'
$ # same as: grep -v 'e' and sed -n '/e/!p' and awk '!/e/'
$ printf 'gatenapplenwhatnkiten' | ruby -ne 'print if !/e/'
what

By default, grep, sed and awk will automatically loop over input content line by line (with n as the line distinguishing character). The -n or -p option will enable this feature for ruby. As seen before, the -e option accepts code as command line argument. Many shortcuts are available to reduce the amount of typing needed.

In the above examples, a regular expression (defined by the pattern between a pair of forward slashes) has been used to filter the input. When the input string isn’t specified in a conditional context (for example: if), the test is performed against global variable $_, which has the contents of the input line (the correct term would be input record, see Record separators chapter). To summarize, in a conditional context:

  • /regexp/ is a shortcut for $_ =~ /regexp/
  • !/regexp/ is a shortcut for $_ !~ /regexp/

$_ is also the default argument for print method, which is why it is generally preferred in one-liners over puts method. More such defaults that apply to the print method will be discussed later.

info See ruby-doc: Pre-defined global variables for documentation on $_, $&, etc.

Here’s an example with file input instead of stdin.

$ cat table.txt
brown bread mat hair 42
blue cake mug shirt -7
yellow banana window shoes 3.14

$ # same as: grep -oE '[0-9]+$' table.txt
$ ruby -ne 'puts $& if /d+$/' table.txt
42
7
14

info The learn_ruby_oneliners repo has all the files used in examples.

Substitution

Use sub and gsub methods for search and replace requirements. By default, these methods operate on $_ when the input string isn’t provided. For these examples, -p option is used instead of -n option, so that the value of $_ is automatically printed after processing each input line.

$ # for each input line, change only first ':' to '-'
$ # same as: sed 's/:/-/' and awk '{sub(/:/, "-")} 1'
$ printf '1:2:3:4na:b:c:dn' | ruby -pe 'sub(/:/, "-")'
1-2:3:4
a-b:c:d

$ # for each input line, change all ':' to '-'
$ # same as: sed 's/:/-/g' and awk '{gsub(/:/, "-")} 1'
$ printf '1:2:3:4na:b:c:dn' | ruby -pe 'gsub(/:/, "-")'
1-2-3-4
a-b-c-d

You might wonder how $_ is modified without the use of ! methods. The reason is that these methods are part of Kernel (see ruby-doc: Kernel for details) and are available only when -n or -p options are used.

  • sub(/regexp/, repl) is a shortcut for $_.sub(/regexp/, repl) and $_ will be updated if substitution succeeds
  • gsub(/regexp/, repl) is a shortcut for $_.gsub(/regexp/, repl) and $_ gets updated if substitution succeeds

info This book assumes you are already familiar with regular expressions. If not, you can check out my free Ruby Regexp book.

Field processing

Consider the sample input file shown below with fields separated by a single space character.

$ cat table.txt
brown bread mat hair 42
blue cake mug shirt -7
yellow banana window shoes 3.14

Here’s some examples that is based on specific field rather than the entire line. The -a option will cause the input line to be split based on whitespaces and the field contents can be accessed using $F global variable. Leading and trailing whitespaces will be suppressed and won’t result in empty fields. More details is discussed in Default field separation section.

$ # print the second field of each input line
$ # same as: awk '{print $2}' table.txt
$ ruby -ane 'puts $F[1]' table.txt
bread
cake
banana

$ # print lines only if last field is a negative number
$ # same as: awk '$NF<0' table.txt
$ ruby -ane 'print if $F[-1].to_f < 0' table.txt
blue cake mug shirt -7

$ # change 'b' to 'B' only for the first field
$ # same as: awk '{gsub(/b/, "B", $1)} 1' table.txt
$ ruby -ane '$F[0].gsub!(/b/, "B"); puts $F * " "' table.txt
Brown bread mat hair 42
Blue cake mug shirt -7
yellow banana window shoes 3.14

BEGIN and END

You can use a BEGIN{} block when you need to execute something before input is read and a END{} block to execute something after all of the input has been processed.

$ # same as: awk 'BEGIN{print "---"} 1; END{print "%%%"}'
$ # note the use of ; after BEGIN block
$ seq 4 | ruby -pe 'BEGIN{puts "---"}; END{puts "%%%"}'
---
1
2
3
4
%%%

ENV hash

When it comes to automation and scripting, you'd often need to construct commands that can accept input from user, file, output of a shell command, etc. As mentioned before, this book assumes bash as the shell being used. To access environment variables of the shell, you can call the special hash variable ENV with the name of the environment variable as a string key.

$ # existing environment variable
$ # output shown here is for my machine, would differ for you
$ ruby -e 'puts ENV["HOME"]'
/home/learnbyexample
$ ruby -e 'puts ENV["SHELL"]'
/bin/bash

$ # defined along with ruby command
$ # note that the variable is placed before the shell command
$ word='hello' ruby -e 'puts ENV["word"]'
hello
$ # the input characters are preserved as is
$ ip='hinbye' ruby -e 'puts ENV["ip"]'
hinbye

Here's another example when a regexp is passed as an environment variable content.

$ cat word_anchors.txt
sub par
spar
apparent effort
two spare computers
cart part tart mart

$ # assume 'r' is a shell variable that has to be passed to the ruby command
$ r='BparB'
$ rgx="$r" ruby -ne 'print if /#{ENV["rgx"]}/' word_anchors.txt
apparent effort
two spare computers

info As an example, see my repo ch: command help for a practical shell script, where commands are constructed dynamically.

Executing external commands

You can call external commands using the system Kernel method. See ruby-doc: system for documentation.

$ ruby -e 'system("echo Hello World")'
Hello World

$ ruby -e 'system("wc -w  out.txt")'
$ cat out.txt
1,2,3,4,5,6,7,8,9,10

Return value of system or global variable $? can be used to act upon exit status of command issued.

$ ruby -e 'es=system("ls word_anchors.txt"); puts es'
word_anchors.txt
true
$ ruby -e 'system("ls word_anchors.txt"); puts $?'
word_anchors.txt
pid 6087 exit 0

$ ruby -e 'system("ls xyz.txt"); puts $?'
ls: cannot access 'xyz.txt': No such file or directory
pid 6164 exit 2

To save the result of an external command, use backticks or %x.

$ ruby -e 'words = `wc -w 

info See also stackoverflow: difference between exec, system and %x() or backticks

Summary

This chapter introduced some of the common options for ruby cli usage, along with typical cli text processing examples. While specific purpose cli tools like grep, sed and awk are usually faster, ruby has a much more extensive standard library and ecosystem. And you do not have to learn a lot if you are comfortable with ruby but not familiar with those cli tools. The next section has a few exercises for you to practice the cli options and text processing use cases.

Exercises

info Exercise related files are available from exercises folder of learn_ruby_oneliners repo.

info All the exercises are also collated together in one place at Exercises.md. For solutions, see Exercise_solutions.md.

a) For the input file ip.txt, display all lines containing is.

$ cat ip.txt
Hello World
How are you
This game is good
Today is sunny
12345
You are funny

##### add your solution here
This game is good
Today is sunny

b) For the input file ip.txt, display first field of lines not containing y. Consider space as the field separator for this file.

##### add your solution here
Hello
This
12345

c) For the input file ip.txt, display all lines containing no more than 2 fields.

##### add your solution here
Hello World
12345

d) For the input file ip.txt, display all lines containing is in the second field.

##### add your solution here
Today is sunny

e) For each line of the input file ip.txt, replace first occurrence of o with 0.

##### add your solution here
Hell0 World
H0w are you
This game is g0od
T0day is sunny
12345
Y0u are funny

f) For the input file table.txt, calculate and display the product of numbers in the last field of each line. Consider space as the field separator for this file.

$ cat table.txt
brown bread mat hair 42
blue cake mug shirt -7
yellow banana window shoes 3.14

##### add your solution here
-923.1600000000001

g) Append . to all the input lines for the given stdin data.

$ printf 'lastnappendnstopn' | ##### add your solution here
last.
append.
stop.

h) Use contents of s variable to display all matching lines from the input file ip.txt. Assume that s doesn't have any regexp metacharacters. Construct the solution such that there's at least one word character immediately preceding the contents of s variable.

$ s='is'

##### add your solution here
This game is good

i) Use system to display contents of filename present in second field (space separated) of the given input line.

$ s='report.log ip.txt sorted.txt'
$ echo "$s" | ##### add your solution here
Hello World
How are you
This game is good
Today is sunny
12345
You are funny

$ s='power.txt table.txt'
$ echo "$s" | ##### add your solution here
brown bread mat hair 42
blue cake mug shirt -7
yellow banana window shoes 3.14

Read More

ترك الرد

من فضلك ادخل تعليقك
من فضلك ادخل اسمك هنا