You've already seen some built-in functions in detail, such as sub, gsub and gensub functions. This chapter will discuss many more built-ins that are often used in one-liners. You'll also see more examples with arrays.
See gawk manual: Functions for details about all the built-in functions as well as how to define your own functions.
length
length function returns number of characters for the given string argument. By default, it acts on $0 variable and a number argument is converted to string automatically.
$ awk 'BEGIN{print length("road"); print length(123456)}' 4 6 $ # recall that record separator isn't part of $0 $ # so, line ending won't be counted here $ printf 'fox\ntiger\n' | awk '{print length()}' 3 5 $ awk 'length($1) < 6' table.txt brown bread mat hair 42 blue cake mug shirt -7
If you need number of bytes, instead of number of characters, then use the -b command line option as well. Locale can also play a role.
$ echo 'αλεπού' | awk '{print length()}' 6 $ echo 'αλεπού' | awk -b '{print length()}' 12 $ echo 'αλεπού' | LC_ALL=C awk '{print length()}' 12
By default, array looping with for(key in array) format gives you elements in random order. By setting a special value to PROCINFO["sorted_in"], you can control the order in which you wish to retrieve the elements. See gawk manual: Using Predefined Array Scanning Orders for other options and details.
$ # by default, array is traversed in random order $ awk 'BEGIN{a["z"]=1; a["x"]=12; a["b"]=42; for(i in a) print i, a[i]}' x 12 z 1 b 42 $ # index (i.e. keys) sorted in ascending order as strings $ awk 'BEGIN{PROCINFO["sorted_in"] = "@ind_str_asc"; a["z"]=1; a["x"]=12; a["b"]=42; for(i in a) print i, a[i]}' b 42 x 12 z 1 $ # value sorted in ascending order as numbers $ awk 'BEGIN{PROCINFO["sorted_in"] = "@val_num_asc"; a["z"]=1; a["x"]=12; a["b"]=42; for(i in a) print i, a[i]}' z 1 x 12 b 42
Here's an example of sorting input lines in ascending order based on second column, treating the data as string.
$ awk 'BEGIN{PROCINFO["sorted_in"] = "@ind_str_asc"} {a[$2]=$0} END{for(k in a) print a[k]}' table.txt yellow banana window shoes 3.14 brown bread mat hair 42 blue cake mug shirt -7