FPAT
FS allows to define input field separator. In contrast, FPAT (field pattern) allows to define what should the fields be made up of.
$ s='Sample123string42with777numbers' $ # define fields to be one or more consecutive digits $ echo "$s" | awk -v FPAT='[0-9]+' '{print $2}' 42 $ # define fields to be one or more consecutive alphabets $ echo "$s" | awk -v FPAT='[a-zA-Z]+' -v OFS=, '{$1=$1} 1' Sample,string,with,numbers
FPAT is often used for csv input where fields can contain embedded delimiter characters. For example, a field content "fox,42" when , is the delimiter.
$ s='eagle,"fox,42",bee,frog' $ # simply using , as separator isn't sufficient $ echo "$s" | awk -F, '{print $2}' "fox
For such simpler csv input, FPAT helps to define fields as starting and ending with double quotes or containing non-comma characters.
$ # * is used instead of + to allow empty fields $ echo "$s" | awk -v FPAT='"[^"]*"|[^,]*' '{print $2}' "fox,42"
The above will not work for all kinds of csv files, for example if fields contain escaped double quotes, newline characters, etc. See stackoverflow: What's the most robust way to efficiently parse CSV using awk? for such cases. You could also use other programming languages such as Perl, Python, Ruby, etc which come with standard csv parsing libraries or have easy access to third party solutions. There are also specialized command line tools such as xsv. If IGNORECASE is set, it will affect field matching. Unlike FS , there is no different behavior for single character pattern.
$ # count number of 'e' in the input string $ echo 'Read Eat Sleep' | awk -v FPAT='e' '{print NF}' 3 $ echo 'Read Eat Sleep' | awk -v IGNORECASE=1 -v FPAT='e' '{print NF}' 4 $ echo 'Read Eat Sleep' | awk -v IGNORECASE=1 -v FPAT='[e]' '{print NF}' 4
Share with your friends: |