When it comes to command line text processing, from an abstract point of view, there are three major pillars



Download 125.91 Kb.
Page58/60
Date09.03.2023
Size125.91 Kb.
#60849
1   ...   52   53   54   55   56   57   58   59   60
Learn GNU AWK

Relying on default initial value


Uninitialized variables are useful, but sometimes they don't translate well if you are converting a command from single file input to multiple files. You have to workout which ones would need a reset at the beginning of each file being processed.
$ # step 1 - works for single file $ awk '{sum += $NF} END{print sum}' table.txt 38.14 $ # step 2 - prepare code to work for multiple file $ awk '{sum += $NF} ENDFILE{print FILENAME ":" sum}' table.txt table.txt:38.14 $ # step 3 - check with multiple file input $ # oops, default numerical value '0' for sum works only once $ awk '{sum += $NF} ENDFILE{print FILENAME ":" sum}' table.txt marks.txt table.txt:38.14 marks.txt:530.14 $ # step 4 - correctly initialize variables $ awk '{sum += $NF} ENDFILE{print FILENAME ":" sum; sum=0}' table.txt marks.txt table.txt:38.14 marks.txt:492

Code in replacement section


The replacement section in substitution functions can accept any expression, converted to string whenever necessary. What happens if the regexp doesn't match the input string but the expression can change the value of a variable, such as increment/decrement operators? Well, the expression is still executed, which may or may not be what you need.
$ # no match for second line, but 'c' was still modified $ awk '{sub(/^(br|ye)/, ++c ") &")} 1' table.txt 1) brown bread mat hair 42 blue cake mug shirt -7 3) yellow banana window shoes 3.14 $ # check for matching line first before applying substitution $ # that may help to simplify the regexp for substitution $ # or, you could save the regexp in a variable to avoid duplication $ awk '/^(br|ye)/{sub(/^/, ++c ") ")} 1' table.txt 1) brown bread mat hair 42 blue cake mug shirt -7 2) yellow banana window shoes 3.14
Also, the expression is executed only once per function call, not for every match.
$ # first line has two matches but 'c' is modified only once $ awk '{gsub(/\
Download 125.91 Kb.

Share with your friends:
1   ...   52   53   54   55   56   57   58   59   60




The database is protected by copyright ©ininet.org 2024
send message

    Main page