When it comes to command line text processing, from an abstract point of view, there are three major pillars



Download 125.91 Kb.
Page52/60
Date09.03.2023
Size125.91 Kb.
#60849
1   ...   48   49   50   51   52   53   54   55   ...   60
Learn GNU AWK

Summary


This chapter discussed a few cases where you need to compare contents of two files. The NR==FNR trick is handy for such cases. The getline function is helpful for line number based comparisons.
Next chapter will discuss how to handle duplicate contents.

Exercises


a) Use contents of match_words.txt file to display matching lines from jumbled.txt and sample.txt. The matching criteria is that the second word of lines from these files should match the third word of lines from match_words.txt.
$ cat match_words.txt %whole(Hello)--{doubt}==ado== just,\joint*,concession<=nice $ # 'concession' is one of the third words from 'match_words.txt' $ # and second word from 'jumbled.txt' $ awk ##### add your solution here wavering:concession/woof\retailer No doubt you like it too
b) Interleave contents of secrets.txt with the contents of a file passed via -v option as shown below.
$ awk -v f='table.txt' ##### add your solution here stag area row tick brown bread mat hair 42 --- deaf chi rate tall glad blue cake mug shirt -7 --- Bi tac toe - 42 yellow banana window shoes 3.14 ---
c) The file search_terms.txt contains one search string per line (these have no regexp metacharacters). Construct an awk command that reads this file and displays search terms (matched case insensitively) that were found in all of the other file arguments. Note that these terms should be matched with any part of the line, not just whole words.
$ cat search_terms.txt hello row you is at $ awk ##### add your solution here ##file list## search_terms.txt jumbled.txt mixed_fs.txt secrets.txt table.txt at row $ awk ##### add your solution here ##file list## search_terms.txt addr.txt sample.txt is you hello

Dealing with duplicates


Often, you need to eliminate duplicates from an input file. This could be based on entire line content or based on certain fields. These are typically solved with sort and uniq commands. Advantage with awk include regexp based field and record separators, input doesn't have to be sorted, and in general more flexibility because it is a programming language.

Download 125.91 Kb.

Share with your friends:
1   ...   48   49   50   51   52   53   54   55   ...   60




The database is protected by copyright ©ininet.org 2024
send message

    Main page