Web Analytics

awk

Advanced~35 min read

awk is a small language for scanning and processing text with fields and patterns. It shines at CSV/TSV and report generation.

Basics: Fields and Filters

Output
Click Run to execute your code

Advanced: BEGIN/END and Arrays

Output
Click Run to execute your code
Note: Set the field separator with -F or inside a BEGIN block.
Pro Tip: Use printf for aligned columns and reports.
Caution: Beware of CSV corner cases (quoted commas). Consider dedicated CSV tools when needed.

Common Mistakes

1) Forgetting NR>1 to skip headers

When processing CSV with headers, include a condition to skip the first line.

2) Mixing separators

Ensure the field separator matches your data (e.g., comma vs tab).

Exercise: Average by City

Task: From name,age,city CSV on stdin, print the average age per city.

Output
Click Run to execute your code
Show Solution
awk -F, 'NR>1 { sum[$3]+=$2; count[$3]++ } END{ for(c in sum) printf "%s %.2f\n", c, sum[c]/count[c] }'

Summary

  • Split with -F and access fields via $1..$n.
  • Use conditions and actions per line.
  • Aggregate in END with arrays.

What's Next?

Handle OS signals and graceful cleanup with Signal Handling.