AWK adalah programming language yang powerful untuk text processing. Dibuat oleh Alfred Aho, Peter Weinberger, dan Brian Kernighan (dari sinilah nama AWK), tools ini sangat efektif untuk manipulasi data terstruktur seperti log files, CSV, dan output command.
1. Pengenalan AWK dan Sintaks Dasar
Struktur AWK Program
pattern { action }
pattern { action }
- Pattern: Kondisi yang harus dipenuhi untuk menjalankan action
- Action: Command yang dijalankan jika pattern match
AWK One-Liners
# Print seluruh file
awk '{print}' file.txt
Print baris tertentu (line 5)
awk 'NR==5' file.txt
Print baris dengan pattern
awk '/pattern/' file.txt
Print field tertentu (default delimiter: whitespace)
awk '{print $1}' file.txt # Field pertama
awk '{print $NF}' file.txt # Field terakhir
awk '{print $1, $3}' file.txt # Field 1 dan 3
2. Field Processing dan Delimiters
Menggunakan Delimiter Custom
# CSV file dengan comma delimiter
awk -F',' '{print $1, $2}' data.csv
TSV file dengan tab delimiter
awk -F'\t' '{print $1}' data.tsv
Multiple delimiters
awk -F'[:,]' '{print $1}' file.txt # Delimiter : atau ,
Regular expression sebagai delimiter
awk -F'[ \t]+' '{print $1}' file.txt # Whitespace
Field Operations
# Print dengan separator custom
awk '{print $1 " - " $2}' file.txt
Calculate total dari field
awk '{sum += $3} END {print sum}' numbers.txt
Average dari field
awk '{sum += $1; count++} END {print sum/count}' data.txt
Find max/min
awk 'max < $1 || NR==1 {max = $1} END {print max}' data.txt
awk 'min > $1 || NR==1 {min = $1} END {print min}' data.txt
Count non-empty fields
awk '{for(i=1;i<=NF;i++) if($i!="") count++} END {print count}' file.txt
3. Pattern Matching
Pattern Types
# Exact match
awk '$1 == "value"' file.txt
Regex match
awk '$2 ~ /regex/' file.txt
Negation
awk '$2 !~ /regex/' file.txt
Numeric comparison
awk '$3 > 100' file.txt
awk '$3 < 50' file.txt
awk '$3 >= 100 && $3 <= 200' file.txt
String comparison
awk '$1 > "m"' file.txt # Alphabetic comparison
Multiple conditions
awk '$1 == "admin" && $3 > 1000' file.txt
awk '$1 == "user" || $1 == "admin"' file.txt
Built-in Patterns
# BEGIN - dijalankan sebelum processing file
awk 'BEGIN {print "Header"} {print}' file.txt
END - dijalankan setelah processing file
awk '{sum += $1} END {print "Total:", sum}' file.txt
NR - Record/line number
awk 'NR==1 {print "First line"}' file.txt
awk 'NR%2==0' file.txt # Even lines
awk 'NR>1 && NR<=10' file.txt # Lines 2-10
NF - Number of fields
awk 'NF > 5' file.txt # Lines dengan lebih dari 5 fields
awk 'NF == 0 {print "Empty line"}' file.txt
4. Log Processing dengan AWK
Analisis Log Apache/Nginx
# Count requests per IP
cat access.log | awk '{print $1}' | sort | uniq -c | sort -rn | head -20
Atau dengan AWK saja
awk '{count[$1]++} END {for(ip in count) print count[ip], ip}' access.log | sort -rn | head -20
Find 404 errors dengan referer
awk '$9 == 404 {print $1, $7, $11}' access.log
Calculate response time average
awk '{sum += $10; count++} END {print "Avg response time:", sum/count "ms"}' access.log
Find most requested URLs
awk '{url[$7]++} END {for(u in url) print url[u], u}' access.log | sort -rn | head -10
Bandwidth usage per IP
awk '{bytes[$1] += $10} END {for(ip in bytes) print bytes[ip], ip}' access.log | sort -rn | head -20
System Log Analysis
# Error count per hour dari syslog
awk '/error/ {hour=substr($3,1,2); count[hour]++} END {for(h in count) print h, count[h]}' /var/log/syslog
Failed SSH login attempts
awk '/Failed password/ {print $11}' /var/log/auth.log | sort | uniq -c | sort -rn | head -10
Disk space trend dari df output
df -h | awk 'NR>1 {sum += $3} END {print "Total used:", sum/1024/1024 " GB"}'
5. AWK Scripting dan File Processing
Multi-line AWK Script
Simpan ke file process.awk:
#!/usr/bin/awk -fBEGIN { FS="," OFS=" | " print "Name", "Department", "Salary" print "----", "----------", "------" }
{ total += $3 count++
if ($3 > 50000) { print $1, $2, "$" $3 }}
END {
print ""
print "Average Salary: $", total/count
print "Total Employees:", count
}Jalankan dengan:
chmod +x process.awk ./process.awk employees.csv # atau awk -f process.awk employees.csvData Transformation
# Convert CSV to TSV awk 'BEGIN {FS=","; OFS="\t"} {$1=$1; print}' input.csv > output.tsvFormat currency
awk '{printf "$%.2f\n", $1}' prices.txt
Pad numbers dengan leading zeros
awk '{printf "%04d\n", $1}' numbers.txt
Date formatting
awk '{gsub(/-/,"/"); print}' dates.txt # Replace - with /
Kesimpulan
AWK adalah tools yang sangat powerful untuk text processing dan data extraction. Dengan kombinasi pattern matching, field processing, dan mathematical operations, AWK dapat menggantikan banyak tools text processing yang lebih kompleks.
Kapan Menggunakan AWK:
– Processing structured text data (CSV, TSV, logs)
– Extract dan transform data
– Calculations pada data
– Reporting dan summarization
– One-liner text processingAlternatives:
–seduntuk simple text substitution
–grepuntuk pattern matching
–cutuntuk field extraction sederhana
–perluntuk complex scriptingTips:
– Selalu test dengan sample data terlebih dahulu
– Gunakan-Funtuk set delimiter
– Print intermediate results saat debugging
– Combine dengan pipes untuk workflow complex
Ditulis oleh
Hendra Wijaya