I have a huge text file that look like this: 2013/255-12:23:34 - Some event happ
ID: 3542228 • Letter: I
Question
I have a huge text file that look like this:
2013/255-12:23:34 - Some event happen
2013/255-12:23:34 - Some event happen
2013/255-12:23:34 - Some event happen
2013/255-12:23:34 - Some event happen
***Connection Lost
2013/256-12:23:34 - Some event happen
2013/256-12:23:34 - Some event happen
2013/256-12:23:34 - Some event happen
2013/256-12:23:34 - Some event happen
2013/257-12:23:34 - Some event happen
2013/257-12:23:34 - Some event happen
I would like to use perl to split this file into smaller files with the date as file name.
So I would look like this when done:
2013_255.txt will contain:
2013/255-12:23:34 - Some event happen
2013/255-12:23:34 - Some event happen
2013/255-12:23:34 - Some event happen
2013/255-12:23:34 - Some event happen
** Connection Lost
2013_256.txt will contain:
2013/256-12:23:34 - Some event happen
2013/256-12:23:34 - Some event happen
2013/256-12:23:34 - Some event happen
2013_257.txt will contain:
2013/257-12:23:34 - Some event happen
2013/257-12:23:34 - Some event happen
Thanks,
Explanation / Answer
#!/bin/sh if [ "$#" -lt 5 ] || [ ! -f "$1" ] then echo "usage: $0 inputfile numfiles opdir fileext region" >&2 exit 1 fi awk -v NFILES="$2" -v FNAME="$3/$5_%d.$4" ' # Do not print the first file -- just count lines NR==FNR { next } # First line of the second read through the file. FNR==1 { HEADER=$0 MAXLINES=sprintf("%d", (NR-1)/NFILES); LINES=MAXLINES next } # skip to the next file and print header if exceeded maxlines (LINES >= MAXLINES) { LINES=0; FILE++; print HEADER > sprintf(FNAME,FILE); } # Print all lines into the current file { print > sprintf(FNAME, FILE); LINES++ } # Yes, we give awk the same file twice. On the first read, it just counts # lines. On the second, it decides which lines go into what file. ' "$1" "$1"
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.