Write a bash script to sort the lines in a text file by the number of characters
ID: 3838287 • Letter: W
Question
Write a bash script to sort the lines in a text file by the number of characters on each line.
Conditions:
The input file will contain only printing characters, spaces, tabs, and newlines.
The input file will not contain lines longer than 512 characters.
Requirements:
Read from standard input, write to standard output.
Output all of the input lines unchanged.
Output the line with the fewest characters first, the line with the next fewest characters next, ... , and the line with the most characters last.
For lines with the same number of characters, output the line with the fewest nonblank characters first, the line with the next fewest nonblank characters next, ... , and the line with the most nonblank characters last.
Include all of the following:
comments with your name, the date, and the assignment
comments with instructions for using the program
descriptive names and/or comments explaining variables & functions
comments explaining any non-obvious control flow
indentation of code blocks
Hints:
work on the simplest cases first
think about the structure of your data and logic before you write any code
start with the smallest possible amount of code
make sure your existing code works before adding new code
add the smallest possible amount of new code, then test again
output variable values, loop counts, etc. for debugging
test your code with a variety of inputs
Explanation / Answer
Or, to do your original (perhaps unintentional) sub-sorting of any equal-length lines:
In both cases, we have solved your stated problem by moving away from awk for your final cut.
Lines of matching length - what to do in the case of a tie:
The question did not specify whether or not further sorting was wanted for lines of matching length. I've assumed that this is unwanted and suggested the use of -s (--stable) to prevent such lines being sorted against each other, and keep them in the relative order in which they occur in the input.
(Those who want more control of sorting these ties might look at sort's --key option.)
Why the question's attempted solution fails (awk line-rebuilding):
It is interesting to note the difference between:
They yield respectively
The relevant section of (gawk's) manual only mentions as an aside that awk is going to rebuild the whole of $0 (based on the separator, etc) when you change one field. I guess it's not crazy behaviour. It has this:
"Finally, there are times when it is convenient to force awk to rebuild the entire record, using the current value of the fields and OFS. To do this, use the seemingly innocuous assignment:"
"This forces awk to rebuild the record."
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.