I have a client who is given a tab delimited .txt file containing hundreds of th
ID: 647117 • Letter: I
Question
I have a client who is given a tab delimited .txt file containing hundreds of thousands of rows.
I have a user story as follows:
As a user I want to take the text file and add a new value at the end of each line which contains the concatenated value of two of the columns.
for example if the file read
text_one text_two
I need to output the following (preferably to a .txt file)
text_one text_two text_onetext_two
My first approach was to ask the vendor supplying the file to do the concatenation before providing the file, the easiest way to solve a problem is to eliminate it right? however they are very uncooperative and have point blank refused.
I've looked at building a simple javascript application that does this client side so a non-technical user could select the file using a file selector. This approach has a few problems
The file could be over a GB in size and so can't be loaded straight into memory, I've tried and the browser crashes
There is no means to write a file in javascript so I'd need to output the content to the screen and have the user save it (somehow)
I was thinking if I could get around the filesize limitations I could just output the edited content to the page and have the user save the page as a .txt file, however I think there is a better way than using javascript that will still accommodate the users lack of technical know-how.
Please consider this question to be stack agnostic, but bear in mind that a nice little shell script or python script would be deemed unsuitable for a non technical user unless there is a way of "packaging" it nicely for a non-technical user.
Explanation / Answer
The code bit is trivial. The best way to do the process is to put something between the vendor and the user.
My preference with these things is to get the vendor to transfer the file via FTP (which I hope they are already doing, given it's size). Write your code to grab the file, process it, and put it where the user expects to find it, then set it to run as appropriate (every 5 minutes, every day or whatever).
This is a very common problem.
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.