Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Really open question here. I\'m not after an answer.. only advice. Any past expe

ID: 650654 • Letter: R

Question

Really open question here. I'm not after an answer.. only advice. Any past experience people have had with re-factoring legacy systems that they could pass on would be amazing. Here's some information about the software I am to re-factor (you'll cringe!):

Web application
Database driven (MySQL)
PHP4 and PHP5
Most of the code is PHP4
Nearly all the code is procedural
Code that is PHP5 isn't OO.. example: 10,000 line+ file with one class and one function
Global variables used everywhere
No source control was used to write the software (you can see from the comments in the code)
Massive amounts of code repetition
No separation of concerns - user interface and logic is combined everywhere
Application relies on order of database tables
Few code comments
250,000+ lines of code
Application in heavy use
Basically, the software is our core product and I have been hired to do a major re-factor (amongst other things). It's a massive task and I can't just dive in and fix all the little things.. I need an overall strategy. I've written some scripts to tidy indentation up, removed commented-out code everywhere and made the project into a repo but now it's time to do the real stuff.

I kind of have a vague idea but not sure how to go about it. I could somehow leave the current code alone and write some layer of software over it that abstracts away from all the horribleness. It would be good if the new layer was some sort of MVC architecture. At the same time I would go into the current code, remove redundancies because otherwise the new layer would be using bad code anyway so the code could slow down even more.

As you can see.. need some clues/hints/tips/advice/experiences!

Thanks very much

Explanation / Answer

Test-Driven Reverse Engineering.

Reverse engineer some user epics; not all the stories, but the "big picture" suites of functionality that comprise the system.

Prioritize based on quality, replaceability, value, etc., etc. This is subjective, but you need to pick a piece to work on.

Write unit tests as best you can that the legacy code will pass.

Write new code for just that.

Now comes the hard part. Bridges.

You need to deploy the new part and maintain the legacy while you're moving forward. You'll need to "bridge" data from legacy to new (and new back to legacy).

You'll also have to write a bunch of Apache rewrite rules to redirect requests to the new so that the URL's don't appear to change too much or too disruptively. Some change is inevitable as your rewrite. Some change is disruptive to users and requires mod_rewrite or 304 redirects or both.

Once you've got that first epic (set of stories) deployed in production, you need to then rethink your epics and decomposition of what's left in the legacy that's still valuable. Reprioritize.

Write unit tests for the next epic. Rewrite the code. Create and revise the bridges. Deploy the next chunk.

You'll repeat the "partition, prioritize, test, refactor, deploy" loop until what's left over as legacy is of so little value that you no longer need any bridges. Then you decommission the old. Delete the code. Burn the bridges. No going back.