SF Active Blues

I finally got down to refactoring the “Page” class and the pages created by that class.  It has been pretty painful.

It hurts, mainly, because I started with publish.php, which loads publish.inc, a script that’s over 600 lines long.

The code has the GET and POST requests handled in one large function, execute().

So, the first fix was to split up the GET and POST handlers, and rewrite execute() to dispatch correctly.

Then, the POST had three different modes: preview, comment, and publish. Those were broken out into three methods.

It’s not that simple, because these three modes are entangled, and they all touch the $_POST global.

It’s still like old-style PHP, where you build up a bunch of fragments of HTML and text, and then meld them all into a template.  Logic pyramids control the execution path.

It’s completely untestable.

I was hoping that the classes in the framework could be adapted so these front-end scripts could execute without modification.  Unfortunately, that’s not going to work.

There’s just too much code smell. Partly, the original code had problems, but adding anti-spam and other code added problems.

Technique

When I have to disentangle code, I start out by digging into the logic pyramids and converting the deepest code into methods, then turning them into functions:

Before:

if ($foo==1) {
    if ($bar==1) {
        // this is the code here
        // many lines of code
    }
}

After:

if ($foo==1) {
    if ($bar==1) {
        $this->fooAndBar();
    }
}

function fooAndBar() {
    // this is the code here
    // many lines of code
}

It’s longer, but the tip of the pyramid is shorter.

This can also be done one level up, so the logic can be retained.

if ($foo==1) {
    if ($bar==1) {
        if ($baz==1) {
            // this is the code here
            // many lines of code
        }
    }
    // some more code
    if ($bar==1) {
        // additional code
    }
}

After:

if ($foo==1) {
    $this->doFoo($bar);
}

function doFoo($bar) {
    if ($bar==1) {
        if ($baz) $this->fooAndBarAndBaz();
    }
    // some more code
    if ($bar==1) {
        // additional code
    }
}

function fooAndBarAndBaz() {
    // this is the code here
    // many lines of code
}

These changes don’t alter the code – they just make it more readable.  The thing is, once  I have everything rearranged, I can see more lines of code on the page, and the indentation is shallower, so it’s easier to alter the code.

With everything in class methods, it’s also testable with PHPUnit or any other testing framework.

Additionally, you can see if variable scopes permeated the original code.

In the example, the doFoo() function takes a $bar argument: it needs some input that alters the logic.  So, it should be tested.

fooAndBarAndBaz() doesn’t require any arguments. That’s good – it means that the function is really a constant.  You don’t need to test it.

Additionally, all the globals like $_POST need to be passed into the functions.  Some objects are replaced with calls to the Pimple dependency injection container. This enables better testing.

Architecture

The other problem with these controller-like page is that it subverted some of the original intent of “Page”s – it was clearly intended to hold mainly anything related to rendering a page.

So, there are a lot of lines of code to translate code, and then set up arrays to pass to the templating system.

Several lines in there take care of data validation, and check for spam.  Several lines in there make calls to the Article object to save the data, and to regenerate a few pages.  Spread throughout are some lines that sanitize data, or escape the output.

In other words: there’s risky code throughout the 600 lines of code.  It might be only 20 or 30 lines, but those are the dangerous lines: like little bits of glass in the lawn.

The architecture needs to be rearranged so there are three distinct phases:

  1. Input validation and conversion of text into PHP data.
  2. Using that data, transforming it, and possibly saving it.
  3. Escaping and merging the data with a template for output.

Additionally, phases 1 and 3 should add in “middleware” to filter the input and output.

File Uploads

There’s yet another problem, and that’s file uploads.

While it’s simple to accept an uploaded file, and it’s simple to preview a form’s data before saving it, it’s not possible to do both in the current code.

This is not just a huge problem for users – it reveals a real flaw in the architecture.

The fix is a little complicated:

  1. Save all forms to a drafts table, regardless of whether it’s a publish or preview.
  2. Save all files with each post, and add it to the drafts table, same as in #1.
  3. On “publishing”, copy the rows from the drafts table to the webcast table, and then delete the rows from the drafts table.

So, the new process is that all posts go into a drafts table.  If the “Publish” button was pressed, we do an importation into webcast.

If “Preview” was pressed, we just accept the data and re-display it as part of the preview.  Upon previewing, the user can add an additional upload.

To reduce confusion, the old method of querying the user for the number of uploads is removed, and the user must upload one file at a time.  The “Upload” button will be the same as a press on “Preview”.

This is a serious change, but it fixes one real US problem with the software.

Conclusion

So, I’m in the middle of this refactoring.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s