Two Algorithms Go To a Foo Bar

A friend and I were having a really interesting discussion this week about PHP foreach loops, array_* functions, and the subjectivity around code readability; significantly interesting enough, in my view, that I wanted to document it here and share it with my broader network for consideration and discussion.

Let’s say you have an array of options you’ve assigned to a class property:

$this->options = [
    [
        'text => 'Good',
        'value' => 'the-good-one',
    ],
    [
        'text' => 'Better',
        'value' => 'the-better-one',
    ],
    [
        'text' => 'Best',
        'value' => 'the-best-one',
    ],
];

This array gets used in a system to select a default option for the list. The calling code searches within the array to see whether there is a match, and if there is, it returns the value of that match. So, for instance, if I passed the string the-good-one to my search lookup, I would get the-good-one back.

If someone wanted to get the value using the text key, the same behavior applies: I pass in Good, and once again, I get the-good-one back. If, then, I pass in something that’s not located within the array, I simply get back the value I passed in – the-worst-one or Worst would return the-worst-one or Worst, respectively.

You’re assigned with writing a fuction that performs these steps. How do you solve it?

The Loop Approach

One possible algorithm for finding a matching value in this set is to use a loop. It might look something like this:

public function get_matching_option_by_value_or_text( $value ) {
    foreach ( $this->options as $option ) {
        if ( $option['value'] === $value ) {
            return $value;
        }
    }

    foreach ( $this->options as $option ) {
        if ( $option['text'] === $value ) {
            return $option['value'];
        }
    }

    return $value;
}

Loops are one of the first programming control structures that new developers learn, and they can help make solving these kinds of problems easy, but they are not without their tradeoffs. Though simple, this example does contain some inherent complexity, primarily: 1) moderate nesting of control flow, and 2) clear-ish but not necessarily full clarity of developer intent (reading the code, you have to stop and consider why there is a return statement within the loop).

The complexity here is that the main point of the algorithm is to determine if a given value doesn’t exist in the value index, and find the matching value for it if it does exist in a text index.

Let’s look at an alternative using built-in PHP array methods.

The PHP array_* Approach

The same result using a loop control flow can be achieved with two built-in PHP array methods: array_search and array_column. The array_search method returns the index of the found result or false if the result could not be found, and array_column pulls all of the values out of an array into a flat structure, preserving the keys. Let’s take a look:

public function get_matching_option_by_value_or_text( $value ) {
    $value_index = array_search( $value, array_column( $this->options, 'value' ) );

    if ( false !== $value_index ) {
        return $value;
    }

    $text_index = array_search( $value, array_column( $this->options, 'text' ) );

    if ( false !== $text_index ) {
        return $this->options[ $text_index ]['value'];
    }

    return $value;
}

There is both more and less complexity in this approach: more because there are two methods within the algorithm that fewer developers might be familiar with, and because we’re making some nested inline function calls (e.g., calling array_column as the second parameter to array_search). It also has more variable assignments, once because we want to know if we found the index of the value key, and again because we want to know if we found the index of the text key.

That said, at the same time, there is also less complexity, because the control flow is subjectively easier to follow: I simply need to read top to bottom to understand what’s happening, and if I need context around the specific functions being used, I can read the PHP documentation to understand how those functions work. In plain language, I can see “If there is a value index, return the value that matches it from the options. If there is a text index, return the value that matches it from the options. Otherwise, return the value.” The intent behind this approach, I’d argue, is clearer, because any questions that get raised as a result of the algorithm can be reviewed within the documentation, whereas questions about the first approach might need to get answered by running the code and stepping through it with a debugger, or asking the original developer who wrote it.

Programming “The Right Way”

Spoiler alert: there is no “right” way. The beauty of programming is that there are lots of different ways to tackle problems. To me, the most important thing is that when someone is assigned at some future date to modify code you’ve written in the past, how easy is it for them to understand what you’ve written? In my view, it’s easier to follow control flows that have reduced levels of nesting, and which use native language contructs. However, I’m just one person, with one worldview, and one collection of knowledge. Everyone’s technical level and programming approaches differ, so it’s important not to be dogmatic about a particular approach, and appreciate that broader audiences might understand how things work in a different way.

At the end of the day, the one true measure of your algorithms is how many tests are supporting it, so that you can more confidently and easily change it when the tides shift. :)

Addendum

I edited this post after a conversation with another friend, Sal Ferrarello, as I’d realized the original examples for the loop were incorrect. Both the loop and array_search examples have been revised to be the same: returning the $value if it was found in the 'value' index of the original array, otherwise returning the 'value' index if $value was found in the 'text' index. Finally, the methods both return the $value parameter if it was not returned in either search. Thank you, Sal, for noticing the inconsistencies in the original post.

There are, also, notably some limitations with array_column in PHP, which does in fact make the loop example preferable. The array_column method creates a new array from the values, so if any does not contain that index, the indexes of the new array will not match the one of the $options array. Thus, it would seem, once again, that the simpler approach, even with its additional “grok-factor” (because of the early returns in a loop), might in fact be the superior one.