rm -rf
Current happenings
So, a thing happened. A legacy piece of software turned belly up and left its users in a situation that was not only frustrating but also costly for the business and employees (due to this piece of software affecting payroll). At the time this happened, this particular application was fairly low on my priority list for refactoring. I intended on getting to it in several weeks, once I knocked out the bulk of the other work I have to do on my board. Well, best intentions are kind of like empty promises – eventually they come back to bite you in the ass.
Consequently I’ve dived into the cesspool of this application’s codebase and have been experiencing some hybrid of emotion (not sure what to call it) betweeen frustration, depression, discomfort and that feeling you get when you’re ready to start flipping tables. There’s markup blended in with business logic, more duplication than a suburb in Southern California, unclear variable names (including multiple single letter variables), quadruple nested loops, and home-baked solutions that achieve what native PHP language constructs are capable of. Talk about the perfectly imperfect piece of code that I dare say is actually one of the seven rooms in Developer Hell.
The good news: apparently the years of experience helps with kind of deducing wtf this person was thinking when they wrote it (I think they missed the Balmer peak by a few glasses of whiskey). The bad news: it’s forced me to spend more time afk as this is some really dense, unclear logic for doing something that, on the presentation layer, looks to be much more simple. I’m going to highlight the process I’ve been going through on this codebase, and hopefully give some good reference material for others in a similar situation.
Getting comfortable
As one does when digging into unfamiliar territory, I started off by getting a high-level view of what exactly this code is doing. In this case, it wasn’t the easiest task. For instance, within the first 100 lines, this gem was sitting there, staring at me like Samuel L Jackon in Black Snake Moan:
$max = sizeof($hoursArray);
$cleanedHours = 0.00;
$tmpArray = array();
$origType = "";
$totalHours = 0.00;
$firstDay = $hoursArray[0]['Date'];
for ($i = 0; $i < $max; $i++) {
if (($hoursArray[$i]['recordtype'] != 'Absent') && ($hoursArray[$i]['recordtype'] != 'ABSENT')) {
if ($hoursArray[$i]['Date'] == $firstDay) {
$totalHours = $totalHours + $hoursArray[$i]['Hours']; //$this->AddHours($totalHours,$hoursArray[$i]['Hours'])
} else {
$firstDay = $hoursArray[$i]['Date'];
$totalHours = $hoursArray[$i]['Hours'];
}
}
if (true) { //$hoursArray[$i]['recordtype'] != $origType || $hoursArray[$i]['recordtype'] == 'ABSENT'
$day = $hoursArray[$i]['Date'];
$origType = $hoursArray[$i]['recordtype'];
$origHours = $hoursArray[$i]['Hours'];
$cleanedHours = $hoursArray[$i]['Hours'];
for ($j = $i + 1; $j < $max; ++$j) {
if (($hoursArray[$j]['Date'] == $day) && ($hoursArray[$j]['recordtype'] == $origType)) {
$cleanedHours = $origHours + $hoursArray[$j]['Hours']; //$this->AddHours($origHours,$hoursArray[$j]['Hours'])
}
}
$status = "";
for ($h = 0; $h < sizeof($hoursApp); $h++) {
if (((string) $hoursApp[$h]['Date'] == (string) $day) && ($hoursApp[$h]['RecordType'] == $origType) && ($hoursApp[$h]['Hours'] == $cleanedHours)) {
$status = $hoursApp[$h]['Status'];
}
}
$tmpArray = array(
"Date" => $day,
"recordtype" => $origType,
"Hours" => $cleanedHours,
"Total" => $totalHours,
"Status" => $status
);
$cleanedArray[] = $tmpArray;
}
}
Or simplified:
Now, as any decently experienced developer knows that working with arrays can get really foggy – quickly. Throw in some for
loops and complexity starts to multiply. So, this code is doing a relatively basic operation. it takes in an array (results returned from a database query) and begins iterating over those results. It’s using sizeof($hoursArray)
for bounds checking; fairly normal.
The first if
block then checks if the agent was absent for that given record. If so, it then checks if it’s the first record in the array, and stores that record’s hours in the $totalHours
variable. If it’s not the first day, it overwrites the $firstDay
and $totalHours
variables. I’m still not sure why it does this, but that’s what it does. Next comes one of my favorite constructs in this codebase. The developer that wrote this absolutely loved the if (true)
checks. If what is true? How will this ever change? It doesn’t make sense to me, but hey, it made sense to them and they loved doing this.
So, if true (whatever that means), we dive into this next block that makes a bit more sense. Some variables are initialized with the current iteration’s values, and we dive into another loop. This loop starts iterating over the original array, just one index ahead. It then starts comparing values. As a side bar, this is what the original array comes back from the database as:
array (size=18)
0 =>
array (size=3)
'Date' => string '10/17/2019' (length=10)
'recordtype' => string 'Active' (length=6)
'Hours' => string '7.56' (length=4)
1 =>
array (size=3)
'Date' => string '10/17/2019' (length=10)
'recordtype' => string 'Break' (length=5)
'Hours' => string '.49' (length=3)
2 =>
array (size=3)
'Date' => string '10/16/2019' (length=10)
'recordtype' => string 'Active' (length=6)
'Hours' => string '7.49' (length=4)
3 =>
array (size=3)
'Date' => string '10/16/2019' (length=10)
'recordtype' => string 'Break' (length=5)
'Hours' => string '.51' (length=3)
...
As you can see, there are records for the same day. Each record has a different record type and hours value. This loop is iterating over the array, comparing the dates and adding up the hours on the same date where record types are the same. The reason for this is some records can have multiple Active
states for one day – we want those combined for an accurate overall view of that particular employee’s state. Anyways, if those conditions are met, we add up the hours for the current index $i
and the next index $j
.
Now we proceed to this next loop – this is one of those moments where I start to want to flip tables. This could be avoided with proper normalization across the databases. Instead, the same values are stored in two different databases, in two different tables, with one database table having a single column that contains the record’s status. So, in this next loop, we start iterating over that second array ($hoursApp
; keep in mind we’re iterating over $hoursArray
already), and doing some comparisons. If the dates match, the record types match, and the hours are the same, then we grab the status from the $hoursApp
array for this index.
A temporary array is set up, then we dump all those values in and push that temporary array into another array. Rinse, repeat for each record in $hoursArray
. Hopefully you’re still following me, because that section took nearly a day to reverse engineer and determine wtf exactly was going on. Subsequently, I’ve rewritten this section of code to look something like:
private function calculateDailyTotals($times)
{
$dailyTotals = [];
foreach ($times as $k => $v) {
$totaltime = 0;
// These are simply to promote readility. Multidimensional arrays can make the
// brain foggy when dealing with stuff like this.
$storedRecordType = $dailyTotals[$v['Date']][$v['recordtype']];
$storedHours = $dailyTotals[$v['Date']][$v['recordtype']]['hours'];
if ($storedRecordType == $v['recordtype']) {
$totaltime = $storedHours + $v['Hours'];
} else {
$totaltime = $v['Hours'];
}
$dailyTotals[$v['Date']][$v['recordtype']] = [
'hours' => $totaltime
];
}
return $dailyTotals;
}
public function getStatuses(&$agentStatuses, &$agentHours)
{
foreach ($agentStatuses as $k => $v) {
foreach ($agentHours as $i => $j) {
if ($i == $v['Date'] && $j[$v['RecordType']]['hours'] == $v['Hours']) {
$agentHours[$i][$v['RecordType']]['status'] = $v['Status'];
}
}
}
return $agentHours;
}
While it’s still not totally clear, this is the first iteration of refactoring and makes what’s going on much simpler to understand at a higher level. At this point, the code starts digging into generating the presentation code and gets even more confusing. Given I’m not completely done refactoring that section, and this post is already over 1k words, I’ll save that for another article.
If you made it here, thanks for sticking through that. It’s not glamorous, or exciting work, but someone’s gotta do it and I hope the next guy will appreciate the work I’ve done to make this all make more sense.