Trust and Verify

I had a chat with my boss the other day, about how good software should be developed. He mentioned that we should use unit-testing and I told him that it was rubbish. Then, he mentioned TDD where tests are written before the actual application and I told him that he risked lying to himself about the quality of the code developed.

Okay, while this is not an entry bashing software testing, I will highlight why testing fails. As one prominent computer security researcher points out:

One researcher with three computers shouldn’t be able to do beat the efforts of entire teams, Miller argued. “It doesn’t mean that they don’t do [fuzzing], but that they don’t do it very well.”

This is because we end up relying on tools to do the testing. Not that there are anything wrong with tools, but the tools are only as good as their operators and the problem lies with the tool operator. When tools are used, the operators generally are either less astute or become lazy. They start to rely on the tool to catch problems and if the tool fails to catch any, they rely on the tool to sign-off on the quality of the code. Tools are inanimate objects and should not be liable for the stuff that they test. The tool operator should always dig in manually to verify the results. Through my 20+ years of active programming, I have been caught by tools so many times and I have developed a distinct distrust towards any tool. I even verify my compiler outputs by disassembling the binaries and checking them by hand once, before running them through a simulator to observe the operations.
objdump -dSC
I argued with my boss that there is only one way to develop good code, that is to just frakin’ write good code, which is not as hard as some people imagine it to be. Writing good code is like writing good prose, there are certain styles that can be followed that will result in better code. Unfortunately, good writing habits are hard to develop and need to be hammered into our apprentices from the very get go. Our universities are failing in this by not driving this point home and trying to teach our programmers to be ever more lazy and rely on automated tools to generate code instead. Okay, code generation is a whole different can of worms and I won’t bother to go into.

I have seen several senior engineers do, what I call, haphazard coding that results in code that sort-of works but without anyone knowing why. A slight change in any part of the code, even in the compiler optimisation settings, can bork out the code. This is usually the result of cut-and-paste coding styles and ineffectual debugging skills. I had to teach my CODE8 apprentice that in debugging, you always follow the flow of the data. There was one instance where I asked her to show me the result that is obtained from the LDAP query and she modified her code to output the data onto the web-application. Unfortunately, this involved obtaining the result from the LDAP query, extracting the relevant fields, reformatting the text before sending it out to the web-app. I then explained to her that this was not the output of the LDAP query. This was the output of the query after several processing steps. Garbage-in-garbage-out. If we do not know that the LDAP query output is accurate, how can we be sure that the reformatting and extraction routines are not frakin’ things up.

In the realm of hardware design, good coding practices are generally easier to enforce because if your code sucks, the hardware tools are not smart enough to do what you want and will usually just bork out. However, if the code is badly written, but just good enough to be understood by the synthesisers the tools will just end up producing very bad hardware design, which will suck up power and slow down performance. As a result, hardware coders are told to only write code in one specific way, in order to produce the hardware that we desire. We get these coding practices drummed into us in a class usually called – design for synthesis.

As for me, I tend to step through code line-by-line in order to see if it is doing what it is supposed to do. Unfortunately, with a language like Java, that can be very difficult to do because of the added complexity of the virtual machine layer. You can compile Java code into bytecode and inspect the bytecode but you won’t know if the JVM will actually run your bytecode the way it is meant to be run because you will just have to trust it to do so.

I subscribe to the principle – “trust and verify”.

Trust and Verify

Published by

Shawn Tan

Leave a comment Cancel reply

Share this:

Related

Published by

Shawn Tan

Leave a comment Cancel reply