RSS

Robustness in Programming

07 Aug

(For my regular readers, I know I promised this post would detail ‘a method by which anyone could send me a message securely, without knowing anything else about me other than my e-mail address, in a way I could read online or my mobile device, in a way that no one can subpoena or snoop on in between.’  A tall order, for sure, but still something I am working to complete in an RFC format.  In the meantime…)

I have the benefit of supporting an engineering group that is seeing tremendous change and growth well past ideation and proof of concept, but at the validation and scaling phases of a product timeline.  One observation I’ve made about the many lessons taught and learned as part of this company and product growth spurt have been the misapplication of the Jon Postel’s Robustness Principle.  Many technical folks are at least familiar with, but often can quote the adage: “Be conservative in what you do, be liberal in what you accept from others“.  Unfortunately, like many good pieces of advice, this is taken out of context when it relates to software development.

First off, robustness, while it sounds positive, it not a trait you always want.  This can be confusing for the uninitiated, considering antonyms of the word include “unfitness” and “weakness”.  On a macro-scale, you want a system to be robust; you a product to be robust.  However, if you decompose an enterprise software solution into its components, and those pieces into their individual parts, the concerns do not always need to, and in some cases should not, be robust.

For instance, should a security audit log be robust?  Imagine a highly secure software application that must carefully log each access attempt to the system.  This system is probably designed so that many different components of the system can write data to this log, and imagine the logging system is simple and writes its output to a file.  If this particular part of the system were robust, as many developers define it, it must, as well as possible, attempt to accept and log any messages posted to it.  However, implemented this way, it is subject to CRLF attacks, whereby a component that can connect to it and insert a delimiter that would allow it to add false entries to the security log.  Of course, you developers say, you need to do input checking and not allow such a condition to pass through to the log.  I would go much further and state you must be as meticulous as possible about parsing and throwing exceptions or raising errors for as many conditions as possible.  Each exception that is not thrown is an implicit assumption, and assumptions are the root cause of 9 out the OWASP Top 10 vulnerabilities in web applications.

Robustness can, and is often, an excuse predicated by laziness.  Thinking about edge cases and about the assumptions software developers make with each method they write is tedious.  It is time consuming.  It does not advance a user story along its path in an iteration.  It adds no movement towards delivering functionality to your end users.  Recognizing and mitigating your incorrect assumptions, however, is an undocumented but critical requirement for the development of every piece of a system that does store, or may ever come in contact with, protected information.  Those that rely on the Robustness Principle must not interpret “liberal” to mean “passive” or “permissive”, but rather “extensible”.

In the previous example I posited about a example logging system, consider how such a system could remove assumptions but still be extensible.  The number and format of each argument that comprises a log entry should be carefully inspected – if auditing text must be descriptive, then shouldn’t such a system reject a zero or two-character event description?  While information systems should be localizable and multilingual, shouldn’t all logs be written in one language and any characters that are not of that language omitted and unique system identifiers within the log languages’ character set used instead?  If various elements are co-related, such as an account number and a username, shouldn’t they be checked for an association instead of blindly accepting them as stated by the caller?  If the log should be chronological, shouldn’t an event specified in the future or too far in the past be rejected?  Each of these leading questions exposes a vulnerability a careful assessment of input checking can address, but which is wholly against most developers’ interpretations of the Robustness Principle.

However, robustness is not about taking whatever is given to you, it is about very carefully checking what you get, and if and only if it passes a litany of qualifying checks, accepting it as an answer to an open-ended question, rather than relying on a defined set of responses, when possible.  A junior developer might enumerate all the error states he or she can imagine in a set list or “enum”, and only accept that value as valid input to a method.  While that’s a form of input checking, it is wholly inextensible, as the next error state any other contributor wishes to add will require a recompile/redeploy of the logging piece, and potentially every other consumer of that component.  Robustness need not require all data be free-form, it must simply be written with foresight.

Postel, wrote his “law” with reference to TCP implementations, but he never suggested that TCP stack implementers liberally accept TCP segments with such boundless blitheness that they infer the syntax of whatever bits they received, but rather, they should not impose an understanding of the data elements that were not pertinent to the task at hand, nor enforce one specific interpretation of a specification upon upstream callers.  And therein lies my second point — robustness is not about disregarding syntax, but about imposing a convention.  Robust systems must fail as early and as quickly as possible when syntax, especially, has been violated or cannot be accurately and unambiguously interpreted, or if the context or state of a system is deemed to be invalid for the operation.  For instance, if a receives a syntactically valid message but can determine the context is wrong, such as a request for information from a user who lacks an authorization to that data, every conceivable permutation of invalid context should be checked, not fail to consider each in a blasé fashion to leave room for a future feature that may, someday, require an assumption made in the present, if it is ever to be developed.  This crosses another threshold beyond extensibility to culpable disregard.

In conclusion, building a robust system requires discretion in interpretation of programming “laws” and “axioms”, and an expert realization that no one-liner assertions were meant by their authors as principles so general to apply to every level of technical scale of the architecture and design of a system.  To those who would disagree with me, I would say, then to be “robust” yourself, you have to accept my argument. 😉

 
Leave a comment

Posted by on August 7, 2013 in Programming

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: