Archive for May 26th, 2008

Book review; “UML 2.0 in Action: A project-based tutorial”

Monday, May 26th, 2008

A while ago I received, from packt, a copy of “UML 2.0 in Action: A project-based tutorial” by Patrick Grassle, Philippe Baumann, Henriette Baumann. This book certainly lives up to it’s byline of being “a detailed and practical walk-through showing how to apply UML to real world development projects”.

UML is a standardized visual specification language for object modeling and is short for “Unified Modeling Language” that includes a graphical notation used to create an abstract model of a system, referred to as a UML model. There are many software tools available which can be used for code generation and reverse engineering, such as the new PEAR package PHP_UML which generates a UML representation of existing PHP source code.

This book assumes no prior knowledge of UML and this works very well. It is by no means comprehensive but that it’s what the authors set out to write – this book is focused on being a practical tutorial for learning the essentials of modelling business systems, IT systems and systems integration – no more, no less. It does this admirably and I’ll recommend this book as a reference and introduction for developers performing system analysis and design activities.

Validation in Depth – a retort to using just regular expressions

Monday, May 26th, 2008

I’ve noticed that Richard Heyes, who professes himself to be a php guru, deleted my comment on his “Some common regular expressions” posting which simply pointed out his expressions didn’t quite do the job and suggested a few PEAR packages that should be used instead of the expressions that he proffered for the following:

  • Email addresses
  • Usernames
  • Telephone numbers
  • Postal codes
  • IP addresses
  • An SQL date
  • A domain
  • A UK sort code

Why he deleted it is anybody’s guess – he deleted a few others too.

Anyway, for the record I thought I’d reproduce my comment from memory (I didn’t think to make a backup copy for obvious reasons but hey nobody expects the Spanish Inquisition).

The problem with just relying on a regular expression for validating data is there is no “defense in depth” to that solution. Sure the expression might catch the main bulk of data entered but there’s always going to be data that get’s through.

For example a simple regular expression for validating phone numbers won’t catch area codes or country that don’t actually exist and another that’s used for validating entered dates might not catch leap-year based exceptions.

  • Email addresses – use the PEAR Validate package for email address validation
  • Usernames
  • Telephone numbers – use Validate_UK; this package will also validate UK specific details such as:
    • SSN (National Insurance/IN)
    • Postal Code
    • Sort Code
    • Bank AC
    • Car registration numbers
    • Passports
    • Driver license
  • Postal codes – use Validate_UK or counterpart as appropriate.
  • IP addresses – use the Net_Check PHP5 port of Net_CheckIP or the original Net_CheckIP for php4 if you really have to.
  • An SQL date – what Richard provided validates the form of a date in yyyy-mm-dd format but not that the entered value is a date; one could enter 2008-13-42. Again, I’d suggest using the Validate package.
  • A domain – You could, in theory use the Validate package’s uri method, prefixing the domain with ‘http://’.
  • A UK sort code – Validate_UK.
  • If you follow these suggestions it should make your input validation more robust than simply relying on regular expressions and nothing more.