Bookmark and Share

Saturday, January 30, 2010

Regular expressions for non-string complex data?

Introduction


RawDev is a sustainable framework that is build from the ground with unit tests. Before I get deeper into unit testing I want to introduce to you a helper library that makes validating any data structure (simple scalar or complex array structure) easy. This library is a key building block to assert functionality in RawDev's unit testing but its use does not end there. Other uses I can think of are: debugging, (non-string) validation. Perhaps you can think of more uses. As you might expect this library is also fully extendable and ample examples are available in the download.

Synopsis


<?php

$expected = 'Hello World';
$value = 'Hello World';

$obj = new RMatch($expected);
$obj->match($value);

?>

This is a silly example that you could do much simpler for scalar values. But stick with it. The RMatch utility library simply expects a $expected value (or definition) to which it then compares the target $value.

Array example

<?php
$expected = array('person' => array('name' => 'Jane', 'spouse' => array('name' => 'John', 'address' => array('city' => 'Philadelphia'))));
$value = array('person' => array('name' => 'Jane', 'spouse' => array('name' => 'John', 'address' => array('city' => 'New York'))));

$value = $expected;
RMatch::construct('RMatch', $expected)->match($value); #fluid interface
?>
This is getting more interesting because not only does RMatch compare the two arrays it also suggests where the error occurs. RMatch is equipped to compare arrays, objects and scalars. You can tell RMatch what exact flavor of matching you seek (e.g. scalar type checking). You can also do things like: check if these three attributes of an array or object match, ignore other attributes. Finally, you can extend RMatch with your own definiting of matching.

Your own matching algorithm


It is easy to extend RMatch with your own matching algorithm. In the example below, I briefly illustrate how to implement a "like match" (just like in SQL)

<?php

class RLikeMatch extends RMatch {
function matchScalar($expected, $value, $path) {
if (!preg_match("/".str_replace($expected, "%", ".*?")."/", $value)) {
throw new RException('nonlike_match_scalar', 'Value [%s] does not like-match expected value [%s] for item [%s].', $value, $expected, $path);
}
}
}

RMatch::construct('RLikeMatch', 'dog%')->match('dog person');

?>

Conclusion

The RMatch library is modular, powerful, lean and easy to use. Although it's initial use was to help assert any test result for unit testing, it can be used in many other situations such as validation and debugging. Although you are likely just to use the defaults, it is easy to change the algorithm on which complex structures are matched. In addition, any part of a complex tree is extendible to have it's own custom match capability.


Future

Using this library reminded me a lot of doing regular expressions on arrays. This idea could be expanded more in the future. A simple syntax could describe definitions such as the count, order, required, etc.

Links

RMatch API Documentation
Download RawDev

A Vision of RawDev

Imagine a scenario:
I have survey results in a Google Doc that gives me a list of websites that my users like. I have a website like http://www.alexa.com/ that provides rankings of websites worldwide. I have a table of data on websites in Oracle.

Three data sources, three disparate data sets. But looking at it, we know that there are linkages there. Imagine an application framework that allowed you to query all three, simultaneously, and seamlessly present the data to your users in a convenient front-end.

Now imagine that framework is called RawDev.

This sort of project is exactly what RawDev is designed to do: bring together independent data and give it to an end-user in a seamless presentation. You set up the datastore once, and then build apps to use that data again and again. Integrate different types, different locations, even different storage engines of data and have one interface. View all these different data-sets as one data-set in your favorite database gui application. Oh, and RawDev will throw in powerful XML based access controls as well.

You can download a beta version of RawDev this summer after a live demo at the Higher Ed Web Symposium at the University of Pennsylvania on July 21st and July 22nd. In the mean time you can get early bird access and follow the development process on this blog and http://rawdev.bokenkamp.com.

Friday, January 29, 2010

Dealing fatal errors in PHP

When a fatal error in PHP occurs, the flow of the program halts resulting in a blank screen or a system error message. Leaving your end-users confused and you, as the programmer, unaware that these fatal errors might be occuring.

RawDev offers a simple way to handle fatal errors more gracefully so that you can (a) log/email the error and (b) display an appropriate message and a path to continue so that the end-user is not staring at a system message or worse, a blank screen unsure what to do next.


How it works

PHP displays the fatal error right before it halts. RawDev reads the output buffer and detects when a fatal error occurs. That's when it calls your custom handler.

Basically all you have to do is register a fatal error handler. See example below.

Example


RFatal::setHandler('fatalErrorHandler');

hello(); # since this function does not exist it will throw a fatal error

# this area is never reached because you get a handle, but PHP is going to halt anyways

RFatal::unsetHandler(); #optional function to remove the fatal error handler

?>

Display Errors


Note that display_errors needs to be set for the RFatal class to work, which is the default in PHP. Usually this config parameter is set by the sysadmin in the php.ini file. Sometimes (typically in production) this is turned off for security reasons with non-favorable usability impact. In RawDev this needs to be turned on. The security issue no longer applies because an appropriate message to the end-user should be displayed by you, the programmer, in the fatal error handler. Typically something similar to a 404 with a simple message and a path to continue.



Conclusion

By trapping the output buffering you can actually get a handle to fatal errors, and RawDev makes it easy to do this. There are cases where you do not want to do output buffering, for example when dealing with very large output. In that case, don't use fatal error trapping.



Links

API Doc for RFatal

Wednesday, January 27, 2010

Unit Testing - Part I

Introduction




Unit testing matters because (a) your code will be more reliable, especially when your code is updated over time and (b) you will reduce programming hours over time. It is tempting to solve the problem and hack a solution together as quickly as possible, doing ad-hoc testing as you go. Taking the time to unit test may add approximately 15% of initial coding time.

A common scenario is that everything is crystal clear when the initial coding is done. However, we live in a dynamic world and things change. So when adding a feature months later, all of a sudden it becomes a lot harder to verify if everything is working properly. You may spend a full day on a silly bug that could have easily been pointed out by unit tests. This is frustrating and not sustainable over time. The investment in Unit testing will earn back those 15% of extra effort you put in initially, quickly.





How it works

Let's consider a simple black box model. A function has input parameters and as output it has the returned result. In addition, an exception can occur, in which case no output is returned. This is diagrammed below.


 

RawDev make the setup of unit tests as easy as possible by using a fluid interface so that you can typically fit a test on one line of code. The time it takes to do unit tests now solely depends on how deep you are going to test. Typically you would like to produce fewer tests that cover as many as possible scenarios. By nature certain functions require more testing based on how critical they are. Tests are just like code, you have to maintain them over time.

The fluid methods to setup your unit test are:
setTitle(string $title); #optional, but useful, the title is displayed when a test fails
setInput(mixed $param, [$param 2 ...]); #sets the input parameters
setOutput(mixed param); #sets the expected returned result
setExpectedException(string $type); # sets the expected RException type

function setInput($input[, $...]) [fluid]

function setOutput($output) [fluid]

function setExpectedException($expecteException) [fluid]

function setTitle($title) [fluid]


The final test execution method is:

bool function test()

Example


Consider a simple function that returns the sum of multiple floats or integers based on two or more input parameters. The function sum is displayed below:

<?
function sum() {
  $params = func_get_args();

  if (count($params) < 2) throw new RException('math_too_few_params', 'Two or more parameters please.');

  $sum = 0;
  foreach($params as $param) $sum += $param;

  return $sum;
}
?>

My test strategy is to quickly test 0, 1, 2 and 3 params and see if the right output is produced. See the RawDev code below. In addition, for demo purposes I include a test for the invalid output, an expected exception that didn't happen and an exception that wasn't trapped.




RFunctionUnitTest::construct('sum')->setTitle('Zero Params')->setInput()->setExpectedException('math_too_few_params')->test();
RFunctionUnitTest::construct('sum')->setTitle('One Params')->setInput(1)->setExpectedException('math_too_few_params')->test();
RFunctionUnitTest::construct('sum')->setTitle('Two Params')->setInput(1, 2)->setOutput(3)->test();
RFunctionUnitTest::construct('sum')->setTitle('Three Params')->setInput(1, 2, 3)->setOutput(6)->test();
RFunctionUnitTest::construct('sum')->setTitle('Invalid Output')->setInput(1, 2)->setOutput(4)->test();
RFunctionUnitTest::construct('sum')->setTitle('Expected Exception Incorrect')->setInput(1, 2)->setExpectedException('math_too_few_params')->test();
RFunctionUnitTest::construct('sum')->setTitle('Untrapped Exception')->setInput(1)->setOutput(1)->test();
Output:

....X?E

X: Invalid Output              : Value is [3] but should be [4].
?: Expected Exception Incorrect: Exception [math_too_few_params] was expected.
E: Untrapped Exception         : Two or more parameters please.

Conclusion

RawDev makes it easy to define the expected output (result/exception) of a function based on the input. It also has an easy way of labeling a test and displaying the results of many tests. What is not yet incorporated is (a) the output of non-scalar variables such as hashes and objects (not such a big deal) and testing of object functions that change the state of an object. These topics will be added and discussed in the near future.

Links

Exceptions
Fluid functions
Function Test API Doc

Tuesday, January 26, 2010

Exception Handling

Examples of exceptions in a web application are model or data layer errors such as "Cannot connect to the database" or user errors such as "Email format incorrect". RawDev offers a simple consistent way of raising these exceptions so that they can (a) be properly logged when necessary and (b) be properly displayed to the end user.

RawDev simply extends the existing PHP Exception with the RException object. Mainly, the added functionality is that you can raise an error with a type (e.g. "division_by_zero"). In addition you can specify a message with parameters (such as used in sprintf). This is (a) handy for the programmer and (b) allows for different languages (i18n) down the road.

Lets get to it:

Example:

require_once('RawDev/RawDev.php');

try {
  throw new RException('email_illegal_format', 'Email [%s] has an illegal format', 'test@test');
}
catch (RException $e) {
  if ($e->type == 'email_illegal_format') print $e->message."\n";
}

?>

In conclusion, RawDev offers a hybrid of using the basic PHP Exception in which you can raise different errors by integer code versus the advanced PHP capability of extending every error type Exception with it's own Exception class (e.g. class EmailIllegalForm extends Exception).

RawDev seeks to (a) avoid using integer codes as error identifies because they are hard to keep track of and (b) creating hundreds of classes for each error type.

Links:
RawDev API Doc

Comments ?

Wednesday, January 6, 2010

Introducing RawDev : a practical PHP framework

I want to let you know that I have decided to publish the framework RawDev that I am working on under the MIT license.


RawDev is a practical MVC framework for Rapid Web Development in PHP. Using RawDev you can program web applications faster in your team or by yourself.


RawDev seeks balance between the following principles: simplicity, modularity, flexibility, power, usability, speediness (execution), security, and reliability (well tested).


RawDev is also intended to be well-documented, collaborative and consistent.

Currently two production applications are using RawDev: http://HealthPanda.com and http://Fodius.com

Version 0.01 is a much more basic version (for now), it includes the error handling core library. The reason for this approach is that I will add documentation and unit tests every week before that library or module is published. The good news is that through weekly updates you will get a unique learning experience and sense of how RawDev works. The bad news is that  if you are eager to start  building web apps with RawDev you will have to wait until more modules are made available.


These weekly updates will lead to a presentation this summer at the Higher Education Web Symposium on July 21 & 22, 2010 at the University of Pennsylvania.  

RawDev can be downloaded from :
http://rawdev.bokenkamp.com

Also, expect weekly updates on:
http://blog.bokenkamp.com

Please forward this to someone you know that finds this interesting as well.

Contact me for feedback by clicking here.

--Raymond.
raymond@bokenkamp.com