PHP5 Overview

PHP5 is still far from being finished even though you can already download a beta version. The performance improvements, the better object orientation and XML support, the new PDO extension and many other small and large changes make many of us wish it was already here. After attending the PHP Conference 2003 in Frankfurt I wrote this article to give you an overview of the major changes in PHP5 from PHP4 and what to expect when porting old applications.

General improvements

Performance

PHP5 will generally be a lot faster than PHP4. Primarily, the execution model have been changed from PHP4. A php script is compiled into arrays of op codes before it is executed. PHP4 essentially use a big switch statement when executing operations from these arrays. PHP5 use function pointers and hash table lookups for function pointers are cached. Just calling PHP functions should be around 40% faster. Algorithms for comparisons have also been reworked and they should now be much more efficient.

Constants

Real constants are finally being introduced. Earlier define() was used for declaring constants. This was slow since it was actually a function call and resolved at runtime. This wouldn’t matter too much if it wasn’t for the fact that defines are used by many projects for storing translations of strings. It might even be the most common form of providing translations in PHP projects. Constants on the other hand are resolved at compile time. So as in most other programming languages the constants in the code will actually be replaced with the value it represents before execution.

XML support

Small revolution

The other area that is close to a revolution is the improved xml support in PHP5. As many other things, xml support in PHP4, was really just a big hack and squeezed in without much thought. Both the DOM and the XSLT extensions were marked as being experimental and they had many problems. SAX on the other hand worked pretty good but is a mess to work with. You have to write a lot of code to do very little xml parsing.

All the new xml extensions use libxml and libxslt which means expat and sablotron are being dropped. Libxml is one of the fastest implementations available and have very good support for the standards. The xml extensions in PHP5 are xml, dom, simplexml, xsl and soap. The xml, dom and simplexml extensions will most likely be compiled in as default which should improve xml support in hosting environments. The xml extension support loading of non wellformed html documents and most dom level 2 methods are implemented.

In PHP5 both the XSL and DOM extensions are new. The XSLT extension is called ext/xsl. The reason for it to be called xsl is to allow the old ext/xslt extension to stay for those who wish to use it. It should be noted however that libxslt is much faster than sablotron, which is used by the old xslt extension. The xsl api methods are derived from api methods in the mozilla implementation.

Problematic error handling

There is a problem with the error handling though. The dom methods throws exceptions where the dom standard so requires. In some other circumstances old type PHP errors will be produced. This mix of error handling methods is a bit confusing and hopefully this will be fixed by the time PHP5 is ready. Some dom methods are not completely ready. The method getElementsByTagName() currently returns an array which is not correct. This will change so that a real nodelist is returned instead.

Interoperability

All the xml extensions are now compeletely interoparable. They all load xml and create DOM trees in memory. This dom tree can then be shared and passed around between the different xml extensions. The best about this is that you can load an xml document into memory, use dom methods and xpath expressions to locate data and change it in memory, and then finally load a xsl template into it’s own dom tree and apply it on the original dom tree. You might not even have to load the xsl template since you could have it cached in shared memory from a previous request. The ability to cache dom trees in memory should give you a huge performance increase.

SimpleXML

This is all good and well. But the best part is yet to come – simpleXML. This new extension written by Sterling Hughes makes it so easy to work with xml data that, as he put it, any monkey with enough bananas can do it. I am inclined to agree with him. As with all the other xml extensions simpleXML integrate nicely with the others. What it do is enable you to work on xml documents as they were native PHP structures of associative arrays. A simple code example below loads a xml file with a booklist and displays the titles of the books. All in just five lines of code.

<?php
$booklist = simplexml_load_file(‘mybookfile’);
foreach ($booklist->book as $book) {
	echo $book->title .”n”;
}
?>

It can hardly be simpler. The extension allows you to convert back and forth between simple xml and a dom tree. You can instead of reading it from a file load the xml from a database and use the function simplexml_load_string(). Some design decisions made for simpleXML, to make it this simple, can make it weird to work with though. Node attributes are for example treated as just another level of nodes.

More advanced XML

Through the use of libxml PHP5 will support validation of xml documents against DTD, XML Schema and the simpler, independent standard, RelaxNG. Schema support in libxml is still not there to 100%. All basic things should be supported however. Validation of the xml is an important feature since you can be confident that when you are working on a dom tree that the data in it is syntactically correct. This is especially important when you start working with SOAP.

Unfortunately SOAP support is currently not finalized and yet to be added to PHP5. What is clear though is that SOAP will be supported natively by PHP5. Right now the options are to use either nuSOAP, the smaller but faster soap extension available through PECL or any other of the numerous PHP implementations available. NuSOAP gives you the ability to create WSDL specification programmatically through the use of a set of api methods. The PECL soap extension is however a faster choice since it does not have the overhead of loading the rather large NuSOAP class file written in PHP.

The xml extensions will also support PHP streams. Using the stream support you can register your own stream handlers. Supported out of the box are php, http, https, ftp, ftps and compress.bz2. One example of using your own stream handlers is in conjunction with xinclude to dynamically change the content of a static xml document.

Database support

The new PDO extension

This is another area that is getting a big makeover. There is a new extension called PDO, PHP Data Objects. Obviously enough for anyone is that it is derived from the name ADO. It is not, however, something that is supposed to look like ADO or work in the same fashion.

The pdo extensions give you database abstraction on the function layer and provides a uniform way of connecting to and using databases from different vendors. This is ofcourse a huge leap forward from the mess in PHP4 with all the different extensions for each database with different naming and parameter conventions.

More choice

The extension provides several new ways of sending queries to a database. It supports unbuffered queries where only a result handle is returned from the database. Unbuffered queries also support destructive reads. Using this feature means that data, as soon as you use it, will be destroyed in the engine. In other words, you cannot go back and reread the data returned in a result set. This is said to improve speed somewhere between 50 – 70% over buffered non desctructive reads.

You will still have the option of fetching rows in associative arrays or regular arrays. As associative arrays are much more expensive to create performancewise, the default returned will be numerical arrays. Several other query functions are also created to increase speed in certain circumstances. One method will allow you to send a query and directly get the whole result returned as a normal PHP structure. Yet another is the fastest choice when you only return one column.

A weird but neat feature is that you can query the database and at the same time provide a class. An object will automatically be intantiated and initialized with the values from the database.

Iterators

Using iterators, that also are new for PHP5, allow you to loop over a resultset without doing so explicitly in your PHP code. You could do this previously using arrays and a foreach loop. The foreach loop is extended to work with objects now. Since the looping is done internally in the PHP core it will be much faster.

The consensus is that with the pdo extension you will have more options than before and a greater opportunity to profile your code for speed.

Mysql support

Mysqli and PDO

Work on PDO, described above, was started quite recently. The need for better database api’s however was recognized a long time ago. MySQL have evolved during the years and is now a very mature database management system featuring transactions, sub queries, replication and many other things. Views and foreign keys are still not there but is slowly in the making. Foreign keys can even be tested in the early versions of MySQL 5.0. The mysql extension have on the other hand not evolved much. The api does not closely follow the C api and many things are missing. Because there are optional parameters to many mysql functions it is also hard to add parameters to functions without breaking backwards compatibility.

Therefore work started on a brand new mysqli extension. This api match the C api much more closely and supports the MySQL 4.1 api. It is not 100% backwards compatible. Since the old mysql extension will stick around for some time this should not be a major problem. It is more than likely that some PHP installations will have both extensions enabled at the same time, allowing both old and new applications to work.

Prepared SQL statements

New for mysqli is support for preparing SQL statements which let the MySQL engine compile them before you use them. If you use the same query several times with different data this will give you significant speed improvements over sending the full SQL statement every time. Since you supply the data together with a resource handle to the compiled query instead of a full SQL statement you will no longer have to bother about magic quotes and addslashes. The extension will take care about that for you. You should then obviously have magic_quotes_gpc set to off. You can ofcourse still send the whole statement at once exactly as before. However, when including large chunks of text or blobs that exceeds the maximum packet size allowed you should use prepared insert statements.

Currently it is not possible to store prepared SQL statements between requests. Zak Greant promised to look into this later as it would be very beneficial performancewise.

Encryption of passwords

One thing that might give some applications trouble is that MySQL have stopped using the old password crypto in favour of SHA-1. This is ofcourse much safer, providing 40 bit oneway encryption of passwords. Applications that use PHP functions to encrypt passwords using either SHA1 or md5 are not affected since this is on the MySQL side.

What now?

PHP5 have recently released a beta 2. It should probably not be considered as beta quality though. Most of the new features have been implemented already. Only five or six people are still in the business of discussing new or changing features for PHP5.

So it is not really beta quality. It is however stable enough to download, install and tinker with. For those of you who wants to be prepared it is probably time to start working on PHP5 support for your applications. There are enough changes in PHP5 that it might not be wise to trust it to be fully backwards compatible. We now know for a fact that it won’t.

At the conference both Sterling Hughes and Markus Börger answered the question – when? The optimistic Börger said March 2004 and the more pessimistic Hughes thought summer was more realistic.

Object oriented support

Hybrid language

Object orientation is one of the areas that is really changing in PHP5. Whereas there was only rudimentary support for object orientation before, PHP5 now implements most features required by OO purists. PHP will ofcourse be a hybrid language and apart from a few areas you can still ignore the object orientation and code procedural if that is your cup of tea. The fact remains that PHP is becoming a more object oriented language.

Changes in PHP5

One major change, that probably will decrease the bug count significantly in any project using classes and objects, is that objects are now sent by reference instead of by value as the default behaviour. You will no longer have to make sure you use ampersands at the right places. PHP5 have a reference count that keep track of the object usage. Objects will be freed when the reference count reaches zero, either by using unset() or by going out of scope.

A quick list of what object orientation in PHP5 introduces:

Handling the memory

To destruct an object you call unset. The function unset only decreases the reference count and you cannot be 100% sure that the object will be destructed. PHP is a language where you should not have to care about the memory that much and it might be that destructors aren’t that a good idea in PHP.

Another big gotcha is that parent constructors and destructors must be called explicitly. This means that it is easier to make the code run faster. What might be thought of as being a bit inconsistent is that if you don’t have a constructor in the subclass the parent constructor will be called automatically.

Structured programming

Type hinting is also a step forward to a more structured programming model. This allows you to enforce that a parameter can only accept an object of a certain class. This is currently only planned to work with classes and objects – not on base types like strings and numbers.

The one new feature you cannot ignore is the introduction of try/catch error handling. The problem is that some extensions of PHP5 will use try/catch and others will stick to the old error handling where error codes are returned.

With the above new features it is possible to implement most design patterns, for example the common factory pattern and singleton pattern. If you take the singleton pattern for example. You can now ensure that only one object is created using static members and then overload the __clone() method and declare it as private.

PHP

If you enjoyed this post, please consider to leave a comment or subscribe to the feed and get future articles delivered to your feed reader.

Leave Comment

(required)

(required)