Drupal 6 performance currently seriously suffers from the fact that it always loads full objects (nodes, users, etc.) in advance. For example, invoking
user_load() leads to a cascade of additional database queries, in this case
at least one extra query to load the roles of the user.
Whether the roles of the user are later actually accessed or not makes no difference, since Drupal doesn't care! For example, even if you're just going to display the user name and picture, Drupal loads completely irrelevant data.
I would therefore like to propose the following:
Replace anonymous objects with dedicated objects implementing a getter method
If the currently anonymous objects (nodes, users, etc.) returned from database queries could be replaced with "real" objects, uninitialized properties of these objects could be
loaded on demand at the time of access by using the
getter method of
PHP 5 objects. To be able to do this, every such first-class object would have to ask all modules for a list of properties they implement beforehand. On access of an uninitialized property, the object would then delegate the loading to the responsible module.
Example:
- A user object is requested and initialized with just a user id (by invoking
user_load()).
- An empty object is created, and initialized with the available properties (in this case no more than the user id, but could be more if available).
- An uninitialized property is accessed.
- The getter method of the object determines the module in charge.
- The module is ordered to load to property.
- The property is returned to the caller.
At this point it is important to understand that modules should not load single fields, but
whole database rows. This saves us from additional queries on subsequent accesses to other properties handled by the same module (and we're basically getting them for free).
Also, SQL queries could be constructed
as simple as possible (avoid complex JOINs), which gives us in certain cases better database performance (if you take
MySQL's query cache, for example, simpler queries might have a longer lifetime in the cache because less tables are involved).
A concrete example would be a node query that currently also returns certain (but in some cases insufficient) user data (usually user id and author name): with dedicated objects it is not required anymore to JOIN the
user table, because the user id contained in the node table is sufficient to initialize a user object. All properties that are accessed later will be loaded on demand, including often neglected items like the user picture, which are a requirement for modules such as
User Display API, which allows to customize the display of user information.
Since all the loading is encapsulated and handled by the object itself instead of hard-coding it in the source module, we'd have
transparent access to any defined property, at any time.
Adding support for methods ("mixins")
Since this concept has a certain similarity with the
mixin pattern, one thing that almost instantaneously comes to mind: would it be useful to allow the same technique for methods? There are
approaches that simulate mixins in PHP, albeit rather hackish (or, let's say leveraging what PHP has to offer). In the end, contrib modules would be allowed to not just extend object properties, but also specify additional methods.