Model hierarchy, avoiding lazy loading

Hi,

I've just started learning Phalcon, so forgive me if I'm posting an already discussed topic.

What I'd like to achieve with the model hierarchy, is to load multiple entity records organized in the expected model hieararchy. Maybe it's not clear, I show a very basic sample.

Let's have a User and an Address entity where a user can have multiple addresses. The generated models:

class User extends \Phalcon\Mvc\Model
{
    public $id;
    public $name;

    public function initialize()
    {
        $this->hasMany('id', 'Address', 'user_id', NULL);
    }   
}

class Address extends \Phalcon\Mvc\Model
{
    public $id;
    public $city;
    public $user_id;

    public function initialize()
    {
        $this->belongsTo('user_id', 'User', 'id', NULL);
    }   
}

My task would be to list all users with all their addresses. So what I'd like is to go through all User records, display their name and in a list, display city fields from ALL of their address records.

First thought, let's query User records:

$users = User::find();
foreach ($users as $user)
{
    echo "\n" . $user->name . ", cities: ";
    foreach ($user->address as $address)
        echo $address->city . ", ";
}

It's trivial, the output is generated as expected. What's not good with this solution is the number of SQL commands generated! This way the addresses are lazy loaded, therefore extra SQL query is generated for EACH user record. For long lists (let's say 100 user records) we would have 100 extra SQL commands!

I think it arises 2 problems:

  1. performance problems (too many database activity because of lazy loading, for many users it could kill the db)
  2. integrity problems

1st is obvious I think, 2nd means that while I go through all the user records, some of them could be deleted meanwhile, and when I reach such deleted user, no addresses will be loaded by the lazy $user->address, I will see that this user exists but has no address. But that's not true, that user doesn't exists... So the integrity of my data is corrupted! I maybe could avoid it be special isolation levels and putting the whole processing in transaction (by I guess only for multi versioning databases).

What I imagine to work, would solve both problems.

I'm looking for a solution where only one SQL is generated, and all the records are loaded in User records and all User records have all corresponding Address records loaded in their 'address' properties (as the hierachy is constructed).

I've tried with PHQL inner joining, I see the result but each entity is loaded independently in the result set, couldn't find a solution where the hierarchy is loaded up fully in their corresponding list properties.

What I'd imagine is a way to tell PHSQL to loaded User records, load it's 'address' list as well (could befurther filtered as well?), this way a full hierarchy could be described that we'd like to load and work with and could avoid lazy loading each lists (and of course 'hasone', single references as well)!

I know, I could trick a few things (load users, gather their IDs, issue only one more SQL with the gathered IDs and load all the needed address record), but none of them would solve organizing them in the model hierarchy and not even persistency which I have with the mentioned basic method.

Is it possible somehow to solve this example by avoiding lazy loading and executing only one query? Of course I'm looking for a general solution for more complex hierarchies (even with multiple relations) but this sample could be a good start to have a solution for.



18.5k

As far as I know, you retrieve all from the DB and set it into the variable (so, your $user->address would be an bidimensional array), not executing SQL every time you want to know a related record.

I think that's what you want to know.



1.9k
edited Sep '14

Thx for your reponse.

Do you mean I should load all related records once and store them separately and find the corresponding ones in this list when I loop through the users list? If so, that's the main thing I want to avoid. The model hierarchy should work the way it automatically assigns the related entities to their corresponding entities. For complex hierarchies I would need to manage multiple lists for multiple related object which is not efficient and even not easy to manage.

If it's not what you said, please describe it.



18.5k

The fact is that the model loads automatically the related records when finding one of them. Do this to see an example:

After you setted belongsTo, hasMany, etc.
...
$model = Model::find();
echo "<pre>";
var_dump($model->nameOfTheModel);
die();


1.9k
edited Sep '14

I think you misunderstood my post.

I know that the model will load automatically the related model's instance on demand (lazy loading at your $mode->nameOfTheModel statement), that's what I want to avoid, because it generates as many extra SQL commands in the background to the DB server, as many lazy loading happens. I want to avoid lazy loading and force the model to auto load the relations that I define somehow, and the question is if I can achieve it somehow or not. Do you see my point?



18.5k

So, when you init the app, you want to automatically bring all that relations to the model, and then search from that result instead of on the DB, isn't it?



1.9k

I don't want this on init app, it's just in a case of a simple query. I just need to have $user->address relations to already be ready for all the queried User objects and don't want Phalcon to lazy load them only when I access the $user->address field. So basically I just want to turn off lazy loading for this field and force Phalcon to preload elements for this relation.

I expected the "join" to make this happen by the $this->modelsManager->createBuilder()->from("User")->join("Address") command, but I get the same user object as many times in a result of this command as many address object belongs to each user objects. So if user (id#1) has 3 related address objects, 3 row will contain the id#1 user with the different address object (as a simple SQL inner join would return 3 rows for such join, which is normal). But Phalcon, as a model mapper, should build the hierarchy from this, I guess. So I should get only 1 User object returned with its 3 address objects already loaded in its $user->address field!

Do you see the goal I'd like to achieve?



18.5k

I don't get it.

But when reading this...

I know that the model will load automatically the related model's instance on demand...

I think you didn't get it. Phalcon related models instance are stored when you find a model record, not when you want to access it. So, where doing Model::find(), you are also retrieving the records of the related model, and setting them into the variable.

But as I said, I don't see the point.



1.9k
edited Sep '14

Aha, here is our misunderstanding of each other!

You're saying that when I query for all User records then all the related $user->address resultsets (list of associated Address records) will also be loaded at once with the User reocrds itselves? Because in this case I have to say NO! It won't! Exactly that's my problem! It will load from the DB only when PHP executes the $user->address statement! This is lazy loading, exactly that's what I'd like to avoid!

To make you believe me, I'm showing you the logged SQL commands that I see after each executed PHP statement (from the first sample in my original post):

$users = User::find();

SQL: "SELECT user.id, user.name FROM user

foreach ($users as $user)
{
    echo "\n" . $user->name . ", cities: ";
    foreach ($user->address as $address)

HERE IT IS!!! An SQL command is executed right after calling $user->address in this foreach:

SQL: SELECT address.id, address.city, address.user_id FROM address WHERE address.user_id = :0

This is where lazy loading happens, and will happen as many times as many iteration the outer foreach($users as $user).. record will produce, so as many User records we have! This is the wasting I wanna avoid. Test it if you don't believe me.

Maybe I'm doing something wrong in the setup of my DbAdapter, but lazy loading happens by default and I didn't find a way to specify some relations at particular queries where I know I will need that relationship and I don't want to waste DB resources to query the related models one-by-one in the background via lazy loading!

Now do you see my point?



18.5k

AHHHHHHHHHHHHHHHHHHHH now I see it! I thinked that it was loaded automatically in the find() method.

BTW, how I see the SQL commands? So I can tell you if your DbAdapter is bad (if you did it as the tutorial, I think I will also get that commands).



1.9k

I don't think either that this behaviour can depend on adapter settings. Btw. I set up sql query logging from the documentation:

http://docs.phalconphp.com/en/latest/reference/db.html#logging-sql-statements



18.5k

I did it and yes, you are right, Phalcon does lazy loading.

But it's logic: when you want a record from a table that has related tables, maybe you only want some of the data from it, not related data from the other tables. This is an advantage in the way that you do not overload in a unnecessary way the model variable. But for you, is a disadvantage.

Maybe you could try a workaround creating your own find() method (with other name), that calls the original find(), and then, do the lazy loading you want to avoid, so it returns to you the model variable full loaded.



1.9k

Yes, the other side is true as well: it wouldn't be efficient if all related models would be loaded. That's why I'm looking for a solution where I can define the list of related models I need to be loaded automatically. If a relation is not defined this way, should work as already works: through lazy loading.

Overloading the find() method is my only thought as well, I possibly could (and if I find no other solution I will) implement my own way, but I'd be glad if it's implemented in Phalcon already (mostly for performance reasons). But after checking the documentation many times, I see no chance for solving it through basic Phalcon behaviour.

Or does somebody see chance for a built-in solution?



18.5k

Maybe you could do a push with that feature in the github repo. But let's wait to @phalcon to appear here and say something about if it isn't implement what you want.



5.9k
edited Sep '14

whenever i have such scenarios, where i need records from nested tables i create a VIEW in mySQL and then a model to describe the view

model users ->describes Users table, has many addresses
model addresses ->describes addresses and belongs to users
the above two models are used for editing the data (add, edit, delete)

and then i create a view in mySQL and also a model in phalcon
model DetailedUsers describes the view in MySQL where the data are created by a query
i use the above model whenever i have to present the data to a visitor, just like your example

there are a few advantages in performance and consistency, at least from my point of view

i do not know if that helps in your case
i can provide an example if you think it is helpfull



1.9k
edited Sep '14

I understand your solution, it really has performance and most importantly consistency advantage. The consistency is also a big problem for me in Phalcon, because if we use lazy loading, our loaded model 'hierarchy' could end up in an incosistent state. Imagine an Address record belonging to the first User record we're processing, and meanwhile we process all User records, this Address object gets reassigned to the LAST User record (in the database, through a different request). If after this modification (while we're going through the User objects) we reach the last User, we will see this Address record again, and our User list will contain this Address object twice! This is obviously an inconsistency.

Maybe we could surround our processing in a transaction, but because of long processing of each User record one-by-one (and lazy loading each Address relation), it could take long, and could lead to performance problems.

So compared to these problems, your solution is better. However, I don't prefer to have Views (in DB) for each relations I need to have together (preloaded). Such a View is related to displaying, I wouldn't like it to be present in a database, because it's not a structural data thing, it's only for displaying! Which, I think, should be independent from models and databases. I know that views are kind of for this situation, but for separating model / view, I don't like the idea to have as many views in my DB as many relations I need for models. It could end up in a huge amount of Views which could be difficult to manage (especially with additional Model object in Phalcon).

What I'm looking for is a general solution, where I can have the needed models ready in the loaded models. Phalcon's ORM is exactly about it: models and relations, that's why it's a bit suprising for me that Phalcon can handle relations only by lazy-loading the relations and only on demand! There should be a way where Phalcon (both for performance and consistency reasons) would prepare (preload) the needed models and I could receive a fully consistent and complete hierarchy with loaded models that I need to work with.

The only solution I see now is to extend the Model::find() method and accept a list of relations to 'preload' for a query. This could work, however with a small performance impact (because I'm extending it in PHP, not in C and there always will be an extra SQL in the background but in an optimized way so there will be only one SQL / relation, not one SQL / model instance like with lazy-loading). I'm working on it, will share if I have something working.

But still hoping for a built in, core solution :)



5.9k
edited Sep '14

i understand that flooding a database with views is not very attractive, i actually use more generic views and use the same view in different spots of the application. Overkill in general but on the other hand no need to edit an sql query or mySQL view whenever i decide to change the columns of a table.

anyway, another option I use is the Model Manager, where i provide raw SQL queries to get complicatet data.
Phalcon\Mvc\Model\Manager(); the resutl is almost the same as with views, you get data from nested tables and use them for presentation purposes
I actually use manager where i need to setup queries that are hard to describe in a model, or have aggregation functions such as AVG, MAX, SUM etc

it has some pros and cons as well, it might be suitable for your case



1.9k

I found another topic with the same problem, the keyword for my problem was 'eager loading':

https://github.com/phalcon/cphalcon/issues/1117

Dear @phalcon, I hope the core team will take consideration about the importance of this feature. It's really a MUST HAVE thing when working with entity relations.

edited Sep '14

I also really want the abiltiy to eager load models! As said in the this post eager loading isn't possible and should be done with the query builder.

I'm guessing you should implement your own function in the model class which uses the query builder to get related records. This won't be one query, but it sure will be allot faster instead of running a single query foreach record. Would like to see an example of how this would be done though.

edited Apr '15

I look forward to it too. We need to avoid unnecesary db queries. It must be easy to phalcon framework to do it since there is already all the needed foundation.

Related to the interface I really like the eloquent one. It uses the method "with" to especify the related models to eager load. (I implemented it on https://github.com/Surt/Granada too) http://laravel.com/docs/5.0/eloquent#eager-loading

On Granada it was as easy as using the with parameters to assign the relations resultset to the parent property. Don't have any idea on zephyr though.