Ok, so I spent alot of time researching relational algorithms for inputing some hierarchical data into Mysql.

I wanted :

  • no excessive db writes if the tree gets modified.
  • fast to query the data
  • easy to modify tree hierarchy (move nodes)

Nested set

The Nested set algorithm in the incubator looks great except my tree is rather large. Inserts to the bottom of the tree, trigger recursive updates all the way up the tree.

Let's say i have a company hierarchy with the whole employee structure and frequently add new customers to the bottom most part/leaf of the hierarchy tree. The nested set algorithm is clearly not made for this : Each insert triggers updates right up to the top !!

The nested set could be improved and use the nested interval algorithm but this isn't my point.

Closure table, Transitive closure

I think it would be interesting to have an official phalcon php implementation of "closure tables" (improved adjacency list). It's basically a self referential many to many relationship, with triggers and some data redundancy.

  • Inserts to the tree do not trigger updates on other nodes. They instead, populate the paths to the inserted node in the closure table.
  • Moving a node only triggers updates to the node's own children.
  • Using a "depth" or "path_length" column make's it easy to represent the whole tree structure. You can also easily query any specific level in the tree.

My solution for now

For my particular project i'm using mysql triggers and some stored procedures but would have prefered to keep this on the php side of things because now i have to run migrations on all my remote boxes that will be running this app if i need to change something.

I used this as a reference : https://github.com/developerworks/hierarchy-data-closure-table


More reading : http://dirtsimple.org/2010/11/simplest-way-to-do-tree-based-queries.html


Video overview of different algorithms : https://www.youtube.com/watch?v=wuH5OoPC3hA