Best Practice for Calculated Fields

I am looking for advice on what the best practice is for implementing my calculated fields in CakePHP 5.

I have a model that is Projects hasMany ProjectExpenses which hasMany ProjectExpenseLineItems.

Users record expenses, which are made up of one or more individual line items, against their projects. They have access to a Project Dashboard view that displays a lot of high level summary information about the project financials - total expense, total income, RoI, etc. which are all calculated from the expenses data.

I have getters in my Project entity that calculates these values and I call/display with a simple $project->(getter_name) structure but I don’t think this is the best option as each getter interrogates the same data of which there can be 1,000’s of expenses and expense line items.

I have now implemented several different ways to get the figures I need for the dashboard but don’t know the best option. What I have tested and have working are the following approaches:

  1. Getters in the entity
    This works well for the views as it’s simple to display a getter field in a template. I retrieve the expenses as part of the overall Project finder using “contains” and then create a collection from those child records on the project. My getters interrogate collection for each calculated field I use. Fairly simple but results in the same collection being interrogated multiple times.

  2. View variables
    I’ve created a function in the entity that I call from the controller. The function interrogates the expenses data once and, with each expense record evaluated, updates multiple variables as required. I return those variables to the controller as an array and pass that to the view where I can display the relevant value from the array as required.

  3. Finders
    I have also got finders for each field working where the getter field doesn’t interrogate any data but executes the finder, which returns a single value with the number I need. This results in multiple hits on the database to query the same data but with a different calculated function to return the desired result.

  4. Callbacks on Insert/Update/Delete Operations
    I can execute a callback when saving expense information that calculates the required figures from the updated expenses data and saves that value on the Project record in a single field. This makes it much easier, and faster, to display the data on my project dashboard but moves any performance issues to the insert/update/delete operation instead. The issue is more obvious when a user is adding lots of expenses in one sitting as the callback runs for every update, whereas, with the other options, it is only run when the user selects to view the project dashboard. I can make this faster by just adding the current update to the existing single value field on the project but I’m concerned that something else may make the stored “calculated” field out of sync and it’s therefore better to recalculate the whole value each time.

I feel option 2 is the best as it’s the one that only requires a single pass of the expenses data, and doesn’t impact the “save” performance but it doesn’t seem like the right way to implement my calculated fields.

Thanks.

Consider does your users more frequently save (update) or display. In most cases users more displays then saves. If so, maybe 4 is ok. If you are worried about sync, maybe put it to cron daily in the middle of the night?

Also maybe you can put some function in model, that will just return right set of values (like described in 2)? Ask yourself do you need this code in any other places? Sometimes if you want high performance you need to hand tune it.

Consider does your users more frequently save (update) or display

Exactly. (4) here sounds like the most appropriate and scalabla approach.

These kind of situations where otherwise an extra db lookup/calc needs to be done or too much related data needs to be fetched and processed, using so called “CounterCache” behavior would the right thing to do.
Either that behavior or a similar custom functionality to “cache” that calculation for simple read access.
If its a lot of calculation, offload it into a Queue task run right after save/delete etc.

You are mostly talking about reading data and caluclating a value which is composed of multiple associated entity fields. For me this is primarily a job for a virtual field

A virtual field access’s data of the current entity and its associated entities (if they are eager loaded (with ->contain()) with everything it needs. Therefore the calculation happens in PHP after all the SQL has been executed and transformed into entity objects.

As long as your calculation logic isn’t too “complicated”, so that your PHP process takes significantly longer to execute this should be fine.


If that is the case, than caluclating in PHP is probably not the right way to go. Then I’d go with a finder which adds your calculated field via SQL (which in most cases is much faster than PHP).
This if course requires you to maybe write some more complicated SQL


Saving data shouldn’t affect anything from above, since - after saving - the data is re-fechted and the entity is rebuilt.

Thank you all for the responses.

I’ll take some comfort in that, apart from Queues, it seems my options were at least viable and working on the right concepts even if I don’t have the experience to know the best option (yet).

For Queues, I looked at those for a different reason when working in CakePHP 4 and remember having an issue due to being on shared hosting but maybe I’ll have a look at them again.

For what happens more, saving or viewing, overall, viewing is more common as every project team member can view a project but only some can update expenses. When using the site navigation, rather than a direct link, the first page a user accesses on a project is the dashboard so it gets a lot of hits. For an individual user, who as authority to update expenses, they may save more than they view as they can add 5, 10 or more expenses each time they access a project.

I’ll give the callback/CounterCache option another go and monitor how it goes. Between testing and maybe just a bit of trust I won’t need to worry about the sync issues. Having all the values I need stored as data would certainly make things easier on the front end.

Quick update, in case it helps others.

I took what @KevinPfeifer said and found this (old) blog from @dereuromark Virtual query fields in CakePHP – DerEuroMark and looked into virtual fields more. I had always created my virtual fields at an entity level with the “_get” structure, which didn’t feel right in this instance. Having them as part of my finder makes a lot more sense, in some cases.

I’ve now, probably unsurprisingly, implemented the various calculated fields in the way that seems most appropriate for each specific calculation. With a little additional logic around the finder, I can restrict when the virtual fields are added to my query so as there’s no impact if they are not needed while keeping my custom finders DRY.

Thanks again all for the help.