How to ensure a millimetre Price Calculation Engine?
The importance of the price calculation engine (PCE)
For a successful website, e-commerce oriented, one of the main points is the price. All way long of the client navigation, the price must remain coherent with the items selected. Changes on the price from one page to another will create some frustration and uncertainty on the client, which will be translated into mistrust. This mistrust will be translated into a bad experience and the bad experience will make the client never come back. But a client who won’t come back isn’t just a single sale which is not made but also all the potential clients in its neighbour. Here is how a single block can deeply impact the profit of a website.
Why does the PCE so many times turn into a black box in the development of a website? At the rate which a today’s website grows up, the functional features which define the PCE also follow that rate. This means a continuously increasing complexity, a deprecated history, and a knowledge which can’t follow that rate. Little by little, the PCE becomes a black box, and nobody wants to work with it. Nobody really knows how it works, but everybody knows that a single modification will end up with a long list of problems.
Following, I’ll describe how I’ve handled this problem for a website which allows you to obtain an online quote for a vehicle repair or maintenance. Our PCE contained multiple reparations type (tyres, glasses, mechanical or body), but the most tricky part was the discounts (applicable by repair type, by minimum amount, by convention, can be accumulated or not…). The problem was that: sometimes, the prices were not coherent from the beginning to the end, adding a new reparation type or a promotion was very complicated.
Now, I’m going to expose some steps I’ve followed to solve our case: make the PCE more manageable. Our PCE has grown too fast, too many rules appeared without thinking about the conflicts. Which made some kind of unmanageable monster. In the beginning, everything starts with a good conception, it’s later when multiple topics are in conflict.
Most of the time, the black box doesn’t really exist as it is. But there are distributed pieces of the PCE all over the application. This causes some problems:
- Find the errors: as there are multiple places where to look (and some are hidden);
- Understand its behaviour: it’s complicated to correctly follow the execution workflow;
- Hard to test: it has too many dependencies.
To try to solve those problems, we have to first try to sweep under the same carpet: have all the PCE in a single place. It was ugly, but it wasn’t the goal. If we almost can have all the rules at a single place, we will have defined boundaries with nothing more inside. This was our first victory.
By having all the logic in the same box, we can now see that the PCE requests multiple other services. The PCE obviously needs a lot of information (discounts information, spare parts prices, hourly rates…) from the rest of the services to deliver the correct price.
Where is the problem now?
- Too much entry points;
- Too many dependencies;
- Context too complicated to request the PCE.
How the Hexagonal Architecture will help us? The Hexagonal Architecture describes the independence of each part of the system, which means that the PCE shouldn’t need any other part to work correctly.
How to reach this?
- I’ve identified all the data/information that the PCE needs;
- I’ve identified all the entry points to the PCE;
- I’ve created an object which includes all the information identified at the first point;
- I’ve modified all the entry points listed at point 2 to accept only the object created at point 3.
This way, all the entry points will have the same format, but even better, it will be easier to test. We had a big object to request the PCE, and it requested too much information (sometimes unnecessary), but that’s not our problem at the moment.
Here we are, the step which will grant us that the future modifications won’t impact the old rules. But this step is far from easy. Here is the sequence I’ve followed to put in place the needed tests: 1. Creates a single test per repair type with the basic parameters; 2. Add more tests with the most common parameters and discounts; 3. Continue with the less common parameters and discounts; 4. Add the extreme cases; 5. Add the impossible cases (if it’s impossible, we must ensure that the client is aware of it).
This was a long task. I’ve needed to talk with several people to understand some functional rules and investigated the history of functional requests. I’ve also found some existing problems between the expected and the actual result. But I didn’t waste my time trying to correct them, I will handle them later. In the end, I can grant the stability of the PCE.
Now that I’ve made a big work of investigation of the past of the PCE, I must not discard all this information. I should document as much as possible the system now. But where? The best place remains inside the application. If the documentation and the algorithms are separated, they won’t evolve together. Some complex parts were described just inside the code, but most parts of the application were described with the tests. We must ensure that the mechanisms are well described and understood (and not only by the one who wrote them).
Last step. I’ve already succeeded to isolate the PCE. I’ve also succeeded in stabilizing the modifications. And finally, I’ve documented the system to increase the readability. Now it’s time to rethink about the logic, find an architecture which best matches the current needs and future features. It’s time to delete old rules which are outdated.
There is no procedure I can give for this task because it will depend on the architecture, on the technologies used, and in so many more aspects. Nevertheless, the system is now stable and ready to accept a big refactoring of the PCE without fear.
In the end, we succeed to have a stable PCE:
- The tests ensure the non-regression;
- The documentation allows a quick onboarding and eases the modifications;
- The isolation help in the integration of new repair types;
- And some other advantages.
Maybe in some years, the current architecture will be deprecated, but as the system is isolated, it won’t generate more headaches.
Procedures like the one I’ve described in this article can be used to stabilize and refactor any part of a system. I’ve mainly been focused on the PCE because it’s one of the most sensitive part, one of the most business important. The procedure I’ve described will ensure the stability, show hidden features/problems, and allow you to start the refactoring of any critical part.