Why Is My Excel Spreadsheet Slow?

Identifying the common causes of slow Excel spreadsheets. Advice on how to diagnose and resolve problems through understanding how Excel works. Tips on functions to avoid in your spreadsheet.

 

Carbon offsetting in Excel

It may surprise you to learn that every time you enter a calculation in Excel you’re either planting a new tree or growing an existing one… How carbon friendly of you!

If you have, or know of, a spreadsheet that is slow…or super slow…or super duper slow, this article will shed some light on the likely reasons and suggest some ways to resolve the problem.

 

Dependency Trees

These trees I mentioned you were planting, are in fact Excel’s “Dependency Trees”, and are grown to make the calculation process of a spreadsheet more efficient. To explain, when a cell is updated in a workbook, assuming it has other cells ‘downstream’ that are dependent on it, Excel knows that some cells will need to be recalculated to ensure we have correct answers everywhere.

Now, Excel could update every calculation in the workbook. Every time. Just to be safe. This sounds remarkably conscientious. But of course, calculations need to be updated in a certain order or the final answer is sure to be the wrong answer. And if you’ve ever pressed Ctrl+Alt+F9 in a large spreadsheet or model, the lag you may get as all calculations are re-run can be quite annoying. You don’t want this every time any cell is updated.

Excel’s answer… as you build your calculations or model, it constantly maintains a record of which cells impact which cells which impact which cells which impact which cells…(snore!) Incredibly conscientious! It’s growing a Dependency Tree.

 

Warning: Dodgy tree analogy ensues

Each cell that has no precedents (i.e. no cells that feed into its calculation) - usually input cells, where we input our assumptions - can be thought of as the tip of a root. Roots converge into a trunk, then on to branches and finally leaves. Leaves would be cells with no dependents, only precedents, and often would be found on output worksheets in the spreadsheet. Water (or assumptions) is fed into the roots and travels to the leaves…this is the calculation process.

Now, the water fed into a root in this Dependency Tree doesn’t usually reach every leaf, it has a defined pathway, known as a “Calculation Chain”, which Excel is constantly keeping its eye on. It could reach one leaf, several leaves, many leaves, or them all. When an input is updated Excel marks every cell along the calculation chain as a “Dirty Cell” (actually official Microsoft terminology!) – these dirty cells must be recalculated. The water travels up and they are recalculated in the correct order and once the water has passed through they are….well, clean cells presumably!

This is an efficient calculation process as only the cells that need updating, are updated. And note, there will be many calculation chains in the spreadsheet.

 

Who cares?

You may be thinking… Who cares? Why do I need to know this? Have I wasted three minutes of my life reading dodgy tree analogies? To many Excel users this won’t matter, but if we are building or working with large or complex spreadsheets, or we are aware of some that our colleagues use, such horticultural knowledge can pay off. And it’s mainly to do with model performance.

 

Diagnosing performance issues

Almost every company has ‘one of those spreadsheets’… it’s clunky, slow, forever calculating and everybody hates it… except sometimes ‘that guy’… you know, the one who built it ages ago and is convinced this poor performance is somehow proof of just how advanced the spreadsheet is… but just no one else is on their level to understand why. I have heard the terms ‘Mega Model’ and the ‘Model to end all Models’ in two separate companies I’ve worked with. In both cases the model should have been trashed and rebranded the ‘Model to end all Models…that that person creates’!

The knowledge we’ve learnt so far can help diagnose and resolve such issues. And it’s good theory to set us along the path of designing and building efficient, high-performance spreadsheets. When the above performance issues present themselves, the problem(s) will be found somewhere in that Dependency Tree – we need to locate and resolve them. The two most common causes are bottleneck calculations and volatile functions.

 

Bottleneck calculations

Bottleneck calculations are exactly as they sound – a calculation or set of calculations that take forever and hold things up! For instance, a large block of SUMIFS or SUMPRODUCT functions that all reference a large array, such as a few thousand cells, can take a long time to calculate – they are memory intensive. And there may be several of them within the same formula. Excel may calculate other tributaries of the calculation chain quickly, but this part lets down the team. Locate these bottlenecks and revise them.

That said, the locating can be the tricky part. You’ll need to think about this Dependency Tree. For example, you’re seeing calculation lags when you’re updating which inputs? Which leaves are ultimately dependent on these inputs? Take a copy of the model and start deleting the leaves and the branches backwards, retesting the inputs and calculation speed as you go. Eventually you’ll delete the bottleneck calculations and know it because the calculation speed improves. So now you can solve the problem in the original model.

 

Volatile functions

But what if you have calculation speed issues every time you update any cell? That’d be annoying. And how would you locate the problem? Short of it being due to bottleneck calculations absolutely everywhere (which is a possibility), the problem can be due to the use of “Volatile Functions”. A volatile function is a function which Excel can’t be sure whether needs recalculating when any input in the spreadsheet is changed, regardless of this whole conscientious Dependency Tree business, and so recalculates anyway, just in case. In fact, many actions you take in Excel will trigger all volatile functions recalculating – whether it’s deleting a worksheet, applying a filter, or copying and pasting.

So if any input is changed, as well as that input’s calculation chain being marked as dirty and in need of recalculating, all cells containing volatile functions are updated as well…and all their calculation chains are marked as dirty and recalculated too.

Volatile functions include: INDIRECT(), OFFSET(), RAND(), RANKDBETWEEN(), NOW(), TODAY(), CELL() and INFO(). The closer they sit to the leaves, the less of a threat they pose to performance as there are fewer dependent cells to recalculate. The closer to the roots, the bigger the threat, particularly if there are knock-on memory intensive calculations. Either way, good spreadsheet design will always absolutely avoid them wherever possible. In my experience the most common culprits are INDIRECT() and OFFSET().

 

Replanting a forest

Finally, if you press Ctrl+Alt+Shift+F9 in a spreadsheet, Excel rebuilds all Dependency Trees in the spreadsheet (in fact all open spreadsheets) and recalculates everything. It’d be like replanting a forest. Now that is fairly useless knowledge…

Close

Two Step

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.