Newsletter 23-July-2007

Welcome to the cpearson.com Excel Newsletter for Monday, 23-July-2007. In this inaugural issue, we will cover two topics: Circular References In Formulas and Understanding Rounding In Excel.

This week's topic is an introduction to Circular References in formulas. A circular reference in a formula is a cell reference that refers to the same cell containing the formula. A circular reference may be direct or indirect. A direct circular reference is when a formula refers to itself; for example, if the formula in cell A1 refers to cell A1. An indirect circular reference is a formula that refers to another cell and that cell refers back to the cell containing the formula, either directly or indirectly. For example, if A1 refers to B1, and B1 refers to C1, and C1 refers back to A1, then A1 is said to have an indirect circular reference. Most of the time, Excel quite rightfully flags a circular reference as an error.

However, you can configure Excel to allow circular references and use them to your advantage. We will look at two examples of circular references.

ENABLE CIRCULAR REFERENCES: You can enable circular references from the Calculation tab of the Options dialog. In Excel, go to the Tools menu, choose Options and then the Calculation tab. There, check the Iteration box and, for these examples, set the Maximum Iterations to 1.

The first example we will look at is to find and persist the maximum value of cell in a range. Normally, you find the maximum of a range by using the MAX worksheet function. For example, =MAX(A1:A5) will return the maximum value in the range A1:A5. However, let's put a twist in the requirements. Instead of just calculating the maximum value, suppose we want to find the maximum value ever reached by any cell in A1:A5. That is, if the maximum at one point in time is 22, and then later the maximum is 15, we want to keep the value 22 without changing down to 15. This is easily accomplished with a circular reference. First, enable circular references as described above. Then, in cell E5, for example, enter the formula =MAX(A1:A5,IF(E5=0,MAX(A1:A5),E5)). Of course, change the reference to E5 to the cell in which you entered the formula. When you first enter the formula and it calculates the first time, the result is the maximum of cells A1:A5. The next time a cell in A1:A5 is changed, the formula looks at the maximum of A1:A5 and the value of E5 itself and returns the maximum of these numbers. If the value of E5 itself is greater than the current maximum of A1:A5, the formula returns the value of E5 itself. Therefore, the formula will return the maximum value ever reached by the values in A1:A5 If the current maximum of A1:A5 falls below a previous maximum value of A1:A5, that previous maximum will remain in E5.

This formula can be modified to return a persistent minimum. With circular references enabled, enter the following in E7.
=MIN(A1:A5,IF(E7=0,MIN(A1:A5),E7)). This will return the persistent minimum value of the cells in A1:A5. If a new minimum is reached, that new minimum is stored in E7. If the current minimum in A1:A5 is greater than a previous minimum stored in E7, the value of E7 is retained.

It should be noted that if you edit the formulas, they will reset themselves and return the current maximum or minimum of A1:A5 and the persisted value will be lost.

Time Sensitive Lookups

Another use of circular references is to control when a VLOOKUP should return a result. Suppose you have a table of stock prices in D1:E5, with ticker symbol in column D and the share price in column E. Further assume that these prices are updated daily, but you want to take the share price only on the first day of the month. For all other days of the month, you don't want to get an updated price. The price returned by the formula should always be the price on the first day of the month. The following formula in cell C10 will do this, assuming that the ticker symbol to look up is in cell C1 and the current date is in A1 (e.g., calculated with the =NOW() function).

=IF($C10=0,VLOOKUP($C1,$D$1:$E$5,2,FALSE),
IF(DAY($A$1)=1,VLOOKUP($C$1,$D$1:$E$5,2,FALSE),$C10))

Note that the formula above is split in to two lines for readability. In Excel, the formula should be entered as a single line.

This formula first looks at itself, $C10, and if that is 0 it does a VLOOKUP into the share price table $D$1:$E$5 to get the current price. This will occur only when the formula is first entered or when you edit it later. If $C10 is not 0, the formula tests the DAY of $A$1 and if that is equal to 1, indicating the first day of the month, the formula does a VLOOKUP into the share price table and returns the price whose ticker symbol is in cells $C$1. If the DAY is not equal to 1, the formula returns its own value, $C10.

Circular references in formulas can be a powerful tool when creating formulas. However, care should be used with them. Because they cause Excel to loop during calculation, they can cause slow performance if you have a large number of circular references. Moreover, once you enable circular references, they are enabled for all cells, so you will not be alerted if you have a circular reference where one does not belong.

Understanding Rounding In Excel

You may have found cases in which Excel does not give you the answer you expect -- the calculated result of a formula is off by a small amount. There are two concepts to understand in order to properly interpret these errors. The first is due to the number formatting used in a cell to format the output. The second is an inherent limitation of computing and fractional numbers.

Number Formatting And Rounding

Number formatting determines how values are displayed on the screen. You may, for example, have a range of numbers formatted to display results to two decimal places. It is important to understand that Excel always stores numbers to the greatest precision allowed by hardware and software. (This limitation is discussed later.) Typically, this is about 15 digits, counting numbers on both sides of the decimal point. Even if you are displaying only two places after the decimal point, Excel still stores the number out to the full 15 digits and uses that in calculations. This can cause your displayed results to disagree with the "right" answer.

For example, suppose you have the formula =1/3 in cells A1, A2 and A3, and you format these cells to display two decimal places. You will see on the screen 0.33, 0.33, and 0.33. If you then add up these cells with the SUM function, =SUM(A1:A3), you get the value 1.00, which disagrees with what you might expect. Doesn't 0.33 + 0.33 + 0.33 equal 0.99? Is this a bug in Excel? No, it is no bug. Regardless of how you display the result, the values in A1, A2, and A3 are stored all the way out to 15 decimal places, 0.3333333333333 and it is this value that is used in calculations. When this number is added 3 times, it adds up to 0.999999999999999 which is rounded to 1.0. Again, this is all independent of how you have the cells formatted to display. Excel always uses all 15 digits of precision. This discrepancy between the calculated value and the displayed value is often called the "penny off" effect, because results seem to be off by a penny (or more).

There are two methods you can use to cope with this.

The first and better solution is to use the ROUND function to calculate with a specified number of digits. Using ROUND, you round each element to sum to the desired number of decimal places. In the example above, you could use instead of =SUM(A1:A3) the formula =ROUND(A1,2)+ROUND(A2,2)+ROUND(A3,2). Note, this formula is NOT the same as =ROUND(SUM(A1:A3),2). The former formula will give you the result 0.99 and the latter will still be a "penny off" with a result of 1.00. While this works, it would be tedious to write such a formula for a range with many cells. Instead, you can use the array formula =SUM(ROUND(A1:A3,2)). Array formulas will be discussed in a later newsletter, so suffice it to say that when you type in the array formula, you must press CTRL SHIFT ENTER rather than just ENTER. You must do this when you first enter the formula and whenever you edit it later. If you do this properly, Excel will display the formula enclosed in curly braces { }. Array formula are discussed in detail at www.cpearson.com/Excel/ArrayFormulas.aspx,.

The second and less desirable method is to set the Precision As Displayed option to True in the Calculation tab of the Options dialog. This is less desirable because number precision will be lost permanently. With two decimal places, using our example above, this will result in =SUM(A1:A3) having a result of 0.99 regardless of the number of digits. Thus, Excel will calculate 1/3 + 1/3 + 1/3 to be equal to 0.9900000000, which is clearly incorrect.

Internal Number Storage And Rounding

A second cause of numeric "discrepancies" is due to the nature of how numbers are stored in the binary system of a computer. First, lets look at a decimal number. In base 10, the number 0.4321 is the result (0 * 1) + (4 * 0.1 ) + (3 * 0.01) + (2 * 0.001) + (1 * 0.0001). This number is accurately represented in base 10 arithmetic. But now consider the number 1/3. No matter how many decimal places you carry this number out, it is still an approximation of the real number 1/3. As long as you are restricted to a finite number of decimal places, 0.33333...33 is still an approximation of the real number 1/3.

The same sort of restriction occurs in computers. Unless a number can be stored as the sum of fractions based on a exponential power of 2, the number stored and used by the computer is an approximation of the real number. Arithmetic done with approximated number can have very small discrepancies due to the approximation. These small approximations will compound one another in complex formulas. To see this in action, enter 200.41 in cell A1, 200 in cell A2, =A1-A2 in cell A3, and =(A3-0.41)=0 in cell A4. Mathematically speaking, A4 should return TRUE, indica0ting that the result of the subtraction is equal to zero. Instead, you'll get FALSE. If you increase the number of displayed decimal places in A3 far enough, you'll see that the real result is 0.409999999999997. This is so because 0.409999999999997 is the closest approximation Excel can make to the number 0.41 given the inherent limitations imposed by the computer.

This is neither a bug nor a flaw in Excel. Indeed, Excel adheres to the accepted industry standards for "double-precision floating point" numbers, as established by the Institute Of Electrical And Electronics Engineers (IEEE). The "discrepancy" in the result is well known and well documented. To work around this inherent limitation, you can modify your formulas such that instead of testing whether a number is, say, zero, you can test if the difference is less than a very small number. For example, in A5, enter the formula =(A3-0.41)<0.000000001. This will correctly return TRUE, indicating the the difference is very small, small enough to be considered equal to zero for nearly any practical purpose. This is a common technique in both worksheet functions and programming languages to take into account the approximations of floating point numbers.

Excel and the IEEE standard provide accuracy to about 15 digits (the total number digits on both sides of the decimal point). Anything beyond that is undefined. You can see this with whole numbers if you try to enter a very large number (greater than 15 digits) in a worksheet cell. Excel will round 1234567890123456 to 1234567890123450. Hint: If you need to store a number such as a long phone number or a credit card number in a cell and you will not be doing any mathematical operations with that number, prefix the number with an apostrophe. The apostrophe will not display in the cell, but will appear in the formula bar. The apostrophe tells Excel to treat what follows as literal text, not to be converted in any way.

An in-depth discussion of rounding and a technical description of the IEEE standard for floating point numbers can be found at www.cpearson.com/Excel/Rounding.htm .