howtofilterdataformulas.png

Dynamically Filter Data Using Only a Formula

In my previous post I went over advanced filters in Excel. This time around I’ll go over how to achieve the same result using just a formula. No macros, no VBA, just through a not-so-simple formula that can dynamically update based on your selections.

I’m again going to use my sample database file for this example. Here’s an excerpt of what that looks like:

Filtering based on one criteria


I’m going to start by filtering all that entries for a specific sales rep.

First, I’m going to use the INDEX function to select the range from where I’m pulling data from.

=INDEX(SampleDatabase!$A$1:$G$1000

For my results, I’m going to want them to show up in the order they appear in the database. For example, in the excerpt above Rep D shows up on lines 3 and 5, and I want that same order to stay intact.

In order to do this, I’m going to use the IF, SMALL and ROW functions, which will be inserted in the INDEX function.

SMALL(IF(SampleDatabase!$C$1:$C$1000=$I$2,ROW(SampleDatabase!$A$1:$A$1000),””),ROW(A1)),1)

$C$1:$C$1000=$I$2 : In this argument, assume that $I$2 is where I have my sales rep name, in this case it would be Rep D. Because I’m only interested in rows that relate to Rep D, this is the main argument that I want to evaluate.

ROW(SampleDatabase!$A$1:$A$1000), “”) : This will return the row number if the above argument is true. It doesn’t matter whether I reference column A, B, C or any other since I’m only pulling the row number. If it isn’t a match, the result will be a blank value.

ROW(A1) : This returns a value of 1, and what this will accomplish is that it will pull the smallest row number from the above list. For instance, for Rep D we know that lines 3 and 5 will be a match, but the smallest number, or the first time that there is a match, is 3. As I drag this formula onto subsequent lines, the row number, because it isn’t frozen, will change and on line 2 it will pull the second smallest row number, on the third line it will be the third smallest, and so on.

The last argument is which column you want to extract. I left it as 1, and that will return the date since that is the first column in my INDEX argument. However, if I wanted to pull the total sales, I could change that to 7, since that would indicate column G, which is the seventh column in the data set that I specified.

The completed formula will look as follows:

=INDEX(SampleDatabase!$A$1:$G$1000,SMALL(IF(SampleDatabase!$C$1:$C$1000=$I$2,ROW(SampleDatabase!$A$1:$A$1000),””),ROW(A1)),1)

This formula will need to be entered as an array, so be sure to hit CTRL+SHIFT+ENTER.

The first five results look as follows:

The one caveat is that if you don’t know how many entries you’ll have and copy the formula down too far, you’ll inevitably end up with #NUM! errors because the formula has not found any more matches. What you can do in this case is use the IFERROR function and include it in the formula:

=IFERROR(INDEX(SampleDatabase!$A$1:$G$1000,SMALL(IF(SampleDatabase!$C$1:$C$1000=$I$2,ROW(SampleDatabase!$A$1:$A$1000),””),ROW(A1)),1),””)

What this will do is now show a blank value if there are no more matches.

Filtering for multiple criteria


While it’s nice to be able to filter for just one criteria, what if you wanted to look for the entries with multiple conditions? Although this makes our already long formula even longer, it is still possible.
Much of the formula stays the same, and the key to making it work is by changing the first argument in the IF statement. Previously, It was only looking for the Sales Rep to be a match:
$C$1:$C$1000=$I$2

I’ll add another criteria, this time for records that include Product E, and I’ll put the product criteria in the cell below in I3. I will add the following to the formula:
$D$1:$D$1000=$I$3
How I combine the two arguments is by multiplying them by one another:
($C$1:$C$1000=$I$2)*($D$1:$D$1000=$I$3)=1

I add the =1 at the end because if both conditions are true then they will result in a 1 value for that line. For example, in the first condition it will look at whether the sales rep is a match, if it is the value will be true (or 1), and if not, it will be false (or 0). The same will happen if the product matches. 
Therefore, if either one of those conditions is false then a 0 will be returned and the two conditions multiplied against one another will not equal 1.
Below is how the new formula looks:
=IFERROR(INDEX(SampleDatabase!$A$1:$G$1000,SMALL(IF(($C$1:$C$1000=$I$2)*($D$1:$D$1000=$I$3)=1,ROW(SampleDatabase!$A$1:$A$1000),””),ROW(A1)),1),””)
You’ll notice much fewer matches in column L (multiple criteria) than in column J where I was only looking for the sales rep to be a match.
If you go back to the original excerpt I showed, you’ll see that for the 3/21/2017 entry, it was for both Rep D and Product E. If I change the values in column I then my calculations will adjust accordingly.

RAND Functions

Use Excel’s Random Number Generators to Populate Sample Data

Excel has random number generator functions which are useful if you need to test a template or create some sample / dummy data.

The RAND function returns a number between 0 and 1. You could multiply this by a factor of 10 or 100 to get a much larger number if you need it. Once you are happy with the data that has been populated then you will want to copy and paste it as values otherwise the numbers will change every time a recalculation occurs. This is true of any random number function in Excel.
Similarly, the RANDBETWEEN function will return a random  number between a range that you specify. In the below example I use 1 and 100:
RAND Functions
If I used the RAND function and just multiplied by 100 I could get similar results to RANDBETWEEN. The latter just saves you that extra step by being able to specify your parameters right in the formula. It allows you also to be more specific (say for example I wanted a random number between 1 and 35, the RANDBETWEEN function would certainly be easier to use)
If you do not like the random numbers you have generated, you can simply just hit the delete key on an empty cell and your random numbers will be regenerated. If you don’t want your numbers to change anymore then you will want to copy them and paste as values.
To use this to create sample data I am going to make two lists, one for employees, one for stores. From there, I can use the RANDBETWEEN function in conjunction with the INDEX function to extract values from the lists:
RANDBETWEEN Function
In columns A and B I am just using the RANDBETWEEN function to select a number between 1 and 5, as that is the number of different employees and stores I have listed in columns C and D. In columns E and F I use the INDEX function to extract from those lists using the random numbers generated in columns A and B.
I will break down the INDEX formula in column E a little bit here:
=INDEX($C$2:$C$6
In the first argument I am selecting my employee list since this is where I want the result to come from. I also lock the cells using the $ sign to ensure that as I copy the formula down that range is locked and will always reference C2:C6.
=INDEX($C$2:$C$6,A3
The second argument in the INDEX function is the row number from where I want to pull my value. Cell A3 is my first random value – which in this case is 4. This means that from the Employee List range (C2:C6) I want the value on the fourth row of that range – which is not row 4, it is row 5 since I start counting from the start of the range, which is on row 2. As I copy this formula down the row number will change to the corresponding value in column A. 
Because the lists only have one column you could actually stop here.
=INDEX($C$2:$C$6,A3, 1)
The last argument in the INDEX function is the column number. In this example I only have 1 column in my lists so the value is equal to 1. You could skip this argument and it will still work however it’s good habit to always enter the column number.
The above formula tells me to look at range C2:C6, and extract the value from the row that is referenced on A3 (which is 4), and from column 1. That point of intersection is Employee 4, since it is on the fourth line of that range, and in the first and only column.
If I copy the formula down a cell it will do the same except this time pull the value from the row referenced in cell A4 – this time it is 2. As a result, my result is Employee 2 since that value is on the second row of the range and again in the first column. 
Column F is the same formula as column E except this time it references the store list (column D) and the second RANDBETWEEN column (column B). This is just to show you how you might fill in multiple items. You wouldn’t want to use column A again otherwise you are guaranteed that the same row will be returned and you will always have Employee 4 tied to Store 4, and thus, not truly random combinations. 
In this example I have my lists, random numbers, and results all in the same area for illustrative purposes but they do not need to be even on the same sheet.

VLOOKUP vs INDEX and MATCH

Lookups are popular in Excel and here I’ll look at the more popular one – VLOOKUP. However it’s not always the best option for doing lookups in Excel, and I’ll show you why.

VLOOKUP
Exact Matches
What Vlookup does is look for a value you have selected, and if it finds it, will return a value from the same table that corresponds to the matched value.
One of the key limitations of VLOOKUP is it cannot return results left of the matched item, only to the right. This is where I recommend the INDEX & MATCH formula (see later down), as that combination will allow you to go left or right and won’t require you to re-arrange your worksheet just to accommodate a formula.
Example 1

If I wanted to lookup value B in the table, I would enter the following formula:
=VLOOKUP(A7,A1:D4,2,false)
This will equal the value in cell B2, the number 2. If I changed the column number from 2 to 3:
=VLOOKUP(A7,A1:D4,3,false)
It would return the number 22, or cell C2.
If I selected column 5, it would result in an error because my table range (A1:D4) only contains four columns.
If instead of looking up letters in column A I wanted to lookup numbers in column B, I would have to change my table range from A1:D4 to B1:D4, and it would look like this:
=VLOOKUP(A7,B1:D4,3,false)
I would also have to change the value in A7 so that it is a number. But again, if I change the formula this way I cannot move to the left and find out what letter corresponds to my value. Not without re-arranging my table.
Note that before I changed the range column 3 related to column C, now it relates to D because the table has shifted. Column 3 relates to the column number in the table, not in the spreadsheet.
                                                                                   
Approximate Matches

The one strength of VLOOKUP is determining what category or range a value falls into. By changing the last argument in the formula to true, Vlookup no longer looks for an exact match. Why would this be useful? Let’s say you have the first 3 letters/numbers of a postal/zip code. Because there are so many combinations possible, you would have to list each one out to find an exact match. 

With VLOOKUP’s approximate match, it will determine the closest match (e.g. shipping rate codes won’t spell out an entire postal/zip code, but will often cite a range). Similarly, if you have tax brackets and need to know what bracket an income level falls into, this is where it would be useful as well. The one caveat is that the values in the table must be in ascending order.
Hlookup is a parallel formula to Vlookup, only that it looks horizontally rather than vertically.
Example 2
In this example, my formula looks like this:
=VLOOKUP(A7,A1:B4,2,TRUE)
The argument at the end has changed from FALSE to TRUE, meaning an exact match is no longer needed. If I had set it to TRUE, it would return an error. But in this case, it returns cell B2, or 2. The reason for this, is because the values are in ascending order, it correctly identifies that V3A falls between V2B and V3C. Since it has not yet reached V3C, it belongs to V2B. If V3A is changed to V4D, X, Y, or a value greater than V4D, it will equal 4, as it will recognize that it belongs to the highest category.

INDEX & MATCH
I mentioned using INDEX & MATCH will give you a more versatile formula. This formula is structured differently than VLOOKUP in that it will pull the coordinates from the row and column number you specify. The match function will allow you to determine the proper row number based on your search criteria, and the column number you can decide – whether it is left or right of the matched value, it doesn’t matter here.
Example 3
Going back to the Example 1, I’ll show you how using the INDEX & MATCH formula will be able to now move to the left and pull values from column A:
=INDEX(A1:D4,MATCH(A7,B1:B4,0),1)
The range is unchanged, but the second argument in the INDEX formula (relating to row number) is calculated using the MATCH formula. The MATCH formula looks for the value in A7 (1), in the range B1:B4 and returns the row number. The 0 in the MATCH formula represents an exact match. After the MATCH formula, the last argument in the INDEX formula is the column number, which has been set to 1, which will return the values in column A. The advantage of using INDEX and MATCH as you can see is you can change the column number to 1, 2, 3, or 4. Similarly, if you wanted to look up the values in column C instead of B, you would change the formula as follows:
=INDEX(A1:D4,MATCH(A7,C1:C4,0),1)
In this case you don’t need to change the column number, it doesn’t move since you don’t have to rearrange the table.
In short, when you should use either formula:
Looking up a value – INDEX & MATCH*
Finding a value based on ranges – VLOOKUP
*unless you just want a quick formula and the values you need are to the right