Google-sheets – Duplicate values in dynamic sorting

duplicategoogle sheetsgoogle-sheets-arrays

I have an issue with duplicate sorting values for a formula in Google Sheets.
I'm using two tables to order 10 scores (see pic). The values in Column D are dragged from another sheet. The values in Column B have a simple RANK formula based on Column D:

=rank(D4,D$4:D$13)

The values in the second table are based on the first in order to dynamically order the second table.

Formula in Column G using an INDEX and MATCH:

=index($C$4:$C$13,match(H4,$D$4:$D$13,0),1)

Formula in Column H uses a simple LARGE formula:

=large($D$4:$D$13,1)

Which all works fine until the values in Column D are the same and the formula in Column G repeats the first value it comes across in Column C (see attached file).

Anyone have a solution for when there are duplicate values in dynamic sorting?

Best Answer

The problem to be resolved is that neither the RANK command, nor the formula in column G are designed to recognise and avoid a duplicate values.

Both ROC and Fod both scored 17, but there is nothing to distinguish between them.

Thankfully Prashanth at infospired.com recognised the underlying problem and solved it in "Flexible Array Formula to Rank Without Duplicates in Google Sheets".

Here's a link to my worked spreadsheet.

Here's a screenshot of Fergal's data versus my version (using the infospired solution).

infospired created a new formula to create a "Rank without Duplicates". This is the formula in the screenshot:

=array_constrain(sort({row(D19:D28)-18,sort({row(D19:D28)-18,D19:D28},2,false)},2,true),9^9,1)

This is a complicated formula, but becomes easier to understand when it is broken into its component pieces. There are three steps:

sort#1
sort#2
produce the ranking.

Sort#1 - sort({row(D19:D28)-18,D19:D28},2,false)
There are three steps
1 - Get the column of raw scores (as-is)
2 - Create a new column consisting of the numbers 1 to ten (inclusive), and assign one number to each score. Note this isn't ranking, it's simply assigning numbers to each score with no duplicates. The numbers are created using ROW with this code row(D19:D28)-18. "row(D19:D28)" would create the numbers 19 to 28 but there is a second element "minus 18". This has the effect of deducting 18 from each row number; so instead of creating number 19, it creates number one (nineteen minus eighteen); two instead of twenty; three instead of twenty one; four instead of twenty two and so on.
3 - User SORT to sort the columns by score descending.

Sort #2 - sort({row(D19:D28)-18,sort({row(D19:D28)-18,D19:D28},2,false)},2,true)
This also consists of three steps
1 - Take the columns from sort#1
2 - Create a new column consisting of the numbers 1 to ten (inclusive), and assign one number to row from Sort#1. The same method applies as in Sort#1. The ROW command generates row numbers, but "minus 18" reduces the actual sequence to numbers one to ten inclusive.
3 - Sort the output by the number column assigned in Sort#1-Step#2; and sort the column ascending.

Produce the ranking This step uses the ARRAY_CONSTRAIN command . The output is an array.

array_constrain(<<column output>>),9^9,1)

1 - The input is the results from Sort#2-Step3.
2 - Create an array up to 9^9 rows long. This is a LOT; scope for over 387 million rows.
3 - 1 column wide. This is the column assigned in Sort#2. This number is now a rank, but there are no duplicate values.

The final table can be created using VLOOKUP commands for both Name and Score. Base the lookup on the Rank, and use the Rank, Name and Score table as data to be interrogated.

Related Solutions

Google-sheets – Sorting unique rows by a related SUMIF column

I have had a similar issue, with trying to sort a UNIQUE dataset.

the way i got around it was by creating another sheet and using the '=QUERY()' function to call the sheet with the UNIQUE list on it. This way you can add a ORDER BY statement to the query.

so in your example, Expenses has the source data, the summary table has the Summary data (with the UNIQUE function),

and then add a new sheet called "Summary Sorted" and then add the query function like this:

=query(Summary!A1:B4, "SELECT A,B ORDER BY B DESC")

Here is an example spreadsheet to demonstrate:

Google-sheets – How to filter a full table based on only unique values in a single column

I found a way to do it:

Select Column A and "Insert 1 left" to add a column. Your table data is moved over one to columns B - D.
In your newly blank Cell A2, enter formula
=IF(COUNTIF($B$2:$B2,B2)=1, "Unique", "Duplicate")
Copy the formula from A1 and paste it down the A column range for as far as you have data. The "$" absolute references will ensure everything works smoothly.
Add a new sheet. Copy and paste your old sheet's B - D column headers to the new sheet's A - C.
In the your new sheet's cell A2, enter formula
=FILTER(OldSheet!B2:D5, OldSheet!A2:A5 = "Unique")
Your new sheet now has your old sheets data -- Deduplicated!

I use it with ImportRange to separate Data Entry from Data Processing, so ImportData shouldn't present a problem.

You can hide the A column on your old sheet if you like.

Best of all, everything updates live on edit. So you can continue adding data to OldSheet and NewSheet will be always be current with filtered rows.

Best Answer

Related Solutions

Google-sheets – Sorting unique rows by a related SUMIF column

Google-sheets – How to filter a full table based on only unique values in a single column

Related Topic