I'm trying to set up a kind of a bid thing, where the top 5 respondents receive a prize. I was hoping to use SQL to get the top 5, but I can't remove duplicate bidders and only take their top bid. I wanted to do something like this:
=QUERY(range, "SELECT timestamp, bid, DISTINCT name, department ORDER BY bid DESC LIMIT 5")
But DISTINCT doesn't work in Google Sheets. Any idea as to what I should do in this situation?
Here is my dummy data set (ignore the timestamps):
Timestamp | Bid ($) | Name | Department |
---|---|---|---|
2/16/2021 10:45:07 | 18 | Jim | Accounting |
2/16/2021 9:42:40 | 22 | Bob | Accounting |
2/16/2021 9:26:15 | 21 | Samantha | Sales |
2/16/2021 9:26:15 | 30 | Jim | Accounting |
2/16/2021 10:45:07 | 43 | Jim | Accounting |
2/16/2021 9:42:40 | 23 | Susan | Engineering |
2/16/2021 9:26:15 | 45 | Susan | Engineering |
2/16/2021 9:26:15 | 26 | Alex | HR |
2/16/2021 10:45:07 | 12 | Geoff | Sales |
2/16/2021 9:42:40 | 12 | Sarah | Sales |
This is my desired outcome:
Timestamp | Bid ($) | Name | Department |
---|---|---|---|
2/16/2021 9:26:15 | 45 | Susan | Engineering |
2/16/2021 10:45:07 | 43 | Jim | Accounting |
2/16/2021 9:26:15 | 26 | Alex | HR |
2/16/2021 9:42:40 | 22 | Bob | Accounting |
2/16/2021 9:26:15 | 21 | Samantha | Sales |
This is what I have so far. You can see that Jim and Susan are in the table twice and that Bob and Samantha have missed out on placing as a result of this.
Using the query:
=QUERY(A3:D12, "SELECT A, B, C, D ORDER BY B DESC LIMIT 5")
Timestamp | Bid ($) | Name | Department |
---|---|---|---|
2/16/2021 9:26:15 | 45 | Susan | Engineering |
2/16/2021 10:45:07 | 43 | Jim | Accounting |
2/16/2021 9:26:15 | 30 | Jim | Accounting |
2/16/2021 9:26:15 | 26 | Alex | HR |
2/16/2021 9:42:40 | 23 | Susan | Engineering |
Best Answer
Assuming that you don't have anything unrelated underneath your main A:D data, try this formula somewhere in Column E or rightward:
=ArrayFormula({"Timestamp","Bid ($)","Name","Department";VLOOKUP(ARRAY_CONSTRAIN(UNIQUE(SORT(C3:C,B3:B,0)),5,1),QUERY({C3:C,A3:D},"Select * Order By Col3 Desc"),{2,3,4,5},FALSE)})
How It Works:
{"Timestamp","Bid ($)","Name","Department"; ... }
Headers are generated. The semicolon means what follows will be placed below that in the virtual array formed within the curly brackets.
SORT(C3:C,B3:B,0)
Creates a virtual array of all names in order from highest to lowest bid.
UNIQUE(SORT(C3:C,B3:B,0))
Reduces that list of names to only UNIQUE entries (but still in order from highest to lowest bid).
ARRAY_CONSTRAIN(UNIQUE(SORT(C3:C,B3:B,0)),5,1)
Constrains that list to only five entries (i.e., the top five names)
QUERY({C3:C,A3:D},"Select * Order By Col3 Desc")
Forms a virtual array that starts with the names column followed by all four columns (five columns total), ordered by the bids in descending order (i.e., highest to lowest).
VLOOKUP([array one],[array two],{2,3,4,5},FALSE)
This will look up the first array (i.e. the five top names) within in the second array (i.e., the QUERY) and return the matching data from columns 2,3,4,5 (i.e., the original four columns of data, recalling that the virtual first column being used by the lookup is a repeat of all the names).
FALSE
simply means we're looking for exact matches on the five top names.Your returned dates may appear as numbers in the 40000 range with decimal components. That is Google Sheets raw format for date/time (the number of days and partial days since December 31, 1899). Just format that column in the date format you prefer using Format > Number.