Sql – Fetch the row which has the Max value for a column

greatest-n-per-grouporaclesql

Table:

UserId, Value, Date.

I want to get the UserId, Value for the max(Date) for each UserId. That is, the Value for each UserId that has the latest date. Is there a way to do this simply in SQL? (Preferably Oracle)

Update: Apologies for any ambiguity: I need to get ALL the UserIds. But for each UserId, only that row where that user has the latest date.

Best Answer

I see many people use subqueries or else window functions to do this, but I often do this kind of query without subqueries in the following way. It uses plain, standard SQL so it should work in any brand of RDBMS.

SELECT t1.*
FROM mytable t1
  LEFT OUTER JOIN mytable t2
    ON (t1.UserId = t2.UserId AND t1."Date" < t2."Date")
WHERE t2.UserId IS NULL;

In other words: fetch the row from t1 where no other row exists with the same UserId and a greater Date.

(I put the identifier "Date" in delimiters because it's an SQL reserved word.)

In case if t1."Date" = t2."Date", doubling appears. Usually tables has auto_inc(seq) key, e.g. id. To avoid doubling can be used follows:

SELECT t1.*
FROM mytable t1
  LEFT OUTER JOIN mytable t2
    ON t1.UserId = t2.UserId AND ((t1."Date" < t2."Date") 
         OR (t1."Date" = t2."Date" AND t1.id < t2.id))
WHERE t2.UserId IS NULL;

Re comment from @Farhan:

Here's a more detailed explanation:

An outer join attempts to join t1 with t2. By default, all results of t1 are returned, and if there is a match in t2, it is also returned. If there is no match in t2 for a given row of t1, then the query still returns the row of t1, and uses NULL as a placeholder for all of t2's columns. That's just how outer joins work in general.

The trick in this query is to design the join's matching condition such that t2 must match the same userid, and a greater date. The idea being if a row exists in t2 that has a greater date, then the row in t1 it's compared against can't be the greatest date for that userid. But if there is no match -- i.e. if no row exists in t2 with a greater date than the row in t1 -- we know that the row in t1 was the row with the greatest date for the given userid.

In those cases (when there's no match), the columns of t2 will be NULL -- even the columns specified in the join condition. So that's why we use WHERE t2.UserId IS NULL, because we're searching for the cases where no row was found with a greater date for the given userid.

Related Solutions

Mysql – How to select the nth row in a SQL database table

There are ways of doing this in optional parts of the standard, but a lot of databases support their own way of doing it.

A really good site that talks about this and other things is http://troels.arvin.dk/db/rdbms/#select-limit.

Basically, PostgreSQL and MySQL supports the non-standard:

SELECT...
LIMIT y OFFSET x

Oracle, DB2 and MSSQL supports the standard windowing functions:

SELECT * FROM (
  SELECT
    ROW_NUMBER() OVER (ORDER BY key ASC) AS rownumber,
    columns
  FROM tablename
) AS foo
WHERE rownumber <= n

(which I just copied from the site linked above since I never use those DBs)

Update: As of PostgreSQL 8.4 the standard windowing functions are supported, so expect the second example to work for PostgreSQL as well.

Update: SQLite added window functions support in version 3.25.0 on 2018-09-15 so both forms also work in SQLite.

Sql – Add a column with a default value to an existing table in SQL Server

Syntax:

ALTER TABLE {TABLENAME} 
ADD {COLUMNNAME} {TYPE} {NULL|NOT NULL} 
CONSTRAINT {CONSTRAINT_NAME} DEFAULT {DEFAULT_VALUE}
WITH VALUES

Example:

ALTER TABLE SomeTable
        ADD SomeCol Bit NULL --Or NOT NULL.
 CONSTRAINT D_SomeTable_SomeCol --When Omitted a Default-Constraint Name is autogenerated.
    DEFAULT (0)--Optional Default-Constraint.
WITH VALUES --Add if Column is Nullable and you want the Default Value for Existing Records.

Notes:

Optional Constraint Name:
If you leave out CONSTRAINT D_SomeTable_SomeCol then SQL Server will autogenerate
a Default-Contraint with a funny Name like: DF__SomeTa__SomeC__4FB7FEF6

Optional With-Values Statement:
The WITH VALUES is only needed when your Column is Nullable
and you want the Default Value used for Existing Records.
If your Column is NOT NULL, then it will automatically use the Default Value
for all Existing Records, whether you specify WITH VALUES or not.

How Inserts work with a Default-Constraint:
If you insert a Record into SomeTable and do not Specify SomeCol's value, then it will Default to 0.
If you insert a Record and Specify SomeCol's value as NULL (and your column allows nulls),
then the Default-Constraint will not be used and NULL will be inserted as the Value.

Notes were based on everyone's great feedback below.
Special Thanks to:
@Yatrix, @WalterStabosz, @YahooSerious, and @StackMan for their Comments.