Google-sheets – Report/process only using the last entry made by the user in Google Forms

google sheetsgoogle-apps-scriptgoogle-forms

I have a form where students can enter choices: for instance the specialisation they want to pursue, and the semester when they will start that specialisation.

An example entry form is here, but it has not much interest, other than add addition data to the example processing sheet. You can copy the sheet locally to edit it, and it will still pull the form data over.

I decided not to use the option on the form to allow only one entry per student as students often change their mind, make a new entry several months later. They were told that only their last entry will be considered at a specific deadline.

I wish to have some reporting in place that shows an overview of all choices, lists the students per specialisation etc… but only taking into account their last entry, and without deleting previous entries made.

To look up the last entry for a specific user, it is easy using:

query(importrange("1L3V0HIN69aVvCV-4Hx7m39w6zsuBZEaK2L0JDULP0NY","Form Responses 1!A:E"),"select Max(Col1), Col2, Col3, Col4 where Col2 contains '"&C1&"' group by Col2,Col3,Col4 order by Max(Col1) desc limit 1",0)

The e-mail (or part of it would be stored in C1). The trick is the limit 1 clause at the end combined with the order by clause.

But this no longer works when trying to filter for the whole cohort following a specialisation at a specific start date, or counting the amount of students per specialisation, etc.

This is where I could use some help. Maybe adding a column "outdated" and filling that in via an array formula could be an option, but I struggled with making that work on the original response spreadsheet.

Best Answer

One approach is to add a status column next to the data input: for example, in L10 it would be

=if(iserror(match(H10, H11:H, 0)), "current", "obsolete")

which returns "current" if no entry below the present row has the same email. Then your query can filter out the obsolete results.


The above does not work with arrayformula. I don't think there is an arrayformula-compatible approach, because the number of comparisons is quadratic in the size of array. Here is a script solution that auto-expands: put status(H10:H) in a new column, with the following function in a script.

function status(arr) {
  flat = arr.reduce(function(a, b) { return a.concat(b); });
  var output = [];
  for (var i=0; i<flat.length; i++) {
    if (flat[i] === '') {
      output.push(['']);
    }
    else if (flat.slice(i+1).indexOf(flat[i]) === -1) {
      output.push(['current']);
    }
    else {
      output.push(['obsolete']);
    }
  }
  return output;
} 

The approach I took in this answer also works reasonably well here: use unique to get unique users (identified by emails), then filter rows by email, and use vlookup to pick the data just from the last row. The result will be a neat summary table that has only the latest data.

Using your example, I build such a table starting from the cell A20. In A20, I enter

=unique(H10:H)

obtaining the list of emails. Then in B20 I enter

=vlookup(999999, filter($G$10:$K, $H$10:$H=$A20), column()+1)

and extend this formula to the range B20:D. This is the result:

+-----------------------------+-----------------+--------+-----------------+
| peter.goedtkindt@school.edu | SpecializationB | 2016.2 | changed my mind |
| student1@school.edu         | SpecializationB | 2016.1 |                 |
| studnent2@school.edu        | SpecializationC | 2016.1 |                 |
+-----------------------------+-----------------+--------+-----------------+

(I don't know if you needed comments, but I included them anyway.) Using column()+1 as the last parameter of vlookup allows the formula to pick the relevant column, after vlookup with absurdly large search key 999999 zooms in on the last entry.