C++ – vector::erase and reverse_iterator

citeratorstlvector

I have a collection of elements in a std::vector that are sorted in a descending order starting from the first element. I have to use a vector because I need to have the elements in a contiguous chunk of memory. And I have a collection holding many instances of vectors with the described characteristics (always sorted in a descending order).

Now, sometimes, when I find out that I have too many elements in the greater collection (the one that holds these vectors), I discard the smallest elements from these vectors some way similar to this pseudo-code:

grand_collection: collection that holds these vectors
T: type argument of my vector
C: the type that is a member of T, that participates in the < comparison (this is what sorts data before they hit any of the vectors).

std::map<C, std::pair<T::const_reverse_iterator, std::vector<T>&>> what_to_delete;
iterate(it = grand_collection.begin() -> grand_collection.end())
{
     iterate(vect_rit = it->rbegin() -> it->rend())
     {
         // ...
          what_to_delete <- (vect_rit->C, pair(vect_rit, *it))
          if (what_to_delete.size() > threshold)
               what_to_delete.erase(what_to_delete.begin());
         // ...  
     }
}

Now, after running this code, in what_to_delete I have a collection of iterators pointing to the original vectors that I want to remove from these vectors (overall smallest values). Remember, the original vectors are sorted before they hit this code, which means that for any what_to_delete[0 - n] there is no way that an iterator on position n - m would point to an element further from the beginning of the same vector than n, where m > 0.

When erasing elements from the original vectors, I have to convert a reverse_iterator to iterator. To do this, I rely on C++11's §24.4.1/1:

The relationship between reverse_iterator and iterator is
&*(reverse_iterator(i)) == &*(i- 1)

Which means that to delete a vect_rit, I use:

vector.erase(--vect_rit.base());

Now, according to C++11 standard §23.3.6.5/3:

iterator erase(const_iterator position); Effects: Invalidates
iterators and references at or after the point of the erase.

How does this work with reverse_iterators? Are reverse_iterators internally implemented with a reference to a vector's real beginning (vector[0]) and transforming that vect_rit to a classic iterator so then erasing would be safe? Or does reverse_iterator use rbegin() (which is vector[vector.size()]) as a reference point and deleting anything that is further from vector's 0-index would still invalidate my reverse iterator?

Edit:

Looks like reverse_iterator uses rbegin() as its reference point. Erasing elements the way I described was giving me errors about a non-deferenceable iterator after the first element was deleted. Whereas when storing classic iterators (converting to const_iterator) while inserting to what_to_delete worked correctly.

Now, for future reference, does The Standard specify what should be treated as a reference point in case of a random-access reverse_iterator? Or this is an implementation detail?

Thanks!

Best Answer

From a standardese point of view (and I'll admit, I'm not an expert on the standard): From §24.5.1.1:

namespace std {
    template <class Iterator>
    class reverse_iterator ...
    {
        ...
            Iterator base() const; // explicit
        ...
        protected:
            Iterator current;
        ...
    };
}

And from §24.5.1.3.3:

Iterator base() const; // explicit
    Returns: current.

Thus it seems to me that so long as you don't erase anything in the vector before what one of your reverse_iterators points to, said reverse_iterator should remain valid.

Of course, given your description, there is one catch: if you have two contiguous elements in your vector that you end up wanting to delete, the fact that you vector.erase(--vector_rit.base()) means that you've invalidated the reverse_iterator "pointing" to the immediately preceeding element, and so your next vector.erase(...) is undefined behavior.

Just in case that's clear as mud, let me say that differently:

std::vector<T> v=...;
...
// it_1 and it_2 are contiguous
std::vector<T>::reverse_iterator it_1=v.rend();
std::vector<T>::reverse_iterator it_2=it_1;
--it_2;

// Erase everything after it_1's pointee:

// convert from reverse_iterator to iterator
std::vector<T>::iterator tmp_it=it_1.base();

// but that points one too far in, so decrement;
--tmp_it;

// of course, now tmp_it points at it_2's base:
assert(tmp_it == it_2.base());

// perform erasure
v.erase(tmp_it);  // invalidates all iterators pointing at or past *tmp_it
                  // (like, say it_2.base()...)

// now delete it_2's pointee:
std::vector<T>::iterator tmp_it_2=it_2.base(); // note, invalid iterator!

// undefined behavior:
--tmp_it_2;
v.erase(tmp_it_2);

In practice, I suspect that you'll run into two possible implementations: more commonly, the underlying iterator will be little more than a (suitably wrapped) raw pointer, and so everything will work perfectly happily. Less commonly, the iterator might actually try to track invalidations/perform bounds checking (didn't Dinkumware STL do such things when compiled in debug mode at one point?), and just might yell at you.