C++ STL – Merit of Using Beginning Iterator Instead of Reference to std::vector

cstl

I'm working on our company's lib. I see a lot of code like:

std::vector<int>::iterator it = market.vec.begin();
for (size_t i = 0; i < market.vec.size(); ++i)
    it[i] = i + 1;

I think a reference should be better:

std::vector<int>& ref_vec = market.vec;
for (size_t i = 0; i < market.vec.size(); ++i)
    ref_vec[i] = i + 1;

Since if we do

market.vec.resize(20);

somewhere, it will be invalid.

So, is there any benefit that I don't know for using iterator instead of reference?

Thanks.

Best Answer

There a limited containers in the C++ library that are required to be implemented with contiguous memory. vector is one of them, so given a reference to the first element, it is well known where the next element is in memory and the way to get there is also well known.

For other containers, it is not known how it is laid in memory, where all the elements are and how to traverse from one element to the next. Iterators provide that mechanism;

Where is the next element (++), possibly where is the previous element (--) and even, get the element 3 places away from the current one (+= 3)
It knows how to get to the value of the element being referenced (dereferencing the iterator)
It also knows when it gets to the end of the container (when compared to .end())

Once the above semantics are established, it becomes very easy to write general algorithms that are agnostic of the container being used and the memory layout.

The general algorithms can also apply some optimisations knowing the nature of the iterators (random iterators vs. forward iterators) and getting to that iterator type is embedded in the iterator itself. Sure, more specialised algorithms that know the memory layout and can be optimised further are present, but they generally land up being members of the containers.

Note: elements can be accessed by reference with members such as .front(), .back() and .at() etc. as supported by the container. Support for these is defined per container, and the front and back only get to the first and last elements, general algorithm support and usage of these members is limited.

Note on the resize: the call to resize the vector could invalidate both the iterators and element references (depending on the whether a reallocation takes place) and should generally be assumed that the resize does invalidate them.

A common technique when using iterators is to loop from the beginning to the end, given the for loop in your sample, as follows;

// auto used for the sample
// if the container is non-const, then "it" is of type
// std::vector<int>::iterator
for (auto it = market.vec.begin(); it != market.vec.end(); ++it)
    // code

Additionally, indexes etc. could be added as required. Given the code sample, std::iota could be a good replacement for it as well.

std::iota(market.vec.begin(), market.vec.end(), 0);

Giving more consideration to the exact sample code provided, there is little reason to use the iterator or the reference, just access the element in the vector using the index operator []. I don't see any synchronisation code so the vector won't be resized whilst the for loop runs.

That said, favour working with iterators and dereference them to get to the element's value.

Related Solutions

C++ – How to store various sized values in a vector

Firstly, command classes are decoupled from the medium and protocol. That means you can design the command classes for your own programming convenience, rather than having to design it to match exactly to the specifics of each protocol (which would be impossible, since different protocols may have different bit widths for the same command and field).

When I mention convenience, what I mean is that you can use the maximum bit width you'll ever need for each command's fields.

However, you may still need to have device- or protocol-specific validation code, since each device or protocol imposes its own limits to what values can be in those fields. Unless you don't plan to implement any validation at all.

When it comes to validation, there are several choices:

Not doing it at all, if you will be doing all of the programming yourself, and if it is a hobby project such that mistakes do not result in damages.
Validating it eagerly, i.e. in the command class. This may be difficult, since a command class might not know which device or protocol it will be sent to.
Validating it late, i.e. in the protocol class where the command values are being converted into bytes.

For example, even if a validation rule says that a particular field can only have a value in the range 0 - 100, it doesn't stop you from using a uint32_t or int32_t for that field in the command class.

To the second question of having an overloaded method that takes in various built-in number types and append the bytes to an internal byte vector, do notice the caveats.

In my opinion, if you only needs to work with the fundamental integer types, you don't need templates. Instead, you simply provide function overloads for each of the types, and you call the functions with a value of the appropriate type.

void CPacket::addData(uint32_t data) { ... }
void CPacket::addData(int32_t data) { ... }
void CPacket::addData(uint16_t data) { ... }
void CPacket::addData(int16_t data) { ... }
...

Regarding the code inside, there are several choices:

Type punning with union. This assumes that your code will work exclusively with one byte-endianness, thus not needing to consider the possibility of porting to a different byte-endianness.

union
{
    uint32_t value;
    uint8_t bytes[4];
} pun = { data };
// after that, add the bytes to the vector one-by-one, according to the byte endianness of the communication.

Explicitly extracting the bytes with endian-agnostic bitwise arithmetic: (see note on casting)

// only if value is unsigned. For signed value, it must first be cast to unsigned
uint8_t byte0 = (uint8_t)value;
uint8_t byte1 = (uint8_t)(value >> 8ul);
uint8_t byte2 = (uint8_t)(value >> 16ul);
uint8_t byte3 = (uint8_t)(value >> 24ul);

C++ – Idiomatic usage of exceptions in C++

First, I feel obliged to point out that std::exception and its children were designed a long time ago. There are a number of parts that would probably (almost certainly) be different if they were being designed today.

Don't get me wrong: there are parts of the design that have worked out pretty well, and are pretty good examples of how to design an exception hierarchy for C++ (e.g., the fact that, unlike most other classes, they all share a common root).

Looking specifically at logic_error, we have a bit of a conundrum. On one hand, if you have any reasonable choice in the matter, the advice you quoted is right: it's generally best to fail as fast and noisily as possible so it can be debugged and corrected.

For better or worse, however, it's hard to define the standard library around what you should generally do. If it defined these to exit the program (e.g., calling abort()) when given incorrect input, that would be what always happened for that circumstance--and there are actually quite a few circumstances under which this probably isn't really the right thing to do, at least in deployed code.

That would apply in code with (at least soft) real-time requirements, and minimal penalty for an incorrect output. For example, consider a chat program. If it's decoding some voice data, and gets some incorrect input, chances are a user will be a lot happier to live with a millisecond of static in the output than a program that just shuts down completely. Likewise when doing video playback, it may be more acceptable to live with producing the wrong values for some pixels for a frame or two than have the program summarily exit because the input stream got corrupted.

As for whether to use exceptions to report certain types of errors: you're right--the same operation might qualify as an exception or not, depending on how it's being used.

On the other hand, you're also wrong--using the standard library doesn't (necessarily) force that decision on you. In the case of opening a file, you'd normally be using an iostream. Iostreams aren't exactly the latest and greatest design either, but in this case they get things right: they let you set an error mode, so you can control whether failing to open a file with result in an exception being thrown or not. So, if you have a file that's really necessary for your application, and failing to open it means you have to take some serious remedial action, then you can have it throw an exception if it fails to open that file. For most files, that you'll try to open, if they don't exist or aren't accessible, they'll just fail (this is the default).

As for how you decide: I don't think there is an easy answer. For better or worse, "exceptional circumstances" isn't always easy to measure. While there are certainly cases that are easy to decide must be [un]exceptional, there are (and probably always will be) cases where it's open to question, or requires knowledge of context that's outside the domain of the function at hand. For cases like that, it may at least be worth considering a design roughly similar to this part of iostreams, where the user can decide whether failure results in an exception being thrown or not. Alternatively, it's entirely possible to have two separate sets of functions (or classes, etc.), one of which will throw exceptions to indicate failure, the other of which uses other means. If you go that route, chances are pretty good that one should be a wrapper around the other, or both act as wrappers around a shared set of functions that implement the real guts of the work.

Best Answer

Related Solutions

C++ – How to store various sized values in a vector

C++ – Idiomatic usage of exceptions in C++

Related Topic