C++ – How to use arrays in C++

arrayscc++-faqmultidimensional-arraypointers

C++ inherited arrays from C where they are used virtually everywhere. C++ provides abstractions that are easier to use and less error-prone (std::vector<T> since C++98 and std::array<T, n> since C++11), so the need for arrays does not arise quite as often as it does in C. However, when you read legacy code or interact with a library written in C, you should have a firm grasp on how arrays work.

This FAQ is split into five parts:

If you feel something important is missing in this FAQ, write an answer and link it here as an additional part.

In the following text, "array" means "C array", not the class template std::array. Basic knowledge of the C declarator syntax is assumed. Note that the manual usage of new and delete as demonstrated below is extremely dangerous in the face of exceptions, but that is the topic of another FAQ.

(Note: This is meant to be an entry to Stack Overflow's C++ FAQ. If you want to critique the idea of providing an FAQ in this form, then the posting on meta that started all this would be the place to do that. Answers to that question are monitored in the C++ chatroom, where the FAQ idea started out in the first place, so your answer is very likely to get read by those who came up with the idea.)

Best Answer

Arrays on the type level

An array type is denoted as T[n] where T is the element type and n is a positive size, the number of elements in the array. The array type is a product type of the element type and the size. If one or both of those ingredients differ, you get a distinct type:

#include <type_traits>

static_assert(!std::is_same<int[8], float[8]>::value, "distinct element type");
static_assert(!std::is_same<int[8],   int[9]>::value, "distinct size");

Note that the size is part of the type, that is, array types of different size are incompatible types that have absolutely nothing to do with each other. sizeof(T[n]) is equivalent to n * sizeof(T).

Array-to-pointer decay

The only "connection" between T[n] and T[m] is that both types can implicitly be converted to T*, and the result of this conversion is a pointer to the first element of the array. That is, anywhere a T* is required, you can provide a T[n], and the compiler will silently provide that pointer:

                  +---+---+---+---+---+---+---+---+
the_actual_array: |   |   |   |   |   |   |   |   |   int[8]
                  +---+---+---+---+---+---+---+---+
                    ^
                    |
                    |
                    |
                    |  pointer_to_the_first_element   int*

This conversion is known as "array-to-pointer decay", and it is a major source of confusion. The size of the array is lost in this process, since it is no longer part of the type (T*). Pro: Forgetting the size of an array on the type level allows a pointer to point to the first element of an array of any size. Con: Given a pointer to the first (or any other) element of an array, there is no way to detect how large that array is or where exactly the pointer points to relative to the bounds of the array. Pointers are extremely stupid.

Arrays are not pointers

The compiler will silently generate a pointer to the first element of an array whenever it is deemed useful, that is, whenever an operation would fail on an array but succeed on a pointer. This conversion from array to pointer is trivial, since the resulting pointer value is simply the address of the array. Note that the pointer is not stored as part of the array itself (or anywhere else in memory). An array is not a pointer.

static_assert(!std::is_same<int[8], int*>::value, "an array is not a pointer");

One important context in which an array does not decay into a pointer to its first element is when the & operator is applied to it. In that case, the & operator yields a pointer to the entire array, not just a pointer to its first element. Although in that case the values (the addresses) are the same, a pointer to the first element of an array and a pointer to the entire array are completely distinct types:

static_assert(!std::is_same<int*, int(*)[8]>::value, "distinct element type");

The following ASCII art explains this distinction:

      +-----------------------------------+
      | +---+---+---+---+---+---+---+---+ |
+---> | |   |   |   |   |   |   |   |   | | int[8]
|     | +---+---+---+---+---+---+---+---+ |
|     +---^-------------------------------+
|         |
|         |
|         |
|         |  pointer_to_the_first_element   int*
|
|  pointer_to_the_entire_array              int(*)[8]

Note how the pointer to the first element only points to a single integer (depicted as a small box), whereas the pointer to the entire array points to an array of 8 integers (depicted as a large box).

The same situation arises in classes and is maybe more obvious. A pointer to an object and a pointer to its first data member have the same value (the same address), yet they are completely distinct types.

If you are unfamiliar with the C declarator syntax, the parenthesis in the type int(*)[8] are essential:

int(*)[8] is a pointer to an array of 8 integers.
int*[8] is an array of 8 pointers, each element of type int*.

Accessing elements

C++ provides two syntactic variations to access individual elements of an array. Neither of them is superior to the other, and you should familiarize yourself with both.

Pointer arithmetic

Given a pointer p to the first element of an array, the expression p+i yields a pointer to the i-th element of the array. By dereferencing that pointer afterwards, one can access individual elements:

std::cout << *(x+3) << ", " << *(x+7) << std::endl;

If x denotes an array, then array-to-pointer decay will kick in, because adding an array and an integer is meaningless (there is no plus operation on arrays), but adding a pointer and an integer makes sense:

   +---+---+---+---+---+---+---+---+
x: |   |   |   |   |   |   |   |   |   int[8]
   +---+---+---+---+---+---+---+---+
     ^           ^               ^
     |           |               |
     |           |               |
     |           |               |
x+0  |      x+3  |          x+7  |     int*

(Note that the implicitly generated pointer has no name, so I wrote x+0 in order to identify it.)

If, on the other hand, x denotes a pointer to the first (or any other) element of an array, then array-to-pointer decay is not necessary, because the pointer on which i is going to be added already exists:

   +---+---+---+---+---+---+---+---+
   |   |   |   |   |   |   |   |   |   int[8]
   +---+---+---+---+---+---+---+---+
     ^           ^               ^
     |           |               |
     |           |               |
   +-|-+         |               |
x: | | |    x+3  |          x+7  |     int*
   +---+

Note that in the depicted case, x is a pointer variable (discernible by the small box next to x), but it could just as well be the result of a function returning a pointer (or any other expression of type T*).

Indexing operator

Since the syntax *(x+i) is a bit clumsy, C++ provides the alternative syntax x[i]:

std::cout << x[3] << ", " << x[7] << std::endl;

Due to the fact that addition is commutative, the following code does exactly the same:

std::cout << 3[x] << ", " << 7[x] << std::endl;

The definition of the indexing operator leads to the following interesting equivalence:

&x[i]  ==  &*(x+i)  ==  x+i

However, &x[0] is generally not equivalent to x. The former is a pointer, the latter an array. Only when the context triggers array-to-pointer decay can x and &x[0] be used interchangeably. For example:

T* p = &array[0];  // rewritten as &*(array+0), decay happens due to the addition
T* q = array;      // decay happens due to the assignment

On the first line, the compiler detects an assignment from a pointer to a pointer, which trivially succeeds. On the second line, it detects an assignment from an array to a pointer. Since this is meaningless (but pointer to pointer assignment makes sense), array-to-pointer decay kicks in as usual.

Ranges

An array of type T[n] has n elements, indexed from 0 to n-1; there is no element n. And yet, to support half-open ranges (where the beginning is inclusive and the end is exclusive), C++ allows the computation of a pointer to the (non-existent) n-th element, but it is illegal to dereference that pointer:

   +---+---+---+---+---+---+---+---+....
x: |   |   |   |   |   |   |   |   |   .   int[8]
   +---+---+---+---+---+---+---+---+....
     ^                               ^
     |                               |
     |                               |
     |                               |
x+0  |                          x+8  |     int*

For example, if you want to sort an array, both of the following would work equally well:

std::sort(x + 0, x + n);
std::sort(&x[0], &x[0] + n);

Note that it is illegal to provide &x[n] as the second argument since this is equivalent to &*(x+n), and the sub-expression *(x+n) technically invokes undefined behavior in C++ (but not in C99).

Also note that you could simply provide x as the first argument. That is a little too terse for my taste, and it also makes template argument deduction a bit harder for the compiler, because in that case the first argument is an array but the second argument is a pointer. (Again, array-to-pointer decay kicks in.)

Related Solutions

C++ – a smart pointer and when should I use one

UPDATE

This answer is rather old, and so describes what was 'good' at the time, which was smart pointers provided by the Boost library. Since C++11, the standard library has provided sufficient smart pointers types, and so you should favour the use of std::unique_ptr, std::shared_ptr and std::weak_ptr.

There was also std::auto_ptr. It was very much like a scoped pointer, except that it also had the "special" dangerous ability to be copied — which also unexpectedly transfers ownership.
It was deprecated in C++11 and removed in C++17, so you shouldn't use it.

std::auto_ptr<MyObject> p1 (new MyObject());
std::auto_ptr<MyObject> p2 = p1; // Copy and transfer ownership. 
                                 // p1 gets set to empty!
p2->DoSomething(); // Works.
p1->DoSomething(); // Oh oh. Hopefully raises some NULL pointer exception.

OLD ANSWER

A smart pointer is a class that wraps a 'raw' (or 'bare') C++ pointer, to manage the lifetime of the object being pointed to. There is no single smart pointer type, but all of them try to abstract a raw pointer in a practical way.

Smart pointers should be preferred over raw pointers. If you feel you need to use pointers (first consider if you really do), you would normally want to use a smart pointer as this can alleviate many of the problems with raw pointers, mainly forgetting to delete the object and leaking memory.

With raw pointers, the programmer has to explicitly destroy the object when it is no longer useful.

// Need to create the object to achieve some goal
MyObject* ptr = new MyObject(); 
ptr->DoSomething(); // Use the object in some way
delete ptr; // Destroy the object. Done with it.
// Wait, what if DoSomething() raises an exception...?

A smart pointer by comparison defines a policy as to when the object is destroyed. You still have to create the object, but you no longer have to worry about destroying it.

SomeSmartPtr<MyObject> ptr(new MyObject());
ptr->DoSomething(); // Use the object in some way.

// Destruction of the object happens, depending 
// on the policy the smart pointer class uses.

// Destruction would happen even if DoSomething() 
// raises an exception

The simplest policy in use involves the scope of the smart pointer wrapper object, such as implemented by boost::scoped_ptr or std::unique_ptr.

void f()
{
    {
       std::unique_ptr<MyObject> ptr(new MyObject());
       ptr->DoSomethingUseful();
    } // ptr goes out of scope -- 
      // the MyObject is automatically destroyed.

    // ptr->Oops(); // Compile error: "ptr" not defined
                    // since it is no longer in scope.
}

Note that std::unique_ptr instances cannot be copied. This prevents the pointer from being deleted multiple times (incorrectly). You can, however, pass references to it around to other functions you call.

std::unique_ptrs are useful when you want to tie the lifetime of the object to a particular block of code, or if you embedded it as member data inside another object, the lifetime of that other object. The object exists until the containing block of code is exited, or until the containing object is itself destroyed.

A more complex smart pointer policy involves reference counting the pointer. This does allow the pointer to be copied. When the last "reference" to the object is destroyed, the object is deleted. This policy is implemented by boost::shared_ptr and std::shared_ptr.

void f()
{
    typedef std::shared_ptr<MyObject> MyObjectPtr; // nice short alias
    MyObjectPtr p1; // Empty

    {
        MyObjectPtr p2(new MyObject());
        // There is now one "reference" to the created object
        p1 = p2; // Copy the pointer.
        // There are now two references to the object.
    } // p2 is destroyed, leaving one reference to the object.
} // p1 is destroyed, leaving a reference count of zero. 
  // The object is deleted.

Reference counted pointers are very useful when the lifetime of your object is much more complicated, and is not tied directly to a particular section of code or to another object.

There is one drawback to reference counted pointers — the possibility of creating a dangling reference:

// Create the smart pointer on the heap
MyObjectPtr* pp = new MyObjectPtr(new MyObject())
// Hmm, we forgot to destroy the smart pointer,
// because of that, the object is never destroyed!

Another possibility is creating circular references:

struct Owner {
   std::shared_ptr<Owner> other;
};

std::shared_ptr<Owner> p1 (new Owner());
std::shared_ptr<Owner> p2 (new Owner());
p1->other = p2; // p1 references p2
p2->other = p1; // p2 references p1

// Oops, the reference count of of p1 and p2 never goes to zero!
// The objects are never destroyed!

To work around this problem, both Boost and C++11 have defined a weak_ptr to define a weak (uncounted) reference to a shared_ptr.

Javascript – How to check if an array includes a value in JavaScript

Modern browsers have Array#includes, which does exactly that and is widely supported by everyone except IE:

console.log(['joe', 'jane', 'mary'].includes('jane')); //true

You can also use Array#indexOf, which is less direct, but doesn't require polyfills for outdated browsers.

console.log(['joe', 'jane', 'mary'].indexOf('jane') >= 0); //true

Many frameworks also offer similar methods:

jQuery: $.inArray(value, array, [fromIndex])
Underscore.js: _.contains(array, value) (also aliased as _.include and _.includes)
Dojo Toolkit: dojo.indexOf(array, value, [fromIndex, findLast])
Prototype: array.indexOf(value)
MooTools: array.indexOf(value)
MochiKit: findValue(array, value)
MS Ajax: array.indexOf(value)
Ext: Ext.Array.contains(array, value)
Lodash: _.includes(array, value, [from]) (is _.contains prior 4.0.0)
Ramda: R.includes(value, array)

Notice that some frameworks implement this as a function, while others add the function to the array prototype.