C++ – Reasons to Create Own Array Class Over Using std::array

arraycclassobject-oriented

What specific conditions or requirements should you create your own array over using std::array?

Here is my background:

I'm developing a small simple library that a small group of people will use and hopefully build upon. The current structure allows for both std::vector and std::array using iterators, however, a lot of the people do not like and cannot seem to use std::array and find the notation confusing to work with the library.

I have attempted to make my library work for better for std::array however, it dawned on me that maybe I should provide my own and adding in the functionality needed?

For example: library::array<int, 2> myArray; myArray.fill(1, 100); and I thought it would be a good project in order for me to develop my skills, particularly in templates and other sorting algorithms.

Best Answer

How many dimensions are needed?
- C++ Template programming may require some code duplication for each level of higher dimension.
- Address calculation is the easy part. A simple approach can be used for dimensions up to a dozen.
- For a 3-dimensional example:
  - Let the size of the array be [m, n, p]
  - For each dimension, we calculate a "weight vector" by successively multiplying terms:
    [p*n, p, 1]
  - To access the element at [i, j, k], its linearized element index can be calculated as:
    [i, j, k] dotproduct [p*n, p, 1]
    or ((((i * n) + j) * p) + k)
- For higher dimensions, care must be taken to reduce the time spent in multiplications needed for element address calculation.

What are the typical sizes of the array, to the order of magnitude?
- For small-sized arrays (less than 1000 elements), basically any design and implementation will work, unless proven otherwise by a profiler.
- For medium-sized arrays (as long as it can still fit entirely in the computer's memory or the application's address space), performance will be an increasingly important concern.
- For large arrays (which cannot fit as a contiguous piece of memory), architecture concern will dominate.

Architecture / Memory Layout
- Driving forces:
  - Size / scale
  - Performance
  - Computational platform
  - Interoperability (with other code / libraries)
  - Access semantics
- Available choices:
  - Contiguous memory (the easiest, most straight-forward choice)
  - Non-contiguous memory (example: tiled array)
  - Memory-mapped file
  - Distributed across multiple machines in a network

Does your implementation need to support high performance?
- As a ballpark, anything requiring more than one billion elementary operations per second can be considered high-performance for the purpose of choosing a design.
- Designing for high-performance will impose constraints on:
  - Choice of computational platform
  - Choice of memory layout

Computational platform?
- If high performance is not needed, scalar C++ code will be sufficient.
- Otherwise, a high-performance computational platform will be needed.
- This will impose constraints on:
  - Choice of memory layout
- Computational choices
  - Scalar C++
  - Multi-core computing
  - Manual threading
  - Concurrent programming library (Intel TBB, Microsoft PPL/AMP)
  - Compiler-assisted (OpenMP)
  - Parallel programming languages. Example: Cilk, Brook
  - Open-soure library (OpenCV)
  - SIMD (Single-Instruction Multiple Data) example: Intel SSE2, AVX, ARM NEON, PowerPC AltiVec
  - GPU programming (CUDA, OpenCL)

Does the array need to be accessible by other libraries, without copying?
- If so, your array's memory layout must be compatible with that other library you are using.
- For small array sizes, or for access patterns where writes (modifications) occur infrequently, copying is not an issue.
- For large arrays which are both read and written very frequently, data copying can become an performance issue.

A story where a bad combination of data copying and access semantics can cause an increasingly disproportionate performance issue:
- Consider two modules, each of which has its own copy of a 100 MB array, and which they have to keep synchronized at regular intervals.
- Suppose that one of the modules freely modifies its own copy, without keeping track of which region of the array is being modified.
- When the time comes to perform synchronization, the first module will have to copy the entire 100 MB array over, because it cannot tell which part needs synchronization.
- The user of the first module will complain that "if I am only modifying a few elements of this array, why is the library taking so long to synchronize?"
Lesson: by requiring modifications to go through an API call, access semantics can be used to reduce inefficiency in data copying.

Access semantics.
- Boils down to three questions:
  - Who computes the memory address of elements? given the subscripts (i, j, k)
  - How is the array's lifetime (ownership) managed?
  - Is the array implementation being notified of which part of the array is modified?
Choices:
- Direct access to underlying memory, no API call.
  - The array implementation will not know which part of the array is modified / used.
- Direct access to underlying memory, requiring a API call to lock/unlock an array view.
  - Known as "advisory locks" because it cannot enforce access control.
  - An advisory lock is still useful, because by locking/unlocking the library is notified of current users of the array, as well as the region being modified.
- API method for reading and writing one element at a time
  - Easiest to implement and use
  - Slowest
- API method for copying an array view at a time
  - Copies between a sub-rectangle of your array and a normal C/C++/Java array
  - Good balance between high-performance and safety
  - Allows your array implementation to use a different memory layout

Design of the library interface and syntax
- ...

If you have made it thus far, congratulations, you can start designing the interface and the syntax!

Related Solutions

iOS JSON – Should Specific Classes Be Created for JSON Objects or Use Container?

I would go with specific classes - it will be much easier to write unit tests and maintain code even if you lose some performance. It is not a place where you should prefer performance over clarity.

C++ Type Safety – Understanding Strongly Typed Typedef

These are phantom type parameters, that is, parameters of a parameterised type that are used not for their representation, but to separate different “spaces” of types with the same representation.

And speaking of spaces, that’s a useful application of phantom types:

template<typename Space>
struct Point { double x, y; };

struct WorldSpace;
struct ScreenSpace;

// Conversions between coordinate spaces are explicit.
Point<ScreenSpace> project(Point<WorldSpace> p, const Camera& c) { … }

As you’ve seen, though, there are some difficulties with unit types. One thing you can do is decompose units into a vector of integer exponents on the fundamental components:

template<typename T, int Meters, int Seconds>
struct Unit {
  Unit(const T& value) : value(value) {}
  T value;
};

template<typename T, int MA, int MB, int SA, int SB>
Unit<T, MA - MB, SA - SB>
operator/(const Unit<T, MA, SA>& a, const Unit<T, MB, SB>& b) {
  return a.value / b.value;
}

Unit<double, 0, 0> one(1);
Unit<double, 1, 0> one_meter(1);
Unit<double, 0, 1> one_second(1);

// Unit<double, 1, -1>
auto one_meter_per_second = one_meter / one_second;

Here we’re using phantom values to tag runtime values with compile-time information about the exponents on the units involved. This scales better than making separate structures for velocities, distances, and so on, and might be enough to cover your use case.

Best Answer

If you have made it thus far, congratulations, you can start designing the interface and the syntax!

Related Solutions

iOS JSON – Should Specific Classes Be Created for JSON Objects or Use Container?

C++ Type Safety – Understanding Strongly Typed Typedef

Related Topic