vector from scratch

How to write a C++ standard container

Sven Johannsen
C++ User Group NRW
Düsseldorf 2016-05-18

Disclaimer

  • source code for educational purposes only
  • demo code divided in multiple steps:
    • every step single header / namespace
    • diff step to get the corresponding changes
  • source code located at GitHub: https://github.com/SvenJo/

extra hint:

  • hit s on your keyboard to display speaker notes

content

  • why use vector?
  • vector part 1
  • allocators
  • exceptions safety
  • vector part 2
  • iterators
  • vector part 3
  • missing / known issues
  • questions

why do we need a vector?

What kind of problems does the class vector try to solve?

  • C-style pointer to arrays
  • C-style arrays
  • std::vector
  • std::array

The std::vector (and std::string) are "first class" container in the STL.

C-style fields

    // raw pointers
    const int i_ptr_size = 10;
    int* i_ptr = new int[i_ptr_size]();

    for (int i = 0; i < i_ptr_size; ++i) {
      i_ptr[i] = 2 + i;
    }    

    delete[] i_ptr;
  • size and pointer
  • manual resource handling (alternative: unique_ptr<>)
  • pointer vs pointer to array: same type (e.g. int*) but need new/delete or new[]/delete[]
  • initialization: limited options

C-style fields (loops)

    for (int i = 0; i < i_ptr_size; ++i) {
      cout << i_ptr[i] << " ";
    }
    cout << endl;

    for (int* pi = i_ptr; pi < i_ptr + i_ptr_size; ++pi) {
      cout << *pi << " ";
    }
    cout << endl;
  • loop with index or pointer style
  • no support for range based for loop

C-style arrays

    // C style arrays
    int i_arr[] = { 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 }; // at compile time

    const int i_ptr_size = sizeof(i_arr) / sizeof(int);

    for (int i : i_arr) {
      cout << i << " ";
    }
    cout << endl;
  • static size
  • self-containing size (number of elements)
  • range based for loops

std::vector

    std::vector<int> v = { 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 };

    for (size_t i = 0; i < v.size(); ++i) {
      cout << v[i] << " ";
    }
    cout << endl;

    for (auto it = v.cbegin(); it != v.cend(); ++it) {
      cout << *it << " ";
    }

    for (int i : v) {
      cout << i << " ";
    }
    cout << endl;
  • initialization like C-style arrays
  • range based for loop
  • self-containing size
  • ...

std::array

    std::array<int, 10> a = { { 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 } };

    for (size_t i = 0; i < a.size(); ++i) {
      cout << a[i] << " ";
    }
    cout << endl;

    for (int i : a) {
      cout << i << " ";
    }
    cout << endl;
  • not part of this presentation

class vector (Part 1)

  • C++ standard 23.3.6
  • Sequence container
  • elements are stored contiguously
  • vector satisfies all of the requirements of a container (exceptions: *_front)

the first lines of code

for the class vector

  • memory layout
  • capacity
  • naive memory allocation
  • element access

memory layout

  template<typename T /*, class Allocator = allocator<T> */ >
  class vector
  {
  public:
    // some typedefs
    typedef size_t size_type; // see 23.2 // implementation-defined!

    vector() {}
     // ...

  private:
    T* begin_ = nullptr;
    T* end_ = nullptr;
    T* capacity_ = nullptr;
  };
  • 3 pointers are all we need
  • ignore class allocator for the basics

step0.h

capacity

    // see: 23.3.6.3, capacity
    bool empty() const noexcept
    {
      return begin_ == end_;
    }

    size_type size() const noexcept
    {
      return end_ - begin_;
    }

    size_type capacity() const noexcept
    {
      return capacity_ - begin_;
    }
  |01234567890123456789-----------|
   ^begin_             ^end_      ^capacity_                  

   begin_ <= end_ <= capacity_

step1.h

max_size()

    size_type max_size() const noexcept
    {
      return std::numeric_limits<size_type>::max() / sizeof(T); 
    }
  • non-static(!) member function

step1.h

naive memory handling

    explicit vector(size_type n /*, const Allocator& = Allocator()*/)
    {
      begin_ = new T[n]();
      capacity_ = end_ = begin_ + n;
    }

    ~vector()
    {
      delete[] begin_;
    }

Not the final code!!!

step2.h

a first meaningful test

    vector<int> v(10);
    cout << "empty    : " << v.empty() << endl;
    cout << "size     : " << v.size() << endl;    
    cout << "capacity : " << v.capacity() << endl;

Result:

empty    : false
size     : 10
capacity : 10

step2.h

element access

vector<int> v = {1, 2, 3};

int i = v[0]; // i==1
v[0] = -1;    // v == -1, 2, 3

const auto& rv = v;
int j = v[1]; // j==2
rv[2] = 100;  // compile error

int k = v.at(2); // k==3
v.at(1) = -2;    // v == -1, -2, 3

int l = v.at(5); // throw out_of_range exception

step3.h

element access: operator[]

  template<typename T>
  class vector
  {
  public:
    // some typedefs
    typedef T& reference;
    typedef const T& const_reference;

    // element access:
    reference operator[](size_type n)
    {
      return *(begin_ + n);
    }
    const_reference operator[](size_type n) const
    {
      return *(begin_ + n);
    }
  };

operator[](size_type n) const returns a const reference and not copy, but the assignment of a const reference to a l-value will create a copy.

step3.h

elememt access: at()

  class out_of_range : public std::logic_error // std::exception
  {
  public:
    out_of_range(const std::string& what_arg) : std::logic_error(what_arg) {}
    out_of_range(const char* what_arg) : std::logic_error(what_arg) {}
  };

  const_reference at(size_type n) const
  {
    if (n >= size()) throw out_of_range("out_of_range: vector");
    return *(begin_ + n);
  }  
  reference at(size_type n)
  {
    if (n >= size()) throw out_of_range("out_of_range: vector");
    return *(begin_ + n);
  }

step3.h

test element access

    vector<int> v(10);
    for (size_t i = 0; i < v.size(); ++i)
      v[i] = 2 + i;

    for (size_t i = 0; i < v.size(); ++i)
      cout << v[i] << " ";
    cout << endl;

    for (size_t i = 0; i < v.size(); ++i)
      cout << v.at(i) << " ";
    cout << endl;

    try {
      v.at(100) = 1234;
    } catch (const std::exception& ex) {
      cout << ex.what() << endl;
    }

Result:

2 3 4 5 6 7 8 9 10 11
2 3 4 5 6 7 8 9 10 11
out_of_range: vector

step3.h

swap

template<typename T>
class vector {

  void swap(vector& v)
  {
    std::swap(v.begin_, begin_);
    std::swap(v.end_, end_);
    std::swap(v.capacity_, capacity_);
  }
};

template <class T>
void swap(vector<T>& x, vector<T>& y)
{
  x.swap(y);
}
vector<int> v1, v2;

v1.swap(v2);
swap(v1,v2);

step3.h

front / back

  reference front()
  {
    return *begin_;
  }
  const_reference front() const
  {
    return *begin_;
  }
  reference back()
  {
    return *(end_ - 1);
  }
  const_reference back() const
  {
    return *(end_ - 1);
  }
vector<int> v(2);

int i = v.front(); 
int j = v.back();

step4.h

data

  // 23.3.6.4, data access
  T* data() noexcept
  {
    return begin_;
  }
  const T* data() const noexcept
  {
    return begin_;
  }
  auto* p = v.data();
  assert(p == &v[0]);
  assert(*p == v.front());

step4.h

clear: failed implementation

    void clear() noexcept
    {
      for(pointer p = begin_; p != end_; ++p) {
        // call destructor for *p ???
      }
      end_ = begin_;      
    }
  |01234567890123456789-----------|
   ^begin_                        ^capacity_
   ^end_

If clear() calls the destructor, what will happen if ~vector() will call delete[]?

allocator

  • the new / delete problem
  • the solution allocators
  • the default allocator

The problem with new and delete

new and delete does 2 things:

  • new[]
    • allocate memory (malloc)
    • call default ctor (most times)
  • delete[]
    • deallocate memory (free)
    • call dtor

With the need of uninitialized memory for the range between size() and capacity(), a 3-pointer-container will not work with new/delete.

Solution: std::allocator

A class with separate functions for

  • allocate memory
  • deallocate memory
  • construct (call ctor)
  • destroy (call dtor)

The default allocator / std::allocator

  • 20.6.9 General utilities library / Memory / The default allocator

Work as "interface" for all std::containers (except std::array)

namespace std {
    template <class T, class Allocator = allocator<T> >
    class vector {
        ...
    };
    template <class T, class Allocator = allocator<T> >
    class list {
        ...
    };
    template <class Key, class T, class Compare = less<Key>,
              class Allocator = allocator<pair<const Key, T> > >
    class map {
        ...
    };
}

The interface is defined / wrapped by the class allocator_traits. (Not part of this presentation)

The default allocator / std::allocator

header: <memory>

template <class T> 
class allocator 
{
public:
    // some typedefs    
    template <class U> struct rebind { typedef allocator<U> other; };

    allocator() noexcept;
    allocator(const allocator&) noexcept;
    template <class U> allocator(const allocator<U>&) noexcept;
    ~allocator();    
    pointer address(reference x) const noexcept;
    const_pointer address(const_reference x) const noexcept;
    size_type max_size() const noexcept;

    pointer allocate(size_type n, allocator<void>::const_pointer hint = 0);
    void deallocate(pointer p, size_type n);

    template<class U, class... Args> void construct(U* p, Args&&... args);
    template <class U> void destroy(U* p);
};

The default allocator / std::allocator

pointer allocate(size_type n, allocator<void>::const_pointer /*hint*/ = 0)
{
    return (pointer)(::operator new[](n * sizeof(T)));
}
void deallocate(pointer p, size_type /*n*/)
{
    ::operator delete[](p);
}
template<class U, class... Args> void construct(U* p, Args&&... args)
{
    new(p)U(std::forward<Args>(args)...);
}
template <class U> void destroy(U* p)
{
    p->~U();
}
size_type max_size() const noexcept
{
    return std::numeric_limits<size_type>::max() / sizeof(T);
}
  • operator new/delete
  • placement new and dirct destructor call.

excursion std::list

Example of a potential internal structure for std::list: node<T>

                      _________
template<class T>     |       |
struct node           |   T   |
{                     |_______|
    T t;              | pprev |
    node* pprev;   <------o   |
    node* pnext;      |_______|
};                    | pnext |
                      |   o------>
                      |_______|

How to allocate and construct node<T> with allocator<T>?

use rebing (C++98/03)

template <class U> 
struct rebind { typedef allocator<U> other; };

The containers use allocators to allocate all internal structures.

  • vector<T> : T[]
  • list<T> : node<T>
  • ...

std::list: use rebind

template <class T> class allocator
{    
    template <class U> allocator(const allocator<U>&) noexcept;
    template <class U> struct rebind { typedef allocator<U> other; };

    void construct(pointer p, const T& val) { new((void*)p)T(val); } // C++98
};
template <class T, class Allocator = allocator<T>> 
class list 
{
    typedef Allocator allocator_type;

    void _alloc(...)
    {           
        using node_allocator = allocator_type::rebind<node<T>::other;
        allocator_type alloc = ...;
        node_allocator node_alloc{alloc}; 

        node<T>* pnode = node_allocator.allocate(1);
        node_allocator.construct(pnode, node<T>{ T(), pprev, ppnext});
    }
};

allocate with rebind, construct with variadic templates (C++11)

template <class T> class allocator
{    
    template <class U> allocator(const allocator<U>&) noexcept;
    template <class U> struct rebind { typedef allocator<U> other; };

    template<class U, class... Args>  // C++-11
    void construct(U* p, Args&&... args) { new(p)U(std::forward<Args>(args)...); }
};
template <class T, class Allocator = allocator<T>> 
class list 
{
    typedef Allocator allocator_type;

    void _alloc(...)
    {           
        using node_allocator = allocator_type::rebind<node<T>::other;
        allocator_type alloc = ...;
        node_allocator node_alloc{alloc}; 

        node<T>* pnode = node_allocator.allocate(1);
        node_allocator.construct<node<T>>(pnode, T(), pprev, ppnext); 
    }
};

std::vector: region of memory

  • begin() - capacity()

    • Memory management
    • allocator<T>::allocate()
    • allocator<T>::deallocate()
  • begin() - end()

    • call ctor / dtor
    • allocator<T>::construct()
    • allocator<T>::destroy()
  • end() - capacity()

    • uninitialized memory

std::vector

template<typename T, class Allocator = std::allocator<T> >
class vector
{
public:
    vector(const Allocator& alloc) : alloc_(alloc) {}

    allocator_type get_allocator() const noexcept 
    { 
        return alloc_; 
    }

private:
    T* begin_ = nullptr;
    T* end_ = nullptr;
    T* capacity_ = nullptr;
    Allocator alloc_; // not the final code.
};

store the allocate to provide a stateful allocator (new in C++11)

std::vector ctor / dtor

vector(size_type n, const T& value, const Allocator& allocator = Allocator())
 : vector(allocator) // ensure dtor calls
{
    auto alloc = get_allocator();
    end_ = begin_ = alloc.allocate(n);
    capacity_ = begin_ + n; 

    for (size_type i = 0; i < n; ++i , ++end_)
        alloc.construct(end_, value);
}
~vector() noexcept
{
    auto alloc = get_allocator();
    for (pointer p = begin_; p != end_; ++p)
        alloc.destroy(p);
    alloc.deallocate(begin_);
}
void clear() noexcept
{
    auto alloc = get_allocator();
    for (pointer p = begin_; p != end_; ++p)
        alloc.destroy(p); // noexcept, ~T() noexcept 
    end_ = begin_;
}

step5.h

exceptions safety

assumptions
  • delete don't throw
  • destructors don't throw
  • swap for primitive types (e.g. pointers) don't throw
  • different level of exceptions safety for different functions

exceptions safety guarantees

23.2.1 General container requirements

(10) Unless otherwise specified (see 23.2.4.1, 23.2.5.1, 23.3.3.4, and 23.3.6.5) all container types defined in this Clause meet the following additional requirements:

  • if an exception is thrown by an insert() or emplace() function while inserting a single element, that function has no effects.
  • if an exception is thrown by a push_back() or push_front() function, that function has no effects.
  • no erase(), clear(), pop_back() or pop_front() function throws an exception.
  • no copy constructor or assignment operator of a returned iterator throws an exception.
  • no swap() function throws an exception.
  • no swap() function invalidates any references, pointers, or iterators referring to the elements of the containers being swapped. [ Note: The end() iterator does not refer to any element, so it may be invalidated. — end note ]

no-throw guarantee

From wikipedia:

No-throw guarantee, also known as failure transparency: Operations are guaranteed to succeed and satisfy all requirements even in exceptional situations. If an exception occurs, it will be handled internally and not observed by clients.

  1. no copy constructor or assignment operator of a returned iterator throws an exception.
  2. no erase(), clear(), pop_back() or pop_front() function throws an exception.
  3. no swap() function throws an exception.

(1) may limit possible implementations

(3) ignores allocator requirements (n4567)

void swap(vector&)
noexcept(allocator_traits<Allocator>::propagate_on_container_swap::value ||
allocator_traits<Allocator>::is_always_equal::value);

Strong exception safety

From wikipedia:

Strong exception safety, also known as commit or rollback semantics: Operations can fail, but failed operations are guaranteed to have no side effects, so all data retain their original values.

  1. if an exception is thrown by an insert() or emplace() function while inserting a single element, that function has no effects.
  2. if an exception is thrown by a push_back() or push_front() function, that function has no effects.

Strong exception safety (vector)

  1. push_back(): If an exception is thrown other than by the copy constructor, move constructor, assignment operator, or move assignment operator of T or by any InputIterator operation there are no effects. If an exception is thrown while inserting a single element at the end and T is CopyInsertable or is_nothrow_move_constructible::value is true, there are no effects. Otherwise, if an exception is thrown by the move constructor of a non-CopyInsertable T, the effects are unspecified.
  2. reserve(), shrink_to_fit(): If an exception is thrown other than by the move constructor of a non-CopyInsertable type, there are no effects.
  3. resize(): If an exception is thrown there are no effects

strong exception safety requirement

for push_back()

struct throw_on_copy{ ... }; 

vector<throw_on_copy> v1(1);
const auto id = v[0].id(); // unique id for every instance
const void* addr = &v[0];
throw_on_copy toc;
try {
    v.push_back(toc);
    REQUIRE(v.size() == 2);
    REQUIRE(id == v[0]);
    REQUIRE(addr != &v[0]);    
}
catch(const std::exception& ex) {
    REQUIRE(v.size() == 1);
    REQUIRE(id == v[0]);    
    REQUIRE(addr == &1[0]);
}

push_back()

How to implement push_back() with strong exception safety guarantee?

void push_back(const T& x)
{
    if (size() + 1 <= capacity()) { 
        *end_ = x;
        ++end_;
    } else {
        T* b = nullptr;
        try {            
            b = alloc(new_capacity);
            copy_all_elements(b, begin_, size());
            b[size()] = x;
            std::swap(b, begin_);
            ++end_;
            dealloc(b); 
        } catch(...) {
            dealloc(b);
            throw;
        }
    }    
}

push_back()

How to implement push_back() with strong exception safety guarantee?

void push_back(const T& x)
{
    if (size() + 1 <= capacity()) { 
        alloc_.construct(end_, x);
        ++end_;
    } else {
        vector v;
        const auto new_capacity = capacity() + capacity() / 2 + 1;
        v.reserve(new_capacity); 
        v.assign(begin(), end()); // don't work for non-copyable types
        v.push_back(x); // don't enter else-case
        swap(v); // noexcept                
    }    
}

(Ignore move-only elements)

reserve

void reserve(size_type n)
{
    if (capacity() >= n) return;

    auto alloc = get_allocator();
    if (alloc.max_size() <= n) 
        throw length_error("vector: requested size is bigger then max_size()");

    vector v(alloc);
    v.end_ = v.begin_ = alloc.allocate(n);
    v.capacity_ = v.begin_ + n;

    for (pointer p = begin_; p != end_; ++p, ++v.end_)
        alloc.construct(v.end_, *p);

    swap(v);
}

step6.h

resize

void resize(size_type n)
{
    auto alloc = get_allocator();

    // 1. begin() <= n <= end()
    if (n <= size()) {
        while (end_ > begin_ + n) {
            --end_;
            alloc.destroy(end_);
        }
        return;
    }
    // 2. end() <= n <= capacity()
    if (n <= capacity()) {
        while (end_ < begin_ + n) {
            alloc.construct(end_, T());
            ++end_;
        }
    return;
    }

    // 3. capacity <= n : alloc memory
    vector v(alloc);
    v.end_ = v.begin_ = alloc.allocate(n);
    v.capacity_ = v.begin_ + n;

    for (pointer p = begin_; p != end_; ++p, ++v.end_)
        alloc.construct(v.end_, *p);
    v.resize(n); // case 2.

    swap(v);
}
  1. n <= size() => shrink
  2. size() < n <= capacity() => expand
  3. capacity() < n => realloc

step6.h

missing exception guarantee for assign and operator=

There are no exception guarantees for

  • assign()
  • operator=()
  • emplace_back()
  • emplace()

exceptions safety vs efficiency

exceptions safety prevents from reusing of existing memory, but create unexpected results

vector<int> v = { 1, 2, 3 };
try {
    v = other_vec;    
} catch(...) {}
vector& operator=(const vector& x)
{
    vector v(x);
    swap(v);
    return *this;
}
vector& operator=(const vector& x)
{
    if (&x == this) return;

    clear(); // clear elements, but keep exiting memory 
    _reserve(x.size()); // allocate new memory (if necessary)   
    _assign(x.begin(), x.end()); // copy elements
    return *this;
}

in the case of reusing existing memory semantic:
check for self-assignment (x.size() == 0 after clear())

missing exception safety guarantee

  vector<int> v(1000000, 42);
  vector<int> v1 = v, v2 = v, v3 = v, v4 = v, v5 = v, a = v;

  try
  {
    while (true) {
      a.resize(v.size() * 10);
      v = a;  v1 = a; v2 = a;
      v3 = a; v4 = a; v5 = a;
    }
  }
  catch (...) {}

  cout << "v.size()  = " << v.size() << endl;  cout << "v1.size() = " << v1.size() << endl;
  cout << "v2.size() = " << v2.size() << endl; cout << "v3.size() = " << v3.size() << endl;
  cout << "v4.size() = " << v4.size() << endl; cout << "v5.size() = " << v4.size() << endl;

Visual Studio 2015, 32bit

v.size()  = 100000000
v1.size() = 100000000
v2.size() = 0
v3.size() = 10000000
v4.size() = 10000000
v5.size() = 10000000

exceptions and move

From reserve() and shrink_to_fit(): If an exception is thrown other than by the move constructor of a non-CopyInsertable type, there are no effects.

Exceptions in move operations will destroy all exception safety guarantees:

void reserve(size_type n)
{
   ...
   vector v;
   v._reserve(n);
   // (A) copy or move all elements from this to v
   swap(v); 
}

exception in line (A):

  • copy: the original container is unchanged until swap().
  • move: it's not possible to roll back the move actions of elements. (If move throws an exception, a reverse move will not work)

vector implementations part 2

  • constructor
  • assign
  • operator=

constructor, assignment and assignment operators

fill a container with data

vector<int> v1;

vector<int> v2  = v1; // copy ctor
v2 = v1;              // assignment operator
v2.assign(v1.begin(), v2.end());
  • copy or move
  • container:
    • same container
    • different container (e.g. list)
    • same type T
    • different type T

constructor

vector(const Allocator& = Allocator());
vector(size_type n);
vector(size_type n, const T& value, const Allocator& = Allocator());
template <class InputIterator> vector(InputIterator first, InputIterator last, const Allocator& = Allocator());
vector(const vector<T,Allocator>& x);
vector(vector&&);
vector(const vector&, const Allocator&);
vector(vector&&, const Allocator&);
vector(initializer_list<T>, const Allocator& = Allocator());
  1. default (done)
  2. n elements of T{}
  3. n elements of T{value} (done)
  4. from other container via iterator
  5. copy ctor
  6. move ctor
  7. copy ctor with allocator
  8. move ctor with allocator
  9. from initializer list

assignment operator

vector<T,Allocator>& operator=(const vector<T,Allocator>& x);
vector<T,Allocator>& operator=(vector<T,Allocator>&& x);
vector& operator=(initializer_list<T>);
  1. copy
  2. move
  3. from initializer list

assign

template <class InputIterator>
void assign(InputIterator first, InputIterator last);
void assign(size_type n, const T& u);
void assign(initializer_list<T>);
  1. from other container
  2. n elements of T{u}
  3. from initializer list

move ctor / move assignment

vector(vector&& other)
{
    swap(other);
}

vector& operator=(&& other);
{    
    swap(other);
    return *this;      
}

vector(vector&& other, const Allocator& alloc)
: vector(alloc)
{
    std::swap(other.begin_, begin_);
    std::swap(other.end_, end_);
    std::swap(other.capacity_, capacity_);
    // don't swap allocator 
    // 23.2.1 [container.requirements.general] 
    // Table 99 — Allocator-aware container requirements
}

trivial move implementation with pointer swap

(ignoring allocator_traits::propagate_on_container_move_assignment)

step7.h

delegate initializer lists

vector& operator=(initializer_list<T> il)
{
    if (this == &x) return *this;  
    assign(il.begin(), il.end());    
    return *this;
}

vector(initializer_list<T> il, const Allocator& alloc = Allocator())
  : vector(il.begin(), il.end(), alloc)
{}

void assign(initializer_list<T> il)
{
    assign(il.begin(), il.end());    
}

delegate initializer lists overloads to iterator overloads

step7.h

delegate constructors

vector(const Allocator& = Allocator());
vector(size_type n, const T& value, const Allocator& = Allocator());
template <class InputIterator> vector(InputIterator first, InputIterator last, const Allocator& = Allocator());
vector(const vector&, const Allocator&);

vector(vector&&, const Allocator&); // already implemented 
vector(initializer_list<T>, const Allocator& = Allocator()); // already implemented

vector(size_type n) : vector(n, T{}, Allocator()) {} // (1)
vector(const vector& x) : vector(x, Allocator()) {} 
//vector(vector&& x) : vector(std::move(x), Allocator()) {}

reuse similar constructors

step7.h

assign

vector& operator=(const vector& x)
{
    if (this == &x) return *this; 
    vector v(x);
    swap(v);    
    return *this;
}
template <class InputIterator>
void assign(InputIterator first, InputIterator last)
{
    vector v(first, last);
    swap(v);
}
void assign(size_type n, const T& u)
{
    vector v(n, u);
    swap(v);
}
  • exception safe implementations: not required by the standard
  • delegate assign to constructor - the other way is also possible

step7.h

implement remaining constructors (1)

template<typename T, class Allocator = std::allocator<T> >
class vector
{
public:
    vector(const Allocator& alloc = Allocator()) : alloc_(alloc)
    {        
    }

private:
    T* begin_ = nullptr;
    T* end_ = nullptr;
    T* capacity_ = nullptr;
    Allocator alloc_;
};

default constructor

step7.h

implement remaining constructors (2)

vector(size_type n, const T& value, const Allocator& allocator = Allocator())
    : vector(allocator)
{
    auto alloc = get_allocator();
    if (alloc.max_size() <= n) 
        throw length_error("vector: requested size is bigger then max_size()");

    end_ = begin_ = alloc.allocate(n);
    capacity_ = begin_ + n;

    for (size_type i = 0; i < n; ++i , ++end_)
        alloc.construct(end_, value);
}

constructor with number of elements, default value and allocator

step7.h

implement remaining constructors (3)

template <class InputIterator>
vector(InputIterator first, InputIterator last, const Allocator& allocator = Allocator())
: vector(allocator)
{
    auto alloc = get_allocator();
    const auto n = std::distance(first, last);
    if (alloc.max_size() <= (size_type)n) 
        throw length_error("vector: requested size is bigger then max_size()");

    end_ = begin_ = alloc.allocate(n);
    capacity_ = begin_ + n;

    for (auto it = first; it != last; ++it, ++end_)
        alloc.construct(end_, *it);
}

constructor with iterator overload

step7.h

iterator workaround for msvc

template<class T>
  class vector {
  public:
    vector(int n, const T& val);
    template <class InputIterator> vector(InputIterator first, InputIterator last);
  };

void foo()
{
    vector<int> vi(1, 2);    
}

Workaround for the Microsoft C++ Compiler:

template <class InputIterator
    , class = typename std::iterator_traits<InputIterator>::value_type>
vector(InputIterator first, InputIterator last);

step7.h

constructors with assign

vector(size_type n, const T& value, const Allocator& allocator = Allocator())
: vector(allocator)
{
    assign(n, value);
}
template <class InputIterator>
vector(InputIterator first, InputIterator last, const Allocator& alloc = Allocator())
: vector(alloc)
{
    assign(first, last);
}
vector(const vector& other, const Allocator& allocator)
: vector(allocator)
{
    _assign(other);
}

alternative implementation: constructors call the assign functions

step7a.h

assign

void assign(size_type n, const T& value)
{
    auto alloc = get_allocator();
    if (alloc.max_size() <= n) 
        throw length_error("vector: requested size is bigger then max_size()");

    clear();
    if (n > capacity()) {        
        alloc.deallocate(begin_, capacity());
        end_ = begin_ = capacity_ = nullptr;
        end_ = begin_ = alloc.allocate(n);
        capacity_ = begin_ + n;
    }

    for (size_type i = 0; i < n; ++i, ++end_)
        alloc.construct(end_, value);
}

assign with number of elements, default value

step7a.h

assign

template <class InputIterator>
void assign(InputIterator first, InputIterator last)
{
    auto alloc = get_allocator();
    const size_type n = std::distance(first, last);
    if (alloc.max_size() <= n) 
        throw length_error("vector: requested size is bigger then max_size()");

    clear();
    if (n > capacity()) {        
        alloc.deallocate(begin_, capacity());
        end_ = begin_ = capacity_ = nullptr;
        end_ = begin_ = alloc.allocate(n);
        capacity_ = begin_ + n;
    }

    for (auto it = first; it != last; ++it, ++end_)
        alloc.construct(end_, *it);
}

assign for a range

step7a.h

assign

void _assign(const vector& other)
{
    auto alloc = get_allocator();
    const auto n = other.size();
    if (alloc.max_size() <= n) 
        throw length_error("vector: requested size is bigger then max_size()");

    clear();
    if (n > capacity()) {        
        alloc.deallocate(begin_, capacity());
        end_ = begin_ = capacity_ = nullptr;
        end_ = begin_ = alloc.allocate(n);
        capacity_ = begin_ + n;
    }

    for (pointer p = other.begin_; p != other.end_; ++p, ++end_)
        alloc.construct(end_, *p);
}

assign function to implement copy constructor and assignment operator

step7a.h

iterators

  • the most simplest iterator for std::vector<T>
  • iterator_traits
  • random access iterator
  • begin() / end()

T* as simple iterator

T* is the most simplest iterator for std::vector<T>

template<typename T>
class vector
{
public:
    ...
    typedef T* iterator;
    typedef const T*  const_iterator;
    ...
    iterator begin() noexcept { return begin_; }
    const_iterator begin() const noexcept { return begin_; }
    iterator end() noexcept { return end_; }
    const_iterator end() const noexcept { return end_; }
    ...
};

verify with iterator_traits

std::vector need a random access iterator

typeid(iterator_traits<int*>::value_type) == typeid(int)
typeid(iterator_traits<int*>::pointer) == typeid(int*)
typeid(iterator_traits<int*>::reference) == typeid(int&)
typeid(iterator_traits<int*>::difference_type) == typeid(ptrdiff_t)
typeid(iterator_traits<int*>::iterator_category) == typeid(random_access_iterator_tag)

random access iterator

A RandomAccessIterator is a

  • BidirectionalIterator is a
    • ForwardIterator is a
      • InputIterator and a
      • OutputIterator

random access iterator

    template<typename T>
    class vector_iterator
    {
    public:
      typedef typename CONT_TYPE::difference_type difference_type; //almost always ptrdif_t
      typedef typename CONT_TYPE::value_type value_type; //almost always T
      typedef typename CONT_TYPE::reference reference; //almost always T& or const T&
      typedef typename CONT_TYPE::value_type* pointer; //almost always T* or const T*
      typedef random_access_iterator_tag iterator_category;

      vector_iterator();
      vector_iterator(implentation specific);

      pointer operator->();
      const_pointer_type operator->() const;
      reference operator*();
      const_reference_type operator*() const;
      vector_iterator& operator++(); //prefix increment: ++it
      vector_iterator operator++(int); //postfix increment: it++
      vector_iterator& operator--(); //prefix increment
      vector_iterator operator--(int); //postfix increment

      vector_iterator& operator+=(size_type n);
      vector_iterator& operator-=(size_type n);
      reference operator[](size_type n);
      const_reference_type operator[](size_type n) const;      
    };

    vector_iterator operator+(const vector_iterator& it, size_type n);
    vector_iterator operator+(size_type n, const vector_iterator& it)l
    vector_iterator operator-(const vector_iterator& it, size_type n);
    difference_type operator-(const vector_iterator& l, const vector_iterator& r);

    bool operator<(const vector_iterator& l, const vector_iterator& r);
    bool operator>(const vector_iterator& l, const vector_iterator& r);
    bool operator<=(const vector_iterator& l, const vector_iterator& r);
    bool operator>=(const vector_iterator& l, const vector_iterator& r);
    bool operator==(const vector_iterator& l, const vector_iterator& r);
    bool operator!=(const vector_iterator& l, const vector_iterator& r);

    void swap(vector_iterator& l, vector_iterator& r);

begin() and end()

iterator begin() noexcept
{
  return iterator(begin_);
}
iterator end() noexcept
{
  return iterator(end_);
}
vector<int> v = { 1,2,3 };
for (auto x : v) cout << x << " ";
cout <<endl;

and cbegin(), cend(), rbegin(), rend(), crbegin() and cend()

step8.h

random access iterator

    template<typename T> class vector_iterator 
    {
    public:
      typedef T* pointer;
      typedef T& reference;
      typedef random_access_iterator_tag iterator_category;

      vector_iterator() {}
      vector_iterator(T* p) : p_(p) {}

      pointer operator->() { return p_; }
      reference operator*() { return *p_; }

      vector_iterator& operator++() //prefix increment: ++it
      {
        ++p_;
        return *this;
      }
      vector_iterator operator++(int) //postfix increment: it++
      {
        vector_iterator i(*this);
        ++p_;
        return i;
      }
      reference operator[](size_type n) { return *(p_ + n); }

    private:
      pointer p_ = nullptr;

      friend bool operator!=(const vector_iterator& l, const vector_iterator& r) { return l.p_ != r.p_; }
      friend bool operator<(const vector_iterator& l, const vector_iterator& r) { return l.p_ < r.p_; }      
    };

The C++ standard doesn't contain a class vector_iterator. The class vector contains a typedef to a class which satisfy the requirements of a random access iterator.

vector implementations part 3

  • compare
  • pop_back
  • erase
  • insert
  • emplace

compare operations

template <class T, class Allocator>
bool operator==(const vector<T, Allocator>& x, const vector<T, Allocator>& y);
template <class T, class Allocator>
bool operator< (const vector<T ,Allocator>& x, const vector<T,Allocator>& y);
template <class T, class Allocator>
bool operator!=(const vector<T, Allocator>& x, const vector<T, Allocator>& y);
template <class T, class Allocator>
bool operator> (const vector<T, Allocator>& x, const vector<T, Allocator>& y);
template <class T, class Allocator>
bool operator>=(const vector<T, Allocator>& x, const vector<T, Allocator>& y);
template <class T, class Allocator>
bool operator<=(const vector<T, Allocator>& x, const vector<T, Allocator>& y);

6 non-member operations

step9.h

compare operations

template <class T, class Allocator>
bool operator==(const vector<T, Allocator>& x, const vector<T, Allocator>& y)
{
  // use std algorithms as in standard
  // return distance(x.begin(), x.end()) == distance(y.begin(), y.end())
  //  && std::equal(x.begin(), x.end(), y.begin()); 
  // 'std::_Equal2': Function call with parameters that may be unsafe - this call relies on the caller to check that the passed values are correct. To disable this warning, use -D_SCL_SECURE_NO_WARNINGS. See documentation on how to use Visual C++ 'Checked Iterators'

  return x.size() == y.size() && detail::_equal(x.begin(), x.end(), y.begin(), y.end());
}
template <class T, class Allocator>
bool operator< (const vector<T ,Allocator>& x, const vector<T,Allocator>& y)
{
  return lexicographical_compare(x.begin(), x.end(), y.begin(), y.end());
}

see description for operator == and < in 23.2.1 ([container.requirements.general]) "Table 96 — Container requirements" and "Table 98 — Optional container operations"

using the 4 parameter overload of equal (C++14)

step9.h

compare operations

template <class T, class Allocator>
bool operator!=(const vector<T, Allocator>& x, const vector<T, Allocator>& y)
{
  return !(x == y);
}

template <class T, class Allocator>
bool operator> (const vector<T, Allocator>& x, const vector<T, Allocator>& y)
{
  return y < x;
}

template <class T, class Allocator>
bool operator>=(const vector<T, Allocator>& x, const vector<T, Allocator>& y)
{
  return !(x < y);
}

template <class T, class Allocator>
bool operator<=(const vector<T, Allocator>& x, const vector<T, Allocator>& y)
{
  return !(x > y);
}

implement the remaining operators with the help of == and <

step9.h

pop_back()

void pop_back()
{
  resize(size() - 1);
}

implement pop_back() with the help of resize() and size()

step10.h

erase()

iterator erase(const_iterator position);
iterator erase(const_iterator first, const_iterator last);

erase a single element or a range of elements

step10.h

erase() a single element

iterator erase(const_iterator position)
{
    // store position pointer independent for return value
    const size_type pos = position - iterator(begin_);

    for (pointer p = begin_ + pos + 1; p != end_; ++p)
        std::swap(*(p - 1), *p);
    get_allocator().destroy(--end_);

    return iterator(begin_ + pos);
}
  • const_iterator as input, non-const iterator as return value
  • use swap<T>() for better move support

step10.h

erase() a range of elements

iterator erase(const_iterator first, const_iterator last)
{
    // store position pointer independent for return value
    const size_type pos1 = first - iterator(begin_); 
    const size_type pos2 = last - iterator(begin_);

    pointer p1 = begin_ + pos1;
    pointer p2 = begin_ + pos2;

    for (; p2 != end_; ++p1, ++p2)
        std::swap(*p1, *p2);

    auto alloc = get_allocator();
    for (size_type i = 0; i < size_type(p2 - p1); ++i)
        alloc.destroy(--end_);

    return iterator(begin_ + pos1);
}
  • 2 loops for moving and destroy the range
  • use swap<T>() for better move support

step10.h

erase-remove idiom

vector<std::string> v = { "1", "5", "2", "3", "4", "5" };

v.erase(
    std::remove_if(begin(v), end(v), [](const std::string& s) { return s == "5"; }),
    end(v)
);
assert(v == vector<std::string> v = { "1", "2", "3", "4" });
  • remove / remove_if moves all found elements to the end
  • erase deletes the elements from the end

step10.h

many insert functions

iterator insert(const_iterator position, const T& x);
iterator insert(const_iterator position, T&& x);

iterator insert(const_iterator position, size_type n, const T& x);
template <class InputIterator> 
iterator insert(const_iterator position, InputIterator first, InputIterator last);
iterator insert(const_iterator position, initializer_list<T> il);
  • 2x single value inserts
  • 3x multi value inserts
    • n elements of value x
    • range of iterators
    • initializer_list

How insert should work

  • work inplace? (don't realloc)
    • Yes:
      • begin() - insert_pos(): do nothing
      • insert new element at insert_pos with swap
      • insert_pos - end(): shift(1) / shift(n)
    • No:
      • allocate big enough in tmp pointer
      • copy begin() - insert_pos()
      • insert new element at insert_pos with swap
      • insert_pos - end()
      • swap tmp and internal pointer
      • free tmp pointer

Don't create gap of uninitialized memory between initialized values

simple insert

iterator insert_simple(const_iterator position, const T& x)
{
    // store position pointer independent for return value
    const size_type pos = position - iterator(begin_);

    if (size() + 1 <= capacity()) {
        T val = x;
        for (pointer p = begin_ + pos; p != end_; ++p)
            std::swap(val, *p);

        get_allocator().construct(end_, val);
        ++end_;
        return begin_ + pos;
    }
    vector v(*this);
    v.reserve(size() + 1);
    v.insert_simple(v.begin_ + pos, x);
    swap(v);
    return begin_ + pos;
}

not the best performance

insert for a single value

iterator insert(const_iterator position, const T& x)
{
    // store position pointer independent for return value
    const size_type pos = position - iterator(begin_);
    auto alloc = get_allocator();
    const size_type n = size() + 1;

    if (n <= capacity()) {
        T val = x;
        for (pointer p = begin_ + pos; p != end_; ++p)
            std::swap(val, *p);

        alloc.construct(end_, val);
        return begin_ + pos;
    }
    vector v(alloc);
    // allocate memory
    v.end_ = v.begin_ = alloc_.allocate(n);
    v.capacity_ = v.begin_ + n;

    pointer p = begin_;
    // copy first part
    for (; p != begin_ + pos; ++p, ++v.end_)
        alloc.construct(v.end_, *p);

    // insert value
    alloc.construct(v.end_, x);
    ++v.end_;

    // copy the remaining part
    for (; p != end_; ++p, ++v.end_)
        alloc.construct(v.end_, *p);

    swap(v);
    return begin_ + pos;
}

insert for many values and emplace

iterator insert(const_iterator position, size_type n, const T& x); 
template <class InputIterator> 
iterator insert(const_iterator position, InputIterator first, InputIterator last);
iterator insert(const_iterator position, initializer_list<T> il);

template <class... Args>
iterator emplace(const_iterator position, Args&&... args)

emplace

template <class... Args>
iterator emplace(const_iterator position, Args&&... args)
{
  // store position pointer independent for return value
  const size_type pos = position - iterator(begin_);
  auto alloc = get_allocator();
  const size_type n = size() + 1;

  if (n <= capacity()) {
    T val{ args...};
    for (pointer p = begin_ + pos; p != end_; ++p)
      std::swap(val, *p);

    alloc.construct(end_, val);
    ++end_;
    return begin_ + pos;
  }
  if (alloc.max_size() <= n)
    throw length_error("vector: requested size is bigger then max_size()");

  vector v(alloc);
  // allocate memory
  v.end_ = v.begin_ = alloc.allocate(n);
  v.capacity_ = v.begin_ + n;

  pointer p = begin_;
  // copy first part
  for (; p != begin_ + pos; ++p, ++v.end_)
    alloc.construct(v.end_, *p);

  // insert value
  alloc.construct(v.end_, args...);
  ++v.end_;

  // copy the remaining part
  for (; p != end_; ++p, ++v.end_)
    alloc.construct(v.end_, *p);

  swap(v);
  return begin_ + pos;
}

construct a value inplace

emplace_back()

template <class... Args> void emplace_back(Args&&... args)
{
  auto alloc = get_allocator();
  const auto n = size() + 1;

  if (n <= capacity()) {
    alloc.construct(end_, args...);
    ++end_;
    return;
  }

  if (alloc.max_size() <= n) 
    throw length_error("vector: requested size is bigger then max_size()");

  vector v;

  // allocate memory
  const auto new_capacity = n + n / 2 + 1;
  v.end_ = v.begin_ = alloc.allocate(new_capacity);
  v.capacity_ = v.begin_ + new_capacity;

  // copy orig to new memory
  for (pointer p = begin_; p != end_; ++p, ++v.end_) {
    alloc.construct(v.end_, *p);
  }
  // create element in memory of the vector, calls ctor with "args..."
  alloc.construct(v.end_, args...);
  ++v.end_;

  swap(v);
}

create an object direct in the memory of the vector

others topics

  • missing parts
  • bugs / room for improvements

what's missing

  • special handling for

    • move only elements (e.g. unique_ptr, ofstream, ...)
    • copy only elements (const member variables)
    • noexcept handling (call inplace modifier only for noexcept copyable / moveable)
  • fix

    • insert for multiple elements
    • push_back(T&& x)
  • missing sanity checks

implicit conversion

between ptrdiff_t and size_type

    size_type size() const noexcept
    {
      return end_ - begin_;  // returns ptrdiff_t
    }
    size_type capacity() const noexcept
    {
      return capacity_ - begin_;
    }    
    vector(InputIterator first, InputIterator last)
    {
      reserve(std::distance(first, last)); // missing check for negative values
      ...
    }
  • implicit conversion from ptrdiff_t (signed) to site_t (unsigned)
  • reduce possible number of elements

max_size()

max_size() returns the wrong value:

Due the implicit conversion between ptrdiff_t and size_type the effective max_size should be reduced.

    size_type max_size() const noexcept
    {
      // reduce number of elements due implicit conversion 
      //return std::numeric_limits<size_type>::max() / sizeof(T); 
      return std::numeric_limits<ptrdiff_t>::max() / sizeof(T);
    }

header file pollution

file vector.hpp

#pragma once

#include <cassert>   // assert
#include <utility>   // forward
#include <algorithm> // lexicographical_compare

Nobody need to include cassert and other headers after including vector.hpp. This will result in non-portable code!

Questions