Returning a std::vector

Returning a `std::vector`#

Consider a function that takes a vector as input and returns a new one.

We can imagine:

std::vector<double> f(const std::vector<double>& v_in) {

    std::vector<double> v_out;

    // do stuff to fill v_out based on v_in

    return v_out;

}

This looks like it is returning v_out by value and that when we do:

auto v_new = f(v_old);

that we need to make a copy from the function to the caller.

Tip

We can use auto here because C++ knows the return type. We could also be explicit and do:

std::vector<double> v_new = f(v_old);

But C++ instead provides optimizations (usually return value optimization) that eliminates the need for making a copy. As a result, there is little performance penalty in writing a function that returns something large like a vector.

Updating via an argument#

An alternate method to create a vector via a function is to pass it through the argument list as a reference:

void f(const std::vector<double>& v_in, std::vector<double>& v_out) {

    // fill v_out based on v_in

}

Then we can do:

std::vector<double> v_old{};
std::vector<double> v_new{};

f(v_old, v_new);

Returning a reference?#

Danger

What about returning a reference? We might think that we could do:

std::vector<double>& f(const std::vector<double>& v_in) {

    std::vector<double> v_out;

    // do stuff to fill v_out based on v_in

    return v_out;

}

The problem here is that v_out is destroyed at the end of the function f, so the reference will be to something that no longer exists. This is not allowed—we cannot return a reference to a local variable.

Example#

Let’s play with this. Here’s an example that tries several ways to have a function create a new vector whose elements are initialized to be twice those of the input vector:

Listing 58 function_vector.cpp#

#include <iostream>
#include <vector>

// take a vector as input and return a new vector
std::vector<double> f1(const std::vector<double>& v_in) {

    std::vector<double> v_out;

    for (auto e : v_in) {
        v_out.push_back(2.0 * e);
    }

    return v_out;

}

// update a vector through the argument list
void f2(const std::vector<double>& v_in,
        std::vector<double>& v_out) {

    // erase any stored contents
    v_out.clear();

    for (auto e : v_in) {
        v_out.push_back(2.0 * e);
    }

}

// attempt to return a reference to our vector
std::vector<double>& f3(const std::vector<double>& v_in) {

    std::vector<double> v_out;

    for (auto e : v_in) {
        v_out.push_back(2.0 * e);
    }

    return v_out;

}

int main() {

    std::vector<double> v_old{0.0, 1.0, 2.0, 3.0, 4.0, 5.0};

    // method 1: new vector is returned

    auto v_new1 = f1(v_old);

    for (auto e : v_new1) {
        std::cout << e << " ";
    }
    std::cout << std::endl;

    // method 2: pass the new vector as an argument
    // and it is updated

    std::vector<double> v_new2{};

    f2(v_old, v_new2);

    for (auto e : v_new2) {
        std::cout << e << " ";
    }
    std::cout << std::endl;

    // method 3: try to get a reference to a vector
    // created in the function.  This will not work.

    auto v_new3 = f3(v_old);

    for (auto e : v_new3) {
        std::cout << e << " ";
    }
    std::cout << std::endl;


}

We see that the first two methods work, but the last results in a segmentation fault. In fact, the compiler even warned us about this.

Of the two methods that work, the first one f1 is more readable, since it is clear what is being returned.

Some codes adopt the style that functions should not modify their arguments at all, but only return new data via return values.