This article originally appeared in Medium.
The “Do I Know This Already?” Quiz
What is the best way to return multiple values from a C++17 function?
- Using output parameters:
autooutput_1(
int&i1) { i1 = 11;
return12; }
- Using a local structure:
autostruct_2() {
struct_ {
inti1, i2; };
return_{21, 22}; }
- Using an
std::pair
:
autopair_2() {
returnstd::make_pair(31, 32); }
- Using an
std::tuple
:
autotuple_2() {
returnstd::make_tuple(41, 42); }
Keep reading for the answer.
Use Case: Why to Return Multiple Values?
A typical example is the std::from_chars(
)
, a C++17 function similar to strtol()
. But from_chars()
returns 3 values: a parsed number, an error code, and a pointer to the first invalid character.
The function uses a mix of techniques: the number is returned as an output parameter, but the error code and the pointer are returned as a structure. Why is this so? Let’s take a closer look…
Analysis
Return Multiple Values Using Output Parameters
Example code:
auto output_1(int &i1) { i1 = 11; // Output first parameter return 12; // Return second value }// Use volatile pointers so compiler could not inline the function auto (*volatile output_1_ptr)(int &i1) = output_1;int main() { int o1, o2; // Define local variables o2 = output_1_ptr(o1); // Output 1st param and assign the 2nd printf("output_1 o1 = %d, o2 = %d\n", o1, o2); }
The code compiles to:
output_1(int&): mov [rdi], 11 # Output first param to the address in rdimov eax, 12 # Return second value in eaxretmain: # Note: simplified lea rdi, [rsp + 4] # Load address of the 1st param (on stack) call [output_1_ptr] # Call output_1 using a pointer mov esi, [rsp + 4] # Load 1st param from the stack mov ecx, eax # Load 2nd param from eaxcall printf
Compiler Explorer: https://godbolt.org/z/Fan8OH
Pros:
- Classic. Easy to understand.
- Works with any C++ standard, including C (using pointers).
- Supports function overloading.
Cons:
- Address of the first parameter needs to be loaded prior to the function call.
- First parameter is passed using stack. Slow 🙁
- Due to System V AMD64 ABI, we can pass in registers up to 6 addresses. The stack must be used to pass more than 6 params. Even slower 🙁
To illustrate the last cons, here is an example code to output 7 params:
// Output more than 6 params int output_7(int &i1, int &i2, int &i3, int &i4, int &i5, int &i6, int &i7) { i1 = 11; i2 = 12; i3 = 13; i4 = 14; i5 = 15; i6 = 16; i7 = 17; return 18; }
And the disassembly of the output_7()
:
output_7(int&, int&, int&, int&, int&, int&, int&): mov [rdi], 11 # mov [rsi], 12 # Addresses of the first 6 params get passed mov [rdx], 13 # via rdi, rsi, rdx, rcx, r8, and r9 mov [rcx], 14 # according to System V AMD64 ABI mov [r8], 15 # (for Linux, macOS, FreeBSD etc) mov [r9], 16 # mov rax, [rsp + 8] # But address for the 7th is on the stack, mov [rax], 17 # which is slow mov eax, 18 ret
The seventh address is passed via stack, so we put the address on the stack, then we read it from the stack, then we output the value to that address… A bit too much memory operations. Slow 🙁
Return Multiple Values Using a Local Structure
Example code:
auto struct_2() { struct _ { // Declare a local structure with 2 integers int i1, i2; }; return _{21, 22}; // Return the local structure }// Use volatile pointers so compiler could not inline the function auto (*volatile struct_2_ptr)() = struct_2;int main() { auto [s1, s2] = struct_2_ptr(); // Structured binding declaration printf("struct_2 s1 = %d, s2 = %d\n", s1, s2); }
Disassembly:
struct_2(): movabs rax, 0x1600000015 # Just return 2 integers in raxretmain: # Note: simplified call [struct_2_ptr] # No need to load output param addresses mov rdx, rax # Just use the values returned in raxshr rdx, 32 # High 32 bits of raxmov rcx, rax mov esi, ecx # Low 32 bits of raxcall printf
Compiler Explorer: https://godbolt.org/z/Q7P4q0
Pros:
- Works with any C++ standard, including C, though the structure must be declared outside the function scope.
- Returns up to 128 bits in registers, no stack is used. Fast!
- Does not require addresses of the params, which allows compiler to better optimize the code.
Cons:
- Requires C++17 structured binding declaration.
- The function can’t be overloaded, since the return type is not a part of function identification.
What happens when we try to return more values? According to the System V AMD64 ABI, values up to 128 bits are stored in RAX and RDX. So up to four 32-bit integers will be returned in registers. One byte more and we have to use the stack.
Still, we don’t need to load output param addresses, so it is faster than the output parameters method.
Return Multiple Values Using an std::pair
Example:
auto pair_2() { return std::make_pair(31, 32); } // Just one line!// Use volatile pointers so compiler could not inline the function auto (*volatile pair_2_ptr)() = pair_2;int main() { auto [p1, p2] = pair_2_ptr(); // Structured binding declaration printf("pair_2 p1 = %d, p2 = %d\n", p1, p2); }
The generated assembly code:
pair_2(): movabs rax, 0x200000001f # Just return 2 integers in raxretmain: # Note: simplified call [pair_2_ptr] # Just call the function mov rdx, rax # Use the values returned in raxshr rdx, 32 mov rcx, rax mov esi, ecxcall printf
Compiler Explorer: https://godbolt.org/z/9iXzSb
Pros:
- Just one line of code!
- No need to declare the local structure.
- Just like with the structures, returns up to 128 buts in registers, no stack is used.
Cons:
- Pair is just two return values.
- Just like with the structures, the function can’t be overloaded.
Return Multiple Values Using an std::tuple
Example:
auto tuple_2() { return std::make_tuple(41, 42); } // Just one line!// Use volatile pointers so compiler could not inline the function auto (*volatile tuple_2_ptr)() = tuple_2;int main() { auto [t1, t2] = tuple_2_ptr(); // Structured binding declaration printf("tuple_2 t1 = %d, t2 = %d\n", t1, t2); }
The code compiles to:
tuple_2(): movabs rax, 0x290000002a. # Good start, but... mov [rdi], rax # Indirect write to a output parameter? mov rax, rdi # Return the address of the parameter retmain: # Note: simplified mov rdi, rsp # Pass stack pointer as a parameter call [tuple_2_ptr] # Call the function mov edx, [rsp] # Get the values from the stack mov esi, [rsp + 4] call printf
Compiler Explorer: https://godbolt.org/z/hSVV72
Pros:
- The source code is one liner, just like with the
std::pair
. - Unlike the
std::pair
, it is easy to add more values.
Cons:
- Unfortunately, the disassembly is a mixed bag. We need to pass an address of the output tuple to the function, one per tuple.
- Even for two integers (64 bits), the return values are always on the stack. Slow 🙁
What if we return more values in the tuple? Adding more values does not change the disassembly much: we still pass just one address pointing to the stack, then we put the values under that address (on stack), and then we load them back from the stack to use for printf()
.
It’s slower than the pair and the structure, which both return up to 128 bits in the registers. But it’s faster than the output parameters, where we need to pass a few addresses to the function, not just one.
Key Takeaways
- The fastest methods to return multiple parameters in C++17 are by using local structure and
std::pair
. - The
std::pair
must be preferred to return two values as the most convenient and fastest method. - Use output parameters when the function overload is needed. That’s why
std::from_chars()
uses output parameters and a return structure.
Full source code: https://github.com/berestovskyy/applied-cpp
The Answer to the “Do I Know This Already?” Quiz
The std::pair
is the most convenient and fastest method to return two values. If we need to return more than two values, local structure (faster) or std::tuple
(convenient) must be used instead.