Out-of-Order Executor library. Yet another approach to efficient and scalable tasking API and task scheduling.
- Requirements: Install cmake, make, googletest, google benchmark, and optionally TBB (recommended), and TaskFlow libraries into your environment.
- Build & run:
make - Install:
make install
With nested parallelism, blocking style programming is deadlock-prone and has latency problems. OOX provides semantic way out of these issues.
std::future is not intended for continuation tasks. Even with existing proposals like .then(), the continuation-style is limited - while OOX offers:
- Implicitly collapse template recursion of 'futures':
future x = async(async(async(…))); - Implicitly unpack 'futures' to a value in arguments:
async([](int a){}, async(…)); - Implicit value conversion for a 'future' variable. e.g.:
future<int> x{2}; - Implicitly build dependencies based on arguments: No blocking synchronization in the algorithm
This approach enables beautifully concise recursive parallelism programming like:
oox::var<int> Fib(int n) { // OOX: High-level continuation style programming
if(n < 2) return n;
auto left = oox::run(Fib, n-1);
return oox::run(std::plus<int>(), std::move(left), Fib(n-2));
}In contrast, both Intel Threading Building Blocks (TBB) and TaskFlow examples do block, which makes them slower than OOX besides requiring verboser coding:
int Fib(int n) { // TBB: High-level blocking style programming
if(n < 2) return n;
int left, right;
tbb::parallel_invoke( // blocks here
[&] { left = Fib(n-1); },
[&] { right = Fib(n-2); }
);
return left + right;
}and
int Fib(int n, tf::Subflow& sbf) { // TaskFlow: High-level blocking style programming
if (n < 2) return n;
int res1, res2;
sbf.emplace([&res1, n] (tf::Subflow& sbf) { res1 = Fib(n - 1, sbf); } );
sbf.emplace([&res2, n] (tf::Subflow& sbf) { res2 = Fib(n - 2, sbf); } );
sbf.join(); // blocks here
return res1 + res2;
}oox::var<T>: Basic representation of data in the OOX graph. In concept, a new form ofstd::futurefor continuations. It carries both: a value and dependency infousing oox::node = oox::var<void>: carries solely dependency info
oox::var<T> oox::run(T(Func&)(...), Args...): Basic tasking API, spawns a task when arguments are ready and returnsoox::varas a promise to provide the result of Func in future. If there areoox::vararguments, which are not ready yet (i.e. they are "promises" themselves), it makes a continuation task, which depends on completion of pendingoox::vararguments.
Pillars:
- Abstract user functor from async dependencies:
oox::run([](functor args...){}, dependency args...) - Reuse functor and runner arg types matching for dependency type specification
- Flow, output, and anti-dependencies are deduced
- Makes duplication of arguments finally useful
- Have clear serialization semantics for ease of debugging
Matching rules:
- plain arguments:
- Follow C++ rules: everything is passed as decay copy
- Use
std::refandstd::creffor passing by reference (and take responsibility for the lifetime)
oox::vararguments:- Similar to std::ref: usually passed by reference but it cares about lifetime and access sync
oox::varusage has to be indifferent from plain types
Stored types:
oox::runreturnsoox::varfor decay type of functor return type, copy- or move-initialized.oox::vardoesn't store references. Usestd::reference_wrapperor pointer types instead- Don't compile oox_var if not
is_same<T, std::decay<T>::type> oox::varis reference-counted, it is safe for end of scope
Access types of oox::var's stored value:
- Read-Write: Exclusive access: both matching arguments are passed by reference
- Read-Only: Shared access:
auto f = [](const A&){}withoox::var<A>& aorconst oox::var<A>& awhile runningoox::run(f, a) - Copy-Only: Copy access: like read-only but does not hold off following read-write access after a copy is done,
auto f = [](A){}orauto f = [](A&&){}withoox::var<A>& aorconst oox::var<A>& awhile runningoox::run(f, a) - Final: dispose after use:
oox::var<A>&& a