Avoid building a static library for CPU target code

Offloading with `target` on the CPU requires creating a self-contained shared library for the CPU code. Numba lowering inserts calls to NRT  or helperlib functions, which are not exported through Numba-compiled shared libraries. Additionally, we need to init the NRT memory system (`NRT_MemSys_Init`) to make it usable from the standalone CPU target code.

Currently, we build a static library by pulling sources from the Numba installation, insert a constructor for initialization (the memory system), and link that static library with the shared library CPU target code.  Packaging a wheel pulls those sources from the specific Numba version installed at build time. This works across Numba versions because those APIs and their implementations are fairly stable, but it's not ideal.

I can think of the following alternatives:

1. PyOMP builds the static library at JIT time instead of packaging it, pulling sources from the Numba installation. PyOMP can cache that library to avoid rebuilding for another CPU target region
2. Numba provides a static library ~~(or LLVM bitcode library)~~[^1] that PyOMP links at JIT time

[^1]: bitcode library will require the user to have python development headers which is not ideal



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Avoid building a static library for CPU target code #23

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Avoid building a static library for CPU target code #23

Description

Footnotes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions