Skip to content

Commit a8b323e

Browse files
authored
docs: update migration guide
PR-URL: #1000 Reviewed-by: Athan Reines <kgryte@gmail.com> Reviewed-by: Lucas Colley
1 parent 94bb5ba commit a8b323e

1 file changed

Lines changed: 68 additions & 65 deletions

File tree

spec/draft/migration_guide.md

Lines changed: 68 additions & 65 deletions
Original file line numberDiff line numberDiff line change
@@ -2,115 +2,116 @@
22

33
# Migration Guide
44

5-
This page is meant to help migrate your codebase to an Array API compliant
6-
implementation. The guide is divided into two parts and, depending on your
7-
exact use-case, you should look thoroughly into at least one of them.
5+
This page is meant to help migrate your codebase to an array API standard
6+
compliant implementation or become interoperable with compliant
7+
implementations. The guide is divided into three parts.
88

9-
The first part is dedicated for {ref}`array-producers`. If your library
10-
mimics, for example, NumPy's or Dask's functionality, then you can find in
11-
the first part additional instructions and guidance on how to ensure
12-
downstream users can easily pick your solution as an array provider for
13-
their system/algorithm.
9+
The first part gives an overview of the {ref}`ecosystem` libraries, that
10+
are helpful in different contexts when working with the array API standard.
1411

15-
The second part delves into details for Array API compatibility for
12+
The second part is dedicated to {ref}`array-producers`. If your library
13+
mimics, for example, NumPy's or PyTorch's functionality, you can find
14+
additional instructions and guidance here on how to ensure downstream users
15+
can easily pick your solution as an array provider for their system/algorithm.
16+
17+
The third part delves into details for array API standard compatibility for
1618
{ref}`array-consumers`. This pertains to any software that performs
1719
multidimensional array manipulation in Python, such as may be found in
1820
scikit-learn, SciPy, or statsmodels. If your software relies on a certain
1921
array producing library, such as NumPy or JAX, then you can use the second
20-
part to learn how to make it library agnostic and interchange array
21-
namespaces with significantly less friction.
22+
part to learn how to make it library agnostic and, as a result, use array
23+
namespaces interchangeably with significantly less friction.
24+
25+
26+
(ecosystem)=
2227

2328
## Ecosystem
2429

25-
Apart from the documented standard, the Array API ecosystem also provides
30+
Apart from the documented standard, the array API ecosystem also provides
2631
a set of tools and packages to help you with the migration process:
2732

2833

2934
(array-api-compat)=
3035

31-
### Array API Compat
36+
### array-api-compat
3237

3338
GitHub: [array-api-compat](https://github.com/data-apis/array-api-compat)
3439

3540
User group: Array Consumers
3641

37-
Although NumPy, Dask, CuPy, and PyTorch support the Array API Standard, there
38-
are still some corner cases where their behavior diverges from the standard.
39-
`array-api-compat` provides a compatibility layer to cover these cases.
40-
This is also accompanied by a few utility functions for easier introspection
41-
into array objects. As an array consumer, you can still rely on the original
42-
API while having access to the standard compatible one.
42+
Although NumPy or CuPy support the array API standard, there are still some
43+
corner cases where their behavior diverges from the standard.
44+
`array-api-compat` provides a compatibility layer to cover an additional subset
45+
of such corner cases for supported libraries. This is also accompanied by a few
46+
utility functions for easier introspection into array objects. As an array
47+
consumer, you can consume standard-compliant namespaces as well as the wrapped
48+
namespaces in `array-api-compat` at the same time.
4349

4450

4551
(array-api-strict)=
4652

47-
### Array API Strict
53+
### array-api-strict
4854

4955
GitHub: [array-api-strict](https://github.com/data-apis/array-api-strict)
5056

51-
User group: Array Consumers, Array Producers (for testing)
57+
User group: Array Consumers
5258

5359
`array-api-strict` is a library that provides a strict and minimal
54-
implementation of the Array API Standard. For array producers, it is designed
55-
to be used as a reference implementation for testing and development purposes.
56-
You can compare your API calls with `array-api-strict` counterparts and
57-
ensure that your library is fully compliant with the standard and can
58-
serve as a reliable reference for other developers in the ecosystem.
59-
For consumers, you can use `array-api-strict` during the development as an
60-
array provider to ensure your code uses APIs compliant with the standard.
60+
implementation of the array API standard. As a consumer, you can use
61+
`array-api-strict` in parametrising tests over the array namespace
62+
to ensure your code uses only APIs that are compliant with the standard.
6163

6264

6365
(array-api-tests)=
6466

65-
### Array API Test
67+
### array-api-tests
6668

6769
GitHub: [array-api-tests](https://github.com/data-apis/array-api-tests)
6870

6971
User group: Array Producers
7072

7173
`array-api-tests` is a collection of tests that can be used to verify the
72-
compliance of your library with the Array API Standard. It includes tests
74+
compliance of your library with the array API standard. It includes tests
7375
for array producers, covering a wide range of functionalities and use cases.
7476
By running these tests, you can ensure that your library adheres to the
7577
standard and can be used with compatible array consumer libraries.
7678

7779

7880
(array-api-extra)=
7981

80-
### Array API Extra
82+
### array-api-extra
8183

8284
GitHub: [array-api-extra](https://github.com/data-apis/array-api-extra)
8385

8486
User group: Array Consumers
8587

8688
`array-api-extra` is a collection of additional utilities and tools that are
87-
missing from the Array API Standard but can be useful for compliant array
88-
consumers. It includes additional array manipulation and statistical functions.
89-
It is already used by SciPy and scikit-learn.
90-
91-
The sections below mention when and how to use them.
89+
not present in the array API standard but can be useful for compliant array
90+
consumers. It includes additional array manipulation and statistical
91+
functions, support for lazy backends, and useful testing utilities. It is
92+
already used by SciPy and scikit-learn.
9293

9394

9495
(array-producers)=
9596

9697
## Array Producers
9798

9899
For array producers, the central task during the development/migration process
99-
is ensuring that the user-facing API adheres to the Array API Standard.
100+
is ensuring that the user-facing API adheres to the array API standard.
100101

101102
The complete API of the standard is documented in the
102103
[API specification](https://data-apis.org/array-api/latest/API_specification/index.html).
103104

104105
There, each function, constant, and object is described with details
105106
on parameters, return values, and special cases.
106107

107-
### Testing against Array API
108+
### Testing against array API
108109

109110
There are two main ways to test your API for compliance: either using
110111
`array-api-tests` suite or testing your API manually against the
111112
`array-api-strict` reference implementation.
112113

113-
#### Array API Test suite (Recommended)
114+
#### array-api-tests suite (Recommended)
114115

115116
{ref}`array-api-tests` is a test suite which verifies that your API
116117
adheres to the standard. For each function or method, it confirms
@@ -126,10 +127,10 @@ cover only the minimal workflow:
126127
variable to your package import name.
127128
3. Inside the `array-api-tests` directory run the command for running pytest: `pytest`. There are
128129
multiple useful options delivered by the test suite. A few worth mentioning:
129-
- `--max-examples=1000` - maximal number of test cases to generate when using
130+
- `--max-examples=1000` - maximum number of test cases to generate when using
130131
hypothesis. This allows you to balance between execution time of the test
131132
suite and thoroughness of the testing. It's advised to use as many examples
132-
as the time buget can fit. Each test case is a random combination of
133+
as the time budget can fit. Each test case is a random combination of
133134
possible inputs: the more cases, the higher chance of finding an
134135
unsupported edge case.
135136
- With the `--xfails-file` option, you can describe which tests are expected
@@ -144,13 +145,15 @@ cover only the minimal workflow:
144145
option is to skip these for the time being.
145146

146147
We strongly advise you to embed this setup in your CI as well. This will allow
147-
you to continuously monitor Array API coverage, and make sure new changes don't break existing
148-
APIs. As a reference, see [NumPy's Array API Tests CI setup](https://github.com/numpy/numpy/blob/581d10f43b539a189a2d37856e5130464de9e5f6/.github/workflows/linux.yml#L296).
148+
you to continuously monitor array API standard coverage, and make sure new
149+
changes don't break existing APIs. As a reference, see
150+
[NumPy's array-api-tests CI setup](https://github.com/numpy/numpy/blob/581d10f43b539a189a2d37856e5130464de9e5f6/.github/workflows/linux.yml#L296)
151+
and [a Pixi workspace setup](https://github.com/mdhaber/mparray/blob/0ef47e008fef92c605f73907436d4c6617419161/pixi.toml#L119-L179).
149152

150153

151-
#### Array API Strict
154+
#### array-api-strict
152155

153-
A simpler, and more manual, way of testing Array API coverage is to
156+
A simpler, and more manual, way of testing array API standard coverage is to
154157
run your API calls along with the {ref}`array-api-strict` Python implementation.
155158

156159
This way, you can ensure that the outputs coming from your API match the minimal
@@ -163,10 +166,9 @@ cases.
163166

164167
## Array Consumers
165168

166-
For array consumers, the main premise is to keep in mind that your **array
167-
manipulation operations should not lock in for a particular array producing
168-
library**. For instance, if you use NumPy for arrays, then your code could
169-
contain:
169+
For array consumers, the main premise is that your **array manipulation operations
170+
should not be specific to one particular array producing library**. For instance,
171+
if your code is specific to NumPy, it might contain:
170172

171173
```python
172174
import numpy as np
@@ -178,12 +180,12 @@ return np.dot(c, b)
178180
```
179181

180182
The first step should be as simple as assigning the `np` namespace to a dedicated
181-
namespace variable. The convention used in the ecosystem is to name it `xp`. Then,
182-
it is vital to ensure that each method and function call is something that the Array API
183-
supports. For example, `dot` is present in the NumPy's API, but the standard
184-
doesn't support it. For the sake of simplicity, let's assume both `c` and `b`
185-
are `ndim=2`; therefore, we select `tensordot` instead, as both NumPy and the
186-
standard define it:
183+
namespace variable. The convention used in the ecosystem is to name it `xp`.
184+
Then, it is vital to ensure that each method and function call is something that
185+
the array API standard supports. For example, `dot` is present in the NumPy
186+
API, but the standard doesn't support it. For the sake of simplicity, let's
187+
assume both `c` and `b` are `ndim=2`; therefore, we select `tensordot` instead,
188+
as both NumPy and the standard define it:
187189

188190
```python
189191
import numpy as np
@@ -196,18 +198,19 @@ c = xp.mean(a, axis=0)
196198
return xp.tensordot(c, b, axes=1)
197199
```
198200

199-
At this point, replacing one backend with another one should only require providing a different
200-
namespace, such as `xp = torch` (e.g., via an environment variable). This can be useful
201-
if you're writing a script or in your custom software. The other alternatives are:
201+
At this point, replacing one backend with another one should only require
202+
providing a different namespace, such as `xp = torch` (e.g., via an environment
203+
variable). This can be useful if you're writing a script or in your custom
204+
software. The other alternatives are:
202205

203-
- If you are building a library where the backend is determined by input arrays,
204-
and your function accepts array arguments, then a recommended way is to ask
205-
your input arrays for a namespace to use: `xp = arr.__array_namespace__()`.
206-
If the given library doesn't have it, then [`array_api_compat.array_namespace()`](https://data-apis.org/array-api-compat/helper-functions.html#array_api_compat.array_namespace)
207-
should be used instead:
206+
- If you are building a library where the backend is determined by input
207+
arrays, and your function accepts array arguments, then a recommended way to
208+
fetch the namespace is to use [`array_api_compat.array_namespace()`](https://data-apis.org/array-api-compat/helper-functions.html#array_api_compat.array_namespace).
209+
In case you don't want to introduce a new package dependency, you can rely
210+
on a plain `xp = arr.__array_namespace__()`:
208211
```python
209212
def func(array1, scalar1, scalar2):
210-
xp = array1.__array_namespace__() # or array_namespace(array1)
213+
xp = array_namespace(array1) # or array1.__array_namespace__()
211214
return xp.arange(scalar1, scalar2) @ array1
212215
```
213216
- For a function that accepts scalars and returns arrays, use namespace `xp` as
@@ -227,7 +230,7 @@ offers a set of useful utility functions, such as:
227230
- [array_namespace()](https://data-apis.org/array-api-compat/helper-functions.html#array_api_compat.array_namespace)
228231
for fetching the namespace based on input arrays.
229232
- [is_array_api_obj()](https://data-apis.org/array-api-compat/helper-functions.html#array_api_compat.is_array_api_obj)
230-
for inspecting whether a given object is Array API compatible.
233+
for inspecting whether a given object is array API compatible.
231234
- [device()](https://data-apis.org/array-api-compat/helper-functions.html#array_api_compat.device)
232235
for retrieving the device on which an array resides.
233236

0 commit comments

Comments
 (0)