Spaces:
Running
Running
Merge tag 'v0.12.0' into custom-objectives
Browse files- README.md +8 -14
- docs/examples.md +35 -1
- pysr/__init__.py +1 -0
- pysr/julia_helpers.py +3 -0
- pysr/sklearn_monkeypatch.py +13 -0
- pysr/sr.py +41 -6
- pysr/test/test.py +17 -0
- pysr/version.py +2 -2
README.md
CHANGED
|
@@ -8,7 +8,6 @@ https://user-images.githubusercontent.com/7593028/188328887-1b6cda72-2f41-439e-a
|
|
| 8 |
|
| 9 |
</div>
|
| 10 |
|
| 11 |
-
|
| 12 |
PySR uses evolutionary algorithms to search for symbolic expressions which optimize a particular objective.
|
| 13 |
|
| 14 |
<div align="center">
|
|
@@ -19,13 +18,11 @@ PySR uses evolutionary algorithms to search for symbolic expressions which optim
|
|
| 19 |
|
| 20 |
</div>
|
| 21 |
|
| 22 |
-
|
| 23 |
(pronounced like *py* as in python, and then *sur* as in surface)
|
| 24 |
|
| 25 |
If you find PySR useful, please cite it using the citation information given in [CITATION.md](https://github.com/MilesCranmer/PySR/blob/master/CITATION.md).
|
| 26 |
If you've finished a project with PySR, please submit a PR to showcase your work on the [Research Showcase page](https://astroautomata.com/PySR/papers)!
|
| 27 |
|
| 28 |
-
|
| 29 |
<div align="center">
|
| 30 |
|
| 31 |
### Test status
|
|
@@ -33,10 +30,9 @@ If you've finished a project with PySR, please submit a PR to showcase your work
|
|
| 33 |
| **Linux** | **Windows** | **macOS (intel)** |
|
| 34 |
|---|---|---|
|
| 35 |
|[](https://github.com/MilesCranmer/PySR/actions/workflows/CI.yml)|[](https://github.com/MilesCranmer/PySR/actions/workflows/CI_Windows.yml)|[](https://github.com/MilesCranmer/PySR/actions/workflows/CI_mac.yml)|
|
| 36 |
-
| **Docker** | **Conda** | **Coverage** |
|
| 37 |
|[](https://github.com/MilesCranmer/PySR/actions/workflows/CI_docker.yml)|[](https://github.com/MilesCranmer/PySR/actions/workflows/CI_conda_forge.yml)|[](https://coveralls.io/github/MilesCranmer/PySR)|
|
| 38 |
|
| 39 |
-
|
| 40 |
</div>
|
| 41 |
|
| 42 |
PySR is built on an extremely optimized pure-Julia backend: [SymbolicRegression.jl](https://github.com/MilesCranmer/SymbolicRegression.jl).
|
|
@@ -47,14 +43,13 @@ to find algebraic relations that approximate a dataset.
|
|
| 47 |
|
| 48 |
One can also
|
| 49 |
extend these approaches to higher-dimensional
|
| 50 |
-
spaces by using a neural network as proxy, as explained in
|
| 51 |
[2006.11287](https://arxiv.org/abs/2006.11287), where we apply
|
| 52 |
it to N-body problems. Here, one essentially uses
|
| 53 |
symbolic regression to convert a neural net
|
| 54 |
to an analytic equation. Thus, these tools simultaneously present
|
| 55 |
an explicit and powerful way to interpret deep models.
|
| 56 |
|
| 57 |
-
|
| 58 |
*Backstory:*
|
| 59 |
|
| 60 |
Previously, we have used
|
|
@@ -68,19 +63,18 @@ of this package is to have an open-source symbolic regression tool
|
|
| 68 |
as efficient as eureqa, while also exposing a configurable
|
| 69 |
python interface.
|
| 70 |
|
| 71 |
-
|
| 72 |
# Installation
|
| 73 |
|
| 74 |
<div align="center">
|
| 75 |
|
| 76 |
| pip - **recommended** <br> (works everywhere) | conda <br>(Linux and Intel-based macOS) | docker <br>(if all else fails) |
|
| 77 |
|---|---|---|
|
| 78 |
-
| 1. [Install Julia](https://julialang.org/downloads/)<br>2. Then, run: `pip install -U pysr`<br>3. Finally, to install Julia packages:<br>`
|
| 79 |
|
| 80 |
</div>
|
| 81 |
|
| 82 |
Common issues tend to be related to Python not finding Julia.
|
| 83 |
-
To debug this, try running `
|
| 84 |
If none of these folders contain your Julia binary, then you need to add Julia's `bin` folder to your `PATH` environment variable.
|
| 85 |
|
| 86 |
**Running PySR on macOS with an M1 processor:** you should use the pip version, and make sure to get the Julia binary for ARM/M-series processors.
|
|
@@ -136,7 +130,7 @@ model.fit(X, y)
|
|
| 136 |
|
| 137 |
Internally, this launches a Julia process which will do a multithreaded search for equations to fit the dataset.
|
| 138 |
|
| 139 |
-
Equations will be printed during training, and once you are satisfied, you may
|
| 140 |
quit early by hitting 'q' and then \<enter\>.
|
| 141 |
|
| 142 |
After the model has been fit, you can run `model.predict(X)`
|
|
@@ -167,9 +161,9 @@ This arrow in the `pick` column indicates which equation is currently selected b
|
|
| 167 |
`model_selection` strategy for prediction.
|
| 168 |
(You may change `model_selection` after `.fit(X, y)` as well.)
|
| 169 |
|
| 170 |
-
`model.equations_` is a pandas DataFrame containing all equations, including callable format
|
| 171 |
(`lambda_format`),
|
| 172 |
-
SymPy format (`sympy_format` - which you can also get with `model.sympy()`), and even JAX and PyTorch format
|
| 173 |
(both of which are differentiable - which you can get with `model.jax()` and `model.pytorch()`).
|
| 174 |
|
| 175 |
Note that `PySRRegressor` stores the state of the last search, and will restart from where you left off the next time you call `.fit()`, assuming you have set `warm_start=True`.
|
|
@@ -181,7 +175,7 @@ You may load the model from the `pkl` file with:
|
|
| 181 |
|
| 182 |
```python
|
| 183 |
model = PySRRegressor.from_file("hall_of_fame.2022-08-10_100832.281.pkl")
|
| 184 |
-
```
|
| 185 |
|
| 186 |
There are several other useful features such as denoising (e.g., `denoising=True`),
|
| 187 |
feature selection (e.g., `select_k_features=3`).
|
|
|
|
| 8 |
|
| 9 |
</div>
|
| 10 |
|
|
|
|
| 11 |
PySR uses evolutionary algorithms to search for symbolic expressions which optimize a particular objective.
|
| 12 |
|
| 13 |
<div align="center">
|
|
|
|
| 18 |
|
| 19 |
</div>
|
| 20 |
|
|
|
|
| 21 |
(pronounced like *py* as in python, and then *sur* as in surface)
|
| 22 |
|
| 23 |
If you find PySR useful, please cite it using the citation information given in [CITATION.md](https://github.com/MilesCranmer/PySR/blob/master/CITATION.md).
|
| 24 |
If you've finished a project with PySR, please submit a PR to showcase your work on the [Research Showcase page](https://astroautomata.com/PySR/papers)!
|
| 25 |
|
|
|
|
| 26 |
<div align="center">
|
| 27 |
|
| 28 |
### Test status
|
|
|
|
| 30 |
| **Linux** | **Windows** | **macOS (intel)** |
|
| 31 |
|---|---|---|
|
| 32 |
|[](https://github.com/MilesCranmer/PySR/actions/workflows/CI.yml)|[](https://github.com/MilesCranmer/PySR/actions/workflows/CI_Windows.yml)|[](https://github.com/MilesCranmer/PySR/actions/workflows/CI_mac.yml)|
|
| 33 |
+
| **Docker** | **Conda** | **Coverage** |
|
| 34 |
|[](https://github.com/MilesCranmer/PySR/actions/workflows/CI_docker.yml)|[](https://github.com/MilesCranmer/PySR/actions/workflows/CI_conda_forge.yml)|[](https://coveralls.io/github/MilesCranmer/PySR)|
|
| 35 |
|
|
|
|
| 36 |
</div>
|
| 37 |
|
| 38 |
PySR is built on an extremely optimized pure-Julia backend: [SymbolicRegression.jl](https://github.com/MilesCranmer/SymbolicRegression.jl).
|
|
|
|
| 43 |
|
| 44 |
One can also
|
| 45 |
extend these approaches to higher-dimensional
|
| 46 |
+
spaces by using a neural network as proxy, as explained in
|
| 47 |
[2006.11287](https://arxiv.org/abs/2006.11287), where we apply
|
| 48 |
it to N-body problems. Here, one essentially uses
|
| 49 |
symbolic regression to convert a neural net
|
| 50 |
to an analytic equation. Thus, these tools simultaneously present
|
| 51 |
an explicit and powerful way to interpret deep models.
|
| 52 |
|
|
|
|
| 53 |
*Backstory:*
|
| 54 |
|
| 55 |
Previously, we have used
|
|
|
|
| 63 |
as efficient as eureqa, while also exposing a configurable
|
| 64 |
python interface.
|
| 65 |
|
|
|
|
| 66 |
# Installation
|
| 67 |
|
| 68 |
<div align="center">
|
| 69 |
|
| 70 |
| pip - **recommended** <br> (works everywhere) | conda <br>(Linux and Intel-based macOS) | docker <br>(if all else fails) |
|
| 71 |
|---|---|---|
|
| 72 |
+
| 1. [Install Julia](https://julialang.org/downloads/)<br>2. Then, run: `pip install -U pysr`<br>3. Finally, to install Julia packages:<br>`python3 -c 'import pysr; pysr.install()'` | `conda install -c conda-forge pysr` | 1. Clone this repo.<br>2. `docker build -t pysr .`<br>Run with:<br>`docker run -it --rm pysr ipython`
|
| 73 |
|
| 74 |
</div>
|
| 75 |
|
| 76 |
Common issues tend to be related to Python not finding Julia.
|
| 77 |
+
To debug this, try running `python3 -c 'import os; print(os.environ["PATH"])'`.
|
| 78 |
If none of these folders contain your Julia binary, then you need to add Julia's `bin` folder to your `PATH` environment variable.
|
| 79 |
|
| 80 |
**Running PySR on macOS with an M1 processor:** you should use the pip version, and make sure to get the Julia binary for ARM/M-series processors.
|
|
|
|
| 130 |
|
| 131 |
Internally, this launches a Julia process which will do a multithreaded search for equations to fit the dataset.
|
| 132 |
|
| 133 |
+
Equations will be printed during training, and once you are satisfied, you may
|
| 134 |
quit early by hitting 'q' and then \<enter\>.
|
| 135 |
|
| 136 |
After the model has been fit, you can run `model.predict(X)`
|
|
|
|
| 161 |
`model_selection` strategy for prediction.
|
| 162 |
(You may change `model_selection` after `.fit(X, y)` as well.)
|
| 163 |
|
| 164 |
+
`model.equations_` is a pandas DataFrame containing all equations, including callable format
|
| 165 |
(`lambda_format`),
|
| 166 |
+
SymPy format (`sympy_format` - which you can also get with `model.sympy()`), and even JAX and PyTorch format
|
| 167 |
(both of which are differentiable - which you can get with `model.jax()` and `model.pytorch()`).
|
| 168 |
|
| 169 |
Note that `PySRRegressor` stores the state of the last search, and will restart from where you left off the next time you call `.fit()`, assuming you have set `warm_start=True`.
|
|
|
|
| 175 |
|
| 176 |
```python
|
| 177 |
model = PySRRegressor.from_file("hall_of_fame.2022-08-10_100832.281.pkl")
|
| 178 |
+
```
|
| 179 |
|
| 180 |
There are several other useful features such as denoising (e.g., `denoising=True`),
|
| 181 |
feature selection (e.g., `select_k_features=3`).
|
docs/examples.md
CHANGED
|
@@ -284,7 +284,41 @@ You can get the sympy version of the best equation with:
|
|
| 284 |
model.sympy()
|
| 285 |
```
|
| 286 |
|
| 287 |
-
## 8.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 288 |
|
| 289 |
For the many other features available in PySR, please
|
| 290 |
read the [Options section](options.md).
|
|
|
|
| 284 |
model.sympy()
|
| 285 |
```
|
| 286 |
|
| 287 |
+
## 8. Complex numbers
|
| 288 |
+
|
| 289 |
+
PySR can also search for complex-valued expressions. Simply pass
|
| 290 |
+
data with a complex datatype (e.g., `np.complex128`),
|
| 291 |
+
and PySR will automatically search for complex-valued expressions:
|
| 292 |
+
|
| 293 |
+
```python
|
| 294 |
+
import numpy as np
|
| 295 |
+
|
| 296 |
+
X = np.random.randn(100, 1) + 1j * np.random.randn(100, 1)
|
| 297 |
+
y = (1 + 2j) * np.cos(X[:, 0] * (0.5 - 0.2j))
|
| 298 |
+
|
| 299 |
+
model = PySRRegressor(
|
| 300 |
+
binary_operators=["+", "-", "*"], unary_operators=["cos"], niterations=100,
|
| 301 |
+
)
|
| 302 |
+
|
| 303 |
+
model.fit(X, y)
|
| 304 |
+
```
|
| 305 |
+
|
| 306 |
+
You can see that all of the learned constants are now complex numbers.
|
| 307 |
+
We can get the sympy version of the best equation with:
|
| 308 |
+
|
| 309 |
+
```python
|
| 310 |
+
model.sympy()
|
| 311 |
+
```
|
| 312 |
+
|
| 313 |
+
We can also make predictions normally, by passing complex data:
|
| 314 |
+
|
| 315 |
+
```python
|
| 316 |
+
model.predict(X, -1)
|
| 317 |
+
```
|
| 318 |
+
|
| 319 |
+
to make predictions with the most accurate expression.
|
| 320 |
+
|
| 321 |
+
## 9. Additional features
|
| 322 |
|
| 323 |
For the many other features available in PySR, please
|
| 324 |
read the [Options section](options.md).
|
pysr/__init__.py
CHANGED
|
@@ -1,3 +1,4 @@
|
|
|
|
|
| 1 |
from .version import __version__
|
| 2 |
from .sr import (
|
| 3 |
pysr,
|
|
|
|
| 1 |
+
from . import sklearn_monkeypatch
|
| 2 |
from .version import __version__
|
| 3 |
from .sr import (
|
| 4 |
pysr,
|
pysr/julia_helpers.py
CHANGED
|
@@ -194,6 +194,9 @@ def init_julia(julia_project=None, quiet=False, julia_kwargs=None, return_aux=Fa
|
|
| 194 |
# Static python binary, so we turn off pre-compiled modules.
|
| 195 |
julia_kwargs = {**julia_kwargs, "compiled_modules": False}
|
| 196 |
Julia(**julia_kwargs)
|
|
|
|
|
|
|
|
|
|
| 197 |
|
| 198 |
using_compiled_modules = (not "compiled_modules" in julia_kwargs) or julia_kwargs[
|
| 199 |
"compiled_modules"
|
|
|
|
| 194 |
# Static python binary, so we turn off pre-compiled modules.
|
| 195 |
julia_kwargs = {**julia_kwargs, "compiled_modules": False}
|
| 196 |
Julia(**julia_kwargs)
|
| 197 |
+
warnings.warn(
|
| 198 |
+
"Your system's Python library is static (e.g., conda), so precompilation will be turned off. For a dynamic library, try `pyenv`."
|
| 199 |
+
)
|
| 200 |
|
| 201 |
using_compiled_modules = (not "compiled_modules" in julia_kwargs) or julia_kwargs[
|
| 202 |
"compiled_modules"
|
pysr/sklearn_monkeypatch.py
ADDED
|
@@ -0,0 +1,13 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Here, we monkey patch scikit-learn until this
|
| 2 |
+
# issue is fixed: https://github.com/scikit-learn/scikit-learn/issues/25922
|
| 3 |
+
from sklearn.utils import validation
|
| 4 |
+
|
| 5 |
+
|
| 6 |
+
def _ensure_no_complex_data(*args, **kwargs):
|
| 7 |
+
...
|
| 8 |
+
|
| 9 |
+
|
| 10 |
+
try:
|
| 11 |
+
validation._ensure_no_complex_data = _ensure_no_complex_data
|
| 12 |
+
except AttributeError:
|
| 13 |
+
...
|
pysr/sr.py
CHANGED
|
@@ -1,5 +1,6 @@
|
|
| 1 |
"""Define the PySRRegressor scikit-learn interface."""
|
| 2 |
import copy
|
|
|
|
| 3 |
import os
|
| 4 |
import sys
|
| 5 |
import numpy as np
|
|
@@ -518,6 +519,8 @@ class PySRRegressor(MultiOutputMixin, RegressorMixin, BaseEstimator):
|
|
| 518 |
What precision to use for the data. By default this is `32`
|
| 519 |
(float32), but you can select `64` or `16` as well, giving
|
| 520 |
you 64 or 16 bits of floating point precision, respectively.
|
|
|
|
|
|
|
| 521 |
Default is `32`.
|
| 522 |
random_state : int, Numpy RandomState instance or None
|
| 523 |
Pass an int for reproducible results across multiple function calls.
|
|
@@ -1647,7 +1650,13 @@ class PySRRegressor(MultiOutputMixin, RegressorMixin, BaseEstimator):
|
|
| 1647 |
)
|
| 1648 |
|
| 1649 |
# Convert data to desired precision
|
| 1650 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1651 |
|
| 1652 |
# This converts the data into a Julia array:
|
| 1653 |
Main.X = np.array(X, dtype=np_dtype).T
|
|
@@ -1788,9 +1797,9 @@ class PySRRegressor(MultiOutputMixin, RegressorMixin, BaseEstimator):
|
|
| 1788 |
warnings.warn(
|
| 1789 |
"Note: you are running with 10 features or more. "
|
| 1790 |
"Genetic algorithms like used in PySR scale poorly with large numbers of features. "
|
| 1791 |
-
"
|
| 1792 |
-
"
|
| 1793 |
-
"or, alternatively,
|
| 1794 |
"For example, `X = PCA(n_components=6).fit_transform(X)`, "
|
| 1795 |
"using scikit-learn's `PCA` class, "
|
| 1796 |
"will reduce the number of features to 6 in an interpretable way, "
|
|
@@ -2035,6 +2044,7 @@ class PySRRegressor(MultiOutputMixin, RegressorMixin, BaseEstimator):
|
|
| 2035 |
|
| 2036 |
def _read_equation_file(self):
|
| 2037 |
"""Read the hall of fame file created by `SymbolicRegression.jl`."""
|
|
|
|
| 2038 |
try:
|
| 2039 |
if self.nout_ > 1:
|
| 2040 |
all_outputs = []
|
|
@@ -2042,7 +2052,11 @@ class PySRRegressor(MultiOutputMixin, RegressorMixin, BaseEstimator):
|
|
| 2042 |
cur_filename = str(self.equation_file_) + f".out{i}" + ".bkup"
|
| 2043 |
if not os.path.exists(cur_filename):
|
| 2044 |
cur_filename = str(self.equation_file_) + f".out{i}"
|
| 2045 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2046 |
# Rename Complexity column to complexity:
|
| 2047 |
df.rename(
|
| 2048 |
columns={
|
|
@@ -2058,7 +2072,10 @@ class PySRRegressor(MultiOutputMixin, RegressorMixin, BaseEstimator):
|
|
| 2058 |
filename = str(self.equation_file_) + ".bkup"
|
| 2059 |
if not os.path.exists(filename):
|
| 2060 |
filename = str(self.equation_file_)
|
| 2061 |
-
|
|
|
|
|
|
|
|
|
|
| 2062 |
all_outputs[-1].rename(
|
| 2063 |
columns={
|
| 2064 |
"Complexity": "complexity",
|
|
@@ -2067,6 +2084,7 @@ class PySRRegressor(MultiOutputMixin, RegressorMixin, BaseEstimator):
|
|
| 2067 |
},
|
| 2068 |
inplace=True,
|
| 2069 |
)
|
|
|
|
| 2070 |
except FileNotFoundError:
|
| 2071 |
raise RuntimeError(
|
| 2072 |
"Couldn't find equation file! The equation search likely exited "
|
|
@@ -2357,3 +2375,20 @@ def _csv_filename_to_pkl_filename(csv_filename) -> str:
|
|
| 2357 |
pkl_basename = base + ".pkl"
|
| 2358 |
|
| 2359 |
return os.path.join(dirname, pkl_basename)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
"""Define the PySRRegressor scikit-learn interface."""
|
| 2 |
import copy
|
| 3 |
+
from io import StringIO
|
| 4 |
import os
|
| 5 |
import sys
|
| 6 |
import numpy as np
|
|
|
|
| 519 |
What precision to use for the data. By default this is `32`
|
| 520 |
(float32), but you can select `64` or `16` as well, giving
|
| 521 |
you 64 or 16 bits of floating point precision, respectively.
|
| 522 |
+
If you pass complex data, the corresponding complex precision
|
| 523 |
+
will be used (i.e., `64` for complex128, `32` for complex64).
|
| 524 |
Default is `32`.
|
| 525 |
random_state : int, Numpy RandomState instance or None
|
| 526 |
Pass an int for reproducible results across multiple function calls.
|
|
|
|
| 1650 |
)
|
| 1651 |
|
| 1652 |
# Convert data to desired precision
|
| 1653 |
+
test_X = np.array(X)
|
| 1654 |
+
is_complex = np.issubdtype(test_X.dtype, np.complexfloating)
|
| 1655 |
+
is_real = not is_complex
|
| 1656 |
+
if is_real:
|
| 1657 |
+
np_dtype = {16: np.float16, 32: np.float32, 64: np.float64}[self.precision]
|
| 1658 |
+
else:
|
| 1659 |
+
np_dtype = {32: np.complex64, 64: np.complex128}[self.precision]
|
| 1660 |
|
| 1661 |
# This converts the data into a Julia array:
|
| 1662 |
Main.X = np.array(X, dtype=np_dtype).T
|
|
|
|
| 1797 |
warnings.warn(
|
| 1798 |
"Note: you are running with 10 features or more. "
|
| 1799 |
"Genetic algorithms like used in PySR scale poorly with large numbers of features. "
|
| 1800 |
+
"You should run PySR for more `niterations` to ensure it can find "
|
| 1801 |
+
"the correct variables, "
|
| 1802 |
+
"or, alternatively, do a dimensionality reduction beforehand. "
|
| 1803 |
"For example, `X = PCA(n_components=6).fit_transform(X)`, "
|
| 1804 |
"using scikit-learn's `PCA` class, "
|
| 1805 |
"will reduce the number of features to 6 in an interpretable way, "
|
|
|
|
| 2044 |
|
| 2045 |
def _read_equation_file(self):
|
| 2046 |
"""Read the hall of fame file created by `SymbolicRegression.jl`."""
|
| 2047 |
+
|
| 2048 |
try:
|
| 2049 |
if self.nout_ > 1:
|
| 2050 |
all_outputs = []
|
|
|
|
| 2052 |
cur_filename = str(self.equation_file_) + f".out{i}" + ".bkup"
|
| 2053 |
if not os.path.exists(cur_filename):
|
| 2054 |
cur_filename = str(self.equation_file_) + f".out{i}"
|
| 2055 |
+
with open(cur_filename, "r") as f:
|
| 2056 |
+
buf = f.read()
|
| 2057 |
+
buf = _preprocess_julia_floats(buf)
|
| 2058 |
+
df = pd.read_csv(StringIO(buf))
|
| 2059 |
+
|
| 2060 |
# Rename Complexity column to complexity:
|
| 2061 |
df.rename(
|
| 2062 |
columns={
|
|
|
|
| 2072 |
filename = str(self.equation_file_) + ".bkup"
|
| 2073 |
if not os.path.exists(filename):
|
| 2074 |
filename = str(self.equation_file_)
|
| 2075 |
+
with open(filename, "r") as f:
|
| 2076 |
+
buf = f.read()
|
| 2077 |
+
buf = _preprocess_julia_floats(buf)
|
| 2078 |
+
all_outputs = [pd.read_csv(StringIO(buf))]
|
| 2079 |
all_outputs[-1].rename(
|
| 2080 |
columns={
|
| 2081 |
"Complexity": "complexity",
|
|
|
|
| 2084 |
},
|
| 2085 |
inplace=True,
|
| 2086 |
)
|
| 2087 |
+
|
| 2088 |
except FileNotFoundError:
|
| 2089 |
raise RuntimeError(
|
| 2090 |
"Couldn't find equation file! The equation search likely exited "
|
|
|
|
| 2375 |
pkl_basename = base + ".pkl"
|
| 2376 |
|
| 2377 |
return os.path.join(dirname, pkl_basename)
|
| 2378 |
+
|
| 2379 |
+
|
| 2380 |
+
_regexp_im = re.compile(r"\b(\d+\.\d+)im\b")
|
| 2381 |
+
_regexp_im_sci = re.compile(r"\b(\d+\.\d+)[eEfF]([+-]?\d+)im\b")
|
| 2382 |
+
_regexp_sci = re.compile(r"\b(\d+\.\d+)[eEfF]([+-]?\d+)\b")
|
| 2383 |
+
|
| 2384 |
+
_apply_regexp_im = lambda x: _regexp_im.sub(r"\1j", x)
|
| 2385 |
+
_apply_regexp_im_sci = lambda x: _regexp_im_sci.sub(r"\1e\2j", x)
|
| 2386 |
+
_apply_regexp_sci = lambda x: _regexp_sci.sub(r"\1e\2", x)
|
| 2387 |
+
|
| 2388 |
+
|
| 2389 |
+
def _preprocess_julia_floats(s: str) -> str:
|
| 2390 |
+
if isinstance(s, str):
|
| 2391 |
+
s = _apply_regexp_im(s)
|
| 2392 |
+
s = _apply_regexp_im_sci(s)
|
| 2393 |
+
s = _apply_regexp_sci(s)
|
| 2394 |
+
return s
|
pysr/test/test.py
CHANGED
|
@@ -194,6 +194,20 @@ class TestPipeline(unittest.TestCase):
|
|
| 194 |
print("Model equations: ", model.sympy()[1])
|
| 195 |
print("True equation: x1^2")
|
| 196 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 197 |
def test_empty_operators_single_input_warm_start(self):
|
| 198 |
X = self.rstate.randn(100, 1)
|
| 199 |
y = X[:, 0] + 3.0
|
|
@@ -677,6 +691,9 @@ class TestMiscellaneous(unittest.TestCase):
|
|
| 677 |
check_generator = check_estimator(model, generate_only=True)
|
| 678 |
exception_messages = []
|
| 679 |
for _, check in check_generator:
|
|
|
|
|
|
|
|
|
|
| 680 |
try:
|
| 681 |
with warnings.catch_warnings():
|
| 682 |
warnings.simplefilter("ignore")
|
|
|
|
| 194 |
print("Model equations: ", model.sympy()[1])
|
| 195 |
print("True equation: x1^2")
|
| 196 |
|
| 197 |
+
def test_complex_equations_anonymous_stop(self):
|
| 198 |
+
X = self.rstate.randn(100, 3) + 1j * self.rstate.randn(100, 3)
|
| 199 |
+
y = (2 + 1j) * np.cos(X[:, 0] * (0.5 - 0.3j))
|
| 200 |
+
model = PySRRegressor(
|
| 201 |
+
binary_operators=["+", "-", "*"],
|
| 202 |
+
unary_operators=["cos"],
|
| 203 |
+
**self.default_test_kwargs,
|
| 204 |
+
early_stop_condition="(loss, complexity) -> loss <= 1e-4 && complexity <= 6",
|
| 205 |
+
)
|
| 206 |
+
model.fit(X, y)
|
| 207 |
+
test_y = model.predict(X)
|
| 208 |
+
self.assertTrue(np.issubdtype(test_y.dtype, np.complexfloating))
|
| 209 |
+
self.assertLessEqual(np.average(np.abs(test_y - y) ** 2), 1e-4)
|
| 210 |
+
|
| 211 |
def test_empty_operators_single_input_warm_start(self):
|
| 212 |
X = self.rstate.randn(100, 1)
|
| 213 |
y = X[:, 0] + 3.0
|
|
|
|
| 691 |
check_generator = check_estimator(model, generate_only=True)
|
| 692 |
exception_messages = []
|
| 693 |
for _, check in check_generator:
|
| 694 |
+
if check.func.__name__ == "check_complex_data":
|
| 695 |
+
# We can use complex data, so avoid this check.
|
| 696 |
+
continue
|
| 697 |
try:
|
| 698 |
with warnings.catch_warnings():
|
| 699 |
warnings.simplefilter("ignore")
|
pysr/version.py
CHANGED
|
@@ -1,2 +1,2 @@
|
|
| 1 |
-
__version__ = "0.
|
| 2 |
-
__symbolic_regression_jl_version__ = "0.
|
|
|
|
| 1 |
+
__version__ = "0.12.1"
|
| 2 |
+
__symbolic_regression_jl_version__ = "0.16.1"
|