feat(bigframes): Add numpy ufunc support to col expressions by TrevorBergeron · Pull Request #16554 · googleapis/google-cloud-python

TrevorBergeron · 2026-04-03T20:34:27Z

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
Ensure the tests and linter pass
Code coverage does not decrease (if any source code was changed)
Appropriate docs were updated (if necessary)

Fixes #<issue_number_goes_here> 🦕

gemini-code-assist

Code Review

This pull request introduces support for NumPy universal functions (ufuncs) in BigFrames by implementing the __array_ufunc__ method in the Expression class. It also refactors binary operation logic into a helper function _as_bf_expr and adds unit tests to verify the new functionality. Feedback was provided regarding the use of a non-standard type hint for the method parameter and an issue in the unit tests where non-standard pandas API calls were used to compute expected results.

gemini-code-assist · 2026-04-03T20:37:00Z

        return strings.StringMethods(self)

+    def __array_ufunc__(
+        self, ufunc: numpy.ufunc, method: __builtins__.str, *inputs, **kwargs


Using __builtins__.str as a type hint is non-standard and potentially fragile. It is recommended to use the built-in str type directly.

Suggested change

self, ufunc: numpy.ufunc, method: __builtins__.str, *inputs, **kwargs

self, ufunc: numpy.ufunc, method: str, *inputs, **kwargs

References

Standard Python type hinting practices (PEP 484) recommend using built-in types like 'str' directly instead of accessing them through 'builtins'. ^(link)

gemini-code-assist · 2026-04-03T20:37:00Z

+    pd_kwargs = {
+        "sqrt": np.sqrt(pd.col("float64_col")),  # type: ignore
+        "add_const": np.add(pd.col("float64_col"), 2.4),  # type: ignore
+        "radd_const": np.add(2.4, pd.col("float64_col")),  # type: ignore
+        "add_cols": np.add(pd.col("float64_col"), pd.col("int64_col")),  # type: ignore
+    }


The pd_kwargs dictionary uses pd.col, which is not a standard pandas API. To correctly verify the BigFrames implementation against pandas, the expected results should be computed using standard pandas column access on scalars_pandas_df. Additionally, standard pandas assign does not support BigFrames Expression objects. To ensure dictionary keys remain sorted without manual effort, the dictionary should be programmatically sorted.

Suggested change

pd_kwargs = {

"sqrt": np.sqrt(pd.col("float64_col")), # type: ignore

"add_const": np.add(pd.col("float64_col"), 2.4), # type: ignore

"radd_const": np.add(2.4, pd.col("float64_col")), # type: ignore

"add_cols": np.add(pd.col("float64_col"), pd.col("int64_col")), # type: ignore

}

pd_kwargs = dict(sorted({

"sqrt": np.sqrt(scalars_pandas_df["float64_col"]),

"add_const": np.add(scalars_pandas_df["float64_col"], 2.4),

"radd_const": np.add(2.4, scalars_pandas_df["float64_col"]),

"add_cols": np.add(scalars_pandas_df["float64_col"], scalars_pandas_df["int64_col"]),

}.items()))

References

To ensure dictionary keys remain sorted without manual effort, programmatically sort the dictionary instead of relying on manual ordering in the code.

PR created by the Librarian CLI to initialize a release. Merging this PR will auto trigger a release. Librarian Version: v0.13.0 Language Image: us-central1-docker.pkg.dev/cloud-sdk-librarian-prod/images-prod/python-librarian-generator@sha256:234b9d1f2ddb057ed7ac6a38db0bf8163d839c65c6cf88ade52530cddebce59e <details><summary>bigframes: v2.40.0</summary> ## [v2.40.0](bigframes-v2.39.0...bigframes-v2.40.0) (2026-05-13) ### Features * Add `bigframes.execution_history` API to track BigQuery jobs (#16588) ([fa20a74](fa20a740)) ```python import bigframes.pandas as bpd bpd.options.compute.enable_execution_history = True df = bpd.read_gbq("my_table") # ... perform operations ... history = bpd.execution_history print(history.jobs) # Access BigQuery job details for executed queries ``` * Implement `ai.similarity` and `ai.embed` for text embeddings and semantic similarity (#16771, #16759) ([d4afa2c](d4afa2c8), [fcb4579](fcb4579b)) ```python import bigframes.pandas as bpd # Generate embeddings df["embeddings"] = bpd.bigquery.ai.embed(df["text_col"]) # Compute similarity df["similarity"] = bpd.bigquery.ai.similarity(df["embeddings_a"], df["embeddings_b"]) ``` * Support `hparam_range` and `hparam_candidates` parameters for hyperparameter tuning in model creation (#16640) ([ca47835](ca47835c)) * Update `ai.score`, `ai.classify` and `ai.if_` parameters to match their SQL equivalents (#16919, #16990, #16857) ([9f42fe1](9f42fe14), [e9c52b1](e9c52b12), [f3cb4ad](f3cb4ad0)) * Support unstable sorting in `sort_values` and `sort_index` (#16665) ([bbdeb70](bbdeb70f)) * Support loading Avro and ORC data formats (#16555) ([6d46cba](6d46cba3)) * Add NumPy ufunc support directly on column expressions (#16554) ([2f792ab](2f792abd)) ### Bug Fixes * Fix bugs compiling ambiguous ids and in subqueries (#16617) ([479e44d](479e44dd)) * BigFrames respects bq default region (#16933) ([ef9945a](ef9945a5)) * avoid views when querying BigLake tables from SQL cells (#16562) ([fdd3e0d](fdd3e0de)) * avoid `copy` argument warning in `to_pandas` (#16917) ([fe5245b](fe5245b8)) ### Performance Improvements * Improve write api upload throughput (#16641) ([ef856b0](ef856b04)) ### Documentation * Add docs to the to_csv methods of dataframe and series (#16570) ([a8fccef](a8fccefd)) </details>

feat(bigframes): Add numpy ufunc support to col expressions

48f7979

gemini-code-assist Bot reviewed Apr 3, 2026

View reviewed changes

TrevorBergeron marked this pull request as ready for review April 6, 2026 20:02

TrevorBergeron requested review from a team as code owners April 6, 2026 20:02

TrevorBergeron requested a review from shuoweil April 6, 2026 20:02

style

0993c21

shuoweil approved these changes Apr 6, 2026

View reviewed changes

TrevorBergeron merged commit 2f792ab into main Apr 6, 2026
31 checks passed

TrevorBergeron deleted the tbergeron_col_numpy branch April 6, 2026 23:18

shuoweil mentioned this pull request May 13, 2026

chore: release bigframes 2.40.0 #17056

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(bigframes): Add numpy ufunc support to col expressions#16554

feat(bigframes): Add numpy ufunc support to col expressions#16554
TrevorBergeron merged 2 commits into
mainfrom
tbergeron_col_numpy

TrevorBergeron commented Apr 3, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Apr 3, 2026

Uh oh!

gemini-code-assist Bot Apr 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	self, ufunc: numpy.ufunc, method: __builtins__.str, inputs, *kwargs
	self, ufunc: numpy.ufunc, method: str, inputs, *kwargs

-    pd_kwargs = {
-        "sqrt": np.sqrt(pd.col("float64_col")),  # type: ignore
-        "add_const": np.add(pd.col("float64_col"), 2.4),  # type: ignore
-        "radd_const": np.add(2.4, pd.col("float64_col")),  # type: ignore
-        "add_cols": np.add(pd.col("float64_col"), pd.col("int64_col")),  # type: ignore
-    }
+    pd_kwargs = dict(sorted({
+        "sqrt": np.sqrt(scalars_pandas_df["float64_col"]),
+        "add_const": np.add(scalars_pandas_df["float64_col"], 2.4),
+        "radd_const": np.add(2.4, scalars_pandas_df["float64_col"]),
+        "add_cols": np.add(scalars_pandas_df["float64_col"], scalars_pandas_df["int64_col"]),
+    }.items()))

Conversation

TrevorBergeron commented Apr 3, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants