Skip to content

feat: add support for hparam_range and hparam_candidates to bigframes.bigquery.create_model#16640

Merged
tswast merged 15 commits into
mainfrom
b501171054-hparam
May 5, 2026
Merged

feat: add support for hparam_range and hparam_candidates to bigframes.bigquery.create_model#16640
tswast merged 15 commits into
mainfrom
b501171054-hparam

Conversation

@tswast
Copy link
Copy Markdown
Contributor

@tswast tswast commented Apr 13, 2026

Fixes internal issue b/501171054
🦕

google-labs-jules Bot and others added 7 commits April 9, 2026 23:13
This change allows the `options` parameter of `bigframes.bigquery._operations.ml.create_model` to accept BigFrames `Expression` objects. These expressions are compiled to SQL scalar expressions and included in the generated `CREATE MODEL` DDL statement.

- Added `bigframes.core.expression.Expression` type support in the `options` dict.
- Updated `create_model_ddl` to handle compiling expressions using `expression_compiler`.
- Added `test_create_model_expression_option` snapshot test to verify the generated "golden SQL".

Co-authored-by: tswast <247555+tswast@users.noreply.github.com>
This change allows the `options` parameter of `bigframes.bigquery._operations.ml.create_model` to accept BigFrames `Expression` objects. These expressions are compiled to SQL scalar expressions and included in the generated `CREATE MODEL` DDL statement.

- Added `bigframes.core.expression.Expression` type support in the `options` dict.
- Updated `create_model_ddl` to handle compiling expressions using `expression_compiler`.
- Added `test_create_model_expression_option` snapshot test to verify the generated "golden SQL", using an expression that calls a function on a literal value (e.g. 0.1 * 10).

Co-authored-by: tswast <247555+tswast@users.noreply.github.com>
This change allows the `options` parameter of `bigframes.bigquery._operations.ml.create_model` to accept BigFrames `Expression` objects. These expressions are compiled to SQL scalar expressions and included in the generated `CREATE MODEL` DDL statement.

- Added `bigframes.core.expression.Expression` type support in the `options` dict.
- Updated `create_model_ddl` to handle compiling expressions using `expression_compiler`.
- Added `test_create_model_expression_option` snapshot test to verify the generated "golden SQL", using an expression that calls a function on a literal value (e.g. 0.1 * 10).

Co-authored-by: tswast <247555+tswast@users.noreply.github.com>
This change allows the `options` parameter of `bigframes.bigquery._operations.ml.create_model` to accept BigFrames `Expression` objects. These expressions are compiled to SQL scalar expressions and included in the generated `CREATE MODEL` DDL statement.

- Added `bigframes.core.expression.Expression` type support in the `options` dict.
- Updated `create_model_ddl` to handle compiling expressions using `expression_compiler`.
- Added `test_create_model_expression_option` snapshot test to verify the generated "golden SQL", using an expression that calls a function on a literal value (e.g. 0.1 * 10).
- Moved test imports to the top level to adhere to PEP 8.

Co-authored-by: tswast <247555+tswast@users.noreply.github.com>
This change allows the `options` parameter of `bigframes.bigquery._operations.ml.create_model` to accept BigFrames `col.Expression` objects. These expressions are compiled to SQL scalar expressions and included in the generated `CREATE MODEL` DDL statement.

- Added `bigframes.core.col.Expression` type support in the `options` dict.
- Updated `create_model_ddl` to handle compiling expressions using `expression_compiler`.
- Added `test_create_model_expression_option` snapshot test to verify the generated "golden SQL", using an expression that calls a function on a literal value (e.g. 0.1 * 10).
- Moved test imports to the top level to adhere to PEP 8 and ran `ruff format`.

Co-authored-by: tswast <247555+tswast@users.noreply.github.com>
…te-model-expression-options-15193413976404138758
@tswast tswast requested review from a team as code owners April 13, 2026 21:46
@tswast tswast requested review from TrevorBergeron and removed request for a team April 13, 2026 21:46
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces hparam_range and hparam_candidates functions to the bigframes.bigquery module, enabling support for hyperparameter tuning in BigQuery ML model creation. These functions wrap the HPARAM_RANGE and HPARAM_CANDIDATES SQL functions. The review feedback recommends simplifying the type hints for these new functions, noting that in Python type annotations, float already encompasses int.

Comment thread packages/bigframes/bigframes/bigquery/_operations/mathematical.py Outdated
Comment thread packages/bigframes/bigframes/bigquery/_operations/mathematical.py Outdated
tswast and others added 5 commits April 13, 2026 16:48
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
…te-model-expression-options-15193413976404138758
…pression-options-15193413976404138758' into b501171054-hparam
Base automatically changed from feat-bigframes-create-model-expression-options-15193413976404138758 to main April 28, 2026 18:15
return bigframes.core.col.Expression(bigframes.core.expression.OpExpression(op, ()))


def hparam_range(min: float, max: float) -> bigframes.core.col.Expression:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we put these two functions in the ml package instead? We don't have any hyper-parameter tuning opportunities outside the scope of machine learning, right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ML package corresponds to the ML namespace. These functions aren't in the ML namespace in SQL.

@tswast tswast merged commit ca47835 into main May 5, 2026
30 checks passed
@tswast tswast deleted the b501171054-hparam branch May 5, 2026 21:39
shuoweil added a commit that referenced this pull request May 13, 2026
PR created by the Librarian CLI to initialize a release. Merging this PR
will auto trigger a release.

Librarian Version: v0.13.0
Language Image:
us-central1-docker.pkg.dev/cloud-sdk-librarian-prod/images-prod/python-librarian-generator@sha256:234b9d1f2ddb057ed7ac6a38db0bf8163d839c65c6cf88ade52530cddebce59e
<details><summary>bigframes: v2.40.0</summary>

##
[v2.40.0](bigframes-v2.39.0...bigframes-v2.40.0)
(2026-05-13)

### Features

* Add `bigframes.execution_history` API to track BigQuery jobs (#16588)
([fa20a74](fa20a740))
  ```python
  import bigframes.pandas as bpd
  bpd.options.compute.enable_execution_history = True
  df = bpd.read_gbq("my_table")
  # ... perform operations ...
  history = bpd.execution_history
  print(history.jobs) # Access BigQuery job details for executed queries
  ```

* Implement `ai.similarity` and `ai.embed` for text embeddings and
semantic similarity (#16771, #16759)
([d4afa2c](d4afa2c8),
[fcb4579](fcb4579b))
  ```python
  import bigframes.pandas as bpd
  # Generate embeddings
  df["embeddings"] = bpd.bigquery.ai.embed(df["text_col"])
  # Compute similarity
df["similarity"] = bpd.bigquery.ai.similarity(df["embeddings_a"],
df["embeddings_b"])
  ```

* Support `hparam_range` and `hparam_candidates` parameters for
hyperparameter tuning in model creation (#16640)
([ca47835](ca47835c))
* Update `ai.score`, `ai.classify` and `ai.if_` parameters to match
their SQL equivalents (#16919, #16990, #16857)
([9f42fe1](9f42fe14),
[e9c52b1](e9c52b12),
[f3cb4ad](f3cb4ad0))
* Support unstable sorting in `sort_values` and `sort_index` (#16665)
([bbdeb70](bbdeb70f))
* Support loading Avro and ORC data formats (#16555)
([6d46cba](6d46cba3))
* Add NumPy ufunc support directly on column expressions (#16554)
([2f792ab](2f792abd))

### Bug Fixes

* Fix bugs compiling ambiguous ids and in subqueries (#16617)
([479e44d](479e44dd))

* BigFrames respects bq default region (#16933)
([ef9945a](ef9945a5))

* avoid views when querying BigLake tables from SQL cells (#16562)
([fdd3e0d](fdd3e0de))

* avoid `copy` argument warning in `to_pandas` (#16917)
([fe5245b](fe5245b8))

### Performance Improvements

* Improve write api upload throughput (#16641)
([ef856b0](ef856b04))

### Documentation

* Add docs to the to_csv methods of dataframe and series (#16570)
([a8fccef](a8fccefd))

</details>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants