-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Open
Labels
Description
Expected Behavior
get_historical_features() with ClickHouse offline store should work when querying from multiple FeatureViews.
Current Behavior
Two SQL compatibility issues in the MULTIPLE_FEATURE_VIEW_POINT_IN_TIME_JOIN template:
ON TRUEin JOIN → ClickHouse errorINVALID_JOIN_ON_EXPRESSION. Affects all queries (single and multi-FV).- Multiple
USINGclauses → ClickHouse errorCode: 48. Multiple USING statements are not supported. Affects queries with 2+ FeatureViews.
Steps to reproduce
# Query features from 2+ FeatureViews
retrieval_job = store.get_historical_features(
entity_df=entity_df,
features=[
"feature_view_1:feature_a",
"feature_view_2:feature_b",
],
)
df = retrieval_job.to_df() # raises DatabaseErrorSpecifications
- Version: 0.61.0 (also present on
master) - Platform: any
- Subsystem: ClickHouse contrib offline store
Possible Solution
File: sdk/python/feast/infra/offline_stores/contrib/clickhouse_offline_store/clickhouse.py
Fix 1 — ON TRUE → conditional ON/AND (~line 532):
Before:
ON TRUE
{% for entity in featureview.entities %}
AND subquery."{{ entity }}" = entity_dataframe."{{ entity }}"
{% endfor %}After:
{% for entity in featureview.entities %}
{% if loop.first %}ON{% else %}AND{% endif %} subquery."{{ entity }}" = entity_dataframe."{{ entity }}"
{% endfor %}Fix 2 — USING → ON in final SELECT (~line 620):
Before:
) AS "{{featureview.name}}" USING ("{{featureview.name}}__entity_row_unique_id")After:
) AS "{{featureview.name}}" ON "{{featureview.name}}"."{{featureview.name}}__entity_row_unique_id" = entity_dataframe."{{featureview.name}}__entity_row_unique_id"Both use standard SQL compatible with ClickHouse and PostgreSQL.
I'm attaching the patched file with both fixes applied.
clickhouse.py
Reactions are currently unavailable