You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Environment: 3-node replicated cluster with embedded ClickHouse Keeper
Summary
SELECT generateSerialID('some_sequence') blocks indefinitely when Keeper quorum is lost. The behavior is non-deterministic — sometimes it returns error code 999 (KEEPER_EXCEPTION / Operation timeout) after roughly operation_timeout_ms, but other times it hangs with no timeout at all. max_execution_time has no effect on the hang.
Reproduction
Set up a 3-node cluster with embedded ClickHouse Keeper
With all 3 Keeper nodes healthy, run SELECT generateSerialID('test') — works normally
Stop 2 of 3 Keeper nodes to break quorum
Run SELECT generateSerialID('test') — the query hangs indefinitely
Observed behavior
Sometimes: query hangs, no error, no timeout (it fails after send_receive_timeout passes)
Sometimes: returns Code: 999. Coordination::Exception: Operation timeout after ~10s
Expected behavior
generateSerialID should consistently respect operation_timeout_ms (or a similar setting) and return an error within a bounded time when Keeper quorum is unavailable.
Workaround
Client-side HTTP timeout.
Questions
Is there a server-side or query-level setting that can enforce a timeout on the internal Keeper call made by generateSerialID?
Is the inconsistency (sometimes timeout, sometimes hang) a known issue?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
ClickHouse version: Altinity Stable 25.3.6.10034
Environment: 3-node replicated cluster with embedded ClickHouse Keeper
Summary
SELECT generateSerialID('some_sequence')blocks indefinitely when Keeper quorum is lost. The behavior is non-deterministic — sometimes it returns error code 999 (KEEPER_EXCEPTION/ Operation timeout) after roughlyoperation_timeout_ms, but other times it hangs with no timeout at all.max_execution_timehas no effect on the hang.Reproduction
SELECT generateSerialID('test')— works normallySELECT generateSerialID('test')— the query hangs indefinitelyObserved behavior
Code: 999. Coordination::Exception: Operation timeoutafter ~10sExpected behavior
generateSerialIDshould consistently respectoperation_timeout_ms(or a similar setting) and return an error within a bounded time when Keeper quorum is unavailable.Workaround
Client-side HTTP timeout.
Questions
generateSerialID?Beta Was this translation helpful? Give feedback.
All reactions