Skip to content

Conversation

@kafka1991
Copy link
Contributor

use non-blocking stdin check to prevent hanging
close #260 and #152

Changelog category (leave one):

  • Bug Fix

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

Information about CI checks: https://clickhouse.com/docs/en/development/continuous-integration/

CI Settings

NOTE: If your merge the PR with modified CI you MUST KNOW what you are doing
NOTE: Checked options will be applied if set before CI RunConfig/PrepareRunConfig step

Run these jobs only (required builds will be added automatically):

  • Integration Tests
  • Stateless tests
  • Stateful tests
  • Unit tests
  • Performance tests
  • All with aarch64
  • All with ASAN
  • All with TSAN
  • All with Analyzer
  • All with Azure
  • Add your option here

Deny these jobs:

  • Fast test
  • Integration Tests
  • Stateless tests
  • Stateful tests
  • Performance tests
  • All with ASAN
  • All with TSAN
  • All with MSAN
  • All with UBSAN
  • All with Coverage
  • All with Aarch64

Extra options:

  • do not test (only style check)
  • disable merge-commit (no merge from master before tests)
  • disable CI cache (job reuse)

Only specified batches in multi-batch jobs:

  • 1
  • 2
  • 3
  • 4

@kafka1991 kafka1991 changed the title fix(jupyter):use non-blocking stdin check to prevent hanging Fix(jupyter):use non-blocking stdin check to prevent hanging Sep 16, 2025
@auxten
Copy link
Member

auxten commented Sep 16, 2025

Better to have some tests inserting data with Jupyter Notebook

@kafka1991
Copy link
Contributor Author

kafka1991 commented Oct 2, 2025

@auxten

It's worth noting the following code doesn't work correctly (we can't accept and insert data from standard input like clickhouse-local).

    def testStdin(self):
        print("start")
        from chdb import session
        chs = session.Session()
        chs.query("CREATE DATABASE IF NOT EXISTS test ENGINE = Atomic")
        chs.query("USE test")
        chs.query('DROP TABLE IF EXISTS embeddings')
        time.sleep(60)
        chs.query("""CREATE TABLE embeddings
                     (
                         movieId   UInt32 NOT NULL,
                         embedding Array(Float32) NOT NULL
                     ) ENGINE = MergeTree()
          ORDER BY movieId""")
        chs.query("""INSERT INTO embeddings FORMAT CSV""")
        count = chs.query('SELECT COUNT(*) as count FROM embeddings')
        print(f"Records: {count}")

terminal run

cat tests/movie_embeddings.csv | python tests/test_query_py.py testStdin()

Receive exception:

RuntimeError: Code: 108. DB::Exception: Code: 108. DB::Exception: No data to insert. (NO_DATA_TO_INSERT) (version 25.5.2.1). (NO_DATA_TO_INSERT)

This issue has already existed in our main branch, introduced by commit 6123ace. This feature is mutually exclusive with the non-blocking Jupyter Notebook, so we can confirm that it is not supported.

@auxten
Copy link
Member

auxten commented Nov 3, 2025

@auxten

It's worth noting the following code doesn't work correctly (we can't accept and insert data from standard input like clickhouse-local).


    def testStdin(self):

        print("start")

        from chdb import session

        chs = session.Session()

        chs.query("CREATE DATABASE IF NOT EXISTS test ENGINE = Atomic")

        chs.query("USE test")

        chs.query('DROP TABLE IF EXISTS embeddings')

        time.sleep(60)

        chs.query("""CREATE TABLE embeddings

                     (

                         movieId   UInt32 NOT NULL,

                         embedding Array(Float32) NOT NULL

                     ) ENGINE = MergeTree()

          ORDER BY movieId""")

        chs.query("""INSERT INTO embeddings FORMAT CSV""")

        count = chs.query('SELECT COUNT(*) as count FROM embeddings')

        print(f"Records: {count}")

terminal run


cat tests/movie_embeddings.csv | python tests/test_query_py.py testStdin()

Receive exception:


RuntimeError: Code: 108. DB::Exception: Code: 108. DB::Exception: No data to insert. (NO_DATA_TO_INSERT) (version 25.5.2.1). (NO_DATA_TO_INSERT)

This issue has already existed in our main branch, introduced by commit 6123ace. This feature is mutually exclusive with the non-blocking Jupyter Notebook, so we can confirm that it is not supported.

This case is rarely used, let merge it.

@auxten auxten merged commit a46d851 into main Nov 3, 2025
6 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Using session.query("insert xxx") will stuck

3 participants