You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
panic: encountered unknown DuckDB type({FLOAT[4] { 0 0 0 false 0 0 [] }}). This is likely a bug - please check the duckdbDataType function for missing type mappings
I was trying to import the vectors based on Arrow format to MyDuck Server following this instruction but failed. Because MyDuck Server can not handle data type List for now.
The table schema:
CREATETABLEIF NOT EXISTS embedded_documents (
document TEXT,
embedding FLOAT[4]
)
Data generation and COPY operation
importnumpyasnpimportpandasaspdimportpyarrowaspaimportio# Generate random document contentdefgenerate_random_document(length=10):
return''.join(random.choice(string.ascii_letters) for_inrange(length))
# Generate a random vector with 1024 dimensions and convert it to string formatdefgenerate_random_vector(dimension=VECTOR_DIMENSION):
vector=np.random.rand(dimension).tolist()
returnvector# Load data using the COPY command with Arrow formatdefcopy_arrow_data(conn, batch_size, batch_number):
data= [(generate_random_document(), generate_random_vector()) for_inrange(batch_size)]
cursor=Nonetry:
# Convert data to a pandas DataFramedf=pd.DataFrame(data, columns=['document', 'embedding'])
# Convert pandas DataFrame to Arrow Tabletable=pa.Table.from_pandas(df)
# Write Arrow Table to a BytesIO streamoutput_stream=io.BytesIO()
withpa.ipc.RecordBatchStreamWriter(output_stream, table.schema) aswriter:
writer.write_table(table)
# Use the COPY command to insert data in Arrow formatcursor=conn.cursor()
withcursor.copy(f"COPY {TABLE_NAME} (document, embedding) FROM STDIN (FORMAT arrow)") ascopy:
copy.write(output_stream.getvalue())
conn.commit()
print(f"Inserted batch of size {len(data)}")
exceptExceptionase:
print(f"Error inserting data: {e}")
finally:
ifcursor:
cursor.close()
print(f"Inserted batch {batch_number}")
The text was updated successfully, but these errors were encountered:
I was trying to import the vectors based on Arrow format to MyDuck Server following this instruction but failed. Because MyDuck Server can not handle data type List for now.
COPY
operationThe text was updated successfully, but these errors were encountered: