"ValueError: not enough values to unpack (expected 2, got 1)" Output when Executing the Model #31

yagmurrozdemir · 2024-03-20T13:33:27Z

I get the below output when I try to run the model by executing "run.sh" file that you have provided.
`root@82167638a7b3:/paper_evaporate/evaporate/evaporate# bash run.sh
Data lake
Chunking files: 100%|████████████████████████| 100/100 [00:00<00:00, 516.76it/s]

Data-lake: fda_510ks, Train size: 10

Extracting purpose for submission (1 / 16)
For attribute purpose for submission
-- Starting with 1144 chunks
-- Ending with 100 chunks
-- 93 starting chunks in sample files
-- 10 chunks in sample files
Extracting attribute purpose for submission using LM: 100%|█| 10/10 [00:00<00:00
Generating functions for attribute purpose for submission: 100%|█| 10/10 [00:00<
Extraction fraction: 1.0
Top 10 scripts:

function_10; Score: {'average_f1': 1.0, 'median_f1': 1.0, 'extraction_fraction': 1.0, 'prior_average_f1': 1.0, 'prior_median_f1': 1.0}
function_1; Score: {'average_f1': 0.9000685871056241, 'median_f1': 1.0, 'extraction_fraction': 1.0, 'prior_average_f1': 0.9000685871056241, 'prior_median_f1': 1.0}
function_3; Score: {'average_f1': 0.9000685871056241, 'median_f1': 1.0, 'extraction_fraction': 1.0, 'prior_average_f1': 0.9000685871056241, 'prior_median_f1': 1.0}
function_5; Score: {'average_f1': 0.9000685871056241, 'median_f1': 1.0, 'extraction_fraction': 1.0, 'prior_average_f1': 0.9000685871056241, 'prior_median_f1': 1.0}
function_7; Score: {'average_f1': 0.9000685871056241, 'median_f1': 1.0, 'extraction_fraction': 1.0, 'prior_average_f1': 0.9000685871056241, 'prior_median_f1': 1.0}
function_9; Score: {'average_f1': 0.9000685871056241, 'median_f1': 1.0, 'extraction_fraction': 1.0, 'prior_average_f1': 0.9000685871056241, 'prior_median_f1': 1.0}
function_11; Score: {'average_f1': 0.9000685871056241, 'median_f1': 1.0, 'extraction_fraction': 1.0, 'prior_average_f1': 0.9000685871056241, 'prior_median_f1': 1.0}
function_15; Score: {'average_f1': 0.9000685871056241, 'median_f1': 1.0, 'extraction_fraction': 1.0, 'prior_average_f1': 0.9000685871056241, 'prior_median_f1': 1.0}
function_0; Score: {'average_f1': 0.8753193960511034, 'median_f1': 1.0, 'extraction_fraction': 1.0, 'prior_average_f1': 0.8753193960511034, 'prior_median_f1': 1.0}
function_13; Score: {'average_f1': 0.8753193960511034, 'median_f1': 1.0, 'extraction_fraction': 1.0, 'prior_average_f1': 0.8753193960511034, 'prior_median_f1': 1.0}
Best script overall: function_10; Score: {'average_f1': 1.0, 'median_f1': 1.0, 'extraction_fraction': 1.0, 'prior_average_f1': 1.0, 'prior_median_f1': 1.0}
Apply the scripts to the data lake and save the metadata. Taking the top 10 scripts per field.
Applying function function_10...
Applying function function_1...
Applying function function_3...
Applying function function_5...
Applying function function_7...
Applying function function_9...
Applying function function_11...
Applying function function_15...
Applying function function_0...
Applying function function_13...
Applying key function_10: 100%|███████████| 100/100 [00:00<00:00, 348364.12it/s]
Applying key function_1: 100%|█████████████| 100/100 [00:00<00:00, 21459.73it/s]
Applying key function_3: 100%|█████████████| 100/100 [00:00<00:00, 51577.77it/s]
Applying key function_5: 100%|█████████████| 100/100 [00:00<00:00, 52083.75it/s]
Applying key function_7: 100%|█████████████| 100/100 [00:00<00:00, 21242.36it/s]
Applying key function_9: 100%|█████████████| 100/100 [00:00<00:00, 50321.58it/s]
Applying key function_11: 100%|████████████| 100/100 [00:00<00:00, 64537.68it/s]
Applying key function_15: 100%|████████████| 100/100 [00:00<00:00, 21027.24it/s]
Applying key function_0: 100%|████████████| 100/100 [00:00<00:00, 580125.03it/s]
Applying key function_13: 100%|███████████| 100/100 [00:00<00:00, 788403.01it/s]
100%|████████████████████████████████████| 100/100 [00:00<00:00, 3153612.03it/s]
/usr/local/lib/python3.10/dist-packages/numpy/core/fromnumeric.py:3504: RuntimeWarning: Mean of empty slice.
return _methods._mean(a, axis=axis, dtype=dtype,
/usr/local/lib/python3.10/dist-packages/numpy/core/_methods.py:129: RuntimeWarning: invalid value encountered in scalar divide
ret = ret.dtype.type(ret / rcount)
Average abstains across documents: nan
Average unique votes per document: nan
Shape of test_votes: (0,)
Shape of test_votes: []
Traceback (most recent call last):
File "/paper_evaporate/evaporate/evaporate/run_profiler.py", line 510, in
main()
File "/paper_evaporate/evaporate/evaporate/run_profiler.py", line 506, in main
run_experiment(profiler_args)
File "/paper_evaporate/evaporate/evaporate/run_profiler.py", line 416, in run_experiment
num_toks, success = run_profiler(
File "/paper_evaporate/evaporate/evaporate/profiler.py", line 685, in run_profiler
file2metadata, num_toks = combine_extractions(
File "/paper_evaporate/evaporate/evaporate/profiler.py", line 159, in combine_extractions
preds, used_deps, missing_files = run_ws(
File "/paper_evaporate/evaporate/evaporate/./weak_supervision/run_ws.py", line 195, in run_ws
n_test, m = test_votes.shape
ValueError: not enough values to unpack (expected 2, got 1)
Data lake
Chunking files: 100%|████████████████████████| 100/100 [00:00<00:00, 505.13it/s]

Data-lake: fda_510ks, Train size: 10
Chunks in sample file ./data/evaporate/fda-ai-pmas/510k/K150526.txt: 5
Chunks in sample file ./data/evaporate/fda-ai-pmas/510k/K151046.txt: 16
Chunks in sample file ./data/evaporate/fda-ai-pmas/510k/K180886.txt: 5
Chunks in sample file ./data/evaporate/fda-ai-pmas/510k/K181525.txt: 9
Chunks in sample file ./data/evaporate/fda-ai-pmas/510k/K151265.txt: 11
Chunks in sample file ./data/evaporate/fda-ai-pmas/510k/K171641.txt: 10
Chunks in sample file ./data/evaporate/fda-ai-pmas/510k/K182472.txt: 4
Chunks in sample file ./data/evaporate/fda-ai-pmas/510k/K161714.txt: 8
Chunks in sample file ./data/evaporate/fda-ai-pmas/510k/K162042.txt: 13
Chunks in sample file ./data/evaporate/fda-ai-pmas/510k/K170974.txt: 12
Directly extracting metadata from chunks: 100%|█| 10/10 [00:00<00:00, 50.24it/s]
@k = 16 --- Recall: 0.500, Precision: 0.500, F1: 0.500
@k = 1 --- Recall: 0.000, Precision: 0.000, F1: 0.000
@k = 5 --- Recall: 0.250, Precision: 0.800, F1: 0.381
@k = 10 --- Recall: 0.375, Precision: 0.600, F1: 0.462
@k = 15 --- Recall: 0.438, Precision: 0.467, F1: 0.452
@k = 20 --- Recall: 0.625, Precision: 0.500, F1: 0.556
@k = 25 --- Recall: 0.625, Precision: 0.500, F1: 0.556
@k = 30 --- Recall: 0.625, Precision: 0.500, F1: 0.556
@k = 35 --- Recall: 0.625, Precision: 0.500, F1: 0.556
@k = 40 --- Recall: 0.625, Precision: 0.500, F1: 0.556
@k = 45 --- Recall: 0.625, Precision: 0.500, F1: 0.556
@k = 50 --- Recall: 0.625, Precision: 0.500, F1: 0.556
@k = 100 --- Recall: 0.625, Precision: 0.500, F1: 0.556
@k = 20 --- Recall: 0.625, Precision: 0.500, F1: 0.556

Extracting 510(k) number (1 / 20)
For attribute 510(k) number
-- Starting with 1144 chunks
-- Ending with 174 chunks
-- 93 starting chunks in sample files
-- 18 chunks in sample files
Extracting attribute 510(k) number using LM: 100%|█| 10/10 [00:00<00:00, 283.58i
Generating functions for attribute 510(k) number: 100%|█| 10/10 [00:00<00:00, 12
Extraction fraction: 1.0
Top 10 scripts:

function_34; Score: {'average_f1': 0.9473684210526316, 'median_f1': 0.9473684210526316, 'extraction_fraction': 1.0, 'prior_average_f1': 0.09473684210526316, 'prior_median_f1': 0.0}
function_29; Score: {'average_f1': 0.7777777777777777, 'median_f1': 0.6666666666666666, 'extraction_fraction': 1.0, 'prior_average_f1': 0.2333333333333333, 'prior_median_f1': 0.0}
function_26; Score: {'average_f1': 0.6666666666666666, 'median_f1': 0.6666666666666666, 'extraction_fraction': 1.0, 'prior_average_f1': 0.13333333333333333, 'prior_median_f1': 0.0}
function_31; Score: {'average_f1': 0.6666666666666666, 'median_f1': 0.6666666666666666, 'extraction_fraction': 1.0, 'prior_average_f1': 0.13333333333333333, 'prior_median_f1': 0.0}
function_0; Score: {'average_f1': 0.6415151515151516, 'median_f1': 0.6666666666666666, 'extraction_fraction': 1.0, 'prior_average_f1': 0.6415151515151516, 'prior_median_f1': 0.6666666666666666}
function_28; Score: {'average_f1': 0.6415151515151516, 'median_f1': 0.6666666666666666, 'extraction_fraction': 1.0, 'prior_average_f1': 0.6415151515151516, 'prior_median_f1': 0.6666666666666666}
function_1; Score: {'average_f1': 0.6387205387205388, 'median_f1': 0.6666666666666666, 'extraction_fraction': 1.0, 'prior_average_f1': 0.5748484848484849, 'prior_median_f1': 0.6666666666666666}
function_2; Score: {'average_f1': 0.6387205387205388, 'median_f1': 0.6666666666666666, 'extraction_fraction': 1.0, 'prior_average_f1': 0.5748484848484849, 'prior_median_f1': 0.6666666666666666}
function_3; Score: {'average_f1': 0.6387205387205388, 'median_f1': 0.6666666666666666, 'extraction_fraction': 1.0, 'prior_average_f1': 0.5748484848484849, 'prior_median_f1': 0.6666666666666666}
function_6; Score: {'average_f1': 0.6387205387205388, 'median_f1': 0.6666666666666666, 'extraction_fraction': 1.0, 'prior_average_f1': 0.5748484848484849, 'prior_median_f1': 0.6666666666666666}
Best script overall: function_34; Score: {'average_f1': 0.9473684210526316, 'median_f1': 0.9473684210526316, 'extraction_fraction': 1.0, 'prior_average_f1': 0.09473684210526316, 'prior_median_f1': 0.0}
Apply the scripts to the data lake and save the metadata. Taking the top 10 scripts per field.
Applying function function_34...
Applying function function_29...
Applying function function_26...
Applying function function_31...
Applying function function_0...
Applying function function_28...
Applying function function_1...
Applying function function_2...
Applying function function_3...
Applying function function_6...
Applying key function_34: 100%|███████████| 100/100 [00:00<00:00, 480447.19it/s]
Applying key function_29: 100%|███████████| 100/100 [00:00<00:00, 353949.70it/s]
Applying key function_26: 100%|███████████| 100/100 [00:00<00:00, 465516.54it/s]
Applying key function_31: 100%|███████████| 100/100 [00:00<00:00, 522329.27it/s]
Applying key function_0: 100%|████████████| 100/100 [00:00<00:00, 323634.57it/s]
Applying key function_28: 100%|███████████| 100/100 [00:00<00:00, 309314.45it/s]
Applying key function_1: 100%|████████████| 100/100 [00:00<00:00, 565270.08it/s]
Applying key function_2: 100%|████████████| 100/100 [00:00<00:00, 326404.98it/s]
Applying key function_3: 100%|████████████| 100/100 [00:00<00:00, 423667.07it/s]
Applying key function_6: 100%|████████████| 100/100 [00:00<00:00, 292693.93it/s]
100%|█████████████████████████████████████| 100/100 [00:00<00:00, 961996.33it/s]
/usr/local/lib/python3.10/dist-packages/numpy/core/fromnumeric.py:3504: RuntimeWarning: Mean of empty slice.
return _methods._mean(a, axis=axis, dtype=dtype,
/usr/local/lib/python3.10/dist-packages/numpy/core/_methods.py:129: RuntimeWarning: invalid value encountered in scalar divide
ret = ret.dtype.type(ret / rcount)
Average abstains across documents: nan
Average unique votes per document: nan
Shape of test_votes: (0,)
Shape of test_votes: []
Traceback (most recent call last):
File "/paper_evaporate/evaporate/evaporate/run_profiler.py", line 510, in
main()
File "/paper_evaporate/evaporate/evaporate/run_profiler.py", line 506, in main
run_experiment(profiler_args)
File "/paper_evaporate/evaporate/evaporate/run_profiler.py", line 416, in run_experiment
num_toks, success = run_profiler(
File "/paper_evaporate/evaporate/evaporate/profiler.py", line 685, in run_profiler
file2metadata, num_toks = combine_extractions(
File "/paper_evaporate/evaporate/evaporate/profiler.py", line 159, in combine_extractions
preds, used_deps, missing_files = run_ws(
File "/paper_evaporate/evaporate/evaporate/./weak_supervision/run_ws.py", line 195, in run_ws
n_test, m = test_votes.shape
ValueError: not enough values to unpack (expected 2, got 1)
`

Also, since the directory structure desired by the code is not very clearly stated, I arranged the directory structure through my own interpretation of things and modified the "run.sh"'s arguments as follows,
python3 run_profiler.py --data_lake fda_510ks --num_attr_to_cascade 50 --num_top_k_scripts 10 --train_size 10 --combiner_mode ws --use_dynamic_backoff --KEYS "$keys" --data_dir "./data/evaporate/fda-ai-pmas/510k/" --base_data_dir "./data/evaporate/data/fda_510ks" --gold_extractions_file "./data/evaporate/data/fda_510ks/table.json" python3 run_profiler.py --data_lake fda_510ks --num_attr_to_cascade 50 --num_top_k_scripts 10 --train_size 10 --combiner_mode ws --use_dynamic_backoff --KEYS "$keys" --do_end_to_end --data_dir "./data/evaporate/fda-ai-pmas/510k/" --base_data_dir "./data/evaporate/data/fda_510ks" --gold_extractions_file "./data/evaporate/data/fda_510ks/table.json"
The directory structure I have used can, hopefully, be understood from the above given paths. Can you help me identify the problem here? I guess the problem is about the file paths (so the directory structure), but I was not able to find the appropriate structure although I have tried every combination I can think of. Can you share the exact directory structure I should use so that I would be able to use the original "run.sh" file directly and successfully run the model? Or else, can you specify, for my own structure, what to do differently? Thanks in advance.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"ValueError: not enough values to unpack (expected 2, got 1)" Output when Executing the Model #31

"ValueError: not enough values to unpack (expected 2, got 1)" Output when Executing the Model #31

yagmurrozdemir commented Mar 20, 2024 •

edited

Loading

"ValueError: not enough values to unpack (expected 2, got 1)" Output when Executing the Model #31

"ValueError: not enough values to unpack (expected 2, got 1)" Output when Executing the Model #31

Comments

yagmurrozdemir commented Mar 20, 2024 • edited Loading

yagmurrozdemir commented Mar 20, 2024 •

edited

Loading