Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sanity check of running code as give doesn't work even with corrected pip installs #24

Open
brando90 opened this issue Jul 25, 2023 · 1 comment

Comments

@brando90
Copy link

To reproduce do the following:

Install all dependencies

# -- Create conda env for evaporate
conda create --name evaporate python=3.10
# conda create --name evaporate python=3.8
conda activate evaporate

# -- Evaporate code
cd ~
# cd $AFS
# echo $AFS
git clone [email protected]:brando90/evaporate.git
# ln -s /afs/cs.stanford.edu/u/brando9/evaporate $HOME/evaporate
cd ~/evaporate
pip install -e .

# -- Install missing dependencies not in setup.py
pip install tqdm openai manifest-ml beautifulsoup4 pandas cvxpy sklearn scikit-learn snorkel snorkel-metal tensorboardX

# -- Weak supervision code
cd ~/evaporate/metal-evap
git submodule init
git submodule update
pip install -e .

# -- Manifest (to install from source, which helps you modify the set of supported models. Otherwise, ``setup.py`` installs ``manifest-ml``)
cd ~
# cd $AFS
# echo $AFS
git clone [email protected]:HazyResearch/manifest.git
# ln -s /afs/cs.stanford.edu/u/brando9/manifest $HOME/manifest
cd ~/manifest
pip install -e .

# -- Git lfs install is a command used to initialize Git Large File Storage (LFS) on your machine.
git lfs install
cd ~/evaporate/
# cd ~/data
git clone https://huggingface.co/datasets/hazyresearch/evaporate
# get data in python
# from datasets import load_dataset
# dataset = load_dataset("hazyresearch/evaporate")

Run file with your open ai key

# keys="PLACEHOLDER" # INSERT YOUR API KEY(S) HERE
keys=$(cat ~/data/openai_api_key.txt)
# echo $keys

cd ~/evaporate/evaporate
conda activate evaporate
# conda activate maf

# evaporate code closed ie
python ~/evaporate/evaporate/run_profiler.py \
    --data_lake fda_510ks \
    --do_end_to_end False \
    --num_attr_to_cascade 50 \
    --num_top_k_scripts 10 \
    --train_size 10 \
    --combiner_mode ws \
    --use_dynamic_backoff True \
    --KEYS ${keys}

Run run_profiler.py

Error:

(evaporate) brando9@ampere1:~/evaporate$ python ~/evaporate/evaporate/run_profiler.py     --data_lake fda_510ks     --do_end_to_end False     --num_attr_to_cascade 50     --num_top_k_scripts 10     --train_size 10     --combiner_mode ws     --use_dynamic_backoff True     --KEYS ${keys}
Traceback (most recent call last):
  File "/lfs/ampere1/0/brando9/evaporate/evaporate/run_profiler.py", line 14, in <module>
    from profiler import run_profiler
  File "/afs/cs.stanford.edu/u/brando9/evaporate/evaporate/profiler.py", line 33, in <module>
    from run_ws import run_ws
ModuleNotFoundError: No module named 'run_ws'

May we get help?


Related issue: #21

@brando90
Copy link
Author

I have my own fork trying to fix this basic import statement but code still doesn't work yet. Would be nice to release a version of the code that works out of the box at least for reproducing the original results before trying it on our own custom data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant