Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: gfClient error exit #245

Open
1 task done
anderdnavarro opened this issue Feb 1, 2024 · 11 comments
Open
1 task done

[Bug]: gfClient error exit #245

anderdnavarro opened this issue Feb 1, 2024 · 11 comments
Assignees
Labels
bug Something isn't working

Comments

@anderdnavarro
Copy link

What happened?

Hi!

I have a list of 2000 sequences. I am trying to align them using pxblat, if I run only 100-300 sequences it works well, but when I try to do all of them I get the following error:

python3: TCP non-blocking connect() to localhost IP  timed-out in select() after 10000 milliseconds - Cancelling!: Operation now in progress
python3: # error: Operation now in progress
Sorry, the BLAT/iPCR server seems to be down.  Please try again later: localhost 5000
: Operation now in progress
gfClient error exit

I tried to automatically split them into different batches, even adding 25s sleep between rounds. I also used both Context and General modes (stoping the server each round or not), but I always obtain the same error. I could run the script several times manually with different sequences (instead of generating the batches inside the script), but I couldn't automatize it in that case.

This is the code I was using:

client = Client(
        host="localhost",
        port=5000,
        seq_dir="/databases/hg38",
        min_score=20,
        min_identity=90
)

with Server("localhost", 5000, "/databases/hg38/hg38.2bit", can_stop=True, step_size=5, tile_size=10) as server: 
        sequences:list = prepare_blat_sequences(file)
        server.wait_ready()  
        results = client.query(sequences[0:2000])

I don't know if there would be an extra Server or Client option that I am missing that I could use.

Thank you very much!
Ander

Version

python-3.10.12
pxblat-1.1.10
biopython-1.83

What platform are you working on?

No response

Relevant log output

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
@anderdnavarro anderdnavarro added the bug Something isn't working label Feb 1, 2024
@cauliyang
Copy link
Collaborator

good finding! I will dive into the issue and try to resolve that soon

@cauliyang cauliyang self-assigned this Feb 2, 2024
@anderdnavarro
Copy link
Author

I ran the same 2000 sequences with the previous version I had installed (pxblat 0.3.6) and the error said: *** buffer overflow detected ***: terminated

In case it is useful for you!
Thanks!

@cauliyang
Copy link
Collaborator

Thanks for the information, and I guess the issue is related to the previous issue #66. Could you please use ulimit -n 2048 to set the maximum limit of open files/connections? Let's see the method will fix the issue.

@anderdnavarro
Copy link
Author

I tried to use ulimit -n 2048 and ulimit -n 4096, but none of them worked. Same error.

@cauliyang
Copy link
Collaborator

cauliyang commented Feb 3, 2024

thanks for the update, I will dive into the issue. Could you please install the latest version and test the issue again?

@cauliyang
Copy link
Collaborator

hi @anderdnavarro, Thanks for testing the tool! I have tested the latest version, '1.1.19', and the bug is supposed to be fixed. Before we test, let's make sure the port is already closed using pxblat server stop localhost port or the api https://pxblat.readthedocs.io/en/latest/api/stop_server.html since the port may not be closed properly if we meet some errors previously.

@anderdnavarro
Copy link
Author

Hi @cauliyang , perfect! As soon as you release the new version I will try it.

Thank you again for the quick solution!

@anderdnavarro
Copy link
Author

Hi @cauliyang, I tried the new 1.1.20 version, but the problem is still there. I used ulimit -n 2048 too and restarted the port before testing with the command you provided:

This is the output using VSCode terminal

python3: TCP non-blocking connect() to localhost IP  timed-out in select() after 10000 milliseconds - Cancelling!: Operation now in progress
python3: # error: Operation now in progress
Sorry, the BLAT/iPCR server seems to be down.  Please try again later: localhost 6000
: Operation now in progress
gfClient error exit

And this is the output using a regular terminal. I paste it too because it is a bit different:

getaddrinfo() failed: Device or resource busy
python3: Host localhost not found --> System error
: Device or resource busy
python3: # error: Device or resource busy
Sorry, the BLAT/iPCR server seems to be down.  Please try again later: localhost 6000
: Device or resource busy
gfClient error exit

I tried both because using VSCode, 70% of the times I get an error even for only 10 sequences, and I think it is related to the port forwarding feature it has. When I run pxblat server stop localhost port to restart the port, it is detected by the app (I am using a Linux server but working on a M1 Mac).

@cauliyang
Copy link
Collaborator

cauliyang commented Feb 7, 2024

hi @anderdnavarro, thanks for sharing the info. Could you please share me with the latest code you use? I try to reproduce the issue and will resolve that soon.

@anderdnavarro
Copy link
Author

Sure! This is the command I'm using:

python blat2.py -i sequences.txt -g /databases/hg38

This is the script:

import os
import click
from pxblat import Server, Client

@click.command(name="Blat")
@click.option("-i", "--input",
              type=click.Path(exists=True, file_okay=True),
              metavar="FILE",
              required = True,
              help="List of sequences (one per row)")
@click.option("-g", "--genomeDir", "genomeDir",
              type=click.Path(exists=True, file_okay=False),
              metavar="DIR",
              required = False,
              default = '/databases/hg38',
              help="Directory containing the genome files required for Blat (2.bit)")
def Blat(input, genomeDir):

    """
    pxBlat command to run many sequences at the same time
    """

    # File with sequences
    with open(input, 'r') as f:
        sequences:list = f.readlines()
    sequences = [line.rstrip('\n') for line in sequences]

    # 2bit file
    all_files:list = os.listdir(genomeDir)
    g2bit:click.Path = [file for file in all_files if file.endswith(".2bit")][0]  

    # Blat options
    ## Client
    client = Client(
        host="localhost",
        port=6000,
        seq_dir=genomeDir,
        min_score=20,
        min_identity=90
    )

    ## Server
    with Server("localhost", 6000, os.path.join(genomeDir, g2bit), can_stop=True, step_size=5, tile_size=10) as server: #BLAT WEB options
        server.wait_ready()  
        results = client.query(sequences[:10])
    
    print(results)

if __name__ == '__main__':
    Blat()

And the complete list of sequences can be found at the following link: https://drive.google.com/file/d/14oumMtx4NnMH95VXFqBHTUx5q3Qrhsai/view?usp=sharing

Let me know if you need anything else!

@cauliyang
Copy link
Collaborator

@anderdnavarro, sounds great! Thanks for sharing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants