Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting an error in liana multi df_to_lr #84

Closed
ndrubins opened this issue Jan 27, 2024 · 15 comments
Closed

Getting an error in liana multi df_to_lr #84

ndrubins opened this issue Jan 27, 2024 · 15 comments
Assignees
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@ndrubins
Copy link

ndrubins commented Jan 27, 2024

Hi,

I'm trying to follow the Differential Expression Analysis for CCC & Downstream Signalling Networks["https://liana-py.readthedocs.io/en/latest/notebooks/targeted.html"] tutorial. I've created my own dea_df and not all genes in my dataset appear in the dea_df for each of the cell types (some get dropped out due to missing samples etc').

The var and X slots in the dataset are sorted and so are the genes in dea_df (per each cell type), but when I run the df_to_lr command:

lr_res = li.multi.df_to_lr(adata,
                           dea_df=de_df,
                           resource_name='consensus',
                           expr_prop=0.1, # calculated for adata as passed - used to filter interactions
                           groupby=groupby,
                           stat_keys=['stat', 'pvalue', 'padj'],
                           use_raw=False,
                           complex_col='stat', # NOTE: we use the Wald Stat to deal with complexes
                           verbose=True,
                           return_all_lrs=False,
                           )

I get this error:

Using resource `consensus`.
Using `.X`!
96 features of mat are empty, they will be removed.
/home/rnd/miniconda/envs/py3_9/lib/python3.9/site-packages/liana/method/_pipe_utils/_pre.py:148: ImplicitModificationWarning: Trying to modify attribute `.obs` of view, initializing view as actual.
Features in adata and dea_df are mismatched.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/rnd/miniconda/envs/py3_9/lib/python3.9/site-packages/liana/multi/df_to_lr.py", line 148, in df_to_lr
    assert_covered(np.union1d(np.unique(resource["ligand"]),
  File "/home/rnd/miniconda/envs/py3_9/lib/python3.9/site-packages/liana/method/_pipe_utils/_pre.py", line 60, in assert_covered
    raise ValueError(msg + f" [{x_missing}] missing from {superset_name}")
ValueError: Please check if appropriate organism/ID type was provided! Allowed proportion (0.98) of missing resource elements exceeded (1.00). Too few features from the resource were found in the data. [A1BG, A2M, AANAT, ABCA1, ACE, ACKR1, ACKR2, ACKR3, ACKR4, ACTR2, ACVR1, ACVR1B, ACVR1C, ACVR2A, ACVR2B, ACVRL1, ADA, ADAM10, ADAM11, ADAM12, ADAM15, ADAM17, ADAM2, ADAM22, ADAM23, ADAM28, ADAM29, ADAM7, ADAM9, ADAMTS3, ADCY1, ADCY7, ADCY8, ADCY9, ADCYAP1, ADCYAP1R1, ADGRA2, ADGRB1, ADGRE2, ADGRE5, ADGRG1, ADGRG3, ADGRG5, ADGRL1, ADGRL4, ADGRV1, ADIPOQ, ADIPOR1, ADIPOR2, ADM, ADM2, ADO, ADORA1, ADORA2A, ADORA2B, ADORA3, ADRA2A, ADRA2B, ADRB1, ADRB2, ADRB3, AFDN, AGER, AGR2, AGRN, AGRP, AGT, AGTR1, AGTR2, AGTRAP, AHSG, AIMP1, ALB, ALCAM, ALK, ALKAL1, ALKAL2, ALOX5, AMBN, AMELX, AMELY, AMFR, AMH, AMHR2, ANG, ANGPT1, ANGPT2, ANGPT4, ANGPTL1, ANGPTL2, ANGPTL3, ANGPTL4, ANGPTL7, ANOS1, ANTXR1, ANXA1, ANXA2, APCDD1, APELA, APLN, APLNR, APLP1, APLP2, APOA1, APOA2, APOA4, APOB, APOC1, APOC2, APOC3, APOC4, APOD, APOE, APOO, APP, AQP1, AQP5, AQP6, AR, AREG, ARF1, ARF6, ARPC5, ART1, ARTN, ASGR1, ASGR2, ASIP, ATP1A3, ATP5F1B, ATP6AP2, ATRN, AVP, AVPR1A, AVPR1B, AVPR2, AXL, AZGP1, B2M, BAG6, BAMBI, BCAM, BCAN, BDKRB1, BDKRB2, BDNF, BEX3, BGN, BMP1, BMP10, BMP15, BMP2, BMP3, BMP4, BMP5, BMP6, BMP7, BMP8A, BMP8B, BMPR1A, BMPR1B, BMPR2, BOC, BPI, BRS3, BSG, BST1, BST2, BTC, BTLA, BTN1A1, BTN3A2, C1QA, C1QB, C1QBP, C1QTNF1, C1QTNF5, C3, C3AR1, C4A, C4B, C4BPA, C5, C5AR1, C5AR2, CACNA1C, CADM1, CADM3, CALCA, CALCB, CALCR, CALCRL, CALM1, CALM3, CALML3, CALR, CAMP, CAP1, CATSPER1, CAV1, CCBE1, CCK, CCKAR, CCKBR, CCL1, CCL11, CCL13, CCL14, CCL15, CCL16, CCL17, CCL18, CCL19, CCL2, CCL20, CCL21, CCL22, CCL23, CCL24, CCL25, CCL26, CCL27, CCL28, CCL3, CCL3L1, CCL4, CCL4L1, CCL5, CCL7, CCL8, CCN1, CCN2, CCN3, CCN4, CCN6, CCR1, CCR10, CCR2, CCR3, CCR4, CCR5, CCR6, CCR7, CCR8, CCR9, CCRL2, CD14, CD151, CD163, CD177, CD180, CD19, CD1A, CD1B, CD1C, CD1D, CD2, CD200, CD200R1, CD200R1L, CD209, CD22, CD226, CD24, CD244, CD247, CD248, CD27, CD274, CD28, CD300LB, CD300LF, CD320, CD33, CD34, CD36, CD38, CD3D, CD4, CD40, CD40LG, CD44, CD46, CD47, CD48, CD5, CD52, CD53, CD55, CD58, CD59, CD5L, CD6, CD63, CD68, CD69, CD7, CD70, CD72, CD74, CD79A, CD80, CD81, CD82, CD86, CD8A, CD8B, CD8B2, CD9, CD93, CD96, CD99, CD99L2, CDH1, CDH10, CDH11, CDH2, CDH5, CDH7, CDON, CEACAM1, CEACAM16, CEACAM19, CEACAM5, CEACAM6, CEACAM8, CEL, CELSR1, CELSR2, CELSR3, CER1, CFC1, CFC1B, CFH, CFP, CFTR, CGA, CGB8, CGN, CHAD, CHL1, CHRM1, CHRM3, CHRNA10, CHRNA3, CHRNA4, CHRNA7, CHRNA9, CHRNB2, CHRNB4, CIRBP, CKLF, CLCF1, CLDN4, CLEC10A, CLEC11A, CLEC12A, CLEC14A, CLEC1B, CLEC2A, CLEC2B, CLEC2D, CLEC3A, CLEC4A, CLEC4G, CLEC4M, CMKLR1, CMKLR2, CNGA2, CNMD, CNR1, CNR2, CNTF, CNTFR, CNTN1, CNTN2, CNTN3, CNTN4, CNTN5, CNTN6, CNTNAP1, COL10A1, COL11A1, COL11A2, COL12A1, COL13A1, COL14A1, COL15A1, COL16A1, COL17A1, COL18A1, COL19A1, COL1A1, COL1A2, COL20A1, COL21A1, COL22A1, COL24A1, COL26A1, COL27A1, COL28A1, COL2A1, COL3A1, COL4A1, COL4A2, COL4A3, COL4A4, COL4A5, COL4A6, COL5A1, COL5A2, COL5A3, COL6A1, COL6A2, COL6A3, COL6A5, COL6A6, COL7A1, COL8A1, COL8A2, COL9A1, COL9A2, COL9A3, COLEC12, COLQ, COMP, COPA, CORT, CP, CPAMD8, CR1, CR2, CRH, CRHR1, CRHR2, CRISP2, CRISP3, CRLF1, CRLF2, CRLF3, CRP, CRTAM, CSF1, CSF1R, CSF2, CSF2RA, CSF2RB, CSF3, CSF3R, CSH1, CSH2, CSHL1, CSPG4, CTF1, CTHRC1, CTLA4, CUBN, CX3CL1, CX3CR1, CXADR, CXCL1, CXCL10, CXCL11, CXCL12, CXCL13, CXCL14, CXCL16, CXCL2, CXCL3, CXCL5, CXCL6, CXCL8, CXCL9, CXCR1, CXCR2, CXCR3, CXCR4, CXCR5, CXCR6, CYTL1, DAG1, DBP, DCBLD2, DCC, DCHS1, DCLK3, DCN, DDR1, DDR2, DEFB1, DEFB103A, DEFB106B, DEFB4B, DHH, DIP2A, DKK1, DKK2, DKK3, DKK4, DLK1, DLK2, DLL1, DLL3, DLL4, DMP1, DNAJB11, DPP4, DRAXIN, DRD2, DRD4, DSC1, DSC2, DSC3, DSCAM, DSG1, DSG2, DSG3, DSG4, DSPP, DYSF, EBI3, ECM1, EDA, EDA2R, EDAR, EDIL3, EDN1, EDN2, EDN3, EDNRA, EDNRB, EFEMP1, EFEMP2, EFNA1, EFNA2, EFNA3, EFNA4, EFNA5, EFNB1, EFNB2, EFNB3, EGF, EGFR, ENAM, ENG, ENHO, ENO1, ENPEP, ENTPD1, EPGN, EPHA1, EPHA10, EPHA2, EPHA3, EPHA4, EPHA5, EPHA6, EPHA7, EPHA8, EPHB1, EPHB2, EPHB3, EPHB4, EPHB6, EPO, EPOR, ERAP1, ERBB2, ERBB3, ERBB4, EREG, ERFE, ESAM, ESR1, ETV5, F10, F11, F11R, F12, F13A1, F2, F2R, F2RL1, F2RL2, F2RL3, F3, F7, F8, F9, FABP5, FADD, FAM3B, FAM3C, FAM3D, FAP, FARP2, FAS, FASLG, FAT4, FBLN1, FBLN2, FBN1, FCAMR, FCER1A, FCER2, FCGR1A, FCGR2A, FCGR2B, FCGR3B, FCGRT, FCN2, FCRL1, FCRL4, FFAR2, FGA, FGB, FGF1, FGF10, FGF11, FGF12, FGF13, FGF14, FGF16, FGF17, FGF18, FGF19, FGF2, FGF20, FGF21, FGF22, FGF23, FGF3, FGF4, FGF5, FGF6, FGF7, FGF8, FGF9, FGFR1, FGFR2, FGFR3, FGFR4, FGFRL1, FGG, FGL1, FLOT1, FLRT3, FLT1, FLT3, FLT3LG, FLT4, FN1, FNDC5, FPR1, FPR2, FPR3, FRAS1, FREM1, FREM2, FRS3, FSHB, FSHR, FST, FSTL1, FSTL5, FXYD6, FZD1, FZD10, FZD2, FZD3, FZD4, FZD5, FZD6, FZD7, FZD8, FZD9, GABBR2, GAD1, GAL, GALP, GALR1, GALR2, GALR3, GAS1, GAS6, GAST, GC, GCG, GCGR, GDF1, GDF10, GDF11, GDF15, GDF2, GDF3, GDF5, GDF6, GDF7, GDF9, GDNF, GFRA1, GFRA2, GFRA3, GFRA4, GFRAL, GH1, GH2, GHR, GHRH, GHRHR, GHRL, GHSR, GIP, GIPR, GJB2, GLG1, GLP1R, GLP2R, GLRA2, GNAI2, GNAS, GNB3, GNRH1, GNRH2, GNRHR, GP1BA, GP1BB, GP5, GP6, GP9, GPC1, GPC2, GPC3, GPC4, GPC5, GPHA2, GPHB5, GPI, GPIHBP1, GPNMB, GPR101, GPR135, GPR151, GPR152, GPR171, GPR182, GPR19, GPR20, GPR25, GPR37, GPR37L1, GPR39, GPR42, GPR75, GPR83, GPR84, GPRC5D, GPRC6A, GREM1, GREM2, GRIN2A, GRIN2B, GRIN2C, GRIN2D, GRM1, GRM3, GRM4, GRM5, GRM7, GRN, GRP, GRPR, GSTM2, GSTO1, GSTP1, GUCA2A, GUCA2B, GUCY2C, GUCY2D, GZMA, GZMB, HAPLN1, HAS2, HAVCR1, HAVCR2, HBEGF, HCRT, HCRTR1, HCRTR2, HCST, HDC, HEBP1, HFE, HGF, HHIP, HHLA2, HJV, HLA-A, HLA-B, HLA-C, HLA-DMA, HLA-DMB, HLA-DOB, HLA-DPA1, HLA-DPB1, HLA-DQA1, HLA-DQA2, HLA-DQB1, HLA-DQB2, HLA-DRA, HLA-DRB1, HLA-DRB3, HLA-DRB4, HLA-DRB5, HLA-E, HLA-F, HLA-G, HMGB1, HMMR, HP, HPX, HRAS, HRG, HRH1, HRH2, HRH3, HRH4, HSPA1A, HSPA4, HSPA8, HSPG2, HTR1A, HTR1B, HTR1D, HTR1E, HTR1F, HTR2A, HTR2B, HTR2C, HTR4, HTR5A, HTR6, HTR7, IAPP, IBSP, ICAM1, ICAM2, ICAM3, ICAM4, ICAM5, ICOS, ICOSLG, IFITM1, IFNA1, IFNA10, IFNA14, IFNA16, IFNA17, IFNA2, IFNA21, IFNA4, IFNA5, IFNA6, IFNA7, IFNA8, IFNAR1, IFNAR2, IFNB1, IFNE, IFNG, IFNGR1, IFNGR2, IFNK, IFNL1, IFNL2, IFNL3, IFNL4, IFNLR1, IFNW1, IGDCC3, IGDCC4, IGF1, IGF1R, IGF2, IGF2R, IGFBP4, IGFBP7, IGFBPL1, IGFL1, IGFL2, IGFL3, IGFLR1, IGHG1, IGLC1, IGSF1, IGSF10, IGSF11, IHH, IL10, IL10RA, IL10RB, IL11, IL11RA, IL12A, IL12B, IL12RB1, IL12RB2, IL13, IL13RA1, IL13RA2, IL15, IL15RA, IL16, IL17A, IL17B, IL17C, IL17F, IL17RA, IL17RB, IL17RC, IL17RE, IL18, IL18BP, IL18R1, IL18RAP, IL19, IL1A, IL1B, IL1F10, IL1R1, IL1R2, IL1RAP, IL1RAPL1, IL1RAPL2, IL1RL1, IL1RL2, IL1RN, IL2, IL20, IL20RA, IL20RB, IL21, IL21R, IL22, IL22RA1, IL22RA2, IL23A, IL23R, IL24, IL25, IL26, IL27, IL27RA, IL2RA, IL2RB, IL2RG, IL3, IL31, IL31RA, IL32, IL33, IL34, IL36A, IL36B, IL36G, IL36RN, IL37, IL3RA, IL4, IL4R, IL5, IL5RA, IL6, IL6R, IL6ST, IL7, IL7R, IL9, IL9R, ILDR1, IMPG2, INHA, INHBA, INHBB, INHBC, INHBE, INS, INSL3, INSL5, INSR, IRAK4, ISLR2, ITGA1, ITGA10, ITGA11, ITGA2, ITGA2B, ITGA3, ITGA4, ITGA5, ITGA6, ITGA7, ITGA8, ITGA9, ITGAD, ITGAE, ITGAL, ITGAM, ITGAV, ITGAX, ITGB1, ITGB2, ITGB3, ITGB3BP, ITGB4, ITGB5, ITGB6, ITGB7, ITGB8, ITIH2, IZUMO1, IZUMO1R, JAG1, JAG2, JAM2, JAM3, JAML, JMJD6, KCNA3, KCND1, KCND2, KCNJ10, KCNJ15, KCNJ4, KCNN4, KCNQ1, KCNQ3, KCNQ5, KDR, KEL, KIDINS220, KIR2DL1, KIR2DL2, KIR2DL3, KIR2DL4, KIR2DS1, KIR2DS4, KIR3DL1, KIR3DL2, KIR3DL3, KIR3DS1, KISS1, KISS1R, KIT, KITLG, KL, KLB, KLK3, KLRB1, KLRC1, KLRC2, KLRC3, KLRD1, KLRF1, KLRF2, KLRG1, KLRG2, KLRK1, KMT2E, KNG1, KREMEN1, KREMEN2, L1CAM, LACRT, LAG3, LAIR1, LAMA1, LAMA2, LAMA3, LAMA4, LAMA5, LAMB1, LAMB2, LAMB3, LAMC1, LAMC2, LAMC3, LAMP1, LAMP2, LCK, LCN1, LCN2, LDLR, LEAP2, LEFTY1, LEFTY2, LEP, LEPR, LGALS1, LGALS3, LGALS3BP, LGALS8, LGALS9, LGI1, LGI2, LGI3, LGI4, LGR4, LGR5, LGR6, LHB, LHCGR, LIF, LIFR, LILRA1, LILRA3, LILRA4, LILRB1, LILRB2, LILRB3, LILRB4, LIN7C, LINGO1, LIPC, LIPH, LMAN1, LMBR1L, LPA, LPAR1, LPAR2, LPAR3, LPAR4, LPL, LPP, LRIG1, LRIG2, LRP1, LRP10, LRP11, LRP1B, LRP2, LRP4, LRP5, LRP6, LRP8, LRPAP1, LRRC4, LRRC4B, LRRC4C, LRRN3, LRRTM2, LSR, LTA, LTB, LTBP1, LTBP3, LTBR, LTF, LTK, LUM, LVRN, LY86, LY96, LYPD3, LYVE1, MADCAM1, MAG, MAGED1, MAML2, MARCO, MAS1, MATN1, MBL2, MC1R, MC2R, MC3R, MC4R, MC5R, MCAM, MCFD2, MCHR1, MCHR2, MDK, MEGF10, MELTF, MEPE, MERTK, MET, MFAP2, MFAP3L, MFAP5, MFGE8, MFNG, MFRP, MGRN1, MIA, MICA, MICB, MIF, MILR1, MIP, MLN, MLNR, MMP1, MMP12, MMP13, MMP2, MMP24, MMP7, MMP9, MMRN2, MOG, MPIG6B, MPL, MPZ, MPZL1, MRAP, MRC1, MRC2, MRGPRX1, MRGPRX2, MS4A4A, MSMP, MST1, MST1R, MSTN, MTNR1A, MTNR1B, MTTP, MUC1, MUC5AC, MUC6, MUSK, MXRA5, MYL9, MYOC, NAMPT, NCAM1, NCAM2, NCAN, NCL, NCR1, NCR2, NCR3, NCR3LG1, NCSTN, NDP, NECTIN1, NECTIN2, NECTIN3, NECTIN4, NEGR1, NELL2, NEO1, NETO2, NFASC, NGF, NGFR, NID1, NLGN1, NLGN2, NLGN3, NLGN4X, NMB, NMBR, NMS, NMU, NMUR1, NMUR2, NODAL, NOTCH1, NOTCH2, NOTCH3, NOTCH4, NPB, NPBWR1, NPBWR2, NPFF, NPFFR1, NPFFR2, NPNT, NPPA, NPPB, NPPC, NPR1, NPR2, NPR3, NPS, NPSR1, NPTN, NPTX1, NPTX2, NPTXR, NPVF, NPW, NPY, NPY1R, NPY2R, NPY4R, NPY5R, NPY6R, NR0B2, NRCAM, NRG1, NRG2, NRG3, NRG4, NRP1, NRP2, NRSN1, NRTN, NRXN1, NRXN2, NRXN3, NT5E, NTF3, NTF4, NTN1, NTN3, NTN4, NTNG1, NTNG2, NTRK1, NTRK2, NTRK3, NTS, NTSR1, NTSR2, NUCB2, NXPH1, NXPH2, NXPH3, OBP2A, OCLN, OGFR, OLFM2, OLR1, OMG, OPRD1, OPRK1, OPRL1, OPRM1, OR1G1, ORAI2, ORM1, OSM, OSMR, OSTN, OXT, OXTR, P2RX7, P2RY12, P2RY14, P2RY6, P4HB, PAM, PARD3, PCNA, PCSK1N, PCSK9, PDAP1, PDCD1, PDCD1LG2, PDCD2, PDE1A, PDE1B, PDE1C, PDGFA, PDGFB, PDGFC, PDGFD, PDGFRA, PDGFRB, PDPN, PDX1, PDYN, PECAM1, PENK, PF4, PF4V1, PGF, PGLYRP1, PHEX, PI16, PI3, PIGA, PIGF, PIGR, PILRA, PILRB, PIP, PITPNM3, PKM, PLA2G10, PLA2G2A, PLA2R1, PLAT, PLAU, PLAUR, PLD1, PLD2, PLG, PLGRKT, PLSCR1, PLSCR4, PLTP, PLXDC1, PLXDC2, PLXNA1, PLXNA2, PLXNA3, PLXNA4, PLXNB1, PLXNB2, PLXNB3, PLXNC1, PLXND1, PMCH, PNOC, PODXL, PODXL2, POMC, POSTN, PPBP, PPY, PRG4, PRL, PRLH, PRLHR, PRLR, PRND, PROC, PROCR, PROK1, PROK2, PROKR1, PROKR2, PROS1, PRSS1, PRSS2, PRSS3, PRTG, PSAP, PSEN1, PSG4, PSG5, PSG6, PSG7, PSG9, PSPN, PTCH1, PTCH2, PTDSS1, PTGDR, PTGDR2, PTGER2, PTGER3, PTGER4, PTGIR, PTGS2, PTH, PTH1R, PTH2, PTH2R, PTHLH, PTK7, PTN, PTPN11, PTPN6, PTPRA, PTPRB, PTPRC, PTPRD, PTPRF, PTPRG, PTPRJ, PTPRK, PTPRM, PTPRR, PTPRS, PTPRU, PTPRZ1, PVR, PYY, QDPR, QRFP, QRFPR, RACK1, RAET1E, RAET1G, RAET1L, RAMP1, RAMP2, RAMP3, RARRES1, RARRES2, RBP3, RBP4, RECK, RELN, REN, RET, RETN, RGMA, RGMB, RHAG, RHBDF2, RHBDL2, RIMS1, RIMS2, RIPK1, RLN1, RLN2, RLN3, RNASE2, RNF43, ROBO1, ROBO2, ROBO3, ROBO4, ROR1, ROR2, RPS19, RPSA, RSPO1, RSPO2, RSPO3, RSPO4, RTN4, RTN4R, RTN4RL1, RTN4RL2, RXFP1, RXFP2, RXFP3, RXFP4, RXRA, RYK, RYR1, RYR2, S100A1, S100A10, S100A12, S100A4, S100A8, S100A9, S100B, S1PR1, S1PR2, S1PR3, S1PR4, S1PR5, SAA1, SCARA5, SCARB1, SCARF1, SCEL, SCGB1A1, SCGB3A1, SCGB3A2, SCN10A, SCN2B, SCN4A, SCN5A, SCN8A, SCT, SCTR, SCUBE2, SDC1, SDC2, SDC3, SDC4, SDK2, SECTM1, SELE, SELL, SELP, SELPLG, SEMA3A, SEMA3B, SEMA3C, SEMA3D, SEMA3E, SEMA3F, SEMA3G, SEMA4A, SEMA4B, SEMA4C, SEMA4D, SEMA4F, SEMA4G, SEMA5A, SEMA5B, SEMA6A, SEMA6B, SEMA6D, SEMA7A, SERPINA1, SERPINA7, SERPINC1, SERPINE1, SERPINE2, SERPINF1, SERPING1, SERTAD1, SFRP1, SFRP2, SFTPA1, SFTPA2, SFTPD, SHANK1, SHANK2, SHBG, SHH, SIGIRR, SIGLEC1, SIGLEC10, SIGLEC5, SIGLEC6, SIGLEC7, SIGLEC8, SIGLEC9, SIRPA, SIRPB2, SIRPG, SLAMF9, SLC16A1, SLC16A2, SLC16A7, SLC17A7, SLC18A2, SLC18A3, SLC1A5, SLC2A2, SLC37A1, SLC40A1, SLC4A11, SLC6A8, SLIT1, SLIT2, SLIT3, SLITRK1, SLITRK2, SLITRK3, SLITRK5, SLITRK6, SLPI, SLURP1, SLURP2, SMAP1, SMO, SNCA, SNX14, SOCS2, SORBS1, SORCS2, SORCS3, SORL1, SORT1, SOST, SOSTDC1, SPARC, SPINK1, SPINT1, SPN, SPON1, SPON2, SPP1, SPTAN1, SPTBN2, SPX, SST, SSTR1, SSTR2, SSTR3, SSTR4, SSTR5, ST14, ST6GAL1, STAB1, STAB2, STRA6, STX1A, STX3, STX4, TAC1, TAC3, TAC4, TACR1, TACR2, TACR3, TAFA4, TARM1, TBXA2R, TCN1, TCN2, TCTN1, TDGF1, TECTA, TECTB, TEK, TF, TFF1, TFF2, TFF3, TFPI, TFR2, TFRC, TG, TGFA, TGFB1, TGFB2, TGFB3, TGFBR1, TGFBR2, TGFBR3, TGM2, TGS1, THBD, THBS1, THBS2, THBS3, THBS4, THPO, THY1, TIE1, TIGIT, TIMD4, TIMP1, TIMP2, TIMP3, TLN1, TLR1, TLR2, TLR4, TLR5, TLR6, TLR7, TLR9, TMED5, TMEM219, TMEM67, TMIGD2, TMIGD3, TNC, TNF, TNFRSF10A, TNFRSF10B, TNFRSF10C, TNFRSF10D, TNFRSF11A, TNFRSF11B, TNFRSF12A, TNFRSF13B, TNFRSF13C, TNFRSF14, TNFRSF17, TNFRSF18, TNFRSF19, TNFRSF1A, TNFRSF1B, TNFRSF21, TNFRSF25, TNFRSF4, TNFRSF6B, TNFRSF8, TNFRSF9, TNFSF10, TNFSF11, TNFSF12, TNFSF13, TNFSF13B, TNFSF14, TNFSF15, TNFSF18, TNFSF4, TNFSF8, TNFSF9, TNN, TNR, TNXB, TOR2A, TPH1, TPO, TPSAB1, TPSB2, TRADD, TRAF2, TRAF3, TREM1, TREM2, TREML2, TREML4, TRH, TRHR, TRPC3, TRPC5, TRPM2, TRPM3, TRPV1, TRPV6, TSHB, TSHR, TSLP, TSPAN1, TSPAN10, TSPAN12, TSPAN14, TSPAN15, TSPAN17, TSPAN5, TTR, TXLNA, TYRO3, TYROBP, UCN, UCN2, UCN3, ULBP1, ULBP2, ULBP3, UNC5A, UNC5B, UNC5C, UNC5D, UTS2, UTS2B, UTS2R, VANGL2, VASN, VASP, VCAM1, VCAN, VCL, VEGFA, VEGFB, VEGFC, VEGFD, VGF, VIM, VIP, VIPR1, VIPR2, VLDLR, VSIG10L, VSIR, VSTM1, VTN, VWF, WFIKKN2, WIF1, WNT1, WNT10A, WNT10B, WNT11, WNT16, WNT2, WNT2B, WNT3, WNT3A, WNT4, WNT5A, WNT5B, WNT6, WNT7A, WNT7B, WNT8A, WNT8B, WNT9A, WNT9B, XCL1, XCL2, XCR1, YBX1, ZG16B, ZNRF3, ZP3] missing from var_names

My question is if I'm getting this error because my 13,062 gene dataset is missing too many resource genes? If so which argument controls the minimal accepted overlap proportion (seems from the error message that the default is 0.98 but which argument changes that?)

Thanks

@ndrubins ndrubins added bug Something isn't working help wanted Extra attention is needed labels Jan 27, 2024
@dbdimitrov
Copy link
Collaborator

Hi @ndrubins, is your dataset with human gene symbols?

If yes then this might indeed be a bug, and in that case I might need a small subset of your data to debug.

Daniel

@ndrubins
Copy link
Author

Hi Daniel,

Thanks a lot for the quick response.

My data are originally from mouse, but prior to running liana multi df_to_lr I converted them to human gene symbols.

Here's the code I'm running liana multi df_to_lr with:

import numpy as np
import pandas as pd
import scanpy as sc
import liana as li
import decoupler as dc
import omnipath as op
import pyreadr

#read data with human gene symbols
adata = sc.read('/home/rnd/adata.h5ad')
#set adata.X (the count matrix) as adata.layers['counts']
adata.layers['counts'] = adata.X

sample_key = 'sample'
groupby = 'cell_type'
condition_key = 'age'

adata = adata[adata.obs[condition_key]=='old'].copy()
sc.pp.normalize_total(adata, target_sum=1e4)
sc.pp.log1p(adata)

>>> adata
AnnData object with n_obs × n_vars = 57455 × 13062
    obs: 'sample', 'age', 'cell_type', 'nCount_RNA', 'nFeature_RNA', 'cell_abbr'
    var: 'gene_id', 'symbol', 'highly_variable', 'means', 'dispersions', 'dispersions_norm', 'mean', 'std'
    uns: 'log1p'
    layers: 'counts'
>>> adata.var
       gene_id  symbol  highly_variable     means  dispersions  dispersions_norm      mean       std
SAMD3    SAMD3   SAMD3             True  0.117891     1.808869          1.142210  0.046311  0.271780
BIRC3    BIRC3   BIRC3            False  0.780688     1.638153          0.087827  0.450895  0.709419
NAPA      NAPA    NAPA            False  0.641963     1.116005         -0.553606  0.411139  0.612375
BRD4      BRD4    BRD4            False  0.798999     1.220429         -0.574386  0.527698  0.680623
BRK1      BRK1    BRK1            False  0.916493     1.070140         -0.812639  0.640491  0.715506
...        ...     ...              ...       ...          ...               ...       ...       ...
ZXDC      ZXDC    ZXDC            False  0.186980     1.157657         -0.207838  0.100781  0.334271
ZYG11B  ZYG11B  ZYG11B            False  0.238547     1.172030         -0.178040  0.131239  0.377451
ZYX        ZYX     ZYX            False  1.420814     2.341516          0.269752  0.840340  0.998730
ZZEF1    ZZEF1   ZZEF1            False  0.385198     1.207996         -0.382832  0.222435  0.479948
ZZZ3      ZZZ3    ZZZ3            False  0.310629     1.236055         -0.045309  0.174271  0.430775

[13062 rows x 8 columns]

#read pre-run age (old vs. young) differential expression data.frame (created in R and save as an RDS file)
de_df = pyreadr.read_r('/home/rnd/de.df.RDS')
de_df = de_df[None]
de_df = de_df.reset_index().rename(columns={'level_0': groupby}).set_index('index')

>>> de_df
        cell_type cell_abbr    baseMean  log2FoldChange      stat    pvalue      padj
index
A4GALT          0     AT1.1    0.333333            -inf  1.282898  0.161630  0.329362
AAAS            1     AT1.1    0.333333            -inf  1.282898  0.161630  0.329362
AACS            2     AT1.1    1.000000            -inf  0.335624  0.717376  0.857067
AAGAB           3     AT1.1    1.000000        0.000000  1.775477  0.027807  0.100003
AAK1            4     AT1.1    1.666667            -inf -0.151318  0.869784  0.942332
...           ...       ...         ...             ...       ...       ...       ...
ZXDC       762907      vein   24.750000        0.634205  0.148733  0.563260  0.900345
ZYG11B     762908      vein   37.250000        0.530760  0.005796  0.973200  0.992934
ZYX        762909      vein  124.250000        0.726780  0.249147  0.161248  0.603470
ZZEF1      762910      vein   52.750000        0.537943  0.007226  0.968682  0.991453
ZZZ3       762911      vein   38.750000        0.397549 -0.160620  0.421224  0.823931

#run liana multi.df_to_lr
lr_res = li.multi.df_to_lr(adata,
                           dea_df=de_df,
                           resource_name='consensus',
                           expr_prop=0.1, # calculated for adata as passed - used to filter interactions
                           groupby=groupby,
                           stat_keys=['stat', 'pvalue', 'padj'],
                           use_raw=False,
                           complex_col='stat', # NOTE: we use the Wald Stat to deal with complexes
                           verbose=True,
                           return_all_lrs=False,
                           )

Which produces the error in my first post.

Happy to send you a subset of the data - let me know how and what.

Thanks!

@dbdimitrov
Copy link
Collaborator

Hi @ndrubins,

Yeah, your input and data format look correct. You can send a link to e.g. google drive or figshare with a subset to daniel.dimitrov (аt) uni-heidelberg.de.

I'll do my best to address the issue ASAP.

Daniel

@dbdimitrov
Copy link
Collaborator

Hi @ndrubins,

Thanks for sharing the data. I found a couple of issues.

  1. I believe you should pass groupby = 'cell_abbr' as "cell_type" in your dea_df is populated with just index values with the same length as the dataframe
  2. The .rds object seems to be corrupted (I couldn't load it in R4.2) and there is an issue with the 'cell_abbr' column - you can just export a csv, it would make your life easier :)

Other than that your data worked on liana v1.0.4 (latest on pip).

Hope this helps.
Daniel

@ndrubins
Copy link
Author

ndrubins commented Feb 1, 2024

Thanks a lot.

I'm still having one issue with the dea_df.

What I'm reading in (from the RDS file) is:

>>> de_df
         index cell_abbr    baseMean  log2FoldChange      stat    pvalue      padj
0       A4GALT     AT1.1    0.333333            -inf  1.282898  0.161630  0.329362
1         AAAS     AT1.1    0.333333            -inf  1.282898  0.161630  0.329362
2         AACS     AT1.1    1.000000            -inf  0.335624  0.717376  0.857067
3        AAGAB     AT1.1    1.000000        0.000000  1.775477  0.027807  0.100003
4         AAK1     AT1.1    1.666667            -inf -0.151318  0.869784  0.942332
...        ...       ...         ...             ...       ...       ...       ...
762907    ZXDC      vein   24.750000        0.634205  0.148733  0.563260  0.900345
762908  ZYG11B      vein   37.250000        0.530760  0.005796  0.973200  0.992934
762909     ZYX      vein  124.250000        0.726780  0.249147  0.161248  0.603470
762910   ZZEF1      vein   52.750000        0.537943  0.007226  0.968682  0.991453
762911    ZZZ3      vein   38.750000        0.397549 -0.160620  0.421224  0.823931

How should I index it so that it's correctly formatted for liana's multi.df_to_lr?

Thanks a lot

@dbdimitrov
Copy link
Collaborator

Hi @ndrubins,

Better to export it as csv from R to ensure there are no buggy columns (in the one that you shared cell abbr was bugged). Other than that you need to have the gene symbols your index, and it should work.

@ndrubins
Copy link
Author

ndrubins commented Feb 2, 2024

Thanks a lot.

@ndrubins ndrubins closed this as completed Feb 2, 2024
@ndrubins
Copy link
Author

ndrubins commented Mar 20, 2024

Hi @dbdimitrov,

Sorry to bother you again. I'm trying to run multi.df_to_lr again (similar to how I did before) but this time I'm providing the function a custom pandas dataframe with [ligand, receptor] columns for the resource argument and it's giving this error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/rnd/miniconda/envs/py3_9/lib/python3.9/site-packages/liana/multi/df_to_lr.py", line 148, in df_to_lr
    assert_covered(np.union1d(np.unique(resource["ligand"]),
  File "/home/rnd/miniconda/envs/py3_9/lib/python3.9/site-packages/liana/method/_pipe_utils/_pre.py", line 60, in assert_covered
    raise ValueError(msg + f" [{x_missing}] missing from {superset_name}")
ValueError: Please check if appropriate organism/ID type was provided! Allowed proportion (0.98) of missing resource elements exceeded (1.00). Too few features from the resource were found in the data

I have a feeling it's an indexing issue because I'm getting the same error if I use this OminPath consensus ligand,receptor dataframe:

>>> omnipath_consensus_df
      ligand receptor
0     LGALS9    PTPRC
1     LGALS9      MET
2     LGALS9     CD44
3     LGALS9     LRP1
4     LGALS9     CD47
...      ...      ...
4820    BMP2    ACTR2
4821   BMP15    ACTR2
4822    CSF1    CSF3R
4823   IL36G   IFNAR1
4824   IL36G   IFNAR2

[4825 rows x 2 columns]

>>> type(omnipath_consensus_df)
<class 'pandas.core.frame.DataFrame'>

I obtained this omnipath_consensus_df with this R code:

library(dplyr)
liana.consensus.df <- liana::select_resource("Consensus")[[1]] %>% liana:::decomplexify() %>%
  dplyr::select(source_genesymbol,target_genesymbol) %>% 
  dplyr::rename(ligand=source_genesymbol,receptor=target_genesymbol)

Any idea?

Thanks

@dbdimitrov
Copy link
Collaborator

Hi @ndrubins, it looks correct from what you're showing. I also tried passing an external resource to the function and it worked for me. Can you also show how does the dea_df look like? And adata.var?

Thanks

@dbdimitrov dbdimitrov reopened this Mar 20, 2024
@ndrubins
Copy link
Author

ndrubins commented Mar 20, 2024

Of course.

>>> adata
AnnData object with n_obs × n_vars = 57455 × 13167
    obs: 'sample', 'age', 'nCount_RNA', 'nFeature_RNA', 'cell_abbr'
    var: 'gene_id', 'symbol', 'highly_variable', 'means', 'dispersions', 'dispersions_norm', 'mean', 'std'
    uns: 'log1p'
    layers: 'counts'
>>> adata.var
       gene_id  symbol  highly_variable     means  dispersions  dispersions_norm      mean       std
SAMD14  SAMD14  SAMD14            False  0.017770     1.333416          0.156533  0.008652  0.102023
BIRC6    BIRC6   BIRC6            False  0.783255     1.259732         -0.512081  0.512168  0.676008
NAP1L3  NAP1L3  NAP1L3            False  0.017006     1.121743         -0.282292  0.008741  0.099460
BRD8      BRD8    BRD8            False  0.375344     1.246635         -0.311100  0.210921  0.478002
BRMS1L  BRMS1L  BRMS1L            False  0.154792     1.207570         -0.104361  0.081493  0.304332
...        ...     ...              ...       ...          ...               ...       ...       ...
ZXDC      ZXDC    ZXDC            False  0.186980     1.157657         -0.207838  0.100781  0.334271
ZYG11B  ZYG11B  ZYG11B            False  0.238547     1.172030         -0.178040  0.131239  0.377451
ZYX        ZYX     ZYX            False  1.420814     2.341516          0.269752  0.840340  0.998730
ZZEF1    ZZEF1   ZZEF1            False  0.385198     1.207996         -0.382832  0.222435  0.479948
ZZZ3      ZZZ3    ZZZ3            False  0.310629     1.236055         -0.045309  0.174271  0.430775

[13167 rows x 8 columns]
>>> dea_df
                                   cell_abbr  baseMean  log2FoldChange      stat    pvalue      padj
index
AAK1                               B.cell  1.138393       -0.360604 -0.464941  0.063978  0.397473
AAMP                               B.cell  1.125492        0.126192  0.296059  0.053633  0.382694
AATF                               B.cell  1.090547       -0.093669 -0.074989  0.597805  0.788567
ABCA1                              B.cell  1.246906       -0.433987 -0.575483  0.050428  0.377990
ABCB7                              B.cell  1.145998       -0.243887 -0.241885  0.187322  0.550212
...                                      ...       ...             ...       ...       ...       ...
ZXDC    macrophage  1.136842       -1.504077 -0.135634  0.560524  0.814248
ZYG11B  macrophage  1.200000       -1.398717 -0.054283  0.820323  0.938793
ZYX     macrophage  1.821522       -1.436526 -0.112890  0.367168  0.691145
ZZEF1   macrophage  1.425249       -1.225175  0.196901  0.233626  0.565311
ZZZ3    macrophage  1.290043       -1.477586 -0.162519  0.365358  0.689764

[280869 rows x 6 columns]
>>> omnipath_consensus_df
      ligand receptor
0     LGALS9    PTPRC
1     LGALS9      MET
2     LGALS9     CD44
3     LGALS9     LRP1
4     LGALS9     CD47
...      ...      ...
4820    BMP2    ACTR2
4821   BMP15    ACTR2
4822    CSF1    CSF3R
4823   IL36G   IFNAR1
4824   IL36G   IFNAR2

[4825 rows x 2 columns]

And this is the complete output I'm getting when running it with multi.df_to_lr:

Converting `cell_abbr` to categorical!
Using provided `resource`.
Using `.X`!
/home/rnd/miniconda/envs/py3_9/lib/python3.9/site-packages/anndata/_core/anndata.py:522: FutureWarning: The dtype argument is deprecated and will be removed in late 2024.
/home/rnd/miniconda/envs/py3_9/lib/python3.9/site-packages/anndata/_core/anndata.py:1906: UserWarning: Observation names are not unique. To make them unique, call `.obs_names_make_unique`.
/home/rnd/miniconda/envs/py3_9/lib/python3.9/site-packages/anndata/_core/anndata.py:1906: UserWarning: Observation names are not unique. To make them unique, call `.obs_names_make_unique`.
98 features of mat are empty, they will be removed.
/home/rnd/miniconda/envs/py3_9/lib/python3.9/site-packages/liana/method/_pipe_utils/_pre.py:148: ImplicitModificationWarning: Trying to modify attribute `.obs` of view, initializing view as actual.
/home/rnd/miniconda/envs/py3_9/lib/python3.9/site-packages/anndata/_core/anndata.py:1906: UserWarning: Observation names are not unique. To make them unique, call `.obs_names_make_unique`.
/home/rnd/miniconda/envs/py3_9/lib/python3.9/site-packages/anndata/_core/anndata.py:1906: UserWarning: Observation names are not unique. To make them unique, call `.obs_names_make_unique`.
/home/rnd/miniconda/envs/py3_9/lib/python3.9/site-packages/liana/method/_pipe_utils/_pre.py:151: FutureWarning: The default of observed=False is deprecated and will be changed to True in a future version of pandas. Pass observed=False to retain current behavior or observed=True to adopt the future default and silence this warning.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/rnd/miniconda/envs/py3_9/lib/python3.9/site-packages/liana/multi/df_to_lr.py", line 148, in df_to_lr
    assert_covered(np.union1d(np.unique(resource["ligand"]),
  File "/home/rnd/miniconda/envs/py3_9/lib/python3.9/site-packages/liana/method/_pipe_utils/_pre.py", line 60, in assert_covered
    raise ValueError(msg + f" [{x_missing}] missing from {superset_name}")
ValueError: Please check if appropriate organism/ID type was provided! Allowed proportion (0.98) of missing resource elements exceeded (1.00). Too few features from the resource were found in the data. [A1BG, A2M, AANAT, ABCA1, ACE, ACKR1, ACKR2, ACKR3, ACKR4, ACTR2, ACVR1, ACVR1B, ACVR1C, ACVR2A, ACVR2B, ACVRL1, ADA, ADAM10, ADAM11, ADAM12, ADAM15, ADAM17, ADAM2, ADAM22, ADAM23, ADAM28, ADAM29, ADAM7, ADAM9, ADAMTS3, ADCY1, ADCY7, ADCY8, ADCY9, ADCYAP1, ADCYAP1R1, ADGRA2, ADGRB1, ADGRE2, ADGRE5, ADGRG1, ADGRG3, ADGRG5, ADGRL1, ADGRL4, ADGRV1, ADIPOQ, ADIPOR1, ADIPOR2, ADM, ADM2, ADO, ADORA1, ADORA2A, ADORA2B, ADORA3, ADRA2A, ADRA2B, ADRB1, ADRB2, ADRB3, AFDN, AGER, AGR2, AGRN, AGRP, AGT, AGTR1, AGTR2, AGTRAP, AHSG, AIMP1, ALB, ALCAM, ALK, ALKAL1, ALKAL2, ALOX5, AMBN, AMELX, AMELY, AMFR, AMH, AMHR2, ANG, ANGPT1, ANGPT2, ANGPT4, ANGPTL1, ANGPTL2, ANGPTL3, ANGPTL4, ANGPTL7, ANOS1, ANTXR1, ANXA1, ANXA2, APCDD1, APELA, APLN, APLNR, APLP1, APLP2, APOA1, APOA2, APOA4, APOB, APOC1, APOC2, APOC3, APOC4, APOD, APOE, APOO, APP, AQP1, AQP5, AQP6, AR, AREG, ARF1, ARF6, ARPC5, ART1, ARTN, ASGR1, ASGR2, ASIP, ATP1A3, ATP5F1B, ATP6AP2, ATRN, AVP, AVPR1A, AVPR1B, AVPR2, AXL, AZGP1, B2M, BAG6, BAMBI, BCAM, BCAN, BDKRB1, BDKRB2, BDNF, BEX3, BGN, BMP1, BMP10, BMP15, BMP2, BMP3, BMP4, BMP5, BMP6, BMP7, BMP8A, BMP8B, BMPR1A, BMPR1B, BMPR2, BOC, BPI, BRS3, BSG, BST1, BST2, BTC, BTLA, BTN1A1, BTN3A2, BTN3A3, C1QA, C1QB, C1QBP, C1QTNF1, C1QTNF5, C3, C3AR1, C4A, C4B, C4BPA, C5, C5AR1, C5AR2, CACNA1C, CADM1, CADM3, CALCA, CALCB, CALCR, CALCRL, CALM1, CALM2, CALM3, CALML3, CALR, CAMP, CANX, CAP1, CATSPER1, CAV1, CCBE1, CCK, CCKAR, CCKBR, CCL1, CCL11, CCL13, CCL14, CCL15, CCL16, CCL17, CCL18, CCL19, CCL2, CCL20, CCL21, CCL22, CCL23, CCL24, CCL25, CCL26, CCL27, CCL28, CCL3, CCL3L1, CCL4, CCL4L1, CCL5, CCL7, CCL8, CCN1, CCN2, CCN3, CCN4, CCN6, CCR1, CCR10, CCR2, CCR3, CCR4, CCR5, CCR6, CCR7, CCR8, CCR9, CCRL2, CD14, CD151, CD163, CD177, CD180, CD19, CD1A, CD1B, CD1C, CD1D, CD2, CD200, CD200R1, CD200R1L, CD209, CD22, CD226, CD24, CD244, CD247, CD248, CD27, CD274, CD28, CD300LB, CD300LF, CD320, CD33, CD34, CD36, CD38, CD3D, CD3G, CD4, CD40, CD40LG, CD44, CD46, CD47, CD48, CD5, CD52, CD53, CD55, CD58, CD59, CD5L, CD6, CD63, CD68, CD69, CD7, CD70, CD72, CD74, CD79A, CD80, CD81, CD82, CD86, CD8A, CD8B, CD8B2, CD9, CD93, CD96, CD99, CD99L2, CDH1, CDH10, CDH11, CDH2, CDH5, CDH7, CDON, CEACAM1, CEACAM16, CEACAM19, CEACAM5, CEACAM6, CEACAM8, CEL, CELSR1, CELSR2, CELSR3, CER1, CFC1, CFC1B, CFH, CFP, CFTR, CGA, CGB8, CGN, CHAD, CHL1, CHRM1, CHRM3, CHRNA10, CHRNA3, CHRNA4, CHRNA7, CHRNA9, CHRNB2, CHRNB4, CIRBP, CKLF, CLCF1, CLDN4, CLEC10A, CLEC11A, CLEC12A, CLEC14A, CLEC1B, CLEC2A, CLEC2B, CLEC2D, CLEC3A, CLEC4A, CLEC4G, CLEC4M, CMKLR1, CMKLR2, CNGA2, CNMD, CNR1, CNR2, CNTF, CNTFR, CNTN1, CNTN2, CNTN3, CNTN4, CNTN5, CNTN6, CNTNAP1, COL10A1, COL11A1, COL11A2, COL12A1, COL13A1, COL14A1, COL15A1, COL16A1, COL17A1, COL18A1, COL19A1, COL1A1, COL1A2, COL20A1, COL21A1, COL22A1, COL24A1, COL26A1, COL27A1, COL28A1, COL2A1, COL3A1, COL4A1, COL4A2, COL4A3, COL4A4, COL4A5, COL4A6, COL5A1, COL5A2, COL5A3, COL6A1, COL6A2, COL6A3, COL6A5, COL6A6, COL7A1, COL8A1, COL8A2, COL9A1, COL9A2, COL9A3, COLEC12, COLQ, COMP, COPA, CORT, CP, CPAMD8, CR1, CR2, CRH, CRHR1, CRHR2, CRISP2, CRISP3, CRLF1, CRLF2, CRLF3, CRP, CRTAM, CSF1, CSF1R, CSF2, CSF2RA, CSF2RB, CSF3, CSF3R, CSH1, CSH2, CSHL1, CSPG4, CTF1, CTHRC1, CTLA4, CUBN, CX3CL1, CX3CR1, CXADR, CXCL1, CXCL10, CXCL11, CXCL12, CXCL13, CXCL14, CXCL16, CXCL17, CXCL2, CXCL3, CXCL5, CXCL6, CXCL8, CXCL9, CXCR1, CXCR2, CXCR3, CXCR4, CXCR5, CXCR6, CYTL1, DAG1, DBP, DCBLD2, DCC, DCHS1, DCLK3, DCN, DDR1, DDR2, DEFB1, DEFB103A, DEFB106B, DEFB4B, DHH, DIP2A, DKK1, DKK2, DKK3, DKK4, DLK1, DLK2, DLL1, DLL3, DLL4, DMP1, DNAJB11, DPP4, DRAXIN, DRD2, DRD4, DSC1, DSC2, DSC3, DSCAM, DSG1, DSG2, DSG3, DSG4, DSPP, DYSF, EBI3, ECM1, EDA, EDA2R, EDAR, EDIL3, EDN1, EDN2, EDN3, EDNRA, EDNRB, EFEMP1, EFEMP2, EFNA1, EFNA2, EFNA3, EFNA4, EFNA5, EFNB1, EFNB2, EFNB3, EGF, EGFR, ENAM, ENG, ENHO, ENO1, ENPEP, ENTPD1, EPGN, EPHA1, EPHA10, EPHA2, EPHA3, EPHA4, EPHA5, EPHA6, EPHA7, EPHA8, EPHB1, EPHB2, EPHB3, EPHB4, EPHB6, EPO, EPOR, ERAP1, ERBB2, ERBB3, ERBB4, EREG, ERFE, ESAM, ESR1, ETV5, F10, F11, F11R, F12, F13A1, F2, F2R, F2RL1, F2RL2, F2RL3, F3, F7, F8, F9, FABP5, FADD, FAM3B, FAM3C, FAM3D, FAP, FARP2, FAS, FASLG, FAT4, FBLN1, FBLN2, FBN1, FCAMR, FCER1A, FCER2, FCGR1A, FCGR2A, FCGR2B, FCGR3B, FCGRT, FCN2, FCRL1, FCRL4, FFAR2, FGA, FGB, FGF1, FGF10, FGF11, FGF12, FGF13, FGF14, FGF16, FGF17, FGF18, FGF19, FGF2, FGF20, FGF21, FGF22, FGF23, FGF3, FGF4, FGF5, FGF6, FGF7, FGF8, FGF9, FGFR1, FGFR2, FGFR3, FGFR4, FGFRL1, FGG, FGL1, FLOT1, FLRT3, FLT1, FLT3, FLT3LG, FLT4, FN1, FNDC5, FPR1, FPR2, FPR3, FRAS1, FREM1, FREM2, FRS3, FSHB, FSHR, FST, FSTL1, FSTL5, FXYD6, FZD1, FZD10, FZD2, FZD3, FZD4, FZD5, FZD6, FZD7, FZD8, FZD9, GABBR2, GAD1, GAL, GALP, GALR1, GALR2, GALR3, GAS1, GAS6, GAST, GC, GCG, GCGR, GDF1, GDF10, GDF11, GDF15, GDF2, GDF3, GDF5, GDF6, GDF7, GDF9, GDNF, GFRA1, GFRA2, GFRA3, GFRA4, GFRAL, GH1, GH2, GHR, GHRH, GHRHR, GHRL, GHSR, GIP, GIPR, GJB2, GLG1, GLP1R, GLP2R, GLRA2, GNAI2, GNAS, GNB3, GNRH1, GNRH2, GNRHR, GP1BA, GP1BB, GP5, GP6, GP9, GPC1, GPC2, GPC3, GPC4, GPC5, GPHA2, GPHB5, GPI, GPIHBP1, GPNMB, GPR101, GPR135, GPR151, GPR152, GPR171, GPR182, GPR19, GPR20, GPR25, GPR35, GPR37, GPR37L1, GPR39, GPR42, GPR75, GPR83, GPR84, GPRC5D, GPRC6A, GREM1, GREM2, GRIN2A, GRIN2B, GRIN2C, GRIN2D, GRM1, GRM3, GRM4, GRM5, GRM7, GRN, GRP, GRPR, GSTM2, GSTO1, GSTP1, GUCA2A, GUCA2B, GUCY2C, GUCY2D, GZMA, GZMB, HAPLN1, HAS2, HAVCR1, HAVCR2, HBEGF, HCRT, HCRTR1, HCRTR2, HCST, HDC, HEBP1, HFE, HGF, HHIP, HHLA2, HJV, HLA-A, HLA-B, HLA-C, HLA-DMA, HLA-DMB, HLA-DOB, HLA-DPA1, HLA-DPB1, HLA-DQA1, HLA-DQA2, HLA-DQB1, HLA-DQB2, HLA-DRA, HLA-DRB1, HLA-DRB3, HLA-DRB4, HLA-DRB5, HLA-E, HLA-F, HLA-G, HMGB1, HMMR, HP, HPX, HRAS, HRG, HRH1, HRH2, HRH3, HRH4, HSP90AA1, HSP90B1, HSPA1A, HSPA4, HSPA8, HSPG2, HTR1A, HTR1B, HTR1D, HTR1E, HTR1F, HTR2A, HTR2B, HTR2C, HTR4, HTR5A, HTR6, HTR7, IAPP, IBSP, ICAM1, ICAM2, ICAM3, ICAM4, ICAM5, ICOS, ICOSLG, IFITM1, IFNA1, IFNA10, IFNA14, IFNA16, IFNA17, IFNA2, IFNA21, IFNA4, IFNA5, IFNA6, IFNA7, IFNA8, IFNAR1, IFNAR2, IFNB1, IFNE, IFNG, IFNGR1, IFNGR2, IFNK, IFNL1, IFNL2, IFNL3, IFNL4, IFNLR1, IFNW1, IGDCC3, IGDCC4, IGF1, IGF1R, IGF2, IGF2R, IGFBP4, IGFBP7, IGFBPL1, IGFL1, IGFL2, IGFL3, IGFLR1, IGHG1, IGLC1, IGSF1, IGSF10, IGSF11, IHH, IL10, IL10RA, IL10RB, IL11, IL11RA, IL12A, IL12B, IL12RB1, IL12RB2, IL13, IL13RA1, IL13RA2, IL15, IL15RA, IL16, IL17A, IL17B, IL17C, IL17F, IL17RA, IL17RB, IL17RC, IL17RE, IL18, IL18BP, IL18R1, IL18RAP, IL19, IL1A, IL1B, IL1F10, IL1R1, IL1R2, IL1RAP, IL1RAPL1, IL1RAPL2, IL1RL1, IL1RL2, IL1RN, IL2, IL20, IL20RA, IL20RB, IL21, IL21R, IL22, IL22RA1, IL22RA2, IL23A, IL23R, IL24, IL25, IL26, IL27, IL27RA, IL2RA, IL2RB, IL2RG, IL3, IL31, IL31RA, IL32, IL33, IL34, IL36A, IL36B, IL36G, IL36RN, IL37, IL3RA, IL4, IL4R, IL5, IL5RA, IL6, IL6R, IL6ST, IL7, IL7R, IL9, IL9R, ILDR1, IMPG2, INHA, INHBA, INHBB, INHBC, INHBE, INS, INSL3, INSL5, INSR, IRAK4, ISLR2, ITGA1, ITGA10, ITGA11, ITGA2, ITGA2B, ITGA3, ITGA4, ITGA5, ITGA6, ITGA7, ITGA8, ITGA9, ITGAD, ITGAE, ITGAL, ITGAM, ITGAV, ITGAX, ITGB1, ITGB2, ITGB3, ITGB3BP, ITGB4, ITGB5, ITGB6, ITGB7, ITGB8, ITIH2, IZUMO1, IZUMO1R, JAG1, JAG2, JAM2, JAM3, JAML, JMJD6, KCNA3, KCND1, KCND2, KCNJ10, KCNJ15, KCNJ4, KCNN4, KCNQ1, KCNQ3, KCNQ5, KDR, KEL, KIDINS220, KIR2DL1, KIR2DL2, KIR2DL3, KIR2DL4, KIR2DS1, KIR2DS4, KIR3DL1, KIR3DL2, KIR3DL3, KIR3DS1, KISS1, KISS1R, KIT, KITLG, KL, KLB, KLK3, KLRB1, KLRC1, KLRC2, KLRC3, KLRD1, KLRF1, KLRF2, KLRG1, KLRG2, KLRK1, KMT2E, KNG1, KREMEN1, KREMEN2, L1CAM, LACRT, LAG3, LAIR1, LAMA1, LAMA2, LAMA3, LAMA4, LAMA5, LAMB1, LAMB2, LAMB3, LAMC1, LAMC2, LAMC3, LAMP1, LAMP2, LCK, LCN1, LCN2, LDLR, LEAP2, LEFTY1, LEFTY2, LEP, LEPR, LGALS1, LGALS3, LGALS3BP, LGALS8, LGALS9, LGI1, LGI2, LGI3, LGI4, LGR4, LGR5, LGR6, LHB, LHCGR, LIF, LIFR, LILRA1, LILRA3, LILRA4, LILRB1, LILRB2, LILRB3, LILRB4, LIN7C, LINGO1, LIPC, LIPH, LMAN1, LMBR1L, LPA, LPAR1, LPAR2, LPAR3, LPAR4, LPL, LPP, LRIG1, LRIG2, LRP1, LRP10, LRP11, LRP1B, LRP2, LRP4, LRP5, LRP6, LRP8, LRPAP1, LRRC4, LRRC4B, LRRC4C, LRRN3, LRRTM2, LSR, LTA, LTB, LTBP1, LTBP3, LTBR, LTF, LTK, LUM, LVRN, LY86, LY96, LYPD3, LYVE1, LYZ, MADCAM1, MAG, MAGED1, MAML2, MARCO, MAS1, MATN1, MBL2, MC1R, MC2R, MC3R, MC4R, MC5R, MCAM, MCFD2, MCHR1, MCHR2, MDK, MEGF10, MELTF, MEPE, MERTK, MET, MFAP2, MFAP3L, MFAP5, MFGE8, MFNG, MFRP, MGRN1, MIA, MICA, MICB, MIF, MILR1, MIP, MLN, MLNR, MMP1, MMP12, MMP13, MMP2, MMP24, MMP7, MMP9, MMRN2, MOG, MPIG6B, MPL, MPZ, MPZL1, MRAP, MRC1, MRC2, MRGPRX1, MRGPRX2, MS4A4A, MSMP, MST1, MST1R, MSTN, MTMR4, MTNR1A, MTNR1B, MTTP, MUC1, MUC2, MUC5AC, MUC6, MUSK, MXRA5, MYL9, MYLK, MYLK2, MYOC, NAMPT, NCAM1, NCAM2, NCAN, NCL, NCR1, NCR2, NCR3, NCR3LG1, NCSTN, NDP, NECTIN1, NECTIN2, NECTIN3, NECTIN4, NEGR1, NELL2, NEO1, NETO2, NFASC, NGF, NGFR, NID1, NLGN1, NLGN2, NLGN3, NLGN4X, NMB, NMBR, NMS, NMU, NMUR1, NMUR2, NODAL, NOTCH1, NOTCH2, NOTCH3, NOTCH4, NPB, NPBWR1, NPBWR2, NPFF, NPFFR1, NPFFR2, NPNT, NPPA, NPPB, NPPC, NPR1, NPR2, NPR3, NPS, NPSR1, NPTN, NPTX1, NPTX2, NPTXR, NPVF, NPW, NPY, NPY1R, NPY2R, NPY4R, NPY5R, NPY6R, NR0B2, NRCAM, NRG1, NRG2, NRG3, NRG4, NRP1, NRP2, NRSN1, NRTN, NRXN1, NRXN2, NRXN3, NT5E, NTF3, NTF4, NTN1, NTN3, NTN4, NTNG1, NTNG2, NTRK1, NTRK2, NTRK3, NTS, NTSR1, NTSR2, NUCB2, NXPH1, NXPH2, NXPH3, OBP2A, OCLN, OGFR, OLFM2, OLR1, OMG, OPRD1, OPRK1, OPRL1, OPRM1, OR1G1, ORAI2, ORM1, OSM, OSMR, OSTN, OXT, OXTR, P2RX7, P2RY12, P2RY14, P2RY6, P4HB, PAM, PARD3, PCNA, PCSK1N, PCSK9, PDAP1, PDCD1, PDCD1LG2, PDCD2, PDE1A, PDE1B, PDE1C, PDGFA, PDGFB, PDGFC, PDGFD, PDGFRA, PDGFRB, PDPN, PDX1, PDYN, PECAM1, PENK, PF4, PF4V1, PGF, PGLYRP1, PHEX, PI16, PI3, PIGA, PIGF, PIGR, PILRA, PILRB, PIP, PITPNM3, PKM, PLA2G10, PLA2G2A, PLA2R1, PLAT, PLAU, PLAUR, PLD1, PLD2, PLG, PLGRKT, PLPP6, PLSCR1, PLSCR4, PLTP, PLXDC1, PLXDC2, PLXNA1, PLXNA2, PLXNA3, PLXNA4, PLXNB1, PLXNB2, PLXNB3, PLXNC1, PLXND1, PMCH, PNOC, PODXL, PODXL2, POMC, POSTN, PPBP, PPY, PRG4, PRL, PRLH, PRLHR, PRLR, PRND, PROC, PROCR, PROK1, PROK2, PROKR1, PROKR2, PROS1, PRSS1, PRSS2, PRSS3, PRTG, PSAP, PSEN1, PSG4, PSG5, PSG6, PSG7, PSG9, PSPN, PTCH1, PTCH2, PTDSS1, PTGDR, PTGDR2, PTGER2, PTGER3, PTGER4, PTGIR, PTGS2, PTH, PTH1R, PTH2, PTH2R, PTHLH, PTK7, PTMA, PTN, PTPN11, PTPN6, PTPRA, PTPRB, PTPRC, PTPRD, PTPRF, PTPRG, PTPRJ, PTPRK, PTPRM, PTPRR, PTPRS, PTPRU, PTPRZ1, PVR, PYY, QDPR, QRFP, QRFPR, RACK1, RAET1E, RAET1G, RAET1L, RAMP1, RAMP2, RAMP3, RARRES1, RARRES2, RBP3, RBP4, RECK, RELN, REN, RET, RETN, RGMA, RGMB, RHAG, RHBDF2, RHBDL2, RIMS1, RIMS2, RIPK1, RLN1, RLN2, RLN3, RNASE2, RNF43, ROBO1, ROBO2, ROBO3, ROBO4, ROR1, ROR2, RPS19, RPSA, RSPO1, RSPO2, RSPO3, RSPO4, RTN4, RTN4R, RTN4RL1, RTN4RL2, RXFP1, RXFP2, RXFP3, RXFP4, RXRA, RYK, RYR1, RYR2, S100A1, S100A10, S100A12, S100A4, S100A8, S100A9, S100B, S1PR1, S1PR2, S1PR3, S1PR4, S1PR5, SAA1, SCARA5, SCARB1, SCARF1, SCEL, SCGB1A1, SCGB3A1, SCGB3A2, SCN10A, SCN2B, SCN4A, SCN5A, SCN8A, SCT, SCTR, SCUBE2, SDC1, SDC2, SDC3, SDC4, SDK2, SECTM1, SELE, SELL, SELP, SELPLG, SEMA3A, SEMA3B, SEMA3C, SEMA3D, SEMA3E, SEMA3F, SEMA3G, SEMA4A, SEMA4B, SEMA4C, SEMA4D, SEMA4F, SEMA4G, SEMA5A, SEMA5B, SEMA6A, SEMA6B, SEMA6D, SEMA7A, SERPINA1, SERPINA7, SERPINC1, SERPINE1, SERPINE2, SERPINF1, SERPING1, SERTAD1, SFRP1, SFRP2, SFTPA1, SFTPA2, SFTPD, SHANK1, SHANK2, SHBG, SHH, SIGIRR, SIGLEC1, SIGLEC10, SIGLEC5, SIGLEC6, SIGLEC7, SIGLEC8, SIGLEC9, SIRPA, SIRPB2, SIRPG, SLAMF9, SLC16A1, SLC16A2, SLC16A7, SLC17A7, SLC18A2, SLC18A3, SLC1A5, SLC2A2, SLC37A1, SLC40A1, SLC4A11, SLC6A8, SLIT1, SLIT2, SLIT3, SLITRK1, SLITRK2, SLITRK3, SLITRK5, SLITRK6, SLPI, SLURP1, SLURP2, SMAD3, SMAP1, SMO, SNCA, SNX14, SOCS2, SORBS1, SORCS2, SORCS3, SORL1, SORT1, SOST, SOSTDC1, SPARC, SPINK1, SPINT1, SPN, SPON1, SPON2, SPP1, SPTAN1, SPTBN2, SPX, SST, SSTR1, SSTR2, SSTR3, SSTR4, SSTR5, ST14, ST6GAL1, STAB1, STAB2, STRA6, STX1A, STX3, STX4, TAC1, TAC3, TAC4, TACR1, TACR2, TACR3, TAFA4, TARM1, TBXA2R, TCN1, TCN2, TCTN1, TDGF1, TECTA, TECTB, TEK, TF, TFF1, TFF2, TFF3, TFPI, TFR2, TFRC, TG, TGFA, TGFB1, TGFB2, TGFB3, TGFBR1, TGFBR2, TGFBR3, TGM2, TGS1, THBD, THBS1, THBS2, THBS3, THBS4, THPO, THY1, TIE1, TIGIT, TIMD4, TIMP1, TIMP2, TIMP3, TLN1, TLR1, TLR2, TLR4, TLR5, TLR6, TLR7, TLR9, TMED5, TMEM219, TMEM67, TMIGD2, TMIGD3, TNC, TNF, TNFRSF10A, TNFRSF10B, TNFRSF10C, TNFRSF10D, TNFRSF11A, TNFRSF11B, TNFRSF12A, TNFRSF13B, TNFRSF13C, TNFRSF14, TNFRSF17, TNFRSF18, TNFRSF19, TNFRSF1A, TNFRSF1B, TNFRSF21, TNFRSF25, TNFRSF4, TNFRSF6B, TNFRSF8, TNFRSF9, TNFSF10, TNFSF11, TNFSF12, TNFSF13, TNFSF13B, TNFSF14, TNFSF15, TNFSF18, TNFSF4, TNFSF8, TNFSF9, TNN, TNR, TNXB, TOR2A, TPH1, TPO, TPSAB1, TPSB2, TRADD, TRAF2, TRAF3, TREM1, TREM2, TREML2, TREML4, TRH, TRHR, TRPC3, TRPC5, TRPM2, TRPM3, TRPV1, TRPV6, TSHB, TSHR, TSLP, TSPAN1, TSPAN10, TSPAN12, TSPAN14, TSPAN15, TSPAN17, TSPAN5, TTR, TXLNA, TYRO3, TYROBP, UCN, UCN2, UCN3, ULBP1, ULBP2, ULBP3, UNC5A, UNC5B, UNC5C, UNC5D, UTS2, UTS2B, UTS2R, VANGL2, VASN, VASP, VCAM1, VCAN, VCL, VEGFA, VEGFB, VEGFC, VEGFD, VGF, VIM, VIP, VIPR1, VIPR2, VLDLR, VSIG10L, VSIR, VSTM1, VTN, VWF, WFIKKN2, WIF1, WNT1, WNT10A, WNT10B, WNT11, WNT16, WNT2, WNT2B, WNT3, WNT3A, WNT4, WNT5A, WNT5B, WNT6, WNT7A, WNT7B, WNT8A, WNT8B, WNT9A, WNT9B, XCL1, XCL2, XCR1, YBX1, ZG16B, ZNRF3, ZP3] missing from var_names

Thanks

@dbdimitrov
Copy link
Collaborator

Hey @ndrubins, overall it looks okay. Only dea_df seems a bit strange, can you run the following:

import numpy as np
entities = np.union1d(resource['ligand'], resource['receptor'])

# Check this one
np.intersect1d(entities, adata.var.index)

# Check this one also
np.intersect1d(entities, dea_df.index)

Something is not matching and it's likely a formatting issue, let's see where :) if both of these look okay, then I guess you could share a subset and I can debug :)

@ndrubins
Copy link
Author

Thanks a lot for the quick response.

The intersections seem fine to me:

>>> omnipath_consensus_df
      ligand receptor
0     LGALS9    PTPRC
1     LGALS9      MET
2     LGALS9     CD44
3     LGALS9     LRP1
4     LGALS9     CD47
...      ...      ...
4820    BMP2    ACTR2
4821   BMP15    ACTR2
4822    CSF1    CSF3R
4823   IL36G   IFNAR1
4824   IL36G   IFNAR2

[4825 rows x 2 columns]
>>> entities = np.union1d(omnipath_consensus_df['ligand'], omnipath_consensus_df['receptor'])
>>> len(entities)
1893
>>> len(adata.var.index)
13167
>>> a = np.intersect1d(entities, adata.var.index)
>>> a
array(['ABCA1', 'ACE', 'ACKR1', ..., 'XCR1', 'YBX1', 'ZNRF3'],
      dtype=object)
>>> len(a)
1170
>>> b = np.intersect1d(entities, dea_df.index)
>>> b
array(['ABCA1', 'ACE', 'ACKR2', ..., 'XCR1', 'YBX1', 'ZNRF3'],
      dtype=object)
>>> len(b)
1026

Just sent you an email with a gdrive link to the data.

Thanks!

@dbdimitrov
Copy link
Collaborator

Hi @ndrubins,

It seems like the issue is that you have no overlapping cell types between dea_df and adata.

I will make sure to add a check for this in the next LIANA+ update. :)

import numpy as np
np.intersect1d(dea_df[groupby].unique(), adata.obs[groupby].unique())

Hope this helps.
Daniel

@ndrubins
Copy link
Author

So sorry for having bothered you with my own bugs and thanks a lot for helping.

@dbdimitrov
Copy link
Collaborator

No worries 😉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants