wgcca

Python Implementation of Weighted Generalized Canonical Correlation Analysis as described in "Learning Multiview Embeddings of Twitter Users". Benton A, Arora R, and Dredze M. ACL 2016.

Tested with

Python 2.7
scipy 0.17.0
numpy 1.10.4

Test suite:

python src/wgccaTest.py

Sample call to learn 5-dimensional WGCCA model (first two views weighted twice as much as second two):

python src/wgcca.py --input resources/sample_wgcca_input.tsv.gz --output wgcca_embeddings.npz --model wgcca_model.pickle --k 5 --kept_views 0 1 2 3 --weights 1.0 1.0 0.5 0.5 --reg 1.e-8 1.e-8 1.e-8 1.e-8

Input format can be grokked from: resources/sample_wgcca_input.tsv
WGCCA model saved to: wgcca_model.pickle
WGCCA embeddings saved to: wgcca_embeddings.npz

WeightedGCCA methods

_compute: look at this if you want to know how embeddings are computed
learn: entrypoint for learning WeightedGCCA model from training set
apply: entrypoint for extracting embeddings from new data

The input views used in "Learning Multiview Embeddings of Twitter Users" can be found at http://www.cs.jhu.edu/~mdredze/datasets/multiview_embeddings/ -- in the same format as resources/sample_wgcca_input.tsv.

If you use this code please cite:

Adrian Benton, Raman Arora, and Mark Dredze. Learning Multiview Representations of Twitter Users. Association for Computational Linguistics (ACL), 2016.

Please contact adrian dot author1_surname at gmail dot com if you have any questions/suggestions/concerns/comments.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
resources		resources
src		src
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

wgcca

WeightedGCCA methods

About

Releases

Packages

Contributors 3

Languages

License

abenton/wgcca

Folders and files

Latest commit

History

Repository files navigation

wgcca

WeightedGCCA methods

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages