-
Notifications
You must be signed in to change notification settings - Fork 506
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Not in simplex -- two sets of largely different sizes #93
Comments
hi, spherEng1 and PencilEnergy should sum up to the same value (the total mass in both distributions should be the same). Is it the case ? |
Yes, I've normed them both so that they both sum to 1 |
I tried to run your code, but there are many errors in your script. Even with by changing 2i -> 2 * i and nside2 -> nside. I get:
Please make sure your script runs so I can reproduce the error. |
Whoops, sorry about that. Some asterisks got lost
|
Thank you @caricesarotti
Are you sure this script above gives you a not in simplex error ? |
I could not get Also note that you mentioned using Python 3.6 but your output is Python 2.7 (Your script still works with both python 2.7 and 3.6 with the latest numpy, pot and astropy). import numpy as np
from numpy import linalg as LA
import astropy_healpix
import astropy_healpix.healpy as hp
import ot
from ot.lp import emd_c, check_result, emd2
import sys
# DEFINED FOR MASSLESS PARTICLES
def eng(px,py,pz):
return np.sqrt(px**2+py**2+pz**2)
def pT(px, py):
return np.sqrt(px**2+py**2)
def listNorm(a):
return a[:]/a[:].sum()
def _cdist_sphere(X,Y):
theta_d=np.array([[np.arccos(np.around(np.dot(X[i],Y[j]), decimals=5)/np.around((LA.norm(Y[j])*LA.norm(X[i])),decimals=5)) for j in range(len(Y))] for i in range(len(X))])
return theta_d
def emd_Calc(ev0,ev1,M,maxIter=1000000):
# NORMALIZE IF NOT NORMALIZED
ev0norm = ev0[:]/ev0[:].sum()
ev1norm = ev1[:]/ev1[:].sum()
cost, log = emd2(ev0norm, ev1norm, M, numItermax=100000000,log=True)
#print(log['result_code'])
if cost==0:
print(log['warning'])
return cost
# Generating the sphere tiling:
spherEventScan=[]
for i in range(5):
nside=2**i
numPixScan=12*(nside**2)
spherPointScan=[]
sphereEngScan=[]
spherSampleScan=[]
spherPtScan=[]
for j in range(numPixScan):
vecOnSpher = hp.pix2vec(nside,j)
point = np.array([vecOnSpher[0], vecOnSpher[1], vecOnSpher[2]])
engVal = eng(vecOnSpher[0], vecOnSpher[1], vecOnSpher[2])
spherPointScan.append(point)
sphereEngScan.append(engVal)
spherPtScan.append(pT(point[0],point[1]))
spherEventScan.append([np.around(np.array(spherPointScan), decimals=6),np.around(np.array(sphereEngScan), decimals=6),listNorm(np.array(spherPtScan))])
# Comparing a densely tiled sphere to a two element set
pencilPoint = np.array([[1,0,0],[-1,0,0]])
numPencil=2
pencilEnergy = np.full(numPencil, np.float(1./numPencil))
spherPoints1 = spherEventScan[4][0]
spherEng1 = spherEventScan[4][1]
M = _cdist_sphere(spherPoints1, pencilPoint)
emdval = emd_Calc(spherEng1, pencilEnergy, M)
print("emdval = %f" % emdval)
versions = [sys.version, np.__version__, ot.__version__,
astropy_healpix.__version__]
names = ["Python", "Numpy", "POT", "Astropy"]
for name, version in zip(names, versions):
print(name, version) with python 2.7
with python 3.6
|
I don't know what is broken on your end. You can create a small env, this bit works just fine:
your pot_93.py file: import numpy as np
from numpy import linalg as LA
import astropy_healpix
import astropy_healpix.healpy as hp
import ot
from ot.lp import emd_c, check_result, emd2
import sys
# DEFINED FOR MASSLESS PARTICLES
def eng(px,py,pz):
return np.sqrt(px**2+py**2+pz**2)
def pT(px, py):
return np.sqrt(px**2+py**2)
def listNorm(a):
return a[:]/a[:].sum()
def _cdist_sphere(X,Y):
theta_d=np.array([[np.arccos(np.around(np.dot(X[i],Y[j]), decimals=5)/np.around((LA.norm(Y[j])*LA.norm(X[i])),decimals=5)) for j in range(len(Y))] for i in range(len(X))])
return theta_d
def emd_Calc(ev0,ev1,M,maxIter=1000000):
# NORMALIZE IF NOT NORMALIZED
ev0norm = ev0[:]/ev0[:].sum()
ev1norm = ev1[:]/ev1[:].sum()
cost, log = emd2(ev0norm, ev1norm, M, numItermax=100000000,log=True)
#print(log['result_code'])
if cost==0:
print(log['warning'])
return cost
# Generating the sphere tiling:
spherEventScan=[]
for i in range(5):
nside=2**i
numPixScan=12*(nside**2)
spherPointScan=[]
sphereEngScan=[]
spherSampleScan=[]
spherPtScan=[]
for j in range(numPixScan):
vecOnSpher = hp.pix2vec(nside,j)
point = np.array([vecOnSpher[0], vecOnSpher[1], vecOnSpher[2]])
engVal = eng(vecOnSpher[0], vecOnSpher[1], vecOnSpher[2])
spherPointScan.append(point)
sphereEngScan.append(engVal)
spherPtScan.append(pT(point[0],point[1]))
spherEventScan.append([np.around(np.array(spherPointScan), decimals=6),np.around(np.array(sphereEngScan), decimals=6),listNorm(np.array(spherPtScan))])
# Comparing a densely tiled sphere to a two element set
pencilPoint = np.array([[1,0,0],[-1,0,0]])
numPencil=2
pencilEnergy = np.full(numPencil, np.float(1./numPencil))
spherPoints1 = spherEventScan[4][0]
spherEng1 = spherEventScan[4][1]
M = _cdist_sphere(spherPoints1, pencilPoint)
emdval = emd_Calc(spherEng1, pencilEnergy, M)
print("emdval = %f" % emdval)
versions = [sys.version, np.__version__, ot.__version__,
astropy_healpix.__version__]
names = ["Python", "Numpy", "POT", "Astropy"]
for name, version in zip(names, versions):
print(name, version) |
Ok. Could you add just before the emd2 call, inside print(ev0norm.min(), ev0norm.sum())
print(ev1norm.min(), ev1norm.sum())
cost, log = emd2(ev0norm, ev1norm, M, numItermax=100000000,log=True) and report the output here |
I have the same output, 0.003 so it's not a matter of numerical precision. I'm running out of options without a way to reproduce the error ... |
When I consider the versions of cython and scipy, I get
I think I have found the issue though. I talked to some collaborators who have used POT for similar things (https://arxiv.org/abs/1902.02346) and they said they countered the same problem. The issue is with the Mac OS. There is something strange about running the POT on the Darwin platform. I am resolving the problem by switching to a linux machine |
Ok, |
yes I agree with @hichamjanati, if this is a bug that affect several person, we should hunt it down or at least leave this issue open. Switching to linux is not a good answer... |
I've encountered what appears to be the same problem. Like caricesarotti, I only see the problem on a mac, not on linux. The minimal example I've found is this: import numpy as np
import ot
ot.emd2([], [], np.ones([292, 2])) On linux, that outputs 1.0000000000000004, as expected. On my mac, it outputs
It seems to be related to the shape of the input, but I haven't figured out how. Some other examples: using OS and package versions where I do see the problem: >>> import platform; print(platform.platform())
macOS-10.14.6-x86_64-i386-64bit
>>> import sys; print("Python", sys.version)
Python 3.8.3 | packaged by conda-forge | (default, Jun 1 2020, 17:21:09)
[Clang 9.0.1 ]
>>> import numpy; print("NumPy", numpy.__version__)
NumPy 1.18.5
>>> import scipy; print("SciPy", scipy.__version__)
SciPy 1.5.0
>>> import ot; print("POT", ot.__version__)
POT 0.7.0 OS and package versions where I do not see the problem: >>> import platform; print(platform.platform())
Linux-4.4.0-165-generic-x86_64-with-glibc2.10
>>> import sys; print("Python", sys.version)
Python 3.8.3 | packaged by conda-forge | (default, Jun 1 2020, 17:43:00)
[GCC 7.5.0]
>>> import numpy; print("NumPy", numpy.__version__)
NumPy 1.19.0
>>> import scipy; print("SciPy", scipy.__version__)
SciPy 1.5.1
>>> import ot; print("POT", ot.__version__)
POT 0.7.0 I'm planning to follow caricesarotti's suggestion and switch to linux. |
Having the same problem and I'm pretty much stuck on Mac. I can reproduce aforr's minimal example (also same behaviour for the other shapes he tried).
I appreciate all the efforts for a resolution :) |
Yes, I can reproduce the issue. A very quick fix-up: |
Thanks for the work you are putting into this! Unfortunately, both of the quick-fix suggestions don't work for me. They still produce the warning and return 0.0 ... |
Damn I was too quick in my answer. This is indeed a very weird bug... Looking into it, I will keep you posted |
Ok I noticed a small problem with an EPSILON constant in the C code. Could you please install the branch 'precision_concern' and test it on your problem ? The modifications I performed did actually solved the issue for the previous problem (with the [292,2] array). |
That solves both the toy test case and my actual use case and there is no need to do any casting (it works with and without casting)! Thanks so much for looking into this and being so super responsive! :) |
Great ! Thanks for the quick feedback. I will do a PR and close the issue. |
I am trying to calculate the EMD of two sets. When one set has a few hundred entries and the other has only 2, the EMD calculation fails and returns Problem Infeasible.
Steps to reproduce the behavior:
** SEE BELOW COMMENT FOR FIXED SCRIPT **
Expected behavior
Should return EMD around 1, instead says that the sets spherEng1 and pencilEnergy are not in the simplex
Screenshots
Here is comparing the EMDs calculated for less densely tiled to most densely tiled (number of particles = number of segments) with the two element set
Desktop (please complete the following information):
The text was updated successfully, but these errors were encountered: