Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect Scale Parameter in Distribution #370

Open
jamesdeeel opened this issue Jan 14, 2025 · 0 comments
Open

Incorrect Scale Parameter in Distribution #370

jamesdeeel opened this issue Jan 14, 2025 · 0 comments

Comments

@jamesdeeel
Copy link

The scale parameter, if not provided is set as phi here https://github.com/dswah/pyGAM/blob/v0.9.1/pygam/pygam.py#L1038-L1041. The issue here is that phi should represent the variance and scale the standard deviation, broadly speaking.

Here is some code that replicates the issue

import numpy as np
from pygam import GAM, LinearGAM

def compute_scale_two_ways():
    # Parameters for the linear function
    A = 0.0
    B = 2.0
    N = 10_000
    sigma = 3.0

    # Generate random x values
    X = np.random.rand(N) * 100

    # Generate y values with random noise
    noise = np.random.normal(loc=0, scale=sigma, size=N)
    y = A * X + B + noise

    gam = LinearGAM()
    gam.fit(X, y)
    Y_out = gam.predict(X)
    r = y - Y_out
    scale = np.sqrt(np.mean(r**2))

    return scale, gam.distribution.scale

scales_manual, scales_gam = [], []
for _ in range(100):
    scale, gam_scale = compute_scale_two_ways()
    scales_manual.append(scale)
    scales_gam.append(gam_scale)

print(f"Manual Estimate of Scale: {np.mean(scales_manual):.3f} +/- {np.std(scales_manual):.3f}")
print(f"GAM Estimate of Scale: {np.mean(scales_gam):.3f} +/- {np.std(scales_gam):.3f}")

which on my machine returns

Manual Estimate of Scale: 2.996 +/- 0.019
GAM Estimate of Scale: 8.990 +/- 0.112

Which shows the issue, given the standard deviation of the noise term is 3. Having an incorrect scale then means that the log-likelihood is computed incorrectly and possibly other quantities too.

I believe this may be the root issue of #163

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant