Skip to content
This repository was archived by the owner on Jul 20, 2023. It is now read-only.

How to deal with outliers in time series? #22

Open
wj431364 opened this issue Jul 20, 2023 · 1 comment
Open

How to deal with outliers in time series? #22

wj431364 opened this issue Jul 20, 2023 · 1 comment

Comments

@wj431364
Copy link

wj431364 commented Jul 20, 2023

Hi, i am using the very nice pkg rEDM (pyEDM) in my project. However, i find that function CCM is very sensitive to the outliers in time series, which is mainly from the pearson correlation used in the function.

In a extreme case, the causality result will drop from 0.7 to 0.1 by only adding a single data point. This is somehow counter-intuitive to the definition of causality.
I want to know if there exists any method i can deal with those outliers.

Many thanks!

This is an example to reproduce the issue (sorry i am using python)

import numpy as np
import pandas as pd
from pyEDM import *
x = np.random.random(200)
y = np.sin(x) + np.random.random(200) * 0.5
x[-1] = 0
data = pd.DataFrame()
data['0'] = 0
data['x1'] = x
data['x2'] = y

E = max(findE(data['x1'].values,data['x2'].values))

CCM(
    dataFrame=data,
    columns='x1',
    target='x2',
    libSizes='50 160 20',
    sample=100,
    E = 3,
    showPlot=True
)


x[-1] = 15
data = pd.DataFrame()
data['0'] = 0
data['x1'] = x
data['x2'] = y

E = max(findE(data['x1'].values,data['x2'].values))

CCM(
    dataFrame=data,
    columns='x1',
    target='x2',
    libSizes='50 160 20',
    sample=100,
    E = 3,
    showPlot=True
)

results are
image
image

@ha0ye
Copy link
Owner

ha0ye commented Jul 20, 2023

I recommend you post this over in @SugiharaLab/rEDM as that is the active version of the software package.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants