-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Singular Spectrum Analysis decomposition method #126
Comments
You can use the attached file |
Hi, The most important step is to reshape the data to the right size. Pyts expects a similar format to Scikit-Learn: (n_samples x n_timesteps). So with your data (4 x 21500). This adjustment allows the SSA to be calculated correctly. But there is another important point with your data, due to the temporal resolution per day a larger window size is advisable. Since this can quickly lead to a high memory consumption, you should combine it with a simple subsampling (e.g. only every fourth day). Now you can easily decompose the temperature time series into a trend, a periodic course, and a residual by the groups parameter. If the calculation leads to a memory error, do not consider the whole time series, consider less days per year, or reduce the window_size. Here the code:
This produces the following result: I hope this could help a little. Otherwise it shows at least another nice application of the SSA. Best, |
Hi, I'm very sorry for the delayed response, I totally missed the notification of your issue. Lucas gave you a complete working example that should help you get started. On the First, you can use the Another functionality is the automatic computation of the groups using Here is the link to the latest version of singular spectrum analysis: https://pyts.readthedocs.io/en/latest/generated/pyts.decomposition.SingularSpectrumAnalysis.html#pyts.decomposition.SingularSpectrumAnalysis. To use it, you need to install the latest version of the package:
I hope that this is still helpful to you and would be happy to answer any of your questions. Best, |
Great results above, I was able to run the example above with the data given but when I tried to replace it with my data set, I have several issues. Please I was trying to replicate this code to my data set on dam operations but I have been having several issues such as "If 'window_size' is an integer, it must be greater than or equal to 2 and lower than or equal to n_timestamps (got 400)." Please can you help me rectify this? attached is the my file Thanks |
Good evening How are you doing sir I hope all is well? I would need your assistance please to forecast using the SSA.Thanks Triumph |
Bonjour Johann.Greetings Wow, I am so, so so grateful for this beautiful response I am so happy God bless you. It looks like you worked on the data set because I tried implementing and it was giving me this output error 'groups' must be either None, an integer or array-like.” If you can share the data set I will appreciate it. groups=”auto” seems not to work for me. But when I do groups = 5 or any number less than the window size I get results. Attached is the attempt @ groups=5. This are the interested columns for now; Date; Vévap.; Virrig., Vdéversé; Ventrant; Camontdeb(m); H pluie (mm) and Indice Evapo (mm). I still have some data set which I would like to use the SSA for statistical analysis. I would love to work with you on my data analysis, I am working on my masters research on the bagre dam operation the impact of rainfall extremes and geological features on the spill way and embarkment of the dam downstream. So with this data set on the dam operation, I want to look at the statistical analysis using (SSA) to get the trend, periodicity and noise in the data for projection and to help in directing my Modeling. I will communicate with my supervisor on this results and come up with the next line of action. Can we arrange a zoom or google meet so I can discuss with you more about the data set? Seriously you did a great job here and I am so happy for this assistance. |
Description
Hello, I am new to python. I am trying to use the SSA decomposition method for rainfall prediction with a dataset of 21500 rows and 5 columns (21500, 5).
I used the source codes below. But I do not know how to fix it for my dataset. I have an error when changing the value of window size, n_sample, and n_timestamps. Anyone can help me?
How can I use the main step of SSA including embedding, SVD, and reconstruction?
Steps/Code to Reproduce
<< import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from pyts.decomposition import SingularSpectrumAnalysis
Parameters
n_samples, n_timestamps = 100, 48
df = pd.read_csv('C:/Users/PC2/Desktop/passenger.csv', index_col=0)
Toy dataset
rng = np.random.RandomState(41)
X = rng.randn(n_samples, n_timestamps)
We decompose the time series into three subseries
window_size = 15
groups = [np.arange(i, i + 5) for i in range(0, 11, 5)]
Singular Spectrum Analysis
ssa = SingularSpectrumAnalysis(window_size=15, groups=groups)
X_ssa = ssa.fit_transform(X)
Show the results for the first time series and its subseries
plt.figure(figsize=(16, 6))
ax1 = plt.subplot(121)
ax1.plot(X[0], 'o-', label='Original')
ax1.legend(loc='best', fontsize=14)
ax2 = plt.subplot(122)
for i in range(len(groups)):
ax2.plot(X_ssa[0, i], 'o--', label='SSA {0}'.format(i + 1))
ax2.legend(loc='best', fontsize=14)
plt.suptitle('Singular Spectrum Analysis', fontsize=20)
plt.tight_layout()
plt.subplots_adjust(top=0.88)
plt.show()>>
The text was updated successfully, but these errors were encountered: