Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IntroductoryAtotiTutorial/0.7.0 #365

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
Open

Conversation

jackschd
Copy link
Contributor

No description provided.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

HuifangYeo commented on 2022-09-21T03:03:48Z
----------------------------------------------------------------

We will be exploring how to use atoti. <= add a fullstop behind atoti.


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

HuifangYeo commented on 2022-09-21T03:03:48Z
----------------------------------------------------------------

atoti is the free community edition, whereas Atoti+ is the enterprise version.

Both could be productionized but atoti is subjected to EULA that limits the number of users who could use the application.

And of course, atoti is not meant for commercial use.


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

HuifangYeo commented on 2022-09-21T03:03:49Z
----------------------------------------------------------------

Bothe of these are optional. < wrong spelling for Both


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

HuifangYeo commented on 2022-09-21T03:03:50Z
----------------------------------------------------------------

datatbase.. <= wrong spelling.

Would we say atoti is a "database"? Normally I just mention and atoti table.


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

HuifangYeo commented on 2022-09-21T03:03:50Z
----------------------------------------------------------------

Do we use "largest" store, or rather do we use the store with the most granular level of data as the base store?


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

HuifangYeo commented on 2022-09-21T03:03:51Z
----------------------------------------------------------------

For auto mode, maybe we could mention that the measures are only created for the base store. We will have to create measures for the referenced store if required.


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

HuifangYeo commented on 2022-09-21T03:03:52Z
----------------------------------------------------------------

Visualization is broken


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

HuifangYeo commented on 2022-09-21T03:03:52Z
----------------------------------------------------------------

You will notice that there is a Jupyter notebook plugin allowing the creation of pivot tables. => maybe something along the line - "an atoti JupyterLab extension that allows users to create visualizations interactively"?

Because it's not limited to Pivot table, although it's defaulted to pivot table.


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

HuifangYeo commented on 2022-09-21T03:03:53Z
----------------------------------------------------------------

We can see that atoti has created the following hierarchies, based on the non-numeric data loaded => and for the key columns as well.


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

HuifangYeo commented on 2022-09-21T03:03:54Z
----------------------------------------------------------------

This is more like aliasing the hierarchies and levels right?

We could also just delete directly without the aliase:

del cube.hierarchies["OrderId"]
del cube.hierarchies["CustomerId"]

And since it's aliase, maybe you want to shorten it further e.g. h = cube.hierarchies


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

HuifangYeo commented on 2022-09-21T03:03:55Z
----------------------------------------------------------------

Line #1.    hierarchies = cube.hierarchies

You have already aliase the hierarchies in cell 25, therefore it's not necessary to aliase it again.


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

HuifangYeo commented on 2022-09-21T03:03:55Z
----------------------------------------------------------------

Line #1.    hierarchies = cube.hierarchies

You only need to aliase it once and you should be able to invoke hierarchies each time without having to re-assign it again.


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

HuifangYeo commented on 2022-09-21T03:03:56Z
----------------------------------------------------------------

Do you mean to show the atoti editor (the jupyerlab extension for atoti?)


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

HuifangYeo commented on 2022-09-21T03:03:57Z
----------------------------------------------------------------

Is this intended to be empty? Maybe we could mention that the default type of visualization is a Pivot table but we could change the visualization using the atoti editor from the left navigation bar.


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

HuifangYeo commented on 2022-09-21T03:03:58Z
----------------------------------------------------------------

Line #1.    hierarchies = cube.hierarchies

Again, you don't have to re-assign the aliase.

Remove hierarchies = cube.hierarchies


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

HuifangYeo commented on 2022-09-21T03:03:58Z
----------------------------------------------------------------

Line #1.    hierarchies = cube.hierarchies

Remove this line


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

HuifangYeo commented on 2022-09-21T03:03:59Z
----------------------------------------------------------------

Line #1.    hierarchies = cube.hierarchies

Remove this line


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

HuifangYeo commented on 2022-09-21T03:04:00Z
----------------------------------------------------------------

Line #1.    hierarchies = cube.hierarchies

Remove this line


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

HuifangYeo commented on 2022-09-21T03:04:00Z
----------------------------------------------------------------

Line #1.    hierarchies = cube.hierarchies

Remove this line


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

HuifangYeo commented on 2022-09-21T03:04:01Z
----------------------------------------------------------------

Might want to link to the documentation for create_date_hierarchy > https://docs.atoti.io/latest/lib/atoti/atoti.cube.html#atoti.Cube.create_date_hierarchy


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

HuifangYeo commented on 2022-09-21T03:04:02Z
----------------------------------------------------------------

    levels={
        "Year": "y",  # this defines the level and the part of the column's LocalDate object to take as a value
        "Quarter": "QQQ",
        "Month": "MMM",
    },

Just be careful here:

y > returns an Integer type,

MMM > returns a String type,

QQQ > returns a String type too.

Normally I'll try to be consistent and use yyyy instead. I ran into some weird behavior previously when I had a mix of integers with strings.


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

HuifangYeo commented on 2022-09-21T03:04:03Z
----------------------------------------------------------------

We know that the Orders table contains a "ShipperName" field, and we know the measure called "QuantitySold.SUM", so in order to determine how much it costs to ship a particular order we need to compare the "QuantitySold.SUM" measure value with the "Interval".

=> This sentence is very long. Can we break it down?

We know that the Orders table contains a "ShipperName" field, and we know the measure called "QuantitySold.SUM". In order to determine how much it costs to ship a particular order, we need to compare the "QuantitySold.SUM" measure value with the "Interval".


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

HuifangYeo commented on 2022-09-21T03:04:04Z
----------------------------------------------------------------

Is this intended to be empty too?


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

HuifangYeo commented on 2022-09-21T03:04:04Z
----------------------------------------------------------------

We could also use the function atoti.agg.sum_product > https://docs.atoti.io/latest/lib/atoti/atoti.agg.sum_product.html


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

HuifangYeo commented on 2022-09-21T03:04:05Z
----------------------------------------------------------------

But this value only makes sense at a ProductName level, so the aggregation is done at this level by specifying the "scope" of this new measure.

This is known as dynamic aggregation.

=> This is something that I find can be hard for people to follow.

I try to explain that when you query a few levels in the order, e.g. product category, product name, color.

The levels in the query that lies within the scope or is below the mentioned scope (in this case, the product name and color) will follows the formula (measures["QuantitySold.SUM"] * measures["SellingPricePerUnit.SUM"]).

Because product category in this query is above the mentioned scope, therefore it is an aggregation (sum in this case) of the level below (i.e. the productName level).

So depending on how you structure the query, the computation changes on the fly.


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

HuifangYeo commented on 2022-09-21T03:04:06Z
----------------------------------------------------------------

Could we perhaps mention that the function by default has a offset value of 1, which means it's looking at the next member.

And in this case, because the date is arranged from earliest to latest date, so the value returned on 2019-01-15 will be the value on 2019-01-16. So in fact, shouldn't it be the nextSales instead of previousSales? Unless you set offset=-1 to look at the previous member. Or you sort the dates in descending order.


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

HuifangYeo commented on 2022-09-21T03:04:06Z
----------------------------------------------------------------

I'm thinking we should use atoti.agg.single_value instead of atoti.agg.sum to get the vector value:

measures["AverageHistoricalPriceOfLargest20"] = tt.array.mean(
    tt.array.n_greatest(tt.agg.single_value(historicalPrices["HistoricalPrice"]), n=20)
)

https://github.com/atoti/atoti/discussions/694


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

HuifangYeo commented on 2022-09-21T03:04:07Z
----------------------------------------------------------------

I'm not sure if t.agg.sum(historicalPrices["HistoricalPrice"]), is the same as

tt.agg.sum(
  tt.agg.single_value(historicalPrices["HistoricalPrice"]),
  scope=tt.OriginScope(levels["ProductName"])
)

Also, if we want to take the lowest, should it be q=0.05 instead?

https://en.wikipedia.org/wiki/Value_at_risk


Copy link
Contributor

@HuifangYeo HuifangYeo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm approving it ahead. Skipping readme update for this notebook as the restructured gallery should be launch soon.

@printhellohetal
Copy link
Member

We can close this PR, right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants