Skip to content

Conversation

@csthomas
Copy link
Contributor

@csthomas csthomas commented Nov 8, 2016

Pull Request for Issue #10567.

Related PRs #11184 and #8563

Summary of Changes

  • Big performance improvement in JTable reorder method.
  • Add a new public method to JDatabaseQuery selectRowNumber() with support for mysql, postgresql>=8.4, sqlsrv and sqlite3.
  • Do not load table rows to php memory. Use multi-table UPDATE syntax to reorder table in one sql query.
  • PostgreSQL:
    • Before patch each call for (string) $query; returned different result.
  • SQLSRV:
    • Fix multi table update to generate correct query.
    • In SELECT query move HAVING before ORDER BY
  • SQLite:
    • add support for multi table update query by INSERT OR REPLACE INTO
    • add sqlite function ROW_NUMBER(init=null, partitionBy=null) after establish a connection.
  • Add unit tests for selectRowNumber and multi table update queries.
  • Fix a few other unit test files that generated error after applied my changes.

Testing Instructions

Just as in a competitive PR #11184.

In short:

  • Create a new featured article in a category with a lots of articles. Patched version should be faster but you may not see it if you have less than 1000 articles in the category.
  • Check that your new article is on the top. [Set ordering ascending in list order]

Note:

  • Content items start ordering from 0.
  • Featured items start ordering from 1.

The test to be more enjoyable, you can add below code after $query->execute()
https://github.com/joomla/joomla-cms/pull/12839/files#diff-8254e2c441a41d3fa26f0f83fbbb3043R1390

JFactory::getApplication()->enqueueMessage(
'<b>reorder()</b>:<br/>'.$this->_db->replacePrefix($query).'<br/>Affected:'.$this->_db->getAffectedRows(), 'message');

Documentation Changes Required

Probably.

Introduce a new public method JDatabaseQuery::selectRowNumber().
Main database drivers (mysql, postgresql, sqlsrv and sqlite) can use Update with innerJoin.
SQLite driver has a new sqlite function ROW_NUMBER(init=null, partitionBy=null)

@csthomas
Copy link
Contributor Author

csthomas commented Nov 8, 2016

@andrepereiradasilva
Copy link
Contributor

so, for what i understand this will work on all database drivers, except SQLLite
i can only test in mysql, but is this ready for testing?

$k = $this->_tbl_key;
$subquery = $this->_db->getQuery(true)
->from($this->_tbl)
->windowRowNumber('ordering', 'new_ordering');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not related to the PR itself, but shouldn't $this->getColumnAlias('ordering') be used to allow more flexibility when naming the table columns?
See #12464

Copy link
Contributor Author

@csthomas csthomas Nov 17, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will take a look.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with that but should I use that before it will be merged?

@csthomas
Copy link
Contributor Author

There is missing a few unit tests for windowRowNumber but you can start testing.

@andrepereiradasilva
Copy link
Contributor

ok will try to test today or tomorrow, but as said i can only test mysql.

@csthomas
Copy link
Contributor Author

I have found one issue with windowRowNumber on mysql database.
It can be used only once per query.
This is current limitation. I will try to find some solution for that.

@ggppdk
Copy link
Contributor

ggppdk commented Nov 21, 2016

I have found one issue with windowRowNumber on mysql database.
It can be used only once per query.
This is current limitation. I will try to find some solution for that.

JTable::reorder() does not need to call it more than once, right ?
but i understand that you want to extend it for more usage cases ?

Example of some query that will be created if it is called twice

SELECT *

	FROM
	(
		SELECT lang.lang_code
		 ,(SELECT @a := @a + 1 FROM (SELECT @a := 0) AS row_init) as num1
		 ,(SELECT @a := @a + 1 FROM (SELECT @a := 0) AS row_init) as num2
		FROM #__languages AS lang
		ORDER_BY _window_order_
	)
	AS window
	
ORDER BY _order_
  1. about making the variable name @A to be unique
    you can add a static counter variable to the method, and use @A1 , @a2, @A3, etc

  2. about window_order how are you going to handle multiple values for it ?

maybe it is best to throw an exception if someone tries to call it multiple times (throw because it is not implemented / supported thus prevent wrong usage of it, and no B/C issue, since the method is new)

  • JTable::reorder() does not need to call it more than once, (right ?)
  • in the future someone may implement multiple calls on it

@csthomas
Copy link
Contributor Author

csthomas commented Nov 22, 2016

maybe it is best to throw an exception if someone tries to call it multiple times (throw because it is not implemented / supported thus prevent wrong usage of it, and no B/C issue, since the method is new)

I tried to implement some code but I'm back to the same conclusion. I will add throw Exception.
Mysql query would require a very complicated query like SELECT * FROM ( SELECT ... ( SELECT ... ...`.

@ggppdk
Copy link
Contributor

ggppdk commented Dec 18, 2016

It is nice to see some PRs in this repository that try to squeeze some extra performance out some code
that can have some visible benefits

but this PR does not try to squeeze a little extra performance out of something
It tries to addresses a real performance issue with JTable reorder

that causes unacceptable performance and race conditions in large websites, when saving articles and when reordering articles and large menus

Also this PR involved some hard work and considerable time by @csthomas to add code and test all DBs, (even support MSSQL that currently noone uses), when no-one else would spend time to do a proper job (for all DBs) for years
as the reorder performance issue exists for years

and this PR achieves a 50x-100x speedup of JTable reorder in large websites (and is much more complete work for MySQL too, unlike previous PRs)

even my similar purpose PR #11184 , that involved quite some time by me, i will be very happy to close after / if this PR gets enough testing to be merged

  • about a couple of API methods added by this PR , they are currently only used in JTable reorder(), so i argue some people to have less worries of breaking something in different places
  • this PR (besides author) was tested by me and @andrepereiradasilva on MySQL and (besides author) by me on PostgreSQL

Author has also tested on MSSQL, and i almost did too, but i run into some problems installing and configuring SQL server, that delayed me, but i will find time to do it in the short term future to test MSQL too

I hope more people, can put this PR in their "testing / contributing list" e.g. @alikon

@csthomas, can you check and fix the conflicts ?

@csthomas
Copy link
Contributor Author

csthomas commented Dec 18, 2016

Fixed but the last issue is still PostgreSQL version from
https://github.com/joomla/joomla-cms/blob/staging/libraries/joomla/database/driver/postgresql.php#L57

It has to be at least 8.4.0.
If this can be done by other PR then this one can be merged:)

@mbabker
Copy link
Contributor

mbabker commented Dec 18, 2016

There isn't an acceptable way to raise the minimum supported database engine version and block upgrades on an unsupported environment in any current stable release (somewhere there's a PR to add this support because it will be a requirement for 4.0 but it is definitely not in 3.6.5), so from that aspect you're out of luck trying to raise the minimum.

From a practical perspective, you might get lucky though. There are no PostgreSQL 8.3 installs in the stats server and the oldest reported install uses 8.4.2.

@csthomas
Copy link
Contributor Author

csthomas commented Jan 5, 2017

Is it worth more to work on this?
PLT folks, can I bump postgreSQL version to 8.4.2?

If not then I will close that PR.

@ggppdk
Copy link
Contributor

ggppdk commented Jan 6, 2017

PLT folks, can I bump postgreSQL version to 8.4.2?

If there are a couple of persons using a previous postgreSQL version, then they can upgrade,
but i do not think there is even one , lol

There has been a lot of testing of many PR that were doing almost nothing,
why this important PR does not have any testers ?

The car is runs ok, but it is burning oil, and we care more for polishing it ?
First get it to mechanic to fix the problem (=test this PR) and then care to polish it

Can this PR get a high priority label (i guess critical would be for a security issue)

@csthomas
Copy link
Contributor Author

csthomas commented Jan 6, 2017

I have updated PR.
I took a decision to rebase it.
Later it would be easier to merge PR with upcoming new changes.

PR need to be tested again.
I tested on mysql, postgresql and sql server.

@ggppdk
Copy link
Contributor

ggppdk commented Jan 6, 2017

I ll re-test during week

@nibra
Copy link
Member

nibra commented Jan 7, 2017

can I bump postgreSQL version to 8.4.2?
If not then I will close that PR.

Not for 3.x. However, it might be a candidate for Joomla!4.0, so the PR should be made against 4.0-dev instead of staging.

@ggppdk
Copy link
Contributor

ggppdk commented Jan 7, 2017

Not for 3.x. However, it might be a candidate for Joomla!4.0, so the PR should be made against 4.0-dev instead of staging.

@nibra

Stopping this PR for 1 or 2 non-existing postgreSQL user of 8.3.NN is acceptable ?

while saving an article in a category with 1,000 articles
and having it make 1000 update queries and cost 10 seconds

or if you have 5,000 articles,
and require 50 seconds to 100 seconds is acceptable ?

Race conditions in DB updating when re-ordering large menu is acceptable ? (this PR minimizes them)

People having to hack the JTable reorder (using PR(s) proposed here or not proposed here) to make it usable in large websites is acceptable ?
[EDIT] i know 3rd party extensions can extend and override JTable::reorder() if they want, i mean Joomla article manager

There were other changes made in J3.4 J3.5 J3.6 that broke unintentionally several sites

and this change will not even break any 8.3.NN postgreSQL sites because they don't exist,
and even if a few exist, the change added effects saving and re-ordering, a user can still have time to upgrade

or if it is possible (because of the way upgrade is done)
add the minimum DB version check not only to installation as it is now,
by to the Joomla upgrade too, and user knows that a DB upgrade is needed

@csthomas
Copy link
Contributor Author

csthomas commented Jan 7, 2017

Is there an option to check what version of postgreSQL is running and use a workaround for that?
Example:
http://www.postgresonline.com/journal/archives/79-Simulating-Row-Number-in-PostgreSQL-Pre-8.4.html

@Bakual
Copy link
Contributor

Bakual commented Jan 7, 2017

I don't think we can check the db version for the update. We only have checks in place for the PHP version. We added those since we had that B/C break due to a serious security issue where we raised the PHP minimum to 5.3.10.

Stopping this PR for 1 or 2 non-existing postgreSQL user of 8.3.NN is acceptable ?

Either back that statement with facts or don't use such a statement when you want to break peoples sites.
Fact is, even with the stats plugin we don't know for sure how many PostgreSQL installations there are. Certainly not many, but most likely more than the 1-2 non-existing you say.

Semantic versioning is very clear about this question. You can't do it in 3.x. You can do it for 4.0.

@nibra
Copy link
Member

nibra commented Jan 7, 2017

We use Semantic Versioning, so BC break within the 3.x series is not acceptable. To get this improvement into 3,x, it will need a version switch providing the improvement for 8.4+, and keeping the old behaviour for pgsql < 8.4. There is no other option.

@rdeutz
Copy link
Contributor

rdeutz commented Jan 7, 2017

would be great to have it for 3.x but rules are rules and there is not other way as to make this change against the 4.0 branch

@mbabker
Copy link
Contributor

mbabker commented Jan 7, 2017

FWIW, the general argument outside the Joomla community is that semantic versioning only applies to the exposed API; it does NOT apply to the minimum supported version of a dependency (otherwise one could argue our updates of third party libraries are backward compatibility breaks as we in many cases also make changes which take advantage of the new version's features or have bumped a minimum dependency to ensure packages which weren't PHP 7 compatible weren't installed or to block packages with security vulnerabilities). While raising a minimum for more visible things like PHP and database versions is usually more painful than raising the minimum for the Registry or PHPMailer packages, the act of raising a minimum in and of itself is generally not a B/C break.

There is also a practical argument to be made here, given by this screenshot:

screen shot 2017-01-07 at 9 54 56 am

There are ZERO reported PostgreSQL 8.3 users in our statistics database. This may not be a 100% accurate statement due to the optional nature of sending the data, but this argument is now hitting a point where there's a practical side (nobody is using it, at least per the data the project has been collecting for a year now) and a procedural side.

As pointed out, until #12355 is in production (so 3.7) there isn't a way to check and block the upgrade if it were decided to allow it. So that does make it harder to just say yes.

@alikon
Copy link
Contributor

alikon commented Jan 7, 2017

imho we are still missing the main point:
is insane to update/reorder all records when a new article is created
whatever clever/complicated query we can write
we are try to circumvent the real problem

@mbabker
Copy link
Contributor

mbabker commented Jan 7, 2017

Without changing the way the data is stored and processed (which would be a more painful B/C break to deal with), what can you do about that?

@ggppdk
Copy link
Contributor

ggppdk commented Jan 7, 2017

@alikon

JTable::reorder should never use 1 SQL query per updated row, regardless of where it is called

  • now about new article saving, author of this PR has made a solution for this case too, that is B/C but uses large numbers,

but after this PR is accepted, updating a category with 10,000 records will be have JTable::reorder() cost of e.g. 1 second instead of 120seconds - 200 seconds,

finally code added by this PR , i suspect will be useful in other places too !

@csthomas
Copy link
Contributor Author

csthomas commented Jan 7, 2017

Thanks for all comments. This PR won't go to 3.7.

I want to create another PR with less functionality (without partition) that won't require newer postgreSQL.

Example:

-- Added on create connection
CREATE TEMP SEQUENCE rn;
-- On queries
SELECT *, nextval('rn') - 1 AS rn FROM (SELECT id, title FROM #__content LIMIT 10 OFFSET 2) AS a, (SELECT setval('rn', 1)) AS x;

@csthomas
Copy link
Contributor Author

csthomas commented Jan 8, 2017

I close this PR. Please test a new version at #13505

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants