Skip to content

Conversation

@jardelnovaes
Copy link

Fix set/getSchema for DataSources - The Driver ignores the setSchema method.

Hello!

I noticed that the Driver doesn't use the schema defined when I'm using DataSource. The getSchema method just execute an

"SELECT SCHEMA_NAME()"

and ignores the values passed in the setSchema method, it´s not compliance with the get/set schema idea of DataSource.

I'm was using a Spring Boot Application and setting multiples datasources in the application.yml file. In this case the user has provileges in the schema, but I can't set the default schema in the Sql Server.

My idea is, if I set the schema programmatically (e.g. application.yml) the Driver should use the value passed in the setSchema. If schema is null then the Driver can use the "SELECT SCHEMA_NAME()"

Usually a DataSource has the follow properties: url, driver-class-name, username and password. Sometime it has the schema property as well.

@msftclas
Copy link

msftclas commented Nov 28, 2018

CLA assistant check
All CLA requirements met.

@jardelnovaes
Copy link
Author

jardelnovaes commented Nov 28, 2018

Example of Spring Boot DataSource Properties extracted from https://docs.spring.io/spring-boot/docs/current/reference/html/common-application-properties.html

spring.datasource.driver-class-name= # Fully qualified name of the JDBC driver. Auto-detected based on the URL by default.
spring.datasource.password= # Login password of the database.
spring.datasource.schema= # Schema (DDL) script resource references.
spring.datasource.url= # JDBC URL of the database.
spring.datasource.username= # Login username of the database.

The driver returns "SELECT SCHEMA_NAME()" and not the value of spring.datasource.schema.

@codecov-io
Copy link

codecov-io commented Nov 28, 2018

Codecov Report

Merging #890 into dev will increase coverage by 0.36%.
The diff coverage is 85.71%.

Impacted file tree graph

@@             Coverage Diff              @@
##                dev     #890      +/-   ##
============================================
+ Coverage     48.58%   48.95%   +0.36%     
- Complexity     2803     2832      +29     
============================================
  Files           118      118              
  Lines         27868    27873       +5     
  Branches       4640     4641       +1     
============================================
+ Hits          13541    13644     +103     
+ Misses        12131    12118      -13     
+ Partials       2196     2111      -85
Flag Coverage Δ Complexity Δ
#JDBC42 48.49% <85.71%> (+0.36%) 2790 <0> (+29) ⬆️
#JDBC43 48.56% <85.71%> (ø) 2803 <0> (+1) ⬆️
Impacted Files Coverage Δ Complexity Δ
...om/microsoft/sqlserver/jdbc/SQLServerResource.java 100% <ø> (ø) 4 <0> (ø) ⬇️
.../microsoft/sqlserver/jdbc/SQLServerConnection.java 48.72% <85.71%> (+0.63%) 350 <0> (+15) ⬆️
...ncurrentlinkedhashmap/ConcurrentLinkedHashMap.java 38.14% <0%> (-0.65%) 42% <0%> (-1%)
...om/microsoft/sqlserver/jdbc/SQLServerBulkCopy.java 52.46% <0%> (-0.13%) 263% <0%> (-1%)
...rc/main/java/com/microsoft/sqlserver/jdbc/DDC.java 44.2% <0%> (ø) 107% <0%> (+1%) ⬆️
...n/java/com/microsoft/sqlserver/jdbc/DataTypes.java 79.48% <0%> (+0.19%) 5% <0%> (+1%) ⬆️
...m/microsoft/sqlserver/jdbc/SQLServerStatement.java 59.86% <0%> (+0.28%) 141% <0%> (+6%) ⬆️
...rc/main/java/com/microsoft/sqlserver/jdbc/dtv.java 62.57% <0%> (+0.29%) 0% <0%> (ø) ⬇️
...c/main/java/com/microsoft/sqlserver/jdbc/Util.java 61.63% <0%> (+0.64%) 92% <0%> (+3%) ⬆️
... and 5 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1fe2515...0e16caf. Read the comment docs.

@jardelnovaes
Copy link
Author

About the continuous-integration/appveyor/pr — AppVeyor build failed, the log says:

[...]
mvn test -B -Pbuild42
[...]

[INFO] 
[ERROR] Failures: 
[ERROR]   TVPAllTypesTest.testTVPStoredProcedureResultSet:104->testTVPStoredProcedureResultSet:132
[INFO] 
ERROR] Tests run: 533, Failures: 1, Errors: 0, Skipped: 17
[INFO] 
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  05:40 min
[INFO] Finished at: 2018-11-28T01:22:12Z
[INFO] ------------------------------------------------------------------------

I executed the TVPAllTypesTest again successfully, I think it's something about the server environment.
It's working fine with -Pbuild43 as well. Below the log execution of build42.


[INFO] -------------------------------------------------------
[INFO]  T E S T S
[INFO] -------------------------------------------------------
[INFO] Running com.microsoft.sqlserver.jdbc.tvp.TVPAllTypesTest
[INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 15.403 s - in com.microsoft.sqlserver.jdbc.tvp.TVPAllTypesTest
[INFO]
[INFO] Results:
[INFO]
[INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0
[INFO]
[INFO]
[INFO] --- jacoco-maven-plugin:0.8.2:report (report) @ mssql-jdbc ---
[INFO] Loading execution data file E:\Projetos\Contributions\mssql-jdbc\target\jacoco.exec
[INFO] Analyzed bundle 'Microsoft JDBC Driver for SQL Server' with 325 classes
[WARNING] Classes in bundle 'Microsoft JDBC Driver for SQL Server' do no match with execution data. For report generation the same class files must be used as at runtime.
[WARNING] Execution data for class com/microsoft/sqlserver/jdbc/SQLServerConnection43 does not match.
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 01:10 min
[INFO] Finished at: 2018-11-28T03:10:39-02:00
[INFO] Final Memory: 33M/117M
[INFO] ------------------------------------------------------------------------

How can we execute the continuous-integration/appveyor/pr — AppVeyor again?

@jardelnovaes
Copy link
Author

Hello @kant and @mfriesen

The failed test is successfully running locally, how can we execute the continuous-integration/appveyor/pr — AppVeyor again?

@cheenamalhotra
Copy link
Member

@jardelnovaes Thanks for your contribution. I've restarted Appveyor build for you. We are aware of the failure and have recently fixed that random issue. Apologies for any inconvenience.

@jardelnovaes
Copy link
Author

thank you @cheenamalhotra!

@cheenamalhotra
Copy link
Member

Hi @jardelnovaes

I took a quick glance to the PR, and looks like setting/getting Schema here is only applicable to client side. The server schema could be something else, and the setSchema() does not interact with SQL Server to switch the database in use. In which case, getSchema() would not query server but return locally set property which could be invalid.

@jardelnovaes
Copy link
Author

This is exactly what the Driver should do, I'll explain:

PS: You can read the summary at the end or this detailed explanation.

Explanation:
setSchema is not the catalog or database name, when I change it I shouldn't switch the database in use. If I was using MySQL I should do it using the use databaseName command because in MySQL the schema and the catalog are the same thing.

It's not how the SQL Server works, SQL Server has database (or catalog), owner (or schema) (we can talk about linked server and so on, but I'll simplify)
A qualified name could be myDB.mySchema.myTable or myDB.dbo.myTable. I can remove the owner if I'd like to use the user default schema like this myDB..myTable and I can remove the database if I'd already use use myDB to switch the database (not the schema!) or I've set a database name in the connection string.

Oracle has the command alter session set current_schema=<not default schema> it changes the schema of the client session/connection, SQL Server doesn't have this option (The driver ought support the oldest Sql Server Version with Microsoft support).

Sql Server has just the option to permanently change the user default (e.g. ALTER USER [userName] WITH DEFAULT_SCHEMA = mySchema;)
I may have permission to read objects in many schemas, but I could have no permission to change the user settings, any way we should not change the user default schema at the server just because I set the schema in the client connection options.

If we did it all connection at that server will use that schema and not just that session/client connection, that is what this functionality set/get Schema provides (change just the session/connection).

When we have a different schema (not the user default) providers like Hibernate will produces queries with this format <schema>.<object> rather than just the object name, so the user will be able to read information from a different schema if he has permissions.

When we'd like to use the user default schema we don't need to set the schema property in the connection.

If the schema name is wrong or the user doesn't have permissions SQL Server will provide a message like object doesn't exists or user doesn't have permission on it

There's no problem to return SELECT SCHEMA_NAME() if schema was not set, it'll return the "server default" once the connection doesn't ask for a not default schema.

Summary:
setSchema should not change the user settings at the server.
Sql Server doesn't provide a command to change the current schema just for the session/connection.
Providing a different schema will produces a "qualified name" like schema.object and will work without change any server configuration.
There's no problem to return SELECT SCHEMA_NAME() if schema was not set.

@cheenamalhotra
Copy link
Member

cheenamalhotra commented Nov 29, 2018

Thanks for explanation!

Yes I agree SQL Server has no such functionality for schemas and I do understand how this API setSchema() is functional for other databases as you explained above too.

But if we try to make this API work just for client interaction purpose, I'm now sure how is this useful as it would also cause getSchema() to never return current schema in use by connection (based on logged in user), unless user sets it back to Null - which looks little tricky and would confuse the API behavior and not respect JDBC Specifications which do not force drivers to make this API work if underlying database cannot support it.

JDBC Specifications:
https://docs.oracle.com/javase/8/docs/api/java/sql/Connection.html#setSchema-java.lang.String-
Also according to Specs, getSchema() "Retrieves this Connection object's current schema name." not the one set with setSchema().

@jardelnovaes
Copy link
Author

jardelnovaes commented Nov 30, 2018

@cheenamalhotra, ufortunately I have to disagree with you in many of your comments, let's go!

is functional for other databases as you explained above too.
It's functional for Sql Server as well, just the driver doesn't repect what the connection asked for (to use other schema and not the default).

if underlying database cannot support it.
Does not Sql Server support it? Even the versions out of support have this functionality and it's used by many applications, not necessarily Java applications (at least not with this driver)

it would also cause getSchema() to never return current schema in use by connection (based on logged in user), unless user sets it back to Null
It's not how I've written the code! If you never set the schema, getSchema() returns the default schema set at the server by SELECT SCHEMA_NAME() which one is the schema used by the connection, just when you explicit set a schema it'll return what you set, that now will be the schema used by the connection. You'll always receive a schema by getSchema() and it'll be always the schema used by the connection.
And the For maximum portability, setSchema should be called before a Statement is created or prepared. API recomendation will be covered as well, actually in this case better than when you not set a schema.

not respect JDBC Specifications
Right now (and a long time ago) the SQL Server supports Schema, but the driver doesn't respect JDBC Specifications because doesn't respect the setSchema() and forces the user default schema even when the connection has asked for other one.

"Retrieves this Connection object's current schema name." not the one set with setSchema()."
It's a really weird comment, let's go!
Differently from C# that has properties Java use get/set pattern to represents a property. So you should define the property value using set<propertyName> and read using get<propertyName>, so always set<propertyName> is the way to change the value (of course you can have a default value, but when you change by the set<propertyName> you must keep this value).
If it's a readOnly property you'll have just the get<propertyName>, so it's not the case of schema.
If you don't implement the setSchema you shouldn't implement the getSchema, they're "married" creating the same idea of properties in C#.

//C# example
public class Connection {
	private String schema;

	public Schema {  //get and set are inside the property Schema;
		get {
			return yourLogic();
		}
		set {
			youtOtherLogic();
			schema	= value;
		}
	}
	
	public void test() {
		Connection connection =  new Connection();
		connection.Schema = "mySchema";
		Console.WriteLine(connection.Schema);
	}
}
//Java example
public class Connection {
	private String schema;

        // here C# is better because keep get and set are inside the property Schema, Java doesn't
	public getSchema(){
		return yourLogic();
	}

	public void setSchema(final String value) {
		youtOtherLogic();
		schema	= value;
	}
	
	public void test() {
		Connection connection =  new Connection();
		connection.setSchema("mySchema");
		System.out.println(connection.getSchema());
	}	
}

EntityFramework has already the schema support!
https://docs.microsoft.com/en-us/dotnet/api/system.data.entity.dbmodelbuilder.hasdefaultschema?view=entity-framework-6.2.0#System_Data_Entity_DbModelBuilder_HasDefaultSchema_System_String_
Configures the default database schema name. This default database schema name is used for database objects that do not have an explicitly configured schema name.
do not have an explicitly configured schema name -> It's because you can define a schema for each Entity (Table) by the POJOs classes.

modelBuilder.HasDefaultSchema("MyDefaultDbSchema");

Summary:
The Sql Server supports schema, but the jdbc driver doesn't.
The driver "supports" the getSchema() and not the setSchema(), it seems a man without a leg.
If the API define set and get as public methods you should implement both or none of them.
EntityFramework supports the schema.

@jardelnovaes
Copy link
Author

see https://github.com/aspnet/EntityFramework6/blob/master/src/EntityFramework/ModelConfiguration/Configuration/ModelConfiguration.cs

The EntityFramework6 works with the same idea of my PR. I did know about that when I've code, but it shows that the concept I applied is not wrong.

It'll set the schema of setSchema for all Entities in the context, it'll produces command with the format <mySchema>.<myObject>.

Why using .NET the driver should works as expected and using Java shouldn't?

es.Schema = DefaultSchema ?? EdmModelExtensions.DefaultSchema
If has a different schema use it or else use the default from EdmModelExtensions.

// <summary>
// Gets or sets the default schema name.
// </summary>
public string DefaultSchema { get; set; }
	
[...]

	
private void ConfigureDefaultSchema(DbDatabaseMapping databaseMapping)
{
	DebugCheck.NotNull(databaseMapping);

	databaseMapping.Database.GetEntitySets()
				   .Where(es => string.IsNullOrWhiteSpace(es.Schema))
				   .Each(es => es.Schema = DefaultSchema ?? EdmModelExtensions.DefaultSchema);

	databaseMapping.Database.Functions
				   .Where(f => string.IsNullOrWhiteSpace(f.Schema))
				   .Each(f => f.Schema = DefaultSchema ?? EdmModelExtensions.DefaultSchema);
}

@cheenamalhotra
Copy link
Member

@jardelnovaes

EntityFramework is an extension to ADO.NET (Microsoft.Data.SqlClient) Driver, and not the ADO.NET driver itself. By taking quick glance through, it clearly doesn't look like setting DefaultSchema on ModelConfiguration is same thing as setting Schema on SqlConnection in SqlClient driver. And Microsoft.Data.SqlClient does not expose SetSchema() API on active connection as you can see on MS Docs here. And the reason for this is only because SQL Server does not support setting Schema on active connection.

I understand what setters and getters are meant to do, and what I meant was as per JDBC Specs, setSchema() and getSchema() are meant to contact underlying database, just like any other connection property that modifies driver connection with server configuration and not just self interaction. In short, getSchema() is supposed to respond by querying database always rather than any other local value set by any other medium that is different than what database is aware of. And this is exactly how SqlConnection.GetSchema() behaves too, it returns DataTable fetched from SQL Server for the current connection that holds schema names active on current connection. Because we are bound to JDBC Specs, we cannot return anything else but String and hence the driver returns Default Schema of the caller (i.e. logged in user).

And as we cannot modify SQL Server's schema in use by the connection, it is controlled by caller (logged in user), setSchema() is rightly unsupported on SQLServerConnection.

@jardelnovaes
Copy link
Author

Hi @cheenamalhotra

About your comment:
it clearly doesn't look like setting DefaultSchema on ModelConfiguration is same thing as setting Schema on SqlConnection in SqlClient driver.

And couldn't be. Different from JDBC the .NET SqlConnection has a readonly property called getSchema which doesn't have the same ideia of the getSchema of JDBC. The Dataset returned by this getMethod seems to be something reading the INFORMATION_SCHEMA which has informations about columns, constraints, etc...

There're the same word for two differents proposals. As you checked the JDBC concept of getSchema is just the name of the schema in that Client/Server connection and not informations like INFORMATION_SCHEMA.

SQL Server does not support setting Schema on active connection.
It supports, not as the same way that other Databases such as Oracle or PostgreSQL, but it supports. Remember The INFORMATION_SCHEMA it's not just the name of schema.
Oracle can provider access to a different schema by the alter session set current_schema or by the pattern schema.object. Sql Server cannot provider by a server command as the alter session, but can provider by the pattern which is my contribution implementation.

The EntityFramework (after its 6th version) provides a solution for the schema support problem wich is similar to my contribution. The EF6 solution will provide the use of pattern schema.object.

I agree that the best support is a server command to change de current schema (not the default schema), but unfortunately we don't have this option and a solution like the EF6+ it's better than nothing.

Other day I was wondering: Why does community still relies on SQL Server drivers different and not on the Official Microsoft Driver?
Response:

  • Suppport to the funcionalities
  • Performance
  • Number of Bugs
  • Support (when you have problems)

Along this other drivers we have the Jtds Driver, one of most popular, and used by many companies.

This contribution is a private contribution and I don't like to expose my company name, but this company has many products, has bussiness with Microsoft to develop solutions together, has about 700 employees in 5 countries and are one of the companies that stil relies on other JBDC Drivers.

And I thought, how to change it, We should relies more on Microsoft (that is the SQL Server vendor) than relies on other ones.

So I saw a performance decrease in the PR #883 that was reverted by the PR #891 after my contributions, nice feeling I started helping to improve the Performance requirement.

I've been noticed some codes to work around the getSchema problem, especially when the code uses the JBDC DataSource concept.

So I decide to provide a solution for this issue, that fortunately I discovered that is similar to the EF6+ solution, and I openned the PR #890.

I disagree with you I sure that there're no reason to not provide this support, on the other hand I can understand you position and I repect it.

I'm not the owner of the driver and I'm not have power to decide anything about it, I'm just a contributor that has a dream to make the driver more reliable every day.

@peterbae
Copy link
Contributor

peterbae commented Dec 4, 2018

Hi @jardelnovaes,

Thanks for your contribution. However, I don't think we will merge this PR because this is an incorrect behavioral change. With your changes, if the user (with default schema dbo) were to call setSchema("dbo2"), from then on getSchema will return dbo2 but the user's connection is still using dbo, which will confuse the users. If we were to ship this, other customers in the future will come asking us about this inconsistent behavior.

I think you're putting a lot of emphasis on getSchema and setSchema having to both be functional since we provide implementations for them, but this is just to satisfy the JDBC API requirements - we could be getting rid of setSchema if we weren't required to have it since it's not doing anything right now. However, this shouldn't force the driver into implementing a wrong behavioral change.

@jardelnovaes
Copy link
Author

jardelnovaes commented Dec 5, 2018

@peterbae

Have you ever get this code and tried it?
As I've already said if you set the schema the getSchema will be correct 'cause the commands will be effectivelly running on dbo2 and not on dbo.
I've already said that it would be better if the SQL Server provided a server command, but this solution it's better than nothing.
As well the EF team already understood this situation and provided a solution similar to this PR.

The dbo that you mentioned it's the user default schema and not the current schema (dbo2) in this case.

This solution was already tested and worked fine with Spring Boot and other solution that uses DataSource concept.

I've already said to not confuse the .Net getSchema (information_schema data, not just the schema name).

It's possible to provide this solution, it's similar to EF6+ solution.

Finally I've already said that the decision is yours.

You can not merge it 'cause you don't want to or 'cause you think differently from EF team and will wait a solution from SQL Server team. But you absolutely cannot say that is a wrong behavior (at least if you know well about jdbc and schema)

@peterbae
Copy link
Contributor

peterbae commented Dec 5, 2018

Hi @jardelnovaes,

Yes, I have tested your code. I don't know how you tested your code, but my previous statement of:

With your changes, if the user (with default schema dbo) were to call setSchema("dbo2"), from then on getSchema will return dbo2 but the user's connection is still using dbo

still stands. Executing a query off of a statement object doesn't need to execute getSchema - therefore, creating and setting a private variable aside that only gets used when getSchema is called won't be used to execute queries.

As I already say if you set the schema the getSchema will be correct 'cause the commands will be effectivelly running on dbo2 and not on dbo.

Are you calling setSchema and then calling getSchema, and then injecting the output from getSchema directly in your query to the server? In this case, you just stored a local string (the string for schema) inside the Connection object. This shouldn't be necessary since you don't need to use a Connection object to set a local variable for yourself.

@jardelnovaes
Copy link
Author

@peterbae

What ORM did you use in your test?
I'm not creating a new ORM provider!

What situation does this PR create a bug or problem?

What test failed?

No, I don't use the getSchema to create the query 'cause I don't want to recreate the wheels!

Who do this are the ORM's providers like Hibernate. Using DataSource with Spring Data for example.

Now a days we don't loose time doing thinks that someone already did.

We create the queries manually in strict cases not for all application.

I won't ask for hibernate team to change it 'cause it already works for Oracle, PostgreSQL and others.

You can reject this PR if you don't want it. But you should understand how it will work,it seems you aren't.

@jardelnovaes
Copy link
Author

Sorry I don't want to loose more time in this discussion.
I'm closing this, in the future you can provide the solution by yourselves.

@peterbae
Copy link
Contributor

peterbae commented Dec 5, 2018

Hi @jardelnovaes,

If you wanted to use a different schema, you need to create a connection with the user that has access to the schema in SQL Server, instead of using setSchema. I hope this clarifies your question.

We value your contribution, and please let us know if you have any other suggestions for our driver in the future.

@jardelnovaes
Copy link
Author

jardelnovaes commented Dec 5, 2018

@peterbae

create a connection with the user that has access to the schema in SQL Server, instead of using setSchema.

In this case (use the user default schema) nobody needs the schema property, but how do I access a not default schema that the user has access but is not his default schema?

Already it seems you aren't understand about the subject.

Using EF I have a solution for this situation, but not for this JDBC Driver. As well for Oracle, PostgreSQL.

By the way I already closed this PR, I won't loose more time with this.

@jardelnovaes jardelnovaes reopened this Dec 5, 2018
@cheenamalhotra
Copy link
Member

cheenamalhotra commented Dec 5, 2018

Hi @jardelnovaes

Apologies for inconvenience.

We do understand the issue with ORM Frameworks when it comes to binding configurations with driver APIs. But we have to also consider abiding by JDBC specifications in comparison to other drivers, that can change default schema using setSchema() v/s our driver that cannot make this call to database. This change would also allow end users to specify any arbitrary schema (that user may set unknowingly), and the driver has no validation to that, which may lead to unexpected behavior for framework driven applications.

Thus, we cannot add this change and also stay compliant to JDBC Specs, in comparison with other JDBC drivers. I hope you understand where we are coming from.

On the contrary, we are Open Source driver, and you are free to keep this change in your branch and build driver for your application purpose. You may choose to update your branch when any features are added in our dev branch.

From our side, we are trying to approach SQL Server team to raise this concern of missing feature as not only you, there would be other customers in need for this support, and we understand that. It's just that we cannot add a temporary patch that may be workable for one customer, but may cause issues for other users due to lack of validation and server interaction in these APIs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants