-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open questions about the ISA JSON Model #4
Comments
Request for uniforming the way relationships/associations are implemented. Almost all schemes use a uniform format for the associations, for example {
"$defs": {
"Study": {
"properties": {
"observationVariables": {
"description": "The list of Observation Variables being used in this study. \n\nThis list is intended to be the wishlist of variables to collect in this study. It may or may not match the set of variables used in the collected observation records. ",
"items": {
"$ref": "ObservationVariable.json#/$defs/ObservationVariable"
},
"referencedAttribute": "studies",
"relationshipType": "many-to-many",
"type": "array"
}
}
},
"$id": "https://brapi.org/Specification/BrAPI-Schema/BrAPI-Core/Study.json",
"$schema": "http://json-schema.org/draft/2020-12/schema"
} As you can see the the association is a property itself therefore there is no problem in automatic converting the relationships.
All of these three associations are defined differently than the others. {
"$defs": {
"PedigreeNode": {
"properties": {
"parents": {
"description": "A list of parent germplasm references in the pedigree tree for this germplasm. These represent edges in the tree, connecting to other nodes.\n<br/> Typically, this array should only have one parent (clonal or self) or two parents (cross). In some special cases, there may be more parents, usually when the exact parent is not known. \n<br/> If the parameter 'includeParents' is set to false, then this array should be empty, null, or not present in the response.",
"items": {
"properties": {
"parentGermplasm": {
"$ref": "Germplasm.json#/$defs/Germplasm",
"description": "The ID which uniquely identifies a parent germplasm",
"referencedAttribute": "progenyPedigreeNodes",
"relationshipType": "many-to-one"
},
"parentType": {
"description": "The type of parent used during crossing. Accepted values for this field are 'MALE', 'FEMALE', 'SELF', 'POPULATION', and 'CLONAL'. \n\nIn a pedigree record, the 'parentType' describes each parent of a particular germplasm. \n\nIn a progeny record, the 'parentType' is used to describe how this germplasm was crossed to generate a particular progeny. \nFor example, given a record for germplasm A, having a progeny B and C. The 'parentType' field for progeny B item refers \nto the 'parentType' of A toward B. The 'parentType' field for progeny C item refers to the 'parentType' of A toward C.\nIn this way, A could be a male parent to B, but a female parent to C. ",
"enum": [
"MALE",
"FEMALE",
"SELF",
"POPULATION",
"CLONAL"
],
"type": "string"
}
},
"required": [
"germplasmDbId",
"parentType"
],
"type": "object"
},
"type": [
"null",
"array"
]
}
}
},
"$id": "https://brapi.org/Specification/BrAPI-Schema/BrAPI-Germplasm/PedigreeNode.json",
"$schema": "http://json-schema.org/draft/2020-12/schema"
} As you can the the association is here defined as a nested property of the property Is there a possibility to uniform the format of the associations and define them as individual property? |
The It would look something like this:
@LzLang Will this work for Zendro? Will it be able to pick up the reference to |
Notes on how to resolve the above issue(s)"Nested relationships"So, any one-to-one or one-to-many relation to objects that do not have a separate data model definitions we dub "nested". It'd be helpful to discontinue usage of such nested relationships and rather have separate JSON data model definitions for those and then define the relationships as in all cases. List of explicitly defined foreign keysSome data models have foreign keys stated, which should be excluded from the "standard" data model definition. @LzLang will provide us with a list of these keys in order to remove them from the JSON model definitions. Note that currently in the context of automated data warehouse generation with Zendro, we automatically create foreign keys for each association. "Compound foreign keys"In Zendro with only support single foreign keys, of course we could have one for the mother germplasm id and another one for the father. This would be a solution everywhere where we know how many associations we have to the same data model. |
Possible Solution for nested propertiesHello @BrapiCoordinatorSelby , we worked on the nested properties issue and tried to separate those into different/there own models. Cross.json now (condensed to the changed attributes): {
"$defs": {
"Cross": {
"properties": {
"crossAttributes": {
"referencedAttribute": "cross",
"relationshipType": "one-to-many",
"items": {
"$ref": "CrossAttribute.json#/$defs/CrossAttribute",
"description": "Set of custom attributes associated with a cross"
},
"type": [
"null",
"array"
]
},
"externalReferences": {
"referencedAttribute": "cross",
"relationshipType": "one-to-many",
"items": {
"$ref": "CrossExternalReferences.json#/$defs/CrossExternalReferences",
"description": "An array of external reference ids. These are references to this piece of data in an external system. Could be a simple string or a URI."
},
"type": [
"null",
"array"
]
},
"parent1": {
"$ref": "Germplasm.json#/$defs/Germplasm",
"description": "the unique identifier for a germplasm",
"referencedAttribute": "parent1Childs",
"relationshipType": "many-to-one"
},
"parent2": {
"$ref": "Germplasm.json#/$defs/Germplasm",
"description": "the unique identifier for a germplasm",
"referencedAttribute": "parent2Childs",
"relationshipType": "many-to-one"
},
"pollinationEvents": {
"referencedAttribute": "cross",
"relationshipType": "one-to-many",
"items": {
"$ref": "CrossPollinationEvent.json#/$defs/CrossPollinationEvent",
"description": "The list of pollination events that occurred for this cross"
},
"type": [
"null",
"array"
]
}
},
"required": [
"crossDbId"
],
"title": "Cross",
"type": "object"
}
},
"$id": "https://brapi.org/Specification/BrAPI-Schema/BrAPI-Germplasm/Cross.json",
"$schema": "http://json-schema.org/draft/2020-12/schema"
} We created the following models:
CrossAttributes: {
"$defs": {
"CrossAttribute": {
"properties": {
"cross_attribute_ID": {
"description": "the unique identifier for a cross attribute",
"type": "string"
},
"crossAttributeName": {
"description": "the human readable name of a cross attribute",
"type": [
"null",
"string"
]
},
"crossAttributeValue": {
"description": "the value of a cross attribute",
"type": [
"null",
"string"
]
},
"cross": {
"$ref": "Cross.json#/$defs/Cross",
"description": "The unique identifier for a Cross",
"referencedAttribute": "crossAttributes",
"relationshipType": "many-to-one"
}
},
"required": [
"cross_attribute_ID"
],
"title": "CrossAttribute",
"type": "object"
}
},
"$id": "https://brapi.org/Specification/BrAPI-Schema/BrAPI-Germplasm/CrossAttribute.json",
"$schema": "http://json-schema.org/draft/2020-12/schema"
} CrossExternalReferences {
"$defs": {
"CrossExternalReferences": {
"properties": {
"reference_ID": {
"description": "The external reference ID. Could be a simple string or a URI.",
"type": [
"null",
"string"
]
},
"referenceSource": {
"description": "An identifier for the source system or database of this reference",
"type": [
"null",
"string"
]
},
"cross": {
"$ref": "Cross.json#/$defs/Cross",
"description": "The unique identifier for a Cross",
"referencedAttribute": "externalReferences",
"relationshipType": "many-to-one"
}
},
"required": [
"reference_ID"
],
"title": "CrossExternalReferences",
"type": "object"
}
},
"$id": "https://brapi.org/Specification/BrAPI-Schema/BrAPI-Germplasm/Cross.json",
"$schema": "http://json-schema.org/draft/2020-12/schema"
} CrossPollinationEvent {
"$defs": {
"CrossPollinationEvent": {
"properties": {
"pollination_ID": {
"description": "The unique identifier for this pollination event",
"type": [
"null",
"string"
]
},
"pollinationSuccessful": {
"description": "True if the pollination was successful",
"type": [
"null",
"boolean"
]
},
"pollinationTimeStamp": {
"description": "The timestamp when the pollination took place",
"format": "date-time",
"type": [
"null",
"string"
]
},
"cross": {
"$ref": "Cross.json#/$defs/Cross",
"description": "The unique identifier for a Cross",
"referencedAttribute": "pollinationEvents",
"relationshipType": "many-to-one"
}
},
"required": [
"pollination_ID"
],
"title": "CrossPollinationEvent",
"type": "object"
}
},
"$id": "https://brapi.org/Specification/BrAPI-Schema/BrAPI-Germplasm/CrossPollinationEvent.json",
"$schema": "http://json-schema.org/draft/2020-12/schema"
} In the original Cross model, there were 2 special nested properties "parent1" and "parent2". "parent1": {
"$ref": "Germplasm.json#/$defs/Germplasm",
"description": "the unique identifier for a germplasm",
"referencedAttribute": "parent1Childs",
"relationshipType": "many-to-one"
},
"parent2": {
"$ref": "Germplasm.json#/$defs/Germplasm",
"description": "the unique identifier for a germplasm",
"referencedAttribute": "parent2Childs",
"relationshipType": "many-to-one"
}, Germplasm.json "parent1Childs": {
"title": "parent1Childs",
"description": "Childs of the germplasm",
"referencedAttribute": "parent1",
"relationshipType": "one-to-many",
"items": {
"$ref": "Cross.json#/$defs/Cross",
"description": "Crosses"
},
"type": [
"null",
"array"
]
},
"parent2Childs": {
"title": "parent2Childs",
"description": "Childs of the germplasm",
"referencedAttribute": "parent2",
"relationshipType": "one-to-many",
"items": {
"$ref": "Cross.json#/$defs/Cross",
"description": "Crosses"
},
"type": [
"null",
"array"
]
} Way of standardizing primary and foreign keysCurrently primary and foreign keys are defined the same way, e.g. from Cross: {
"$defs": {
"Cross": {
"properties": {
"crossDbId": {
"description": "the unique identifier for a cross",
"type": "string"
},
"parent1": {
"properties": {
"germplasmDbId": {
"description": "the unique identifier for a germplasm",
"type": [
"null",
"string"
]
}
},
"type": [
"null",
"object"
]
}
},
"required": [
"crossDbId"
],
"title": "Cross",
"type": "object"
}
},
"$id": "https://brapi.org/Specification/BrAPI-Schema/BrAPI-Germplasm/Cross.json",
"$schema": "http://json-schema.org/draft/2020-12/schema"
} The primary key
And for foreign keys we used a similar pattern, for example I use "listOwnerPerson": {
"$ref": "Person.json#/$defs/Person",
"description": "The unique identifier for a List Owner. (usually a user or person)",
"referencedAttribute": "lists",
"relationshipType": "many-to-one"
}, So basically one person can have multiple lists, in Zendro we would define the relationship like: "listOwnerPerson": {
"type": "many_to_one",
"implementation": "foreignkeys",
"reverseAssociation": "lists",
"target": "Person",
"targetKey": "lists_ids",
"sourceKey": "list_owner_person_id",
"keysIn": "List",
"targetStorageType": "sql"
} So our foreign keys are named after the attribute and uses id/ids, depending if it's an array or not. Standardizing a way of defining associationsCurrently BrAPI is using two different ways to define associations. "observationUnits": {
"title": "observationUnits",
"description": "observationUnits",
"referencedAttribute": "cross",
"relationshipType": "one-to-many",
"items": {
"$ref": "ObservationUnit.json#/$defs/ObservationUnit",
"description": "ObservationUnit"
},
"type": [
"null",
"array"
]
} On the other side "crossingProject": {
"$ref": "CrossingProject.json#/$defs/CrossingProject",
"description": "the unique identifier for a crossing project",
"referencedAttribute": "crosses",
"relationshipType": "many-to-one"
}, We don't see a benefit in nesting the reference and giving it a separate description. "observationUnits": {
"title": "observationUnits",
"description": "observationUnits",
"referencedAttribute": "cross",
"relationshipType": "one-to-many",
"$ref": "ObservationUnit.json#/$defs/ObservationUnit",
"type": [
"null",
"array"
]
} |
Open Questions
When parsing and reading through the ISA JSON Model a few questions arose. They are listed here.
How to treat properties of type
object
In some cases BrApi JSON data models have properties of type
object
. We can model them in Zendro in a number of ways.one-to-many
association.Probably this should be decided on a case-by-case level?
Example:
additionalInfo
andadditionalProperties
e.g. inPerson.json
.Structure of
additionalInfo
The definition taken from
Person.json
says:So, according to this specification, a person can have additional info. But, what is the structure of this object? The object
additionalInfo
can have a number ofadditionalProperties
that are of typestring
?additionalProperties
a collection ofkey
andvalue
pairs that can store any information? In that case, we cannot provide a schema, but must use serialized JSON.Reply from meeting with the BrApi group
additionalInfo
should be the only case, where we see non formatted data. In the BrApi test server we serialize and store this object as JSON.How to model
externalReferences
?The array of external references is found in the Person model:
There are several question about this specification:
referenceId
andreferenceSource
are marked as required, but in theirtype
specificationnull
is allowed.Response from the BrApi development team
ExternalReference
referenceId
can be e.g. a DOI URLPage & Holmes 2012, Inferring Phylogenies
(actually this example does not exist)Another example would be the field-book-App:
field-book
Validation
Zendro has the capability to use any validation function on provided data. The Zendro framework can validate both data formats (syntactically) and data values (semantically). However, if the database has to be queried, we should consider whether this might be a performance bottleneck.
Example taken from
Sample.json
:Questions:
If we can recognize validations and the respective function to carry out on data values, we can implement them in Zendro, and generate them automatically.
Relationships / Associations
For some associations we see the foreign keys implemented in the JSON Specs, e.g. in
Sample.json
:Here, we can conclude from the name of the foreign key and its existence:
many-to-[one|many
, i.e. many samples belong to probably many (not one) germplasm)Germplasm
).However a formal specification of all relationships would be extremely helpful and resolve open questions.
To be excluded properties
In some data models foreign keys are stated. Also, to spare the user to send another request to the RESTful API some of the properties of the associated (relationship) models are stored, too. See this example taken from
Sample.json
:Using GraphQL these properties are not required. GraphQL specifically allows to fetch within a single HTTP-Request all data the user wants, including properties of related (associated) data models. Furthermore, given we at some point have a formal description of relationships between data models, foreign keys would ideally no longer be listed among data model definitions.
Is there a way, we can recognize these "to be excluded" properties and not include them in the final GraphQL data model definitions. An easy quick and dirty solution would be a simple exclusion list?
Response from the GraphQL development group
DbId
is either a primary or a foreign key.The text was updated successfully, but these errors were encountered: