You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
To kick off the implementation of classification layers in an object-oriented way, the first step is to define a base ClassificationLayer class. This class will act as the foundation for all specific classification types (e.g., SegmentationClassificationLayer, SlopeClassificationLayer, FusionClassificationLayer). The purpose of the base class is to encapsulate the common functionality shared by all classification types while allowing flexibility for subclasses to implement their specific logic.
Code
In a new python file classification.py in xDEM source code, write the ClassificationLayer abstract class:
Define Common Attributes: The base class should define attributes that will be common accross all classification layers, such as:
dem DEM on which the classification will be applied to.
name name for the classification layer.
req_stats list of required statistics to compute (optional, all statistics in geoutils.Raster computed by default).
req_stats_classes list of the classes on which the statistics will be applied (optional, all classes by default).
class_names dict connecting the class names to the class indexes (set to None).
classification result of the classification, a geoutils.Mask object, (set to None).
stats_dict required statistics for required classes in a dict (set to None).
Abstract Method: Since different classification layers will have their own classification logic (e.g., segmentation masks vs. slope ranges), the base class should declare abstract methods that the subclasses are required to implement, such as apply_classification(). This method will be used to compute the classification attribute, which will be a geoutils.Mask object, in which each band will represent one class mask.
Statistics computation: A common feature across all classification layers is the ability to compute statistics on the classified pixels (e.g., mean, standard deviation). The base class should provide a get_stats() method that takes into account the two last attributes. The output, as stats attribute, should be a dict, in which the first layer represent the classes, and the second the statistics. This method should use for each required classes the DEM.set_mask() to apply the classification mask and the DEM.get_stats() method to compute the required statistics. The result will be a dict stored under stats attribute.
Saving results: An other common feature is the ability to save the results. The save() method shoud have an output_dir in input, and save:
The classification object with the Mask.save() method, under name.tif;
The class_name attribute, that represents the name of each class in a dict, under name_classes.json;
The stats attribute, under name_stats.json OR name_stats.csv.
Documentation
We need to start a documentation page on this subject.
The text was updated successfully, but these errors were encountered:
Great to have this overview, including #693 to #696! 🙂
I'm commenting for all 5 issues below.
Only a couple conceptual remarks at this stage:
Storing the classif output: I would be in favor of relying on pd.DataFrame objects instead of dictionaries to report bins. They are made for this, by supporting interval indexing (e.g., continuous with open/close support such as [1, 2[, [2, 5[, for binning, or discrete for segmentation such as [1], [5]), and by being able to natively combine several bins through multiple indexing (https://pandas.pydata.org/docs/user_guide/advanced.html; can also use named columns to simplify). That would support both types of binning mentioned (discrete=segmentation and continuous=slope), and through multi-indexing there might not be a need for a specific "fusion" type? (if I understood the objectives there correctly).
Performing the classif: Note that we have multiple-variable classification (= N-D binning) in xDEM already (code here: https://github.com/GlacioHack/xdem/blob/main/xdem/spatialstats.py#L77; example of 2-D application with a figure here: https://xdem.readthedocs.io/en/stable/uncertainty.html#heteroscedasticity). We chose to rely on scipy.nd_binning at the time, but we could also switch to pd.groupby() that is now more modular than 5 years ago and available for rasters through Xarray as well (so soon in GeoUtils/xDEM through the accessor). With those functionalities already performing the binning, we might not need the ClassificationLayer classes? (I'm not sure I grasped all the objective of the classes! See my final note below 😉)
Implement in GeoUtils directly: As classification is also very useful for any raster, as for get_stats(), we could have the binning functionality directly in GeoUtils, for example Raster.nd_binning() or Raster.groupby() (to match the Xarray accessor to come) returning a pd.DataFrame. We had actually planned to move xDEM's nd_binning() (https://github.com/GlacioHack/xdem/blob/main/xdem/spatialstats.py#L77) to geoutils/stats/, see details here: Re-structure spatialstats.py #378.
To understand the implementation better, I think what I'm missing is an explanation of the needs and their link to the class structure 😄 : Do we need to save specific spatial metadata/rasters from the bins that we can't with nd_binning or groupby()? Or other?
Context
To kick off the implementation of classification layers in an object-oriented way, the first step is to define a base
ClassificationLayer
class. This class will act as the foundation for all specific classification types (e.g.,SegmentationClassificationLayer
,SlopeClassificationLayer
,FusionClassificationLayer
). The purpose of the base class is to encapsulate the common functionality shared by all classification types while allowing flexibility for subclasses to implement their specific logic.Code
In a new python file
classification.py
in xDEM source code, write theClassificationLayer
abstract class:dem
DEM on which the classification will be applied to.name
name for the classification layer.req_stats
list of required statistics to compute (optional, all statistics ingeoutils.Raster
computed by default).req_stats_classes
list of the classes on which the statistics will be applied (optional, all classes by default).class_names
dict connecting the class names to the class indexes (set toNone
).classification
result of the classification, ageoutils.Mask
object, (set toNone
).stats_dict
required statistics for required classes in a dict (set toNone
).apply_classification()
. This method will be used to compute theclassification
attribute, which will be ageoutils.Mask
object, in which each band will represent one class mask.get_stats()
method that takes into account the two last attributes. The output, asstats
attribute, should be a dict, in which the first layer represent the classes, and the second the statistics. This method should use for each required classes theDEM.set_mask()
to apply the classification mask and theDEM.get_stats()
method to compute the required statistics. The result will be a dict stored understats
attribute.save()
method shoud have anoutput_dir
in input, and save:classification
object with theMask.save()
method, undername.tif
;class_name
attribute, that represents the name of each class in a dict, undername_classes.json
;stats
attribute, undername_stats.json
ORname_stats.csv
.Documentation
We need to start a documentation page on this subject.
The text was updated successfully, but these errors were encountered: