Purpose
OMAFRA has a mandate of maintaining and updating the provincial soil maps. PDSM provides tools that make sampling and mapping more efficient and more accurate. Having these maps available on a digital interface allows for anytime access to simplify decision making on the farm.
Reach
Provincial soil maps are used in many decision-making processes such as land use, land evaluation, farming practices, beneficial management practices, and many others. PDSM can be used to build models that predict soil properties (e.g., texture, organic matter content) and soil classes (e.g., soil drainage class, soil type). The models being developed for PDSM in Ontario are specific to our provincial data, however the algorithms themselves are being used in many jurisdictions nationally and internationally and in many other disciplines. The resulting soil maps from PDSM are used to determine agricultural capability (Canada Land Inventory Agricultural Capability rating). Agricultural capability is one of the aspects considered by other organizations for land evaluation.
Potential impacts
When used for precision agriculture the decisions have the capacity to affect crop yields for multiple growing seasons and the health of nearby water bodies and water tables for multiple years. When used to inform land assessment the outcomes can impact assessment scores and decisions.
Internal use policies
These algorithms are open source and available for use by anyone. Government staff have no requirements or guidance on when, how or if it should be used.
Technical description
OMAFRA currently uses Cubist and Random Forest models for making soil predictions.
Cubist is a rule-based model that builds a decision tree based on the input data. A linear regression model is then developed for each terminal node of the tree. Random Forest is similar to the Cubist model but differs in that it is an ensemble model. This means the algorithm creates multiple decision trees and return mean predictions across these multiple trees.
Since these are both machine learning algorithms, they are dependent on the input data; this means the resulting models differ depending on the data used for building the model.