Motivated users are entirely capable of contributing models, provided
they are comfortable using git
pull requests.
models
repository to your local enviroment.publications
directory that
ParametricModel
or
ModelSet
) with associated metadata and prediction
function.allometric/models
via
GitHub.After the completion of step 4, your pull request will be tested and reviewed for completion, and will become part of the main package if it passes review.
Users that are not familiar with git
can simply email me
their model files at bfrank70@gmail.com. All other users are encouraged to
submit models via pull requests
allometric::Publication
is a class that represents a
given scientific article, report, or other technical document that
contains allometric models.
Citations are established using RefManageR::BibEntry
which is an R representation of a BibTex citation. For example, the
BibEntry for the following publication
H. Temesgen, V. J. Monleon, and D. W. Hann. “Analysis and comparison of nonlinear tree height prediction strategies for Douglas-fir forests”. In: Canadian Journal of Forest Research 38.3 (2008), pp. 553-565.
is
paper_citation <- RefManageR::BibEntry(
bibtype = "article",
key = "temesgen_2008",
title = "Analysis and comparison of nonlinear tree height prediction
strategies for Douglas-fir forests",
author = "Temesgen, Hailemariam and Monleon, Vicente J. and Hann, David W.",
journal = "Canadian Journal of Forest Research",
year = 2008,
volume = 38,
number = 3,
pages = "553-565",
year = 2008
)
many different types of publications are available, called “entry
types”, and documentation for required fields are given here. The function that
creates a Publication
requires two arguments, the citation
itself and an optional list of descriptors. The role of descriptors will
become clear as we progress. For now, just know that
descriptors
is a named list of attributes that apply to
all models in the publication.
For example, all models in this publication are for Douglas-fir, and
were developed using data in the United States and Oregon. With this in
mind, we establish the Publication
object.
temesgen_2008 <- Publication(
citation = paper_citation,
descriptors = list(
country = "US",
region = "US-OR",
taxa = Taxa(
Taxon(
family = "Pinaceae",
genus = "Pseudotsuga",
species = "menziesii"
)
)
)
)
temesgen_2008
now represents a publication with citation
information and metadata that describe aspects of the models inside. One
problem, there are no models inside the publication yet, so we must add
them.
Models can be added one at a time. Other convenient ways of adding models are described in the next section, but let us start with just one model.
The publication we are using implements several fixed effects
models that predict the heights of trees using various covariates.
One of the functions the authors use is \(h =
1.37 + b_0 * (1 - \exp(b_1 * d)^{b_2})\) where \(h\) represents height and \(d\) represents the diameter outside bark.
Note that the height is a function of the covariate \(d\) and a finite set of parameters, \(b_0\), \(b_1\) and \(b_2\). This is a perfect candidate for the
FixedEffectsModel
class.
temesgen_2008_model_one <- FixedEffectsModel(
response = list(
hst = units::as_units("m")
),
covariates = list(
dsob = units::as_units("cm")
),
parameters = list(
beta_0 = 51.9954,
beta_1 = -0.0208,
beta_2 = 1.0182
),
predict_fn = function(dsob) {
1.37 + beta_0 * (1 - exp(beta_1 * dsob)^beta_2)
}
)
We can see that four arguments are used, response
,
covariates
, parameters
and
predict_fn
.
response
- a named list containing one element, which
is the name of the response variable. In this case hst
indicates the model is the height of the
stem using the total height. The value
of this element is established by providing the units using the
units
package. In this case, the units are defined as
meters.covariates
- a named list containing all covariates
usied in the model. In this case, only one covariate is used called
dsob
which stands for the diameter of the
stem, outside bark at
breast height. Again, the units are defined, this time
as centimeters.parameters
- a named list containing the parameters and
their estimates.predict_fn
- this is the prediction function. Note that
the body of the function is written using the parameters and covariates,
but the only argument provided to the function is dsob
,
i.e., the one covariate the model uses. This is for the author’s
convenience, and parameters are populated into the function at a later
phase, so they do not need to be defined as arguments to
predict_fn
.Finally, we add the model to the publication using
add_model
temesgen_2008 <- add_model(temesgen_2008, temesgen_2008_model_one)
and print the summary for the publication
temesgen_2008
## Temesgen, Monleon, and Hann (2008)
## --- hst: 1 model set
## --------- 1 model
We can see that the publciation now contains one model set for the
variable hst
and that set contains one model.
A frequent pattern in allometric modeling papers are “model sets”, these are sets of allometric models that have the same response, covariate and functional structure, but are fit for several species, age classes, etc. For example, Brackett (1977) fit 24 cubic volume functions for different species disaggregated by geographic region and species.
ModelSet
is a class in allometric
that
efficiently deals with these types of model structures. For this example
we use FixedEffectsSet
which represents a set of fixed
effects models. Model sets allow the author to import a table (e.g., a
.csv
) of model descriptions to quickly define large sets of
ParametricModel
s.
For this task, consider the Brackett (1977) publication. We begin
again defining the Publication
.
bracket_1977_citation <- RefManageR::BibEntry(
bibtype = "techreport",
key = "brackett_1977",
title = "Notes on tarif tree volume computation",
author = "Brackett, Michael",
year = 1977,
institution = "State of Washington, Dept. of Natural Resources"
)
brackett_1977 <- Publication(
citation = bracket_1977_citation,
descriptors = list(
country = "US",
region = "US-WA"
)
)
Here we can see the citation information and the shared descriptors.
Note that in this case, descriptors
only contains
country
and region
. This is because each
individual model in brackett_1977
is not uniform with
respect to species
as it was in temsegen_2008
,
so we will have to define those descriptors at the model-level.
Next, we load a data frame of model descriptions and examine it
model_specifications <- tibble::as_tibble(load_parameter_frame("vsa_brackett_1977"))
head(model_specifications)
We can see that model_specifications
contains a number
of individual models which are specified by column names referred to as
descriptors
. In this case, descriptors
includes family
, genus
, species
,
geographic_region
and age_class
. The
combination of these values uniquely identifies a model within this
publication. family
, genus
and
species
are part of a standard set of descriptors called “search
descriptors”, which get propagated to the searching functions
described in the README. geographic_region
and
age_class
are not search descriptors, but still help to
clarify the purpose of the model within the publication.
Next, we instantiate the FixedEffectsSet
, which looks
very similar to FixedEffectsModel
used in the previous
section. The arguments are the same, but the implication is that
response unit
, covariate units
and
prediction_fn
all apply uniformly to the models in the
set.
brackett_1977_model_set <- FixedEffectsSet(
response = list(
vsa = units::as_units("ft3")
),
covariates = list(
dsob = units::as_units("in"),
hst = units::as_units("ft")
),
parameter_names = c("a", "b", "c"),
predict_fn = function(dsob, hst) {
10^a * dsob^b * hst^c
},
model_specifications = model_specifications %>% aggregate_taxa()
)
Finally, we add the model set to the publication using
add_set
and print a summary to verify that the model was
added.
brackett_1977 <- add_set(brackett_1977, brackett_1977_model_set)
brackett_1977
This summary shows the publication name, followed by a list of all
model sets defined for the publication. We can see that there is 1 model
set for vsa
(volume of the stem including top and stump)
that contains 24 models.
Once a publication is added to the models
to the
publications
directory the models contained within can now
be installed by the user using install_models()
. In fact,
this is exactly how the user initially gains access to the
allometric_models
table. If you are ever interested in
viewing the source code for a model, just find the matching
.R
file in publications
, which can be viewed
locally or on GitHub. The files used for this example are located at publications/temesgen_2008.R
and publications/brackett_1977.R
,
respectively.