You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Bump Turing to 0.41
* Update initial_params stuff for 0.41
* Update more initial_params stuff
* istrans -> is_transformed
* Minor text fixes
* Remove SamplingContext
* Fix loss of concreteness in MCMCChains
* Fix flaky test in linear regression doc
* Try StableRNG
* Make GMM tutorial a bit less demanding
* Fix typo
Copy file name to clipboardExpand all lines: developers/compiler/design-overview/index.qmd
+2Lines changed: 2 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -36,6 +36,7 @@ The following are the main jobs of the `@model` macro:
36
36
37
37
## The model
38
38
39
+
<!-- Very outdated
39
40
A `model::Model` is a callable struct that one can sample from by calling
40
41
41
42
```{julia}
@@ -49,6 +50,7 @@ and `context` is a sampling context that can, e.g., modify how the log probabili
49
50
50
51
Sampling resets the log joint probability of `varinfo` and increases the evaluation counter of `sampler`. If `context` is a `LikelihoodContext`,
51
52
only the log likelihood of `D` will be accumulated, whereas with `PriorContext` only the log prior probability of `P` is. With the `DefaultContext` the log joint probability of both `P` and `D` is accumulated.
53
+
-->
52
54
53
55
The `Model` struct contains the four internal fields `f`, `args`, `defaults`, and `context`.
54
56
When `model::Model` is called, then the internal function `model.f` is called as `model.f(rng, varinfo, sampler, context, model.args...)`
@@ -104,7 +104,7 @@ Note that `vi_linked[vn_x]` can also be used as shorthand for `getindex(vi_linke
104
104
We can see (for this linked varinfo) that there are _two_ differences between these outputs:
105
105
106
106
1._The internal representation has been transformed using the bijector (in this case, the log function)._
107
-
This means that the `istrans()` flag which we used above doesn't modify the model representation: it only tells us whether the internal representation has been transformed or not.
107
+
This means that the `is_transformed()` flag which we used above doesn't modify the model representation: it only tells us whether the internal representation has been transformed or not.
108
108
109
109
2._The internal representation is a vector, whereas the model representation is a scalar._
110
110
This is because in DynamicPPL, _all_ internal values are vectorised (i.e. converted into some vector), regardless of distribution. On the other hand, since the model specifies a univariate distribution, the model representation is a scalar.
@@ -131,7 +131,7 @@ Before that, though, we'll take a quick high-level look at how the HMC sampler i
131
131
While DynamicPPL provides the _functionality_ for transforming variables, the transformation itself happens at an even higher level, i.e. in the sampler itself.
132
132
The HMC sampler in Turing.jl is in [this file](https://github.com/TuringLang/Turing.jl/blob/5b24cebe773922e0f3d5c4cb7f53162eb758b04d/src/mcmc/hmc.jl).
133
133
In the first step of sampling, it calls `link` on the sampler.
134
-
This transformation is preserved throughout the sampling process, meaning that `istrans()` always returns true.
134
+
This transformation is preserved throughout the sampling process, meaning that `is_transformed()` always returns true.
135
135
136
136
We can observe this by inserting print statements into the model.
137
137
Here, `__varinfo__` is the internal symbol for the `VarInfo` object used in model evaluation:
(Here, the check on `if x isa AbstractFloat` prevents the printing from occurring during computation of the derivative.)
157
-
You can see that during the three sampling steps, `istrans` is always kept as `true`.
157
+
You can see that during the three sampling steps, `is_transformed` is always kept as `true`.
158
158
159
159
::: {.callout-note}
160
-
The first two model evaluations where `istrans` is `false` occur prior to the actual sampling.
160
+
The first two model evaluations where `is_transformed` is `false` occur prior to the actual sampling.
161
161
One occurs when the model is checked for correctness (using [`DynamicPPL.check_model_and_trace`](https://github.com/TuringLang/DynamicPPL.jl/blob/ba490bf362653e1aaefe298364fe3379b60660d3/src/debug_utils.jl#L582-L612)).
162
162
The second occurs because the model is evaluated once to generate a set of initial parameters inside [DynamicPPL's implementation of `AbstractMCMC.step`](https://github.com/TuringLang/DynamicPPL.jl/blob/ba490bf362653e1aaefe298364fe3379b60660d3/src/sampler.jl#L98-L117).
163
163
Both of these steps occur with all samplers in Turing.jl, so are not specific to the HMC example shown here.
@@ -169,7 +169,7 @@ The biggest prerequisite for this to work correctly is that the potential energy
169
169
This is exactly the same as how we had to make sure to define `logq(y)` correctly in the toy HMC example above.
170
170
171
171
Within Turing.jl, this is correctly handled because a statement like `x ~ LogNormal()` in the model definition above is translated into `assume(LogNormal(), @varname(x), __varinfo__)`, defined [here](https://github.com/TuringLang/DynamicPPL.jl/blob/ba490bf362653e1aaefe298364fe3379b60660d3/src/context_implementations.jl#L225-L229).
172
-
If you follow the trail of function calls, you can verify that the `assume` function does indeed check for the presence of the `istrans` flag and adds the Jacobian term accordingly.
172
+
If you follow the trail of function calls, you can verify that the `assume` function does indeed check for the presence of the `is_transformed` flag and adds the Jacobian term accordingly.
173
173
174
174
## A deeper dive into DynamicPPL's internal machinery
The purpose of having all of these machinery is to allow other parts of DynamicPPL, such as the tilde pipeline, to handle transformed variables correctly.
237
-
The following diagram shows how `assume` first checks whether the variable is transformed (using `istrans`), and then applies the appropriate transformation function.
237
+
The following diagram shows how `assume` first checks whether the variable is transformed (using `is_transformed`), and then applies the appropriate transformation function.
238
238
239
239
<!-- 'wrappingWidth' setting required because of https://github.com/mermaid-js/mermaid-cli/issues/112#issuecomment-2352670995 -->
240
240
```{mermaid}
@@ -246,7 +246,7 @@ graph TD
246
246
A["x ~ LogNormal()"]:::boxStyle
247
247
B["vn = <span style='color:#3B6EA8 !important;'>@varname</span>(x)<br>dist = LogNormal()<br>x, vi = ..."]:::boxStyle
Here, `with_logabsdet_jacobian` is defined [in the ChangesOfVariables.jl package](https://juliamath.github.io/ChangesOfVariables.jl/stable/api/#ChangesOfVariables.with_logabsdet_jacobian), and returns both the effect of the transformation `f` as well as the log Jacobian term.
269
269
270
-
Because we chose `f` appropriately, we find here that `x` is always the model representation; furthermore, if the variable was _not_ linked (i.e. `istrans` was false), the log Jacobian term will be zero.
270
+
Because we chose `f` appropriately, we find here that `x` is always the model representation; furthermore, if the variable was _not_ linked (i.e. `is_transformed` was false), the log Jacobian term will be zero.
271
271
However, if it was linked, then the Jacobian term would be appropriately included, making sure that sampling proceeds correctly.
@assert bayes_train_loss < bayes_test_loss "Bayesian training loss ($bayes_train_loss) >= Bayesian test loss ($bayes_test_loss)"
245
244
@assert ols_train_loss < ols_test_loss "OLS training loss ($ols_train_loss) >= OLS test loss ($ols_test_loss)"
246
-
@assert isapprox(bayes_train_loss, ols_train_loss; rtol=0.01) "Difference between Bayesian training loss ($bayes_train_loss) and OLS training loss ($ols_train_loss) unexpectedly large!"
247
-
@assert isapprox(bayes_test_loss, ols_test_loss; rtol=0.05) "Difference between Bayesian test loss ($bayes_test_loss) and OLS test loss ($ols_test_loss) unexpectedly large!"
245
+
@assert bayes_train_loss > ols_train_loss "Bayesian training loss ($bayes_train_loss) <= OLS training loss ($bayes_train_loss)"
246
+
@assert bayes_test_loss < ols_test_loss "Bayesian test loss ($bayes_test_loss) >= OLS test loss ($ols_test_loss)"
248
247
end
249
248
```
250
249
251
-
As we can see above, OLS and our Bayesian model fit our training and test data set about the same.
250
+
We can see from this that both linear regression techniques perform fairly similarly.
251
+
The Bayesian linear regression approach performs worse on the training set, but better on the test set.
252
+
This indicates that the Bayesian approach is more able to generalise to unseen data, i.e., it is not overfitting the training data as much.
Copy file name to clipboardExpand all lines: tutorials/bayesian-poisson-regression/index.qmd
+4-1Lines changed: 4 additions & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -182,7 +182,10 @@ We use the Gelman, Rubin, and Brooks Diagnostic to check whether our chains have
182
182
We expect the chains to have converged. This is because we have taken sufficient number of iterations (1500) for the NUTS sampler. However, in case the test fails, then we will have to take a larger number of iterations, resulting in longer computation time.
183
183
184
184
```{julia}
185
-
gelmandiag(chain)
185
+
# Because some of the sampler statistics are `missing`, we need to extract only
186
+
# the parameters and then concretize the array so that `gelmandiag` can be computed.
Copy file name to clipboardExpand all lines: usage/mode-estimation/index.qmd
+9-1Lines changed: 9 additions & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -37,7 +37,7 @@ data = [1.5, 2.0]
37
37
model = gdemo(data)
38
38
```
39
39
40
-
Finding the maximum aposteriori or maximum likelihood parameters is as simple as
40
+
Finding the maximum a posteriori or maximum likelihood parameters is as simple as
41
41
42
42
```{julia}
43
43
# Generate a MLE estimate.
@@ -80,6 +80,14 @@ maximum_likelihood(
80
80
81
81
When providing values to arguments like `initial_params` the parameters are typically specified in the order in which they appear in the code of the model, so in this case first `s²` then `m`. More precisely it's the order returned by `Turing.Inference.getparams(model, DynamicPPL.VarInfo(model))`.
82
82
83
+
::: {.callout-note}
84
+
## Initialisation strategies and consistency with MCMC sampling
85
+
86
+
Since Turing v0.41, for MCMC sampling, the `initial_params` argument must be a `DynamicPPL.AbstractInitStrategy` as described in [the sampling options page]({{< meta usage-sampling-options >}}#specifying-initial-parameters)).
87
+
The optimisation interface has not yet been updated to use this; thus, initial parameters are still specified as Vectors.
88
+
We expect that this will be changed in the near future.
89
+
:::
90
+
83
91
We can also do constrained optimisation, by providing either intervals within which the parameters must stay, or costraint functions that they need to respect. For instance, here's how one can find the MLE with the constraint that the variance must be less than 0.01 and the mean must be between -1 and 1.:
0 commit comments