Emulator Options
For any smooth function with low dynamic range the default gaussian process will provide all the accuracy that is necesarry for most purposes. However, it is not hard to imagine certain situations where taking some extra care before invoking the GP can lead to more accuracy. Here, some of the extra options that are available for the emulation are highlighted. The example data will be the Schecter function example:
import swiftemulator as se
from swiftemulator.emulators import gaussian_process
import numpy as np
def log_schecter_function(log_M, log_M_star, alpha):
M = 10 ** log_M
M_star = 10 ** log_M_star
return np.log10( (1 / M_star) * (M / M_star) ** alpha * np.exp(- M / M_star ))
model_specification = se.ModelSpecification(
number_of_parameters=2,
parameter_names=["log_M_star","alpha"],
parameter_limits=[[11.,12.],[-1.,-3.]],
parameter_printable_names=["Mass at knee","Low mass slope"],
)
log_M_star = np.random.uniform(11., 12., 100)
alpha = np.random.uniform(-1., -3., 100)
modelparameters = {}
for unique_identifier in range(100):
modelparameters[unique_identifier] = {"log_M_star": log_M_star[unique_identifier],
"alpha": alpha[unique_identifier]}
model_parameters = se.ModelParameters(model_parameters=modelparameters)
modelvalues = {}
for unique_identifier in range(100):
independent = np.linspace(10,12,10)
dependent = log_schecter_function(independent,
log_M_star[unique_identifier],
alpha[unique_identifier])
dependent_error = 0.02 * dependent
modelvalues[unique_identifier] = {"independent": independent,
"dependent": dependent,
"dependent_error": dependent_error}
model_values = se.ModelValues(model_values=modelvalues)
schecter_emulator = gaussian_process.GaussianProcessEmulator()
schecter_emulator.fit_model(model_specification=model_specification,
model_parameters=model_parameters,
model_values=model_values)
Mean models
The most basic addition to a GP is to exchange
the constant (or zero) mean for a more complete
model. For the SWIFT-Emulator these can be found
under swiftemulator.mean_models(). All
currently implement models come in the form of
different order polynomials. Much like the GP
you have to define your model first. The model can
then be passed to the GP which will fit the
coefficients and use that as a mean model.
The GP will then be used to model the residuals
between the polynomial fit and the data.
from swiftemulator.mean_models.polynomial import PolynomialMeanModel
polynomial_model = PolynomialMeanModel(degree=1)
schecter_emulator = gaussian_process.GaussianProcessEmulator(mean_model=polynomial_model)
schecter_emulator.fit_model(model_specification=model_specification,
model_parameters=model_parameters,
model_values=model_values)
The polynomial model fits a polynomial surface to all parameters. This includes not just the polynomial coefficients for each parameter but also the linear combinations up to the degree of the model. Be carefull picking a degree that is very large, as it can quickly lead to over-fitting.
Emulating Bin-by-Bin
The scaling relations obtained from simulations are often binned relations. For the all-purpose emulator we use the independent as an additional parameter which the emulator uses for prediction, but there are situation where you would prefer modeling the response at each independent bin seperately, instead of modeling it all at once.
When emulating bin-to-bin the main difference
is that your GP now comes from
swiftemulator.emulators.gaussian_process_bins().
Each bin will have a unique emulator, that is
trained on all data available for that bin.
A bin is created for each unique value of
the independent found in the ModelValues
container. It is extremely important that
each bin of the original data-set has exactly
the same value for the independent. However,
the individual models do not need the same
sample of bins. If some models are missing
values for some of the bins, this is not
a problem.
Using the binned emulator is as simple as
from swiftemulator.emulators import gaussian_process_bins
schecter_emulator_binned = gaussian_process_bins.GaussianProcessEmulatorBins()
schecter_emulator_binned.fit_model(model_specification=model_specification,
model_parameters=model_parameters,
model_values=model_values)
Which has the same predicion functionality
as the standard gaussian_process.
Note that there is also a binned version
of the cross checks,
swiftemulator.sensitivity.cross_check_bins(),
which acts the same as the normal cross_check
but instead uses the binned emulator, making
it easy to compare the two methods.
1D Emulation
Sometimes the emulation problem is better solved as
In this case we only have the model parameters.
the emulator won’t be a function of an additional x
parameter stored in the model values. In this case the
use can use swiftemulator.emulators.gaussian_process_one_dim().
This method has similar functionality as the other
emulator types. It will still need a ModelValues
container. Here is an example of how such a container
should look like:
modelvalues = {}
for unique_identifier in range(100):
dependent = func(a_arr[unique_identifier], b_arr[unique_identifier])
dependent_error = 0.02 * dependent
modelvalues[unique_identifier] = {"independent": [None],
"dependent": [dependent],
"dependent_error": [dependent_error]}
In order to make use of the general emulator containers, it is still required to provide the values as list. In this case the lists will only contain a single value. The independent value will not be read. When your data is in the correct format the emulator can be trained like all the other methods.
from swiftemulator.emulators import gaussian_process_one_dim
schecter_emulator_one_dim = gaussian_process_one_dim.GaussianProcessEmulator1D()
schecter_emulator_one_dim.fit_model(model_specification=model_specification,
model_parameters=model_parameters,
model_values=model_values)
The only other thing of note is that while predict_values retains the same functionality, you are no longer required to specify any independent values. The prediction is now based purely of the given values of the model parameters.