graphdot.model.gaussian_process package

class graphdot.model.gaussian_process.GaussianProcessRegressor(kernel, alpha=1e-08, beta=1e-08, optimizer=None, normalize_y=False, regularization='+', kernel_options={})[source]

Bases: graphdot.model.gaussian_process.base.GaussianProcessRegressorBase

Gaussian process regression (GPR).

Parameters:
  • kernel (kernel instance) – The covariance function of the GP.
  • alpha (float > 0) – Value added to the diagonal of the kernel matrix during fitting. Larger values correspond to increased noise level in the observations. A practical usage of this parameter is to prevent potential numerical stability issues during fitting, and ensures that the kernel matrix is always positive definite in the precense of duplicate entries and/or round-off error.
  • beta (float > 0) – Cutoff value on the singular values for the spectral pseudoinverse computation, which serves as a backup mechanism to invert the kernel matrix in case if it is singular.
  • optimizer (one of (str, True, None, callable)) – A string or callable that represents one of the optimizers usable in the scipy.optimize.minimize method. If None, no hyperparameter optimization will be carried out in fitting. If True, the optimizer will default to L-BFGS-B.
  • normalize_y (boolean) – Whether to normalize the target values y so that the mean and variance become 0 and 1, respectively. Recommended for cases where zero-mean, unit-variance kernels are used. The normalization will be reversed when the GP predictions are returned.
  • regularization ('+' or 'additive' or '*' or 'multiplicative') – Determines the method of regularization. If ‘+’ or ‘additive’, alpha is added to the diagonals of the kernel matrix. If ‘*’ or ‘multiplicative’, a factor of 1 + alpha will be multiplied with each diagonal element.
  • kernel_options (dict, optional) – A dictionary of additional options to be passed along when applying the kernel to data.
fit(X, y, loss='likelihood', tol=1e-05, repeat=1, theta_jitter=1.0, verbose=False)[source]

Train a GPR model. If the optimizer argument was set while initializing the GPR object, the hyperparameters of the kernel will be optimized using the specified loss function.

Parameters:
  • X (list of objects or feature vectors.) – Input values of the training data.
  • y (1D array) – Output/target values of the training data.
  • loss ('likelihood' or 'loocv') – The loss function to be minimzed during training. Could be either ‘likelihood’ (negative log-likelihood) or ‘loocv’ (mean-square leave-one-out cross validation error).
  • tol (float) – Tolerance for termination.
  • repeat (int) – Repeat the hyperparameter optimization by the specified number of times and return the best result.
  • theta_jitter (float) – Standard deviation of the random noise added to the initial logscale hyperparameters across repeated optimization runs.
  • verbose (bool) – Whether or not to print out the optimization progress and outcome.
Returns:

self – returns an instance of self.

Return type:

GaussianProcessRegressor

fit_loocv(X, y, **options)[source]

Alias of :py:`fit_loocv(X, y, loss='loocv', **options)`.

log_marginal_likelihood(theta=None, X=None, y=None, eval_gradient=False, clone_kernel=True, verbose=False)[source]

Returns the log-marginal likelihood of a given set of log-scale hyperparameters.

Parameters:
  • theta (array-like) – Kernel hyperparameters for which the log-marginal likelihood is to be evaluated. If None, the current hyperparameters will be used.
  • X (list of objects or feature vectors.) – Input values of the training data. If None, self.X will be used.
  • y (1D array) – Output/target values of the training data. If None, self.y will be used.
  • eval_gradient (boolean) – If True, the gradient of the log-marginal likelihood with respect to the kernel hyperparameters at position theta will be returned alongside.
  • clone_kernel (boolean) – If True, the kernel is copied so that probing with theta does not alter the trained kernel. If False, the kernel hyperparameters will be modified in-place.
  • verbose (boolean) – If True, the log-likelihood value and its components will be printed to the screen.
Returns:

  • log_likelihood (float) – Log-marginal likelihood of theta for training data.
  • log_likelihood_gradient (1D array) – Gradient of the log-marginal likelihood with respect to the kernel hyperparameters at position theta. Only returned when eval_gradient is True.

predict(Z, return_std=False, return_cov=False)[source]

Predict using the trained GPR model.

Parameters:
  • Z (list of objects or feature vectors.) – Input values of the unknown data.
  • return_std (boolean) – If True, the standard-deviations of the predictions at the query points are returned along with the mean.
  • return_cov (boolean) – If True, the covariance of the predictions at the query points are returned along with the mean.
Returns:

  • ymean (1D array) – Mean of the predictive distribution at query points.
  • std (1D array) – Standard deviation of the predictive distribution at query points.
  • cov (2D matrix) – Covariance of the predictive distribution at query points.

predict_loocv(Z, z, return_std=False)[source]

Compute the leave-one-out cross validation prediction of the given data.

Parameters:
  • Z (list of objects or feature vectors.) – Input values of the unknown data.
  • z (1D array) – Target values of the training data.
  • return_std (boolean) – If True, the standard-deviations of the predictions at the query points are returned along with the mean.
Returns:

  • ymean (1D array) – Leave-one-out mean of the predictive distribution at query points.
  • std (1D array) – Leave-one-out standard deviation of the predictive distribution at query points.

squared_loocv_error(theta=None, X=None, y=None, eval_gradient=False, clone_kernel=True, verbose=False)[source]

Returns the squared LOOCV error of a given set of log-scale hyperparameters.

Parameters:
  • theta (array-like) – Kernel hyperparameters for which the log-marginal likelihood is to be evaluated. If None, the current hyperparameters will be used.
  • X (list of objects or feature vectors.) – Input values of the training data. If None, self.X will be used.
  • y (1D array) – Output/target values of the training data. If None, self.y will be used.
  • eval_gradient (boolean) – If True, the gradient of the log-marginal likelihood with respect to the kernel hyperparameters at position theta will be returned alongside.
  • clone_kernel (boolean) – If True, the kernel is copied so that probing with theta does not alter the trained kernel. If False, the kernel hyperparameters will be modified in-place.
  • verbose (boolean) – If True, the log-likelihood value and its components will be printed to the screen.
Returns:

  • squared_error (float) – Squared LOOCV error of theta for training data.
  • squared_error_gradient (1D array) – Gradient of the Squared LOOCV error with respect to the kernel hyperparameters at position theta. Only returned when eval_gradient is True.

class graphdot.model.gaussian_process.LowRankApproximateGPR(kernel, alpha=1e-07, beta=1e-07, optimizer=None, normalize_y=False, regularization='+', kernel_options={})[source]

Bases: graphdot.model.gaussian_process.base.GaussianProcessRegressorBase

Accelerated Gaussian process regression (GPR) using the Nystrom low-rank approximation.

Parameters:
  • kernel (kernel instance) – The covariance function of the GP.
  • alpha (float > 0) – Value added to the diagonal of the core matrix during fitting. Larger values correspond to increased noise level in the observations. A practical usage of this parameter is to prevent potential numerical stability issues during fitting, and ensures that the core matrix is always positive definite in the precense of duplicate entries and/or round-off error.
  • beta (float > 0) – Cutoff value on the singular values for the spectral pseudoinverse of the low-rank kernel matrix.
  • optimizer (one of (str, True, None, callable)) – A string or callable that represents one of the optimizers usable in the scipy.optimize.minimize method. If None, no hyperparameter optimization will be carried out in fitting. If True, the optimizer will default to L-BFGS-B.
  • normalize_y (boolean) – Whether to normalize the target values y so that the mean and variance become 0 and 1, respectively. Recommended for cases where zero-mean, unit-variance kernels are used. The normalization will be reversed when the GP predictions are returned.
  • regularization ('+' or 'additive' or '*' or 'multiplicative') – Determines the method of regularization. If ‘+’ or ‘additive’, alpha is added to the diagonals of the kernel matrix. If ‘*’ or ‘multiplicative’, a factor of 1 + alpha will be multiplied with each diagonal element.
  • kernel_options (dict, optional) – A dictionary of additional options to be passed along when applying the kernel to data.
C

The core sample set for constructing the subspace for low-rank approximation.

fit(C, X, y, loss='likelihood', tol=1e-05, repeat=1, theta_jitter=1.0, verbose=False)[source]

Train a low-rank approximate GPR model. If the optimizer argument was set while initializing the GPR object, the hyperparameters of the kernel will be optimized using the specified loss function.

Parameters:
  • C (list of objects or feature vectors.) – The core set that defines the subspace of low-rank approximation.
  • X (list of objects or feature vectors.) – Input values of the training data.
  • y (1D array) – Output/target values of the training data.
  • loss ('likelihood' or 'loocv') – The loss function to be minimzed during training. Could be either ‘likelihood’ (negative log-likelihood) or ‘loocv’ (mean-square leave-one-out cross validation error).
  • tol (float) – Tolerance for termination.
  • repeat (int) – Repeat the hyperparameter optimization by the specified number of times and return the best result.
  • theta_jitter (float) – Standard deviation of the random noise added to the initial logscale hyperparameters across repeated optimization runs.
  • verbose (bool) – Whether or not to print out the optimization progress and outcome.
Returns:

self – returns an instance of self.

Return type:

LowRankApproximateGPR

log_marginal_likelihood(theta=None, C=None, X=None, y=None, eval_gradient=False, clone_kernel=True, verbose=False)[source]

Returns the log-marginal likelihood of a given set of log-scale hyperparameters.

Parameters:
  • theta (array-like) – Kernel hyperparameters for which the log-marginal likelihood is to be evaluated. If None, the current hyperparameters will be used.
  • C (list of objects or feature vectors.) – The core set that defines the subspace of low-rank approximation. If None, self.C will be used.
  • X (list of objects or feature vectors.) – Input values of the training data. If None, self.X will be used.
  • y (1D array) – Output/target values of the training data. If None, self.y will be used.
  • eval_gradient (boolean) – If True, the gradient of the log-marginal likelihood with respect to the kernel hyperparameters at position theta will be returned alongside.
  • clone_kernel (boolean) – If True, the kernel is copied so that probing with theta does not alter the trained kernel. If False, the kernel hyperparameters will be modified in-place.
  • verbose (boolean) – If True, the log-likelihood value and its components will be printed to the screen.
Returns:

  • log_likelihood (float) – Log-marginal likelihood of theta for training data.
  • log_likelihood_gradient (1D array) – Gradient of the log-marginal likelihood with respect to the kernel hyperparameters at position theta. Only returned when eval_gradient is True.

predict(Z, return_std=False, return_cov=False)[source]

Predict using the trained GPR model.

Parameters:
  • Z (list of objects or feature vectors.) – Input values of the unknown data.
  • return_std (boolean) – If True, the standard-deviations of the predictions at the query points are returned along with the mean.
  • return_cov (boolean) – If True, the covariance of the predictions at the query points are returned along with the mean.
Returns:

  • ymean (1D array) – Mean of the predictive distribution at query points.
  • std (1D array) – Standard deviation of the predictive distribution at query points.
  • cov (2D matrix) – Covariance of the predictive distribution at query points.

predict_loocv(Z, z, return_std=False, method='auto')[source]

Compute the leave-one-out cross validation prediction of the given data.

Parameters:
  • Z (list of objects or feature vectors.) – Input values of the unknown data.
  • z (1D array) – Target values of the training data.
  • return_std (boolean) – If True, the standard-deviations of the predictions at the query points are returned along with the mean.
  • method ('auto' or 'ridge-like' or 'gpr-like') – Selects the algorithm used for fast evaluation of the leave-one-out cross validation without expliciting training one model per sample. ‘ridge-like’ seems to be more stable with a smaller core size (that is not rank-deficit), while ‘gpr-like’ seems to be more stable with a larger core size. By default, the option is ‘auto’ and the function will choose a method based on an analysis on the eigenspectrum of the dataset.
Returns:

  • ymean (1D array) – Leave-one-out mean of the predictive distribution at query points.
  • std (1D array) – Leave-one-out standard deviation of the predictive distribution at query points.

class graphdot.model.gaussian_process.GPROutlierDetector(kernel, sigma_bounds=(0.0001, inf), beta=1e-08, optimizer=True, normalize_y=False, kernel_options={})[source]

Bases: graphdot.model.gaussian_process.base.GaussianProcessRegressorBase

Gaussian process regression (GPR) with noise/outlier detection via maximum likelihood estimation.

Parameters:
  • kernel (kernel instance) – The covariance function of the GP.
  • sigma_bounds (a tuple of two floats) – As Value added to the diagonal of the kernel matrix during fitting. The 2-tuple will be regarded as the lower and upper bounds of the values added to each diagonal element, which will be optimized individually by training. Larger values correspond to increased noise level in the observations. A practical usage of this parameter is to prevent potential numerical stability issues during fitting, and ensures that the kernel matrix is always positive definite in the precense of duplicate entries and/or round-off error.
  • beta (float > 0) – Cutoff value on the singular values for the spectral pseudoinverse computation, which serves as a backup mechanism to invert the kernel matrix in case if it is singular.
  • optimizer (one of (str, True, None, callable)) – A string or callable that represents one of the optimizers usable in the scipy.optimize.minimize method. If None, no hyperparameter optimization will be carried out in fitting. If True, the optimizer will default to L-BFGS-B.
  • normalize_y (boolean) – Whether to normalize the target values y so that the mean and variance become 0 and 1, respectively. Recommended for cases where zero-mean, unit-variance kernels are used. The normalization will be reversed when the GP predictions are returned.
  • kernel_options (dict, optional) – A dictionary of additional options to be passed along when applying the kernel to data.
fit(X, y, w, udist=None, tol=0.0001, repeat=1, theta_jitter=1.0, verbose=False)[source]

Train a GPR model. If the optimizer argument was set while initializing the GPR object, the hyperparameters of the kernel will be optimized using the specified loss function.

Parameters:
  • X (list of objects or feature vectors.) – Input values of the training data.
  • y (1D array) – Output/target values of the training data.
  • w (float) – The strength of L1 penalty on the noise terms.
  • udist (callable) – A random number generator for the initial guesses of the uncertainties. A lognormal distribution will be used by default if the argument is None.
  • tol (float) – Tolerance for termination.
  • repeat (int) – Repeat the hyperparameter optimization by the specified number of times and return the best result.
  • theta_jitter (float) – Standard deviation of the random noise added to the initial logscale hyperparameters across repeated optimization runs.
  • verbose (bool) – Whether or not to print out the optimization progress and outcome.
Returns:

self – returns an instance of self.

Return type:

GaussianProcessRegressor

log_marginal_likelihood(theta_ext, X=None, y=None, eval_gradient=False, clone_kernel=True, verbose=False)[source]

Returns the log-marginal likelihood of a given set of log-scale hyperparameters.

Parameters:
  • theta_ext (array-like) – Kernel hyperparameters and per-sample noise prior for which the log-marginal likelihood is to be evaluated. If None, the current hyperparameters will be used.
  • X (list of objects or feature vectors.) – Input values of the training data. If None, self.X will be used.
  • y (1D array) – Output/target values of the training data. If None, self.y will be used.
  • eval_gradient (boolean) – If True, the gradient of the log-marginal likelihood with respect to the kernel hyperparameters at position theta will be returned alongside.
  • clone_kernel (boolean) – If True, the kernel is copied so that probing with theta does not alter the trained kernel. If False, the kernel hyperparameters will be modified in-place.
  • verbose (boolean) – If True, the log-likelihood value and its components will be printed to the screen.
Returns:

  • log_likelihood (float) – Log-marginal likelihood of theta for training data.
  • log_likelihood_gradient (1D array) – Gradient of the log-marginal likelihood with respect to the kernel hyperparameters at position theta. Only returned when eval_gradient is True.

predict(Z, return_std=False, return_cov=False)[source]

Predict using the trained GPR model.

Parameters:
  • Z (list of objects or feature vectors.) – Input values of the unknown data.
  • return_std (boolean) – If True, the standard-deviations of the predictions at the query points are returned along with the mean.
  • return_cov (boolean) – If True, the covariance of the predictions at the query points are returned along with the mean.
Returns:

  • ymean (1D array) – Mean of the predictive distribution at query points.
  • std (1D array) – Standard deviation of the predictive distribution at query points.
  • cov (2D matrix) – Covariance of the predictive distribution at query points.

y_uncertainty

The learned uncertainty magnitude of each training sample.