RMS Link

Optimal Composite Length for Estimating Block Grades

A Companion Notebook to the RMS Blog

Clayton V. Deutsch, Rose A. Aduko, and Jinpyo Kim

March 17, 2024


Outline¶

The following notebook demonstrates how varying composite scales interact with block estimation error. It is shown that composite length at 50 to 100% of the block size minimizes estimation error, applying the following steps:

  1. Generate Reference Truth Realizations
    • Simulate realizations of point-scale variability on a $(1m)^3$ grid, where $L=200$ realizations provide adequate numeric stability
    • Each realizations will be 'drilled' in Step (2) to generate realizations of data sets input to estimation
    • Block average the realizations to the estimation $(10m)^3$ block-scale to provide a 'truth' basis for estimation error calculations in Step (3)
  2. Generate Datasets
    • Sample the reference point-scale realizations at simulated drill hole locations
    • Composite from the 1m point-scale to the remaining $C=20$ composite lengths of 2m, 3m,....,20m
    • This process results in $C \cdot L = 4000$ data sets
  3. Variography per Composite Length
    • A variogram model that is representative of each composite length must be parameterized for input to estimation
    • Experimental variograms are calculated over data sets from Step (2), yielding $C=20$ experimental variograms
    • Variogram models are auto-fit to each experimental variogram and visually checked
  4. Esimation and Evaluation
    • Using the data sets from Step (2) and variogram models from Step (3), perform block kriging using $C \cdot L = 4000$ data sets
    • Calculate the error of each block estimate by comparing against the associated block-scale reference from Step (1)
    • Average the error over all $L$ reference realizations for each of the $C$ composite lengths, before plotting the expected error of each composite length

The presented workflow may be extended to represent other deposits. Inline annotations with each step will note practical adjustments that may be necessary for this extension.


Import required packages:

In [1]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from copy import deepcopy
import rmsp

Activate license and set a couple parameters:

In [2]:
rmsp.activate()
log_progress = False  # Set to true for progressbars on intensive tasks
rmsp.GlobalParams['core.log_progress'] = log_progress
rmsp.GlobalParams['core.enable_beta'] = True
License cleq45ynq00hkgomxsxs35xet checked out by ryan@resmodsol.com to clesnrxfv01veo43fmzfih8ct expires in 47 minutes. RMSP is up to date!

Generate Reference Truth Realizations¶

Define the simulation grid. 2D in the Easting-Elevation plane is used here for speed and ease of demonstration:

In [3]:
griddef = rmsp.GridDef(0.5, 0.0, 0.5, 0.0, 500, 1, 100,
                       1.0, 1.0, 1.0, 'full')
griddef.to_table()
Out[3]:
Easting Northing Elevation
minimum (m) 0.0 -0.5 0.0
maximum (m) 500.0 0.5 100.0
size (m) 1.0 1.0 1.0
number 500 1 100

Define the variogram model for input to simulation. Similar to above, an isotropic variogram is used here to simplify subsequent steps (e.g., variogram calculation/fitting at composite scales):

In [4]:
vario = rmsp.VarioModel.from_params(
    num_struct=1, nugget=0.2,
    shapes='spherical', var_contribs=0.8,
    angles=[90, 0, 0] , ranges=[50, 1, 50])
_ = vario.plot(figsize=(5, 3))

Unconditional simulation of a Gaussian deviate. The number of realizations num_reals may be adjusted down for speed and memory considerations, but may impact numeric stability of the results.

Unconditional simulation of a Gaussian deviate. The number of realizations num_reals may be adjusted down for speed and memory considerations, but may impact numeric stability of the results:

In [5]:
num_reals = 200
cache = rmsp.Simulator().uncond_simulate(
    griddef, vario,
    reals=num_reals,
    cache='simcache/',
    cache_var='ns:var',
    seed=515151
)

Transform the realizations to a lognormal distribution for additional realism. In practice, consider back-transforming using the normal score transform (and/or other relevant transforms):

In [6]:
mu = np.log(1 / np.sqrt(2))
sigma = np.sqrt(np.log(2))

var = rmsp.Variable('var').set_variable_params(cmap='Spectral_r', clim=(0.1, 10), cmap_log=True)
for ireal, real in cache.iter_realizations():
    real[var.col] = np.exp(mu + sigma * real['ns:var'])
    cache.set_real(ireal, real[var.col])

Visualize some realizations:

In [7]:
scene = dict(xlim=griddef.extents('x'), ylim=griddef.extents('z'), orient='xz')
fig, axes = rmsp.ImageGrid(2, 2, figsize=(10, 10), cbar_mode='single')
for ireal, real in cache.iter_realizations(griddef=griddef, reals=4):
    _ = real.sectionplot(var.col, title=f'Realization {ireal}', ax=axes[ireal], **scene)

Block average to the block-scale used for estimation, providing the reference truths at the estimation scale for final evaluation of estimation error.

In [8]:
block_size = 10.0
griddef_block = griddef.change_cell_size(block_size, 1, block_size)
avg = rmsp.GridAverage(griddef, griddef_block).build()
cache_block = avg.apply_sim(cache, var.col, 'simcache_block/', var.col)

The same realizations are visualized for comparison to the point-scale above:

In [9]:
fig, axes = rmsp.ImageGrid(2, 2, figsize=(10, 10), cbar_mode='single')
for ireal, real in cache_block.iter_realizations(griddef=griddef_block, reals=4):
    _ = real.sectionplot(var.col, title=f'Realization {ireal}', ax=axes[ireal], **scene)

Generate Datasets¶

We will be sampling the high resolution model realizations with simulated drilling, before compositing at various lengths, up to twice the planned estimation block size. These composite lengths are defined:

In [10]:
lengths = np.arange(griddef.zsize, block_size * 2 + 1)
lengths
Out[10]:
array([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11., 12., 13.,
       14., 15., 16., 17., 18., 19., 20.])

Sample the point-scalerealizations with 30m space drilling, storing the resulting datasets in a dictionary where the first key is the composite length and the second key is the realization index.

In [11]:
drill_spacing = (30, 1)
sampler = rmsp.ModelSampler(spacings=[drill_spacing], azm=90,
                            comp_len=1.0, locations=griddef,
                            num_reals=num_reals, offset_per_real=False)
sampler.sample(griddef, var.col, cache)
comps = {lengths[0]: {ireal: sampler.get_data(drill_spacing, ireal)
                      for ireal in range(num_reals)}} 

Composite to increasing lengths in parallel, adding resulting datasets to the comps dictionary:

In [12]:
def composite(length):
    """Intended to be called in parallel so that a copy of sampler instance is used"""
    sampler.composite(length, progressbar=False)
    return [sampler.get_data(drill_spacing, ireal) for ireal in range(num_reals)]

results = rmsp.parallel_runner(composite, lengths[1:])
for length, result in zip(lengths[1:], results):
    comps[length] = {ireal: r for ireal, r in enumerate(result)}
del result

Inspect results for the first realization:

In [13]:
fig, axes = rmsp.ImageGrid(5, 2, figsize=(10, 10), cbar_mode='single')
for length, ax in zip(lengths, axes):
    comps[length][0].sectionplot(var.col, **scene, ax=ax,
                                 title=f'Composite Length = {length}m', s=length * 1.1)

Variography per Composite Length¶

The variogram has adjusted from the input model to the reference simulation, due to the lognormal transformation (or back-transforms in practice) and compositing. The variogram could be calculated/modeled for each individual realization and composite length, but we will simplify the process here by calculating an experimental variogram on the first num_rep number of realizations, treated as representative of the overall set.

Accumulate the first num_rep datasets, offsetting them in the y-plane so they aren't mixed in the experimental variogram calculuation:

In [14]:
num_rep = 20
comps_merge = {}
for length in lengths:
    comps_merge[length] = []
    for ireal, comp in comps[length].items():
        if ireal > num_rep:
            break
        comp = comp.copy()
        comp[comp.x] += 1.0e8 * ireal
        comps_merge[length].append(comp)
    comps_merge[length] = pd.concat(comps_merge[length]).reset_index()

In this case, we're working with an isotropic variable, so we'll simply calculate downhole. Multiple search directions and an anisotropic fit may be added in practice.

In [15]:
variomodels = {}
expvarios = {}
zlim = griddef.extents()[1] - griddef.extents()[0]
for length, comp in rmsp.log_progress(comps_merge.items(), progressbar=log_progress):
    search = rmsp.ExpVarioSearch(0, 90, rmsp.Lags(length, length * 0.5, int(60 / length)))
    expvarios[length] = rmsp.ExpVario().calculate(comp, var.col, search, progressbar=False)
    variomodels[length] = rmsp.VarioModel.fit_experimental(
        expvarios[length], num_struct=2, shapes="spherical", ranges12_bounds=1, ranges13_bounds=1)

Visualize the variograms and fit models. Fine tuning or manual enforcement of the nugget effect could be performed in practice.

In [16]:
fig, axes = rmsp.ImageGrid(5, 3, figsize=(10, 10), aspect=False)
for length, ax in zip(lengths, axes):
    expvarios[length].plot(ax=ax, title=f'Composite Length = {length}m')
    variomodels[length].plot_draw(ax)

Estimation and Evaluation¶

Estimate across all realizations at each composite scale:

In [17]:
estimates = {}
for length, comp in rmsp.log_progress(comps.items(), progressbar=log_progress):
    estimates[length] = []
    # Search and number of block discretizations are set per composite length
    search = rmsp.Search.from_vario_buffer(
        variomodels[length], buffer_by=5, min_comps=12, max_comps=24, max_comps_per_dhid=5)
    num_discret = min(max(int(block_size / length), 1), 5)
    for ireal, comp in comps[length].items():
        kriger = rmsp.KrigeEstimator(search, variomodels[length], num_discret=[num_discret, 1, 5])
        estimates[length].append(kriger.estimate(griddef_block, data=comp, var=var.col, progressbar=False))
        # Ensure all blocks are estimated
        assert estimates[length][-1]['estimate'].isna().sum() == 0

Attach the associated reference truths at the block scale to the estimates before calculating overall estimation RMSE by composite length:

In [18]:
rmses = []
for length, estimate in estimates.items():
    for ireal, real in cache_block.iter_realizations():
        estimate[ireal]['truth'] = real[var.col].values
    estimate = pd.concat(estimate).reset_index(drop=True)
    rmses.append(np.sqrt(np.mean((estimate['estimate'] - estimate['truth'])**2)))

Plot calculated RMSE against composite length:

In [19]:
fig, ax = plt.subplots(figsize=(8, 4))
ax.plot(lengths, rmses, 'o-')
rmsp.format_plot(ax, 'Composite Length (m)', 'RMSE', f'Composite Length vs. RMSE ({block_size}m Block)', grid=True)
_ = ax.set_xticks(lengths, labels=[str(int(l)) for l in lengths])