{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "39d65390-f4d0-49e0-bfc5-f20c9d376a9d",
   "metadata": {},
   "source": [
    "<div style=\"text-align: center;padding-top:10px;padding-bottom:10px\">\n",
    "    <a href=\"https://resourcemodelingsolutions.com/\" target=\"_blank\">\n",
    "        <img src=\"https://geostatisticslessons.com/images/icons/resource_modeling_solutions_logo.png\" alt=\"RMS Link\" style=\"width:300px;\">\n",
    "    </a>\n",
    "</div>\n",
    "\n",
    "<div style=\"text-align: center\">\n",
    "    <h1 style=\"color:#2B334F;padding-bottom:0px\">Optimal Composite Length for Estimating Block Grades </h1>\n",
    "    <h2 style=\"color:#2B334F;padding-bottom:20px\">A Companion Notebook to the RMS Blog </h2>    \n",
    "</div>\n",
    "\n",
    "<p class=\"author\" style=\"padding: 0;\"><center><b>Clayton V. Deutsch, Rose A. Aduko, and Jinpyo Kim</b></center></p>\n",
    "<p class=\"date\" style=\"padding: 0;\"><center><b>March 17, 2024</b></center></p>\n",
    "\n",
    "---\n",
    "\n",
    "# Outline\n",
    "\n",
    "The following notebook demonstrates how varying composite scales interact with block estimation error. It is shown that composite length at 50 to 100% of the block size minimizes estimation error, applying the following steps:\n",
    "\n",
    "1. **Generate Reference Truth Realizations**\n",
    "    - Simulate realizations of point-scale variability on a $(1m)^3$ grid, where $L=200$ realizations provide adequate numeric stability\n",
    "    - Each realizations will be 'drilled' in Step (2) to generate realizations of data sets input to estimation\n",
    "    - Block average the realizations to the estimation $(10m)^3$ block-scale to provide a 'truth' basis for estimation error calculations in Step (3)\n",
    "2. **Generate Datasets**\n",
    "    - Sample the reference point-scale realizations at simulated drill hole locations\n",
    "    - Composite from the 1m point-scale to the remaining $C=20$ composite lengths of 2m, 3m,....,20m\n",
    "    - This process results in $C \\cdot L = 4000$ data sets\n",
    "3. **Variography per Composite Length**\n",
    "    - A variogram model that is representative of each composite length must be parameterized for input to estimation\n",
    "    - Experimental variograms are calculated over data sets from Step (2), yielding $C=20$ experimental variograms\n",
    "    - Variogram models are auto-fit to each experimental variogram and visually checked\n",
    "5. **Esimation and Evaluation**\n",
    "    - Using the data sets from Step (2) and variogram models from Step (3), perform block kriging using $C \\cdot L = 4000$ data sets\n",
    "    - Calculate the error of each block estimate by comparing against the associated block-scale reference from Step (1)\n",
    "    - Average the error over all $L$ reference realizations for each of the $C$ composite lengths, before plotting the expected error of each composite length\n",
    "    \n",
    "The presented workflow may be extended to represent other deposits. Inline annotations with each step will note practical adjustments that may be necessary for this extension.\n",
    "\n",
    "---\n",
    "\n",
    "Import required packages:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "fb3f3594-1c41-4aa6-92c2-7e518b325038",
   "metadata": {},
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "from copy import deepcopy\n",
    "import rmsp"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a94cfd7c-f541-4b46-aa6c-4383a42632e6",
   "metadata": {},
   "source": [
    "Activate license and set a couple parameters:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e6ffdeed-b62b-498e-90df-dcbb762e5ff4",
   "metadata": {},
   "outputs": [],
   "source": [
    "rmsp.activate()\n",
    "log_progress = False  # Set to true for progressbars on intensive tasks\n",
    "rmsp.GlobalParams['core.log_progress'] = log_progress\n",
    "rmsp.GlobalParams['core.enable_beta'] = True"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "693d30d3-270f-4312-8946-b0cb0223bc8f",
   "metadata": {
    "tags": []
   },
   "source": [
    "---\n",
    "## Generate Reference Truth Realizations\n",
    "\n",
    "Define the simulation grid. 2D in the Easting-Elevation plane is used here for speed and ease of demonstration:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "132efee3-fc35-4422-8f5d-c457fa46b39a",
   "metadata": {},
   "outputs": [],
   "source": [
    "griddef = rmsp.GridDef(0.5, 0.0, 0.5, 0.0, 500, 1, 100,\n",
    "                       1.0, 1.0, 1.0, 'full')\n",
    "griddef.to_table()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "99bac393-afc5-459b-a276-55850ed5cb60",
   "metadata": {},
   "source": [
    "Define the variogram model for input to simulation. Similar to above, an isotropic variogram is used here to simplify subsequent steps (e.g., variogram calculation/fitting at composite scales):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2d297af4-c000-4162-b747-6a29daf36d74",
   "metadata": {},
   "outputs": [],
   "source": [
    "vario = rmsp.VarioModel.from_params(\n",
    "    num_struct=1, nugget=0.2,\n",
    "    shapes='spherical', var_contribs=0.8,\n",
    "    angles=[90, 0, 0] , ranges=[50, 1, 50])\n",
    "_ = vario.plot(figsize=(5, 3))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0c30f4aa-4332-40be-bf6c-e52243b552ad",
   "metadata": {},
   "source": [
    "Unconditional simulation of a Gaussian deviate. The number of realizations `num_reals` may be adjusted down for speed and memory considerations, but may impact numeric stability of the results."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "15cc0fd0-8cf5-461c-bfc7-c434a5645b4a",
   "metadata": {},
   "source": [
    "Unconditional simulation of a Gaussian deviate. The number of realizations `num_reals` may be adjusted down for speed and memory considerations, but may impact numeric stability of the results:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b1883214-57c4-4aa2-8d64-aa8e3046e30c",
   "metadata": {},
   "outputs": [],
   "source": [
    "num_reals = 200\n",
    "cache = rmsp.Simulator().uncond_simulate(\n",
    "    griddef, vario,\n",
    "    reals=num_reals,\n",
    "    cache='simcache/',\n",
    "    cache_var='ns:var',\n",
    "    seed=515151\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "884f961b-8131-4c02-9360-34be798e55c2",
   "metadata": {},
   "source": [
    "Transform the realizations to a lognormal distribution for additional realism. In practice, consider back-transforming using the normal score transform (and/or other relevant transforms):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1fc6ffc6-2314-4698-b895-088b6a9d805b",
   "metadata": {},
   "outputs": [],
   "source": [
    "mu = np.log(1 / np.sqrt(2))\n",
    "sigma = np.sqrt(np.log(2))\n",
    "\n",
    "var = rmsp.Variable('var').set_variable_params(cmap='Spectral_r', clim=(0.1, 10), cmap_log=True)\n",
    "for ireal, real in cache.iter_realizations():\n",
    "    real[var.col] = np.exp(mu + sigma * real['ns:var'])\n",
    "    cache.set_real(ireal, real[var.col])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2ed6893a-f560-4055-a7b5-a56397782409",
   "metadata": {},
   "source": [
    "Visualize some realizations:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4db01ddd-8d32-433a-ac15-fcb4fdeba45a",
   "metadata": {},
   "outputs": [],
   "source": [
    "scene = dict(xlim=griddef.extents('x'), ylim=griddef.extents('z'), orient='xz')\n",
    "fig, axes = rmsp.ImageGrid(2, 2, figsize=(10, 10), cbar_mode='single')\n",
    "for ireal, real in cache.iter_realizations(griddef=griddef, reals=4):\n",
    "    _ = real.sectionplot(var.col, title=f'Realization {ireal}', ax=axes[ireal], **scene)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fed0e67d-1d09-476b-a46a-0f7026e12450",
   "metadata": {},
   "source": [
    "Block average to the block-scale used for estimation, providing the reference truths at the estimation scale for final evaluation of estimation error. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5ced652e-aa26-423e-8991-1445092cbd3e",
   "metadata": {},
   "outputs": [],
   "source": [
    "block_size = 10.0\n",
    "griddef_block = griddef.change_cell_size(block_size, 1, block_size)\n",
    "avg = rmsp.GridAverage(griddef, griddef_block).build()\n",
    "cache_block = avg.apply_sim(cache, var.col, 'simcache_block/', var.col)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9c5ebae6-1bb1-4145-9e23-5edfaa2cb267",
   "metadata": {},
   "source": [
    "The same realizations are visualized for comparison to the point-scale above:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6848467f-3983-4499-82c9-0be01782db89",
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, axes = rmsp.ImageGrid(2, 2, figsize=(10, 10), cbar_mode='single')\n",
    "for ireal, real in cache_block.iter_realizations(griddef=griddef_block, reals=4):\n",
    "    _ = real.sectionplot(var.col, title=f'Realization {ireal}', ax=axes[ireal], **scene)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "88fff4cc-e0ba-46bd-b99e-31cc57bf57e8",
   "metadata": {
    "tags": []
   },
   "source": [
    "---\n",
    "## Generate Datasets\n",
    "\n",
    "We will be sampling the high resolution model realizations with simulated drilling, before compositing at various lengths, up to twice the planned estimation block size. These composite lengths are defined:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3b9067b6-6c86-44ef-a430-75a91bfe6217",
   "metadata": {},
   "outputs": [],
   "source": [
    "lengths = np.arange(griddef.zsize, block_size * 2 + 1)\n",
    "lengths"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "13bcb347-94db-42c7-b17c-ba6a04b1a893",
   "metadata": {},
   "source": [
    "Sample the point-scalerealizations with 30m space drilling, storing the resulting datasets in a dictionary where the first key is the composite length and the second key is the realization index."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ad4dd8d8-9b9b-4029-88d8-5484d9ceea07",
   "metadata": {},
   "outputs": [],
   "source": [
    "drill_spacing = (30, 1)\n",
    "sampler = rmsp.ModelSampler(spacings=[drill_spacing], azm=90,\n",
    "                            comp_len=1.0, locations=griddef,\n",
    "                            num_reals=num_reals, offset_per_real=False)\n",
    "sampler.sample(griddef, var.col, cache)\n",
    "comps = {lengths[0]: {ireal: sampler.get_data(drill_spacing, ireal)\n",
    "                      for ireal in range(num_reals)}} "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "19cbe9fb-95cf-4a56-9210-cfa88206d487",
   "metadata": {},
   "source": [
    "Composite to increasing lengths in parallel, adding resulting datasets to the `comps` dictionary:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0582a8ae-fd5b-4739-85f3-318f626481bc",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "def composite(length):\n",
    "    \"\"\"Intended to be called in parallel so that a copy of sampler instance is used\"\"\"\n",
    "    sampler.composite(length, progressbar=False)\n",
    "    return [sampler.get_data(drill_spacing, ireal) for ireal in range(num_reals)]\n",
    "\n",
    "results = rmsp.parallel_runner(composite, lengths[1:])\n",
    "for length, result in zip(lengths[1:], results):\n",
    "    comps[length] = {ireal: r for ireal, r in enumerate(result)}\n",
    "del result"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "74bf1957-2717-4133-b30c-25297695d877",
   "metadata": {},
   "source": [
    "Inspect results for the first realization:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a8d386d4-6d5b-40d5-b734-01ba1340e48c",
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, axes = rmsp.ImageGrid(5, 2, figsize=(10, 10), cbar_mode='single')\n",
    "for length, ax in zip(lengths, axes):\n",
    "    comps[length][0].sectionplot(var.col, **scene, ax=ax,\n",
    "                                 title=f'Composite Length = {length}m', s=length * 1.1)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "41302601-8456-442c-bef8-b536f82a2bbd",
   "metadata": {},
   "source": [
    "---\n",
    "## Variography per Composite Length\n",
    "\n",
    "The variogram has adjusted from the input model to the reference simulation, due to the lognormal transformation (or back-transforms in practice) and compositing. The variogram could be calculated/modeled for each individual realization and composite length, but we will simplify the process here by calculating an experimental variogram on the first `num_rep` number of realizations, treated as representative of the overall set.\n",
    "\n",
    "Accumulate the first `num_rep` datasets, offsetting them in the y-plane so they aren't mixed in the experimental variogram calculuation:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "68021f29-9914-4950-b4ef-be60cd62cad0",
   "metadata": {},
   "outputs": [],
   "source": [
    "num_rep = 20\n",
    "comps_merge = {}\n",
    "for length in lengths:\n",
    "    comps_merge[length] = []\n",
    "    for ireal, comp in comps[length].items():\n",
    "        if ireal > num_rep:\n",
    "            break\n",
    "        comp = comp.copy()\n",
    "        comp[comp.x] += 1.0e8 * ireal\n",
    "        comps_merge[length].append(comp)\n",
    "    comps_merge[length] = pd.concat(comps_merge[length]).reset_index()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "eaa75ebd-dd9f-467c-943f-b5d8f1dca090",
   "metadata": {},
   "source": [
    "In this case, we're working with an isotropic variable, so we'll simply calculate downhole. Multiple search directions and an anisotropic fit may be added in practice."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8ad53132-6744-4dff-ba6e-d9760b373c03",
   "metadata": {},
   "outputs": [],
   "source": [
    "variomodels = {}\n",
    "expvarios = {}\n",
    "zlim = griddef.extents()[1] - griddef.extents()[0]\n",
    "for length, comp in rmsp.log_progress(comps_merge.items(), progressbar=log_progress):\n",
    "    search = rmsp.ExpVarioSearch(0, 90, rmsp.Lags(length, length * 0.5, int(60 / length)))\n",
    "    expvarios[length] = rmsp.ExpVario().calculate(comp, var.col, search, progressbar=False)\n",
    "    variomodels[length] = rmsp.VarioModel.fit_experimental(\n",
    "        expvarios[length], num_struct=2, shapes=\"spherical\", ranges12_bounds=1, ranges13_bounds=1)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2f4cc094-f662-4364-b390-345cfebbc859",
   "metadata": {},
   "source": [
    "Visualize the variograms and fit models. Fine tuning or manual enforcement of the nugget effect could be performed in practice."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b7c98201-3559-4a8d-8edf-b9cdc2d8241c",
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, axes = rmsp.ImageGrid(5, 3, figsize=(10, 10), aspect=False)\n",
    "for length, ax in zip(lengths, axes):\n",
    "    expvarios[length].plot(ax=ax, title=f'Composite Length = {length}m')\n",
    "    variomodels[length].plot_draw(ax)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8f55eaf6-fd7e-423f-a62e-9eea03327804",
   "metadata": {},
   "source": [
    "---\n",
    "## Estimation and Evaluation\n",
    "\n",
    "Estimate across all realizations at each composite scale:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "dff1c72a-dfc2-4e3c-a1ba-fe24cba29dc6",
   "metadata": {
    "scrolled": true,
    "tags": []
   },
   "outputs": [],
   "source": [
    "estimates = {}\n",
    "for length, comp in rmsp.log_progress(comps.items(), progressbar=log_progress):\n",
    "    estimates[length] = []\n",
    "    # Search and number of block discretizations are set per composite length\n",
    "    search = rmsp.Search.from_vario_buffer(\n",
    "        variomodels[length], buffer_by=5, min_comps=12, max_comps=24, max_comps_per_dhid=5)\n",
    "    num_discret = min(max(int(block_size / length), 1), 5)\n",
    "    for ireal, comp in comps[length].items():\n",
    "        kriger = rmsp.KrigeEstimator(search, variomodels[length], num_discret=[num_discret, 1, 5])\n",
    "        estimates[length].append(kriger.estimate(griddef_block, data=comp, var=var.col, progressbar=False))\n",
    "        # Ensure all blocks are estimated\n",
    "        assert estimates[length][-1]['estimate'].isna().sum() == 0"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f207d531-4e5b-4805-8ffd-85e13e584d12",
   "metadata": {},
   "source": [
    "Attach the associated reference truths at the block scale to the estimates before calculating overall estimation RMSE by composite length:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "43d53abd-d406-49ec-9860-8987d33421f9",
   "metadata": {},
   "outputs": [],
   "source": [
    "rmses = []\n",
    "for length, estimate in estimates.items():\n",
    "    for ireal, real in cache_block.iter_realizations():\n",
    "        estimate[ireal]['truth'] = real[var.col].values\n",
    "    estimate = pd.concat(estimate).reset_index(drop=True)\n",
    "    rmses.append(np.sqrt(np.mean((estimate['estimate'] - estimate['truth'])**2)))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a13109b1-d31b-4935-8a11-2babe0d8b911",
   "metadata": {},
   "source": [
    "Plot calculated RMSE against composite length:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d0124b55-dde6-41ee-bb2c-65f0bb3e3211",
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, ax = plt.subplots(figsize=(8, 4))\n",
    "ax.plot(lengths, rmses, 'o-')\n",
    "rmsp.format_plot(ax, 'Composite Length (m)', 'RMSE', f'Composite Length vs. RMSE ({block_size}m Block)', grid=True)\n",
    "_ = ax.set_xticks(lengths, labels=[str(int(l)) for l in lengths])"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.8"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}