Technical Overview & Methodology
This document is a technical reference for the Brutus powder indexing software. It explains the main algorithms, search parameters, and methodology, and is intended for users familiar with powder X-ray diffraction. As the name suggests, this is a brute-force method — the search is exhaustive rather than heuristic, and is not necessarily fast.
Core Goal
The aim of ab initio powder indexing is to determine the unit-cell parameters ($a, b, c, \alpha, \beta, \gamma$) from a list of observed diffraction peak positions ($2\theta$). Brutus performs this task using a symmetry-specific, exhaustive search algorithm.
The central assumption is that a small subset of the most intense, low-angle reflections corresponds to simple crystal planes with low-integer Miller indices $(hkl)$. For a given crystal system, the program chooses exactly as many observed peaks as there are unknown lattice parameters, and solves the resulting system of linear equations.
Q-Space Formulation
All peak positions are first converted from $2\theta$ to Q-space, where $Q = 1/d^2$. The general quadratic relationship between $Q$, the Miller indices, and the reciprocal cell parameters ($A, B, C, D, E, F$) is:
where $A = a^{*2}$, $B = b^{*2}$, $C = c^{*2}$, and the cross-terms involve the reciprocal angles. Working in Q-space linearises the problem: for a given assignment of Miller indices to observed peaks, the unknown parameters $\{A, B, C, \ldots\}$ appear as the solution to a system of linear equations.
Brutus solves for these reciprocal parameters (or the subset relevant to the current crystal system), then converts them back to real-space cell parameters. Each candidate cell is then refined with a weighted least-squares fit and scored against the full observed peak list via figures of merit.
System-Specific Parameterisation
The quadratic form simplifies for each crystal system, reducing the number of free parameters. The design matrix column vector for the least-squares system is:
| System | Free parameters ($k$) | Design row getLSDesignRow(hkl) |
|---|---|---|
| Cubic | 1 | $[h^2+k^2+l^2]$ |
| Tetragonal | 2 | $[h^2+k^2,\; l^2]$ |
| Hexagonal | 2 | $[\tfrac{4}{3}(h^2+hk+k^2),\; l^2]$ |
| Orthorhombic | 3 | $[h^2,\; k^2,\; l^2]$ |
| Monoclinic | 4 | $[h^2,\; k^2,\; l^2,\; hl]$ |
| Triclinic | 6 | $[h^2,\; k^2,\; l^2,\; kl,\; hl,\; hk]$ |
The solved parameters relate to reciprocal lattice constants. For example, in orthorhombic, the three fitted parameters are $A=1/a^2$, $B=1/b^2$, $C=1/c^2$, giving $a=1/\sqrt{A}$ etc. For monoclinic, the four parameters are $A=1/(a^2\sin^2\beta)$, $B=1/b^2$, $C=1/(c^2\sin^2\beta)$, $D=-2\cos\beta/(ac\sin^2\beta)$, from which $\beta$ is extracted as $\cos\beta = -D/(2\sqrt{AC})$.
Architecture
Brutus runs entirely in the browser, requiring no installation. The program is split across several files:
brutus.html— the main application: UI, chart rendering (Chart.js), data file parsing, peak detection, space-group analysis, and PDF report generation.worker-logic.js— shared logic loaded both by the main thread (for CPU searches and post-processing) and as a Web Worker (for background CPU indexing). Contains all crystallographic mathematics: HKL generation, Q-calculation, least-squares solvers, figure-of-merit calculations, and the Niggli reduction.webgpu-engine.js— the GPU-accelerated brute-force search for orthorhombic, monoclinic, and triclinic systems.
Quick Start Guide
Use the following workflow for a typical single-phase powder pattern.
-
Load the data file.
Click
Select Data File. Supported formats include.xy,.xrdml,.ras,.xra, and plain text two-column data. -
Detect peaks.
On the Peaks tab, adjust the
Min peak (%),Radius (pts), andPointssliders until the automatically detected peaks match the visual pattern. -
Curate the peak list.
Carefully review all peaks:
- Edit $2\theta$ positions for accuracy.
- Delete spurious peaks (noise, impurities, Kα2 if stripping is off).
- Add important missing peaks, e.g. low-angle reflections, using
Ctrl + Clickon the chart.
-
Set parameters.
On the Parameters tab:
- Select the correct X-ray
Radiation Preset(e.g. Cu Kα). - Choose whether to enable
Strip K-alpha2. When enabled, the Ka1 wavelength is used; when disabled, the average Kα wavelength is used. The default is ON (average Kα wavelength). - Set a chemically reasonable
Max Volume (ų)to limit the search space. - Set
2θ Error (°)according to your data quality (e.g. ≈0.02° for synchrotron, ≈0.05° for a typical lab diffractometer). - Leave Refine Zero-Point Error enabled unless you have a specific reason to turn it off.
- Select the crystal systems to search. Enabling Orthorhombic, Monoclinic, or Triclinic activates GPU-accelerated searches.
- Select the correct X-ray
-
(Optional) Tune GPU Parameters.
If searching Orthorhombic, Monoclinic, or Triclinic, a new box appears.
HKL Basis Size: Number of simple HKLs to use for the search (Defaults: Ortho 300, Mono 100, Tri 40).Peaks to Combine: Number of observed peaks used to form combinations (Defaults: Ortho 7, Mono 7, Tri 9).- Leave these at their defaults unless you have trouble finding a solution.
-
Start indexing.
Click
Start Indexing. Progress is shown on the main progress bar. -
Inspect solutions.
On the Solutions tab:
- Sort solutions by M(20).
- Click a row to display calculated (blue) and observed (red) tick marks on the chart.
- A plausible solution will show excellent alignment and reasonable space-group suggestions.
The User Interface
The application window is divided into a Controls Panel (left) and a Results Area (right), separated by a draggable divider that allows you to resize both panels.
Controls Panel
The controls are organized into three main tabs.
1. Peaks Tab
-
Peak finding sliders:
Radius– rolling-ball background subtraction radius (in data points). Larger values handle broader, slowly varying backgrounds.Points– Savitzky–Golay smoothing window width (in data points). Larger values reduce noise but risk merging or shifting close peaks.Min peak (%)– peak detection threshold on a logarithmic scale. Peaks below this fraction of the strongest peak are ignored. Reduce to detect weak reflections; increase to suppress noise.
- 2θ range sliders: Restrict the angular window for peak finding, for example to exclude noisy low-angle or high-angle regions.
-
Peak table:
Lists all detected peaks. The
2θ Obs (°)column is directly editable for fine corrections. Suspected Kα2 peaks (when stripping is off) are flagged with a yellow background and a ⚠ symbol. Right-clicking a yellow row lets you manually confirm it as a genuine peak.
2. Parameters Tab
- Radiation Preset: Select the X-ray source (e.g. Cu Kα, Co Kα, Mo Kα, synchrotron, etc.). This sets the internal Ka1 and Ka2 wavelength values used for Kα2 stripping and d-spacing calculations.
-
Ka1 Wavelength (Å):
Displays the wavelength used for all Q-space calculations. Updated automatically when a preset is chosen
and depends on the
Strip K-alpha2setting:- When stripping is ON: displays the pure Ka1 wavelength (e.g. 1.54056 Å for Cu).
- When stripping is OFF: displays the intensity-weighted average wavelength (e.g. 1.54184 Å for Cu Kα), which accounts for the unstripped doublet.
- When a Custom preset is selected, this field becomes editable.
-
Strip K-alpha2:
Applies the Rachinger (1948) Kα2 stripping correction to remove the Kα2 component before peak analysis.
This single-pass, deterministic method sweeps from low to high 2θ, using already-corrected lower-2θ intensity
to subtract the Kα2 ghost. It yields highly accurate peak positions without ringing artifacts or negative dips.
The algorithm uses the Ka2/Ka1 intensity ratio (typically 0.5) and the wavelength ratio to compute the angular shift of each ghost. The default is OFF (average Kα wavelength is used). - Max Volume (ų): Upper bound on the allowed unit-cell volume. This is one of the most powerful constraints on the search. Setting it appropriately (e.g. based on the number of formula units and estimated molecular volume) dramatically reduces spurious solutions and speeds up the search.
- Impurity Peaks: The number of unindexed peaks tolerated among the first 20 when computing M(20). If your sample contains a known minor impurity phase, set this to 1 or 2 to avoid penalising the correct solution.
- 2θ Error (°): The matching tolerance between observed and calculated peak positions during indexing. This value is converted to a Q-space tolerance using an angle-dependent formula (see Refinement). Typical values: ≈0.01–0.02° for synchrotron data, ≈0.03–0.06° for laboratory diffractometers.
- Refine Zero-Point Error: When enabled, a zero-point error parameter ($\Delta 2\theta_{zero}$) is simultaneously refined alongside the lattice parameters during the least-squares step. This corrects for instrumental misalignment. It is enabled by default and strongly recommended. See the Refinement section for the full two-round strategy.
- Crystal systems to search: Checkboxes for Cubic, Tetragonal, Hexagonal CPU, and Orthorhombic, Monoclinic, Triclinic GPU. CPU systems run in a Web Worker (background thread) and are fast. GPU systems offload to the WebGPU API for massively parallel evaluation.
GPU Search Parameters
When Orthorhombic, Monoclinic, or Triclinic is selected,
a dedicated panel appears with four parameters controlling the scope, speed, and memory usage of the GPU search.
-
HKL Basis Size ($N_{hkl}$):
Defines the size of the pool of theoretical Miller index triplets (HKLs) used to construct trial matrices.
Brutus generates a list of simple HKLs (e.g. 100, 011, 112), sorted by $h^2+k^2+l^2$, and takes the first
$N_{hkl}$ of them. Increasing this allows finding cells with larger unit-cell edges or complex centering,
but increases the number of HKL combinations factorially.
Defaults: 300 (Ortho), 100 (Mono), 40 (Tri). -
Peaks to Combine ($N_{peaks}$):
Controls how many observed peaks (starting from the lowest-angle) are considered when generating the
solving tuple. The solver needs exactly $k$ peaks ($k=3$ for Ortho, 4 for Mono, 6 for Tri).
Setting $N_{peaks} > k$ (e.g. 7 for Monoclinic) causes the program to test every possible subgroup
of $k$ peaks from those $N_{peaks}$ peaks. This provides robustness against impurities: if one
low-angle peak is from an impurity, at least one combination will skip it.
Defaults: 7 (Ortho), 7 (Mono), 9 (Tri). -
FoM Tolerance:
The "Fail-Fast" threshold for the GPU's internal pre-filter. On the GPU, after solving a trial cell,
an average absolute deviation of $Q$ is computed over the first $\max(N_{peaks}, 10)$ observed peaks.
If this deviation exceeds
FoM Tolerance × (user 2θ Error converted to ΔQ), the cell is discarded before being sent to the CPU. A value of 1.0 means the trial must match exactly within your stated error; a larger value (e.g. 3) is more permissive and will pass more candidates (including valid ones with large zero-shifts) to CPU refinement.
Default: 3. -
Buffer / Candidate (kCells):
The maximum number of GPU candidates (in thousands) buffered before CPU refinement. If the GPU finds
more valid candidates than this limit, the search stops early ("Buffer Full"). All buffered candidates
are then passed to the CPU for full weighted least-squares refinement, M(20) calculation, and Niggli reduction.
A smaller buffer reduces CPU time after the GPU phase; a larger buffer avoids early termination.
Default: 50 (i.e. 50,000 candidate cells).
3. Solutions Tab
- Solutions table (The Ledger): Shows valid solutions found during your session, including crystal system, unit-cell parameters (a, b, c, α, β, γ), volume, and M(20). Solutions are ranked by M(20) in descending order.
- Cumulative Search Behaviour: The table acts as a cumulative record across searches. If you run a search for Orthorhombic cells and then a second search for Monoclinic cells, previous results are retained. New solutions are added to the existing list and the entire table is re-ranked by M(20). This allows direct comparison across crystal systems. The list is only cleared when you load a new data file.
-
Context Menu (Right-Click):
Right-clicking a solution row opens a menu with two options:
- Erase Solution: Permanently removes that solution from the ledger.
- Generate Report: Creates a PDF report for that solution only.
Results Area
- Chart: Shows the experimental diffraction pattern, observed peaks (red ticks), and calculated peaks for the selected solution (blue ticks). A good solution exhibits visually convincing overlap across the full angular range.
-
Chart interaction:
- Zoom: mouse wheel (over the chart area zooms 2θ axis; scroll over the Y-axis zooms intensity).
- Pan: click and drag.
- Reset zoom: right-click on the chart.
- Add peak manually:
Ctrl + Clickon the chart.
Peak Finding in Detail
Accurate peak positions are the single most important input for successful indexing. Brutus uses a multi-step procedure to detect peaks from raw intensity data.
Algorithm Steps
-
Kα2 stripping (optional):
If
Strip K-alpha2is enabled, the Rachinger (1948) correction is applied to the raw intensities. Even if the peak base shape is not perfectly preserved, this is unimportant here because only peak positions are used for indexing, and those are highly accurate. -
Background subtraction:
A rolling-ball style algorithm estimates and removes the background.
The
Radiusslider controls the ball radius in data points. A larger radius smooths over broader background humps. -
Data smoothing:
A Savitzky–Golay filter is applied to the background-subtracted signal to reduce noise
while preserving peak shapes. The
Pointsslider sets the window width. -
Initial peak detection:
Local maxima above the
Min peak (%)threshold are identified. - Position refinement: For each detected peak, a five-point least-squares quadratic fit (Savitzky–Golay coefficients) is carried out around the local maximum to obtain a sub-step precision position. If the peak is too close to the data edge, the algorithm falls back to a three-point fit.
Suspected Kα2 Peaks & Wavelength Treatment
If your data has not been stripped of Kα2, Brutus automatically scans the peak list to flag suspected Kα2 ghosts:
- Yellow Rows: Suspected Kα2 peaks are highlighted in yellow in the table and marked with ⚠. These are excluded from the combinatorial indexing search and from the M(20) Figure of Merit calculation, and count only as "soft" violations during space-group analysis.
- Kα1 Wavelength for Parent Peaks: The parent peak of each Kα2
ghost is identified (marked with a
*). Brutus recalculates its d-spacing using the precise λKa1 wavelength instead of the average, improving indexing accuracy for that peak. - Manual Override: Right-click a yellow row to mark it as a confirmed real peak if it was flagged incorrectly (e.g. due to a near-coincidence with another phase).
Practical Recommendations
- Start from the default slider values and inspect the result visually. Only the peaks shown in the table will be used in the calculations.
- If weak but real peaks are missed, reduce
Min peak (%). If noise is detected as peaks, increase it. - For broad, slowly varying backgrounds, increase
Radius. - For noisy data, increase
Points(smoothing), but avoid over-smoothing, which can merge or shift peaks. - Always manually curate the final peak list. Remove artifacts and known impurity peaks.
If you are not stripping Kα2, delete Kα2 shoulders explicitly or confirm the
program has tagged them correctly.
Important: At high angles, the Kα1/Kα2 doublet may be fully resolved. If you accidentally include one line from Kα1 and the adjacent one from Kα2 as two separate peaks — treating them as originating from the same wavelength — your peak list will be inconsistent and indexing will likely fail.
- Kα2 stripping can simplify the pattern but may introduce small artifacts. If indexing fails with stripping ON, try turning it OFF and manually cleaning the peak list.
Indexing Algorithm and Search Strategy
Brutus uses an exhaustive, symmetry-specific trial-and-refine indexing algorithm. For each crystal system, the program generates trial solutions from combinations of low-angle observed peaks and low-index Miller indices, solves the corresponding linear system in reciprocal space, rejects unphysical cells, and then performs a full weighted least-squares refinement on surviving candidates.
Linear System Formulation
The search is formulated as a system of linear equations based on the quadratic form. Given a trial assignment of Miller indices $\{(hkl)_i\}$ to observed peaks $\{Q_{obs,i}\}$:
where $H_{j,i}$ is the $j$-th element of the design row for $(hkl)_i$ (see the table in the Overview), and $P_j$ are the unknown reciprocal-lattice parameters. Stacking $k$ such equations gives a $k \times k$ square system that is solved exactly (or in the over-determined case, by weighted least-squares).
CPU Searches (High Symmetry) CPU
High-symmetry systems have fewer degrees of freedom ($k \leq 2$) and are solved on the CPU in a background Web Worker. Recent updates have significantly expanded the search scope to ensure large unit cells are not missed.
-
Cubic (1 parameter, $k=1$):
Solves a 1×1 system by iterating through the first 12 observed peaks (up to
peak_depth=12). For each observed peak $Q_{obs,i}$, the program iterates over all HKLs up to index 40 ($h_\text{max}=40$), giving approximately $12 \times 150\,000 \approx 1.8\;$million trials. The trial parameter is simply $a = \sqrt{(h^2+k^2+l^2)/Q_{obs,i}}$. This allows detection of cubic cells up to very large $a$ values. -
Tetragonal & Hexagonal (2 parameters, $k=2$):
Solves a 2×2 system using pairs of peaks $(i, j)$ from the first 12 observed peaks.
For each pair, all HKL combinations $(hkl_1, hkl_2)$ up to index 12 are tried ($h_\text{max}=12$).
The 2×2 linear system is solved analytically (Cramer's rule).
Total trials: $\binom{12}{2} \times |HKLs|^2 \approx 66 \times 1.4\,M \approx 92\;$million combinations.
Typical runs complete in 1–3 seconds. However, dense patterns (e.g. apatite-type structures) can take substantially longer — sometimes a minute or more — because a large fraction of candidate cells survive the initial filter and must be individually refined. If indexing appears stuck on a Hexagonal or Tetragonal file, it is almost certainly still making progress; the trials-per-second counter in the status bar confirms this.
GPU-Accelerated Searches (Low Symmetry) GPU
For Orthorhombic, Monoclinic, and Triclinic systems, the combinatorial explosion makes CPU searching impractical. Brutus offloads these tasks to the WebGPU API, allowing billions of trial cells to be evaluated in minutes.
The Combinatorial Search Space
The total number of cells tested depends on two user-configurable parameters.
- $C(N_{peaks}, k)$ — the number of ways to choose $k$ peaks from the pool of $N_{peaks}$ low-angle observed peaks.
- $C(N_{hkl}, k)$ — the number of ways to choose $k$ HKL triplets from the $N_{hkl}$ basis.
- $k!$ — the number of permutations (assignments of the chosen HKLs to the chosen peaks). Ortho: $3!=6$; Mono: $4!=24$; Tri: $6!=720$.
System Specifics
| System | $k$ | Default $N_{hkl}$ | Default $N_{peaks}$ | Estimated trials (defaults) |
|---|---|---|---|---|
| Orthorhombic | 3 | 300 | 7 | $C(7,3) \times C(300,3) \times 6 \approx 936\;\text{Million}$ |
| Monoclinic | 4 | 100 | 7 | $C(7,4) \times C(100,4) \times 24 \approx 3.3\;\text{Billion}$ |
| Triclinic | 6 | 40 | 9 | $C(9,6) \times C(40,6) \times 720 \approx 232\;\text{Billion}$ |
These numbers can grow very rapidly. For example, increasing the Monoclinic $N_{hkl}$ from 100 to 150 multiplies the HKL combinations by roughly $(150/100)^4 \approx 5\times$, and increasing $N_{peaks}$ from 7 to 9 multiplies the peak combinations by $C(9,4)/C(7,4) = 126/35 \approx 3.6\times$.
GPU Memory Management & Safe Batching
Testing hundreds of billions of cells generates massive amounts of data. To prevent crashing your browser or freezing your computer, Brutus uses a Safe Batching architecture:
- Hardware Limit (VRAM): The program detects your GPU's available memory. On mobile devices or laptops with integrated graphics, the batch size is automatically reduced to fit within the available VRAM.
-
Time Limit (TDR Prevention):
On Windows, if a GPU computation takes longer than ~2 seconds, the OS assumes the driver has frozen
and resets it (
DXGI_ERROR_DEVICE_HUNG). Brutus enforces a strict "Speed Limit" per batch (e.g. max 100,000 trials per GPU dispatch for Triclinic) to keep the GPU responsive.
What this means for you: You do not need to configure memory settings. If you see the progress bar moving in many small steps, the safety system is working correctly.
If the search is too slow: Adjusting internal GPU memory or chunk parameters is not useful. Instead, curate the peak list, lower theHKL Basis Size, or reducePeaks to Combine.
Two-Stage GPU Filtering
Most candidate cells are invalid and are discarded on the GPU to save CPU time. Two filters are applied:
-
Geometric Filter:
Cells are rejected immediately if they are singular (near-zero determinant), have unphysical edge lengths
($< 2\;\text{Å}$ or $> 50\;\text{Å}$), have invalid angles ($< 20°$ or $> 160°$ for monoclinic/triclinic),
or exceed the user's
Max Volume. -
Figure of Merit Pre-filter:
For surviving cells, the GPU calculates a fast internal score: the average absolute deviation
$\langle|\Delta Q|\rangle$ over the first $\max(N_{peaks}, 10)$ observed peaks. If this deviates
beyond
FoM Tolerance × ΔQ_{user}(where $\Delta Q_{user}$ is the Q tolerance derived from the user's 2θ Error), the cell is discarded. Only cells passing this filter are stored in the GPU buffer for CPU refinement.
Least-Squares Refinement
Every candidate cell passing the GPU (or CPU) filters is subjected to a full weighted least-squares
refinement before being scored and reported. This section details the weighting scheme, zero-point
correction, and error propagation implemented in worker-logic.js.
Q-Space Tolerance
The matching tolerance used during peak assignment is not a fixed ΔQ but a peak-angle-dependent quantity derived from the user's stated $\sigma_{2\theta}$. From $Q = (4/\lambda^2)\sin^2\theta$, differentiating:
This means the Q tolerance is largest at intermediate angles (where $\sin 2\theta$ is largest) and converges to zero at $2\theta = 0°$ and $2\theta = 180°$. The implementation adds a small constant ($10^{-9}$) to prevent division by zero.
Least-Squares Weighting Scheme
For a measurement with constant $2\theta$ uncertainty $\sigma_{2\theta}$, the uncertainty in $Q_{obs}$ is:
The statistically optimal weight for each peak in the least-squares system is therefore:
This scheme gives high weight to low-angle peaks (which have small $\sigma_Q$) and low weight to high-angle peaks (which have large $\sigma_Q$). This is the opposite of using $w_i = Q_{obs,i}$ (which would over-weight high-angle peaks), and it is particularly important for the zero-point correction: the historical un-weighted approach significantly biased both the cell parameters and the zero shift. A floor of $w_\text{min} = 1/\sin^2(178°)$ prevents weights from blowing up for hypothetical peaks near $0°$ or $180°$.
The least-squares normal equations solved are therefore:
where $M$ is the design matrix (rows = indexed peaks, columns = reciprocal-lattice parameters),
$W = \text{diag}(w_1, \ldots, w_n)$, and $\mathbf{x}$ is the vector of reciprocal parameters.
This system is solved numerically via LU decomposition with partial pivoting
(implemented in luDecomposition / luSolve).
Zero-Point Error Refinement (Two-Round Strategy)
When Refine Zero-Point Error is enabled, a zero-point parameter $\Delta_{zero}$ (in degrees)
is added as an extra column to the design matrix. The physical model is that the true $2\theta$ of a
peak is $2\theta_{true} = 2\theta_{obs} - \Delta_{zero}$, which shifts the observed Q as:
Linearising gives the zero-error design-matrix column element:
A two-round strategy is used to handle non-trivial zero shifts robustly:
- Round 1: Peak assignment is done assuming $\Delta_{zero}=0$. The full weighted LS system (cell + zero) is solved to obtain a first estimate $\Delta_{zero}^{(1)}$.
- Round 2: Peak assignment is repeated, this time correcting the observed $Q$ values by $\Delta_{zero}^{(1)}$. The LS system is solved again on the new peak set. This dramatically improves robustness when the zero shift is large enough to cause mis-assignments in Round 1.
The same two-round strategy is applied to both GPU candidates (monoclinic/triclinic) and CPU candidates (cubic/tetragonal/hexagonal/orthorhombic).
Minimum Peaks Required for Acceptance
A solution is accepted only if a minimum number of observed peaks can be indexed. These thresholds are hard-coded and slightly exceed the number of free parameters to ensure the fit is over-determined:
| System | Free parameters | Min indexed peaks |
|---|---|---|
| Cubic | 1 | 4 |
| Tetragonal / Hexagonal | 2 | 5 |
| Orthorhombic | 3 | 6 |
| Monoclinic | 4 | 7 |
| Triclinic | 6 | 7 |
Standard Deviation Propagation
The covariance matrix $V = \sigma^2 (M^T W M)^{-1}$ (with $\sigma^2 = SSR/\nu$, $\nu$ = degrees of freedom) is propagated to standard deviations of the cell parameters using analytical derivatives:
- Cubic: $\sigma_a = \tfrac{1}{2} a^3 \sqrt{V_{AA}}$.
- Tetragonal / Hexagonal: $\sigma_a = \tfrac{1}{2} a^3 \sqrt{V_{AA}}$, $\sigma_c = \tfrac{1}{2} c^3 \sqrt{V_{CC}}$.
- Orthorhombic: Diagonal terms $\sigma_a, \sigma_b, \sigma_c$ analogously.
- Monoclinic: Analytical partial derivatives of $\beta$ w.r.t. the fitted parameters $A, C, D$, including off-diagonal covariance terms $V_{AC}, V_{AD}, V_{CD}$.
- Triclinic: A full numerical Jacobian $\partial(\text{cell})/\partial(\text{fit params})$ via central finite differences (adaptive step size $\approx 10^{-5}$ relative), propagated as $\sigma_{x_i}^2 = \sum_{j,k} J_{ij} V_{jk} J_{ik}$.
- Zero correction: $\sigma_{zero} = \sqrt{|V_{zz}|} \times (180/\pi)$, appended to the errors regardless of system.
Computational Limits & The Scale of Search
Because Brutus performs a brute-force combinatorial search, the number of trial cells can become astronomical. While modern GPUs are incredibly fast, there are hard limits defined by both hardware (WebGPU specifications) and practical time constraints.
The Hierarchy of Limits
The "Total Trials" displayed in the application is:
The HKL Combos count is the critical hardware constraint. It maps directly to
the number of GPU threads dispatched. Since WebGPU uses 32-bit unsigned integers (u32) for
thread indexing, this value must remain below $2^{32}-1 \approx 4.29 \times 10^9$.
Brutus automatically calculates this limit based on your HKL Basis Size and prevents the search
from starting if you exceed it.
System-Specific Breakdown
1. Triclinic ($k=6$)
- HKL Limit: The 4.29 Billion wall is reached at $N_{hkl} = 123$. ($C(123,6) \approx 4.24 \times 10^9$).
- Permutations: $6! = 720$.
- Peak Multiplier: Using 20 peaks: $C(20,6) = 38,760$.
- Maximum Possible Trials: $$ 4.24 \times 10^9 \times 38,760 \times 720 \approx 1.18 \times 10^{17} \quad (118 \text{ Quadrillion}) $$
2. Monoclinic ($k=4$)
- HKL Limit: The wall is reached at $N_{hkl} = 568$. ($C(568,4) \approx 4.29 \times 10^9$).
- Permutations: $4! = 24$.
- Peak Multiplier: Using 20 peaks: $C(20,4) = 4,845$.
- Maximum Possible Trials: $$ 4.29 \times 10^9 \times 4,845 \times 24 \approx 5 \times 10^{14} \quad (500 \text{ Trillion}) $$
3. Orthorhombic ($k=3$)
- HKL Limit: The wall is reached at $N_{hkl} = 2,954$. ($C(2954,3) \approx 4.29 \times 10^9$).
- Permutations: $3! = 6$.
- Peak Multiplier: Using 20 peaks: $C(20,3) = 1,140$.
- Maximum Possible Trials: $$ 4.29 \times 10^9 \times 1,140 \times 6 \approx 2.9 \times 10^{13} \quad (29 \text{ Trillion}) $$
Practical Implications
Is the "Trials" number real?
Yes. It represents the total number of mathematical systems the GPU would solve. However,
most invalid cells are rejected after checking only the first 2 or 3 peaks ("Fail-Fast" optimisation),
meaning the GPU rarely performs the full calculation for every trial. The effective throughput
is substantially higher than the raw theoretical number implies.
Can you actually run 120 Quadrillion trials?
Technically yes — the Safe Batching system prevents your computer from crashing. However, even at
200 Million trials/sec, 120 Quadrillion trials would take ~19 years to finish.
Practical Limit: For a search to finish in a "coffee break" time frame (5–10 minutes), keep the Total Trials count under 100–500 Billion. For modern desktop GPUs (e.g. Nvidia RTX series), speeds of 500 M – 2 B cells/sec are typical for Orthorhombic; Triclinic is slower due to the larger $k$ and the more complex 6×6 linear solve.
Tuning the Search
If you obtain no solutions, or too many, consider the following adjustments.
If no solutions (or only poor ones) are found
- Re-examine the peak list. This is the dominant failure mode. Check that the first 10–15 peaks belong to a single phase, are free of impurities, and have accurate $2\theta$ positions.
-
Relax
2θ Error. If this value is too strict for your data resolution, valid solutions may be discarded. -
Increase
Max Volume. The true cell may be larger than initially expected. -
Increase GPU Parameters.
Try increasing
HKL Basis Size(e.g. from 80 to 120 for Tri) orPeaks to Combine(e.g. from 7 to 9 for Mono). This significantly increases search time but covers more combinations. - Increase FoM Tolerance. If there is a significant zero-shift, valid trial cells may have large pre-refinement errors and be discarded. Raise FoM Tolerance to 5–10 and increase the Buffer size accordingly.
If you get too many solutions (GPU buffer fills)
The GPU buffer is limited (default 50,000). If it fills up, the search stops early ("Buffer Full"). In that case:
- Tighten the FoM Tolerance (e.g. from 3 to 1.5). This discards more candidates on the GPU, so fewer reach the CPU buffer.
-
Lower the
Max Volumeif you can constrain the cell size further. - Increase the Buffer size (e.g. from 50 to 200 kCells) if you suspect the correct solution is present but ranked below the buffer cut-off. Note that larger buffers mean more CPU work during post-processing.
-
Tighten
2θ Errorif your data quality warrants it — this also tightens the Q-tolerance used in matching.
Evaluating Solutions
The indexing search usually produces several candidate cells. Brutus keeps at most the best 50, ranked by M(20), after the search. Selecting the correct one requires interpreting figures of merit and checking the refined fit visually.
de Wolff Figure of Merit: M(20) / M(N)
The primary ranking indicator is the de Wolff Figure of Merit, M(N), calculated from the first $N$ observed reflections (typically $N = 20$ or the full peak list if fewer than 20 peaks are available). It quantifies both positional accuracy and completeness. The implementation uses:
- $Q_N = 1/d_N^2$ — the Q-value of the $N$-th observed line.
- $\langle|\Delta Q|\rangle$ — the mean absolute deviation between observed and calculated Q, averaged over all successfully indexed lines within $Q_N$.
- $N_{calc}$ — the number of all calculated (allowed) reflections with $Q \leq Q_N \times 1.05$ (the factor 1.05 provides a small buffer). This penalises cells that generate many unobserved calculated lines.
A high M(N) therefore requires both a small average positional error (large 1/$\langle|\Delta Q|\rangle$) and high completeness (small $N_{calc}$ relative to $N$). Both M(20) (from the first 20 peaks) and M(all) (from the full peak list) are computed and stored. The M(20) value is used for ranking; M(all) appears in the PDF report.
| M(20) value | Interpretation |
|---|---|
| > 20 | Very likely correct. A cell with both high M(20) and plausible chemistry is almost certainly the true solution. |
| > 10 | Likely correct, provided the cell volume is chemically plausible. |
| 5–10 | Plausible; requires further inspection. Check all peaks visually. |
| < 5 | Probably spurious; treat with great caution. |
A minimum M(20) of 2.0 is required for a solution to be retained at all. Solutions below this threshold are silently discarded.
F(N) Figure of Merit
As a complementary metric, Brutus computes the F(N) figure of merit (Smith & Snyder, 1979). While M(N) is expressed in Q-space, F(N) uses $2\theta$ directly:
- $N$ — number of observed lines used.
- $\langle|\Delta(2\theta)|\rangle$ — mean absolute difference between observed and calculated $2\theta$ for those lines.
- $N_{calc}$ — number of theoretical reflections (observed or not) up to the $2\theta$ of the $N$-th observed line.
A high F(20) indicates a precise fit with low average angular error. A solution with both high M(20) and high F(20) is highly reliable. Both F(20) and F(all) are reported in the PDF.
Least-Squares Refinement and Standard Deviations
For each promising candidate, Brutus performs the two-round weighted least-squares refinement described in the Refinement section. The resulting cell parameters are reported with standard deviations propagated from the covariance matrix.
When Refine Zero-Point Error is enabled, the reported zero correction ($\Delta_{zero}$)
represents a systematic misalignment of the diffractometer. A large value (e.g. >0.05° for a lab
instrument) may indicate a calibration issue and should be investigated independently.
Duplicate Suppression
To avoid reporting multiple equivalent cells, Brutus generates a canonical "key" for each refined cell: the system name plus the three axis lengths rounded to two decimal places, and (for monoclinic/triclinic) the angles. If a new solution matches an existing key, it replaces the existing entry only if its M(20) is higher.
Space Group Analysis
After a high-quality unit cell is obtained, Brutus can suggest likely space groups based on systematic absences. This serves as input for subsequent structure solution or Rietveld refinement.
Note: For some space groups that have different origin choices but identical extinction rules (e.g. Pmmn), both settings appear in the list. This is intentional — they are not distinguishable by powder diffraction alone, and the reminder is useful for subsequent Rietveld refinement.
Method
- Generate unique reflections. Using the refined cell, Brutus computes a complete list of theoretical reflections up to the maximum observed $2\theta$, applying Friedel's law to keep only symmetry-unique $(hkl)$ indices (e.g. $l > 0$, or $l=0,\,k > 0$, or $l=k=0,\,h > 0$ for triclinic).
- Index observed peaks. All peaks in the curated list are matched against the theoretical pattern using the Q-space tolerance.
-
Build a high-confidence subset.
To avoid ambiguity from overlapping reflections, Brutus retains only peaks for which no other
theoretical $(hkl)$ lies within the
2θ Errorwindow. This "unambiguous" subset is used exclusively for extinction analysis. -
Determine centering and extinctions.
The unambiguous $(hkl)$ set is compared against the extinction rules (centering conditions and
glide/screw symmetry conditions) for all candidate space groups.
Brutus classifies violations as:
- Hard violations: an unambiguous, real (non-Kα2) reflection breaks the rule.
- Soft violations: only Kα2-suspect peaks break the rule. Space groups with only soft violations are retained in the list with a warning.
h00,0k0,hk0,hkletc.) is used to apply the correct subset of conditions for that zone. - Rank space groups. Based on the crystal system and detected centering, the internal space-group database is filtered. Each space group is assigned a (hard) violation count and ranked accordingly. The conventional name of the space group (consistent with the orientation found by the program) is displayed.
Interpreting the Output
- 0 hard violations: Ideal case. No unambiguous reflection breaks the extinction rules; these are the strongest candidates.
- 1–2 hard violations: Still plausible; minor violations can result from weak forbidden lines visible above background, or small systematic errors in peak positions.
- Ambiguous peaks (in italics in the PDF report): Peaks excluded from extinction analysis due to potential overlap are printed in italics to distinguish them from the high-confidence subset.
- Soft-violation warnings: Space groups that are only challenged by Kα2-suspect reflections are flagged. They may still be the correct space group; strip Kα2 and re-run to resolve the ambiguity.
Advanced Topics: Enhanced Search and Sieving
Beyond the core indexing routine, Brutus applies several "fishing" strategies and reduction steps
to improve robustness and simplify the set of final solutions. These are applied in
findTransformedSolutions() after the main search completes.
1. Niggli Cell Reduction & Symmetry Upgrade
For each solution, Brutus computes the Niggli reduced cell — the standardised, most compact primitive cell for the lattice. The conventional (possibly centred) cell is first reduced to primitive using the detected centering (I, F, C, etc.), and then the Niggli algorithm is applied.
After reduction, the code calls getSymmetry() with a slightly loose tolerance (0.05 Å / 0.05°)
to detect pseudo-symmetry: a cell found as monoclinic might, after Niggli reduction, reveal
all angles equal to 90° and two equal axes — i.e. it is actually tetragonal. If the detected
symmetry is higher than the search symmetry (and the user has selected that higher-symmetry system),
a new trial cell is constructed with the correct symmetry and re-refined and re-scored.
Niggli cells are also used for:
- Database searching: Different conventional cells describing the same lattice reduce to the same Niggli cell, making it a canonical key for identification.
- Standardised reporting: The Niggli cell is included in the PDF report for each major candidate, providing an unambiguous description for cross-referencing.
2. Matrix-Based Cell Transformations
A set of seven crystallographic transformation matrices $P$ is applied to each candidate cell. Each transforms the real-space metric tensor $G \to P^T G P$, generating a new lattice description. The matrices include:
- Face-centred (F) → primitive: $P = \tfrac{1}{2}\bigl[(0,1,1),(1,0,1),(1,1,0)\bigr]$
- Body-centred (I) → primitive: $P = \tfrac{1}{2}\bigl[(-1,1,1),(1,-1,1),(1,1,-1)\bigr]$
- C-centred → primitive: $P = \tfrac{1}{2}\bigl[(1,1,0),(-1,1,0),(0,0,2)\bigr]$
- Axis halvings (a/2, b/2, c/2) and one C→P transformation.
The resulting cell is classified by symmetry and, if the system is among those selected by the user, sent to refinement and scoring.
3. HKL Divisor Analysis (Sub-cell Detection)
The list of indexed $(hkl)$ values for each solution is inspected for common divisors. The GCD of all $|h|$ values, all $|k|$ values, and all $|l|$ values are computed separately. If any GCD exceeds 1 (e.g. all $h$ values are even), the program tests a shrunken cell with the corresponding axis divided by that GCD. This detects cases where the indexing algorithm found a super-cell of the true cell.
4. Orthorhombic–Hexagonal Relationship
A hexagonal lattice can sometimes be described as C-centred orthorhombic with $b/a \approx \sqrt{3}$ (or permutations). All orthorhombic solutions are checked against this condition (within 3% tolerance), and potential hexagonal equivalents are generated and evaluated.
5. Swap Fishing for Ambiguity
For each high-ranking solution, Brutus re-examines the first few low-angle peaks. Pairs of peaks that are very close in angle (the most common source of mis-assignment in the initial brute-force step) are identified, and a new hypothesis is created by swapping their HKL labels. The swapped cell is fully refined and rescored. If it yields a higher M(20), it is retained.
6. Final De-Duplication (Sieving)
After all transformations and re-refinements, Brutus applies a final de-duplication step. If two solutions have volumes within 1% of each other:
- The one with higher symmetry is preferred (e.g. orthorhombic beats monoclinic).
- If their symmetry is identical, the one with higher M(20) is kept.
- Solutions with M(20) values within a small tolerance (ΔM(20) < 0.05) are considered equivalent in quality and the first one found is retained.
The final retained set contains at most 50 solutions, ranked by M(20).
Troubleshooting & FAQ
Error: "WebGPU not supported in this browser"
This error usually indicates a software or configuration issue rather than a lack of hardware capability. Even integrated graphics (e.g. Intel UHD) support WebGPU if configured correctly.
- Unsupported Browser: WebGPU is currently best supported on Google Chrome (v113+) and Microsoft Edge. Firefox has experimental support. Safari supports a subset of WebGPU.
- Outdated Drivers: Old graphics drivers are frequently blocklisted by browsers to prevent instability. Update your GPU drivers (Intel, NVIDIA, or AMD) to the latest version via the manufacturer's website or Windows Update.
-
Hardware Acceleration Disabled:
Check your browser settings (usually under
Settings > System) and ensure "Use graphics acceleration when available" is turned ON. Restart the browser afterwards. -
Linux:
On Linux, WebGPU is often disabled by default. You may need to manually enable flags in
chrome://flags. Search for "Vulkan" and "WebGPU" and set them to Enabled.
To verify: Type chrome://gpu in your address bar and look for "WebGPU".
If it says "Software only" or "Disabled", the GPU search will not function.
Why were no solutions found?
- Poor peak list: This is the most common cause. Ensure the list is accurate, impurity peaks are removed, and positions are well refined. A minimum of 15–20 clean peaks is recommended. For low-symmetry systems, the first 10 peaks (largest d-spacings) must be especially reliable.
-
Incorrect parameters:
Check the wavelength, the selected
Radiation Preset, and theMax Volume. -
GPU parameters too restrictive:
The true solution might require HKLs or peak combinations beyond the default
HKL Basis SizeorPeaks to Combine. Try increasing these. - Large zero-point error: Brutus corrects moderate misalignments, but very large zero errors (e.g. >0.2°) may cause all trial cells to fail the pre-filter. Increase FoM Tolerance and try again.
- Sample is a mixture: Successful indexing requires peaks from a single crystalline phase.
Why is M(20) low although the fit looks good?
- Sub-multiple or super-multiple cell: The found cell may be a multiple or sub-cell of the true one. It indexes only a subset of peaks and is penalised for being too sparse or too dense in calculated reflections. Look for a related cell with higher M(20) in the solutions list.
-
High error / low resolution:
The
2θ Errorsetting may be too strict for broad peaks. Try slightly increasing it to widen the Q-tolerance. - Spurious solution: Random solutions can fit a few peaks by coincidence. Always check whether all major observed peaks are reproduced by the calculated pattern. If significant peaks are missed, the solution is incorrect regardless of M(20).
The progress bar is stuck / search seems frozen
- For Tetragonal and Hexagonal searches with dense patterns, a large fraction of trial cells pass the initial filter and require full refinement. The search is still running. Watch the trials-per-second counter in the status bar.
- For GPU searches, the batch system may be processing a very large number of candidates in the CPU post-processing phase (Niggli reduction, M(20) calculation). This can take several minutes if the GPU buffer is large (e.g. 200 kCells).
- If the browser becomes unresponsive for more than ~30 seconds with no progress counter movement, the GPU driver may have reset. Close and reopen the tab and try with a smaller HKL Basis Size.
Test Files
- The GitHub repository contains several example datasets:
- Monoclinic_test_1.xy — Synchrotron XRD, monoclinic, monochromatic λ = 0.79764 Å. Lattice parameters: a = 19.877 Å, b = 8.196 Å, c = 11.243 Å, β = 106.08°.
- C61Br2_079764.XY — Dibromo-methano fullerene measured at ESRF, λ = 0.79764 Å. Contains an impurity peak at ~16.24° 2θ. Expected solution: cubic I with a ≈ 18.92 Å.
- SPDDRR1_sample2_0692.xy — Round-robin data measured at Daresbury, λ = 0.692 Å. Expected cell: orthorhombic with a = 10.983 Å, b = 12.852 Å, c = 15.740 Å.
- SPDDRR1_sample2_Cu.xy — Same sample as above, measured with a Cu Kα laboratory source.
- SPDDRR1_zhu1_Cu.xy — Round-robin sample 1, measured with Cu Kα; likely monoclinic with a = 7.672 Å, b = 9.624 Å, c = 7.076 Å, β = 106.24°.
- PbSO4.xra and FAP.xra — Laboratory Cu Kα samples representative of orthorhombic and hexagonal structures (datafiles taken from GSAS-2 tutorials).
- P-1_sim_5_6_7_86_91_96.txt — Simulated powder pattern for a P$\bar{1}$ lattice, λ = 1.7 Å, a = 5.0 Å, b = 6.0 Å, c = 7.0 Å, α = 86.0°, β = 91.0°, γ = 96.0° (V = 208.29 ų).
Found a Bug?
If you suspect you have found a bug (e.g. the interface freezes unexpectedly or valid data crashes the GPU search), please report it.
Note: You will need a free GitHub account. The form will ask for your browser version and GPU model to help reproduce the issue.
References
Brutus was developed by Nita Dragoe at Université Paris-Saclay (2024–2025) as a successor to the earlier
program Powder (1999–2000). If you use Brutus in your work, please cite:
https://doi.org/10.13140/rg.2.2.18182.84806
For further background on the methodology, the following references are recommended:
-
M(N) figure of merit
de Wolff, P. M. (1968). "A Simplified Criterion for the Reliability of a Powder Pattern Indexing." Journal of Applied Crystallography 1, 108–113. -
F(N) figure of merit
Smith, G. S. & Snyder, R. L. (1979). "F(N): A Criterion for Rating Powder Diffraction Patterns and Evaluating the Reliability of Powder-Pattern Indexing." Journal of Applied Crystallography 12, 60–65. -
General powder diffraction text
Klug, H. P. & Alexander, L. E. (1974). X-Ray Diffraction Procedures for Polycrystalline and Amorphous Materials, 2nd ed. New York: Wiley-Interscience. -
Kα2 Stripping Algorithm
Rachinger, W. A. (1948). "A Correction for the α1 α2 Doublet in the Measurement of Widths of X-ray Diffraction Lines." Journal of Scientific Instruments 25, 254. -
Savitzky–Golay smoothing
Savitzky, A. & Golay, M. J. E. (1964). "Smoothing and Differentiation of Data by Simplified Least Squares Procedures." Analytical Chemistry 36(8), 1627–1639. -
Alternative indexing approaches
Ito, T. (1949). "A General Powder X-ray Photography." Nature 164, 755–756.
Werner, P.-E., Eriksson, L. & Westdahl, M. (1985). "TREOR, a Semi-exhaustive Trial-and-Error Powder Indexing Program for All Symmetries." Journal of Applied Crystallography 18, 367–370.
Visser, J. W. (1969). "A Fully Automatic Program for Finding the Unit Cell from Powder Data." Journal of Applied Crystallography 2, 89–95.
Le Bail, A. (2004). "Monte Carlo Indexing with McMaille." Powder Diffraction 19(3), 249–254.
Boultif, A. & Louër, D. (2004). "Powder Pattern Indexing with the Dichotomy Method." Journal of Applied Crystallography 37, 724–731. -
Previous software
Dragoe, N. (2001). "PowderV2: A Suite of Applications for Powder X-Ray Diffraction Calculations." Journal of Applied Crystallography 34, 535.