Physics Engine
The reactor simulation in reactor_sim.py is the core of the environment. It implements a first-principles chemical engineering model of an ICI Low-Pressure methanol synthesis reactor with 349 lines of physics code.
Simulation Pipeline
Every call to env.step(action) executes this pipeline:
1. Rate Limiting → 2. Recycle Loop → 3. Purge Gas → 4. Partial Pressures → 5. SRK Fugacity → 6. Kinetic Model → 7. RK4 Energy Balance → 8. Multi-Bed Quench → 9. Catalyst Deactivation → 10. Condensation → 11. Byproducts → 12. Economics → 13. Noise
1. Rate Limiting
Agent actions are rate-limited to prevent unrealistic instantaneous changes:
- Feed rates: max ±20%/step
- Cooling water: max ±30%/step
- Compressor: max ±15%/step
- Valve positions: max ±25%/step
This models the physical reality that valves have finite travel time and pumps have ramp-up delays.
2. Recycle Loop
Single-pass conversion in methanol synthesis is only ~5%. The unconverted gas is recycled:
- Fresh feed mixed with recycled gas at ratio
recycle_ratio(default 3.5) - Total flow to reactor = fresh feed × (1 + recycle_ratio)
- Inert gases (N₂, CH₄, Ar) accumulate in the loop over time
3. Purge Gas Model
Inerts must be periodically purged to prevent buildup:
purge_valve_positioncontrols the fraction of recycle gas vented- Purge reduces inert fraction but wastes valuable H₂ + CO
- Inert fraction equilibrium depends on feed purity, purge rate, and conversion
4. Partial Pressures
Species partial pressures calculated using Dalton's law:
$$P_i = y_i \cdot P_{\text{total}}$$
where $y_i$ is the mole fraction of species $i$ in the reactor gas mixture (H₂, CO, CO₂, CH₃OH, H₂O, inerts).
5. SRK Fugacity Corrections
The Soave-Redlich-Kwong cubic equation of state corrects for non-ideal gas behavior at the high pressures (50–100 bar) in the reactor:
$$P = \frac{RT}{V-b} - \frac{a \alpha}{V(V+b)}$$
Fugacity coefficients $\phi_i$ are calculated for each species:
| Species | Tc (K) | Pc (bar) | ω | Typical φ at 250°C, 80 bar |
|---|---|---|---|---|
| H₂ | 33.2 | 13.0 | -0.216 | ~1.04 |
| CO | 132.9 | 35.0 | 0.048 | ~1.01 |
| CO₂ | 304.2 | 73.8 | 0.225 | ~0.92 |
| CH₃OH | 512.6 | 80.9 | 0.565 | ~0.72 |
| H₂O | 647.1 | 220.6 | 0.344 | ~0.85 |
These corrections matter — without them, predicted methanol yield is ~15% too high because ideal gas law overestimates reactant availability at high pressure.
6. Kinetic Models (5 selectable)
The environment supports 5 published kinetic models, selectable via reactor_config.json:
LHHW (Default)
Langmuir-Hinshelwood-Hougen-Watson model with adsorption terms:
$$r_1 = \frac{k_1 \cdot P_{CO} \cdot P_{H_2}^2 - P_{CH_3OH}/K_{eq}}{(1 + K_{CO} \cdot P_{CO} + K_{H_2} \cdot P_{H_2}^{0.5})^2}$$
- Pre-exponential: $k_0 = 5.0 \times 10^6$
- Activation energy: $E_a = 80{,}000$ J/mol
- Adsorption constants: $K_{CO} = 2.0$, $K_{H_2} = 0.5$
Graaf 1988
Most experimentally validated model. 3-reaction kinetics from Chem. Eng. Sci. 43(12):
- Separate rate constants for R1 (CO hydrogenation), R2 (CO₂ hydrogenation), R3 (rWGS)
- Includes equilibrium constants for all 3 reactions
- CO and CO₂ adsorption terms from independent measurements
VBF 1996 (Vanden Bussche-Froment)
Optimized for CO₂-rich feeds (green methanol from carbon capture):
- Models only the CO₂ hydrogenation pathway
- Better accuracy when CO₂/CO ratio > 0.5
- Water inhibition term included
Seyfert/BASF
Industrial BASF model with CO₂ inhibition factor:
- CO₂ inhibition: $f = 1/(1 + 0.5 \cdot P_{CO_2}/P_{CO})$
- Simpler than LHHW but calibrated on BASF plant data
- Best for traditional natural gas-based plants
Nestler 2021
Latest model validated against Carbon2Chem demo plant:
- COR-dependent correction factor (Carbon Oxide Ratio)
- Penalty above COR = 0.5 (high CO₂ content)
- Reference: Voss et al. (2022) Chem. Ing. Tech. 94(10)
7. RK4 Energy Balance
The energy balance ODE is integrated using 4th-order Runge-Kutta with 4 sub-steps per timestep:
$$\frac{dT}{dt} = \frac{Q_{\text{rxn}} - Q_{\text{cool}} - Q_{\text{loss}}}{m \cdot c_p}$$
where:
- $Q_{\text{rxn}} = \sum_i r_i \cdot (-\Delta H_i)$ — exothermic heat generation from 3 reactions
- $Q_{\text{cool}} = U \cdot A \cdot (T - T_{\text{coolant}})$ — shell-side heat removal
- $Q_{\text{loss}}$ — heat losses to environment
- $m \cdot c_p$ — thermal mass of reactor + catalyst
The enthalpy of each reaction varies with temperature via Kirchhoff's equation:
$$\Delta H(T) = \Delta H_{298} + \int_{298}^{T} \Delta c_p \, dT$$
8. Multi-Bed Quench (ICI 4-Bed)
The reactor is divided into 4 adiabatic catalyst beds:
| Bed | Catalyst Activity | Typical Exit T | Purpose |
|---|---|---|---|
| 1 | 100% (fresh) | ~265°C | Highest reaction rate, most heat generation |
| 2 | ~95% | ~260°C | Moderate conversion |
| 3 | ~88% | ~255°C | Approaching equilibrium |
| 4 | ~80% | ~250°C | Near-equilibrium, low marginal conversion |
Between each bed, cold fresh syngas (5% of total feed) is injected to quench the temperature by ~15°C. This creates the characteristic sawtooth temperature profile.
9. Catalyst Deactivation
Three-zone model based on Spencer (1999):
| Zone | Temperature | Rate | Mechanism | Reversible? |
|---|---|---|---|---|
| Normal | < 260°C | Very slow | Thermal aging | No (but negligible) |
| Above-optimal | 260–280°C | Moderate | Cu crystallite growth | No |
| Sintering | > 280°C | Fast | Cu particle sintering | No — irreversible |
At 300°C → emergency shutdown. Catalyst cannot be recovered. The agent must learn this hard boundary.
10. Condensation
Crude methanol is condensed from the reactor outlet gas:
- Flash separation at ~30 bar
- 96% methanol recovery
- Remaining 4% stays in gas phase → recycled
- Condensate: ~80% methanol + ~20% water
11. Byproducts
Side reactions produce small amounts of:
- Dimethyl ether (DME): 2CH₃OH → CH₃OCH₃ + H₂O
- Methyl formate: CO + CH₃OH → HCOOCH₃
Selectivity model: methanol selectivity = 99.5–99.8% under normal conditions, drops below 99% at high temperature.
12. Economics
Per-step profit calculation:
$$\text{Profit} = \text{Revenue} - \text{Feed Cost} - \text{Energy Cost} - \text{Cooling Cost}$$
- Revenue = methanol produced × regional spot price
- Feed cost = (H₂ + CO consumed) × gas price
- Energy = compressor power × electricity price
- Cooling = cooling water flow × water cost
Prices are configurable via 10 regional bundles in reactor_config.json.
13. Process Noise
Added per step to simulate real sensor behavior:
| Measurement | Noise | Distribution |
|---|---|---|
| Temperature | ±1°C | Gaussian |
| Reaction rate | ±5% | Multiplicative Gaussian |
| Pressure | ±0.3 bar | Gaussian |
| Cooling water temp | Brownian drift | Random walk |
ReactorState
The internal state object used by the simulation (not directly exposed to agents):
@dataclass
class ReactorState:
temperature: float = 250.0 # °C
pressure: float = 50.0 # bar
feed_rate_h2: float = 4.0 # mol/s
feed_rate_co: float = 2.0 # mol/s
cooling_water_flow: float = 50.0 # L/min
cooling_water_temp: float = 25.0 # °C
catalyst_health: float = 1.0 # 0-1
methanol_produced: float = 0.0 # kg cumulative
reaction_rate: float = 0.0 # mol/s
profit_this_step: float = 0.0 # $
cumulative_profit: float = 0.0 # $
time_step: int = 0
emergency_shutdown: bool = False
# ... + 15 more fields
3 Simultaneous Reactions
| # | Reaction | ΔH₂₉₈ | Equilibrium |
|---|---|---|---|
| R1 | CO + 2H₂ ⇌ CH₃OH | −90.5 kJ/mol | Favored at low T, high P |
| R2 | CO₂ + 3H₂ ⇌ CH₃OH + H₂O | −49.5 kJ/mol | Favored at low T, high P |
| R3 | CO₂ + H₂ ⇌ CO + H₂O | +41.2 kJ/mol | Reverse water-gas shift |
R1 is the primary methanol-producing reaction. R2 also produces methanol but consumes more hydrogen (3 mol vs 2 mol). R3 is endothermic and produces CO from CO₂ — it's a side reaction that can be beneficial (converts CO₂ to the more reactive CO) or harmful (consumes hydrogen).
The equilibrium constant for R1:
$$\ln K_{eq} = \frac{3066}{T} - 10.592$$
At 250°C (523K): $K_{eq} \approx 0.0057$ — small, meaning thermodynamics limits conversion. This is why recycle loops are essential.