Benchmarking autonomy in scientific experiments: a hierarchical taxonomy for autonomous large-scale facilities

Tools

Le Houx, James ORCID: https://orcid.org/0000-0002-1576-0673 (2026) Benchmarking autonomy in scientific experiments: a hierarchical taxonomy for autonomous large-scale facilities. [Working Paper] (doi:10.48550/arXiv.2601.06978)

Preview

PDF (Open Access Preprint)
52284 LE HOUX_Benchmarking_Autonomy_In_Scientific_Experiments_(OA PREPRINT)_2026.pdf - Published Version
Available under License Creative Commons Attribution.
Download (272kB) | Preview

Official URL: https://doi.org/10.48550/arXiv.2601.06978

Abstract

The transition from automated data collection to fully autonomous discovery requires a shared vocabulary to benchmark progress. While the automotive industry relies on the SAE J3016 standard, current taxonomies for autonomous science presuppose an owner-operator model that is incompatible with the operational rigidities of Large-Scale User Facilities. Here, we propose the Benchmarking Autonomy in Scientific Experiments (BASE) Scale, a 6-level taxonomy (Levels 0--5) specifically adapted for these unique constraints. Unlike owner-operator models, User Facilities require zero-shot deployment where agents must operate immediately without extensive training periods. We define the specific technical requirements for each tier, identifying the Inference Barrier (Level 3) as the critical latency threshold where decisions shift from scalar feedback to semantic digital twins. Fundamentally, this level extends the decision manifold from spatial exploration to temporal gating, enabling the agent to synchronise acquisition with the onset of transient physical events. By establishing these operational definitions, the BASE Scale provides facility directors, funding bodies, and beamline scientists with a standardised metric to assess risk, define liability, and quantify the intelligence of experimental workflows.

Item Type:	Working Paper
Uncontrolled Keywords:	autonomous experimentation, large-scale user facilities, AI for science, taxonomy, Operational Design Domains (ODD), sim-to-real transfer
Subjects:	Q Science > Q Science (General) Q Science > QA Mathematics > QA75 Electronic computers. Computer science T Technology > T Technology (General)
Faculty / School / Research Centre / Research Group:	Faculty of Engineering & Science Faculty of Engineering & Science > School of Computing & Mathematical Sciences (CMS)
Related URLs:	Publisher Publisher
Last Modified:	14 Jan 2026 16:46
URI:	https://gala.gre.ac.uk/id/eprint/52284

Actions (login required)

View Item

Downloads

Downloads per month over past year

View more statistics

Altmetric