Our most recent paper, titled Design and performance of AI agents interfacing with an atomic layer deposition tool has just come out in Review of Scientific Instruments. This is part of my informally titled series AI beyond the hype, where we are trying to make sense of the capabilities and limitations of AI in the context of materials synthesis.

In this specific work, we show how to integrate an AI agent based on LLMs with an experimental materials synthesis tool, in this case one of our trusty ALD reactors. This is a work that was long in the making, and that would not have been possible without our indefatigable postdocs working to keep the reactor alive through the testing and upgrades.

The paper also includes some preliminary results evaluating agents based on state of the art (as of summer/fall of 2025) LLMs in materials synthesis tasks. In these challenges, the outcome of a query like “grow 20 cycles of Pd” should yield a specific instruction to run a process in the tool. While the paper focused on ALD, the methodology of how test AI agent’s ability to do useful things with experimental tools is general.

I am happy to say that no reactor was harmed during the making of this paper. However, our results show that there is still a gap in capabilities of the AI models, particularly for processes that are not well represented in the scientific literature. I hope that these results spur the development of better models and encourage others to evaluate how well AI performs in their own research settings. After all, we are scientists so quantifying how good/bad methods work is part of what we should do.

I am in the process of getting some of the relevant components online in an open source repository, hopefully this June.