<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="/feed.xml" rel="self" type="application/atom+xml" /><link href="/" rel="alternate" type="text/html" /><updated>2026-05-29T03:25:22+00:00</updated><id>/feed.xml</id><title type="html">Random Walk</title><subtitle>On stuff</subtitle><author><name>Angel Yanguas-Gil</name></author><entry><title type="html">How to integrate LLMs with an atomic layer deposition tool</title><link href="/research/2026/05/28/updates.html" rel="alternate" type="text/html" title="How to integrate LLMs with an atomic layer deposition tool" /><published>2026-05-28T00:00:00+00:00</published><updated>2026-05-28T00:00:00+00:00</updated><id>/research/2026/05/28/updates</id><content type="html" xml:base="/research/2026/05/28/updates.html"><![CDATA[<p>Our most recent paper, 
titled <a href="https://doi.org/10.1063/5.0318770">Design and performance of AI agents interfacing with an atomic layer deposition tool</a> has just come out in
Review of Scientific Instruments.
This is part of my informally titled
series <em>AI beyond the hype</em>, where we are trying to make sense of the capabilities and limitations of AI in the context of materials synthesis.</p>

<p>In this specific work, we show how to integrate an AI agent based on LLMs with an experimental materials synthesis tool, in this case one
of our trusty ALD reactors.
This is a work that was long in the making, and that would not have been possible without our indefatigable postdocs working to keep the reactor alive through the testing and upgrades.</p>

<p>The paper also includes some preliminary results evaluating agents based on state of the art (as of summer/fall of 2025) LLMs in materials synthesis tasks. In these challenges,
the outcome of a query like “grow 20 cycles of Pd” should yield a specific instruction to run a process in the tool. While the paper focused on ALD,
the methodology of how test AI agent’s ability to do useful things with experimental tools is general.</p>

<p>I am happy to say that no reactor was harmed during the making of this paper. However, our results show that there is still a gap in capabilities of the AI models, particularly for processes that are not well represented in the scientific literature. I hope that these results spur the development of better models and encourage others to evaluate how well AI performs in their own research settings. After all, we are scientists so quantifying how good/bad methods work is part of what we should do.</p>

<p>I am in the process of getting some of the relevant components online in an open source repository, hopefully this June.</p>]]></content><author><name>Angel Yanguas-Gil</name></author><category term="research" /><summary type="html"><![CDATA[Our most recent paper, titled Design and performance of AI agents interfacing with an atomic layer deposition tool has just come out in Review of Scientific Instruments. This is part of my informally titled series AI beyond the hype, where we are trying to make sense of the capabilities and limitations of AI in the context of materials synthesis.]]></summary></entry><entry><title type="html">Thoughts about science and AI: my own mental model</title><link href="/research/2026/04/05/genesis.html" rel="alternate" type="text/html" title="Thoughts about science and AI: my own mental model" /><published>2026-04-05T00:00:00+00:00</published><updated>2026-04-05T00:00:00+00:00</updated><id>/research/2026/04/05/genesis</id><content type="html" xml:base="/research/2026/04/05/genesis.html"><![CDATA[<p>Over the past few years we have seen an increase in the amount of research activity happening at the
intersection of AI and materials science. Research
on scientific
applications of AI has been going on for many years.
The difference is that now this research is going mainstream, with
multiple sessions in conferences such as the MRS spring
meeting and AI papers appearing in traditional physics,
chemistry, and materials science journals.</p>

<p>In order to make sense of
what’s going on, I try to place AI methods in a space
defined by three different axes:</p>

<ol>
  <li>
    <p><strong>Technology readiness level</strong>: basically, is this something that is AI research (low TRLs) or is it ready to be used in real life (TRL 9)?</p>
  </li>
  <li>
    <p>How <strong>specific</strong> is the model to my research field? Is this something like email, something useful but general-purpose or is this something that has developed for a specific problem?</p>
  </li>
  <li>
    <p>What is the <strong>visibility</strong> of this AI method? And by that I mean, is it something that is likely to be cited (like a specific model), disclosed (like a tool) or acknowledged (like a facility)?</p>
  </li>
</ol>

<p><img src="/assets/images/ai3d.png" alt="How I break down AI methods in three different axes" style="display:block; margin-left:auto; margin-right:auto; width:50%;" /></p>

<p>Based on this you can put AI works in this 3D space.
At low TRL levels sit
the vast majority of papers and projects involving AI for materials science.
These works, which focus on developing or demonstrating new methods, are essentially <em>AI research</em>.
The point is not to be immediately useful to the researcher,
they are essentially methods papers on a materials science topic where the publication itself is 
the main outcome.</p>

<p>In this region, whether AI lives up to its hype is largely immaterial to the work itself, affecting primarily the shelf life of many of these publications. One may dispute whether investing a lot
of resources in it is warranted, but
the situation is no different than, say, any other idea going through the research hype curve. Right now, though, communities
are growing in this space, as we can see in conferences, journals, and funding priorities. The range of specificity
and visibility in the research of these papers is really broad.</p>

<p>At the other side of the spectrum, at high TRLs, sits what I call <em>AI for research</em>. These are the methods
that are or will be actively used by researchers who are not necessarily the ones who developed them. Reusability
is the key here. Some of these methods will be general (like using generative AI to accelerate the
development of scientific codes), but most will
be very specialized. The range of visibility is 
a bit skewed towards higher visibility with respect
to more exploratory work. AI methods that broadly
support the work of the researcher are not likely to
be considered AI for research, much like email is
not considered a research tool. They tend to be
skewed towards higher specificity for the same reason.</p>

<p>The corner with high TRL, domain-specific AI is
the most intriguing to me, because I think we do not
know yet how impactful AI will turn out to be, how the research enterprise will shape and be shaped
by AI. In fact, we as domain scientists have barely
started to quantify its value and potential gains.
This region is where the aspirational goals of AI meet the reality of research.</p>

<h2 id="what-ifs">What ifs?</h2>

<p>Once you start placing hypothetical tools in this space
you can start asking questions about the potential
impact of AI in our research.</p>

<p>For instance, as a researcher, how should I communicate useful, general-purpose, but low-visibility ways of using AI? If I use an LLM to summarize papers and
classify my sputtering papers, is that something that
should be published in a physics or materials science journal simply because it involves materials? Or should
that become part of how things are done in a group
or a lab?</p>

<p>Here is another example: is the application of general-purpose models to a specific
scientific problem a research goal in itself or is it just a means
to achieve something else, much like an electron microscope? I assume that at some point in history,
using an SEM to view something was noteworthy in itself
until progressively it became commonplace and the novelty
wore off. It seems that right now
we are still in the “I used AI for X”, but at some
point we will get over that phase.</p>

<p>Another way of framing this question is: what areas of this space should materials
science methods papers cover? Is the application of a general model to query a database of materials properties a materials science or an AI paper? Should the scope be
limited to specific, high-visibility parts of the space?</p>

<p>Finally, how do we evaluate the performance or impact of some of the high-visibility, high TRL models being developed right now? If we take a specific AI model to extract
information from X-ray data, how can we quantify how
good it is? Does this type of comparison belong in
a high impact journal?</p>]]></content><author><name>Angel Yanguas-Gil</name></author><category term="research" /><summary type="html"><![CDATA[Over the past few years we have seen an increase in the amount of research activity happening at the intersection of AI and materials science. Research on scientific applications of AI has been going on for many years. The difference is that now this research is going mainstream, with multiple sessions in conferences such as the MRS spring meeting and AI papers appearing in traditional physics, chemistry, and materials science journals.]]></summary></entry><entry><title type="html">Evaluating agents based on reasoning models for process optimization</title><link href="/research/2026/02/14/updates.html" rel="alternate" type="text/html" title="Evaluating agents based on reasoning models for process optimization" /><published>2026-02-14T00:00:00+00:00</published><updated>2026-02-14T00:00:00+00:00</updated><id>/research/2026/02/14/updates</id><content type="html" xml:base="/research/2026/02/14/updates.html"><![CDATA[<p>With agentic AI being essentially everywhere there is the need of understanding how good they really are
for research purposes, and in particular for materials synthesis.</p>

<p>In this <a href="https://arxiv.org/abs/2601.09980">preprint on arxiv</a> I explored how agents based on reasoning models
perform in ALD process optimization tasks. The results are both impressive and somewhat mixed. One one hand,
reasoning models were able to understand and execute elementary atomic layer deposition process optimization
tasks. On the other, they sometime either fail or struggle to find the optimal conditions, which makes them
borderline usable for real world application.</p>

<p>For this work I didn’t use a finetuned model, but a commercially available model. The motivation is that these
models are widely used, and therefore there is a lot of value in understanding how they perform. The other
more prosaic reason is that there are limitations to the type of models that we can access. This left out
some good open weight models such as DeepSeek-R1.</p>

<p>For the test, I designed a few ideal and non-ideal self-limited processes that representative of the type
of ALD processes a researcher is likely to encounter in the wild. For each of these processes, the task
was very simple: find the optimal dose time for the precursor and co-reactant that leads to a saturated
growth per cycle with a process time that is as low as possible. This is a very much an idealized version
of the type of optimization that is carried out over and over again.</p>

<p>The work is currently under review. As soon as the paper is published I will release the code required to
run these challenges. Hopefully someone can take on the challenge of designing performant agents based on
open weights models that can be run locally.</p>]]></content><author><name>Angel Yanguas-Gil</name></author><category term="research" /><summary type="html"><![CDATA[With agentic AI being essentially everywhere there is the need of understanding how good they really are for research purposes, and in particular for materials synthesis.]]></summary></entry><entry><title type="html">gris: a lightweight parser of reference files in RIS format</title><link href="/python/2026/01/14/ris.html" rel="alternate" type="text/html" title="gris: a lightweight parser of reference files in RIS format" /><published>2026-01-14T00:00:00+00:00</published><updated>2026-01-14T00:00:00+00:00</updated><id>/python/2026/01/14/ris</id><content type="html" xml:base="/python/2026/01/14/ris.html"><![CDATA[<p>I took advantage of the break to go back to a couple of side projects. In particular, one of the goals
I had was updating <a href="https://gris.readthedocs.io/en/latest/index.html">gris</a>, a lightweight Python package
that parses reference files in RIS format that I put together a while back.</p>

<h2 id="a-detour-on-ris-data-files">A detour on RIS data files</h2>

<p>Not sure what reference manager people use these days but, if you use something like Zotero or have
downloaded any reference file from Springer Nature, you most likely have come across a RIS file.</p>

<p>The RIS format is old, so old that I had to use the Wayback machine to find an original specification.
Now I keep a copy in the gris documentation for reference.</p>

<p>It is also a format that publishers tend to play with,
many years back WebOfScience used a RIS format with different tag formats when you downloaded references as
other. Recently, AIP has messed up with the format adding headers that were not part of the
<a href="https://gris.readthedocs.io/en/latest/specification.html">original specification</a>.</p>

<h2 id="gris-and-its-updates">gris and its updates</h2>

<p>The new version of gris now doesn’t break when reading files with headers that should not be there.
It now has proper documentation and a bit of testing.</p>

<p>The main purpose of gris is to parse these old files into Python objects that can be then manipulated
or converted into more universal formats, such as JSON. While it is completely unrelated to my daily
work, as it happens it has made its way to <a href="https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0189137">some of my papers</a> as well.</p>

<h2 id="to-do-list">To do list</h2>

<p>There are a few items in my to do list:</p>

<ul>
  <li>I need to expand the collection of input files used in tests.</li>
  <li>I need to document the json output and demonstrate that the RIS -&gt; JSON -&gt; RIS roundtrip works.</li>
  <li>I need to document the code.</li>
  <li>I need to decide whether I want to add to gris a proper semantic layer.</li>
</ul>]]></content><author><name>Angel Yanguas-Gil</name></author><category term="python" /><summary type="html"><![CDATA[I took advantage of the break to go back to a couple of side projects. In particular, one of the goals I had was updating gris, a lightweight Python package that parses reference files in RIS format that I put together a while back.]]></summary></entry><entry><title type="html">Side project: properunits version 0.1.0 is out</title><link href="/python/2025/12/13/properunits.html" rel="alternate" type="text/html" title="Side project: properunits version 0.1.0 is out" /><published>2025-12-13T00:00:00+00:00</published><updated>2025-12-13T00:00:00+00:00</updated><id>/python/2025/12/13/properunits</id><content type="html" xml:base="/python/2025/12/13/properunits.html"><![CDATA[<p>Everybody has to deal with physical units. Pressure is one of my favorite examples. In addition to
the usual assortment of bar, torr, atm, and the humble Pascal, here in the US we have to deal
with things like PSI, which is pounds per square inches, or inches of mercury (inHg).</p>

<p>This is an issue when trying to write code dealing with physical systems. A quick perusal of Python packages
shows a zoo of possible solutions designed to help you do operations with physical magnitudes. To add
to the confusion, here is a new stable release of my side project: <a href="https://pypi.org/project/properunits/">properunits</a>.</p>

<h2 id="intro-to-properunits">Intro to properunits</h2>

<p>The core idea of properunits is that dealing with physical magnitudes should simply be an IO problem. A
user (me) wants to work with a specific unit and my code wants to work with its own unit. Once you
do that transformation, you should be good to go because you have transformed your 5 gallons into a
number using the proper units your code can work with. In my case, this means working with SI+ units,
which to me it means SI units plus maybe one or two exceptions.</p>

<p>So this is how it works:</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">properunits</span> <span class="kn">import</span> <span class="n">Temperature</span>

<span class="n">my_temp</span> <span class="o">=</span> <span class="n">Temperature</span><span class="p">(</span><span class="mi">20</span><span class="p">,</span> <span class="s">'C'</span><span class="p">)</span>
<span class="n">the_proper_temperature</span> <span class="o">=</span> <span class="n">my_temp</span><span class="p">.</span><span class="n">x</span>
</code></pre></div></div>
<p>What properunits does is provide a consistent interface to transform an input with user-defined units into
a proper value (in the example K) that my code can deal with.</p>

<p>You can know which unit is considered a “proper unit” by querying:</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">my_temp</span><span class="p">.</span><span class="n">unit</span>
</code></pre></div></div>
<p>which would return the string <code class="language-plaintext highlighter-rouge">'K'</code>.
Or, if you need to remind yourself which unit did you use, you can retrieve the original value through:</p>

<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">temp</span><span class="p">,</span> <span class="n">unit</span> <span class="o">=</span> <span class="n">my_temp</span><span class="p">.</span><span class="n">value</span>
</code></pre></div></div>

<p>You can check the list of physical units available via:</p>
<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">my_temp</span><span class="p">.</span><span class="n">list_units</span><span class="p">()</span>
</code></pre></div></div>

<p>The documentation for propertunits can be found <a href="https://properunits.readthedocs.io/en/latest/index.html">here</a></p>

<h2 id="what-properunits-does-not-do">What properunits does not do</h2>

<p>Basically anything else. I wanted to have something that would allow me to connect the experimental side of
my research with the models and simulations I create. I just needed to solve the IO problem, not to be able
to work with arbitrary units. I may entertain in the future the idea of being able to use other units as
the “proper unit”, but I must admit it is fairly low in my priority list.
What I may do is expand the range of units to include specific use cases, like nautical miles and the like.</p>]]></content><author><name>Angel Yanguas-Gil</name></author><category term="python" /><summary type="html"><![CDATA[Everybody has to deal with physical units. Pressure is one of my favorite examples. In addition to the usual assortment of bar, torr, atm, and the humble Pascal, here in the US we have to deal with things like PSI, which is pounds per square inches, or inches of mercury (inHg).]]></summary></entry><entry><title type="html">November 2025 updates</title><link href="/updates/2025/11/30/Nov25.html" rel="alternate" type="text/html" title="November 2025 updates" /><published>2025-11-30T00:00:00+00:00</published><updated>2025-11-30T00:00:00+00:00</updated><id>/updates/2025/11/30/Nov25</id><content type="html" xml:base="/updates/2025/11/30/Nov25.html"><![CDATA[<p>After the recommended media silence during the shutdown it is time for some recent updates:</p>

<h2 id="webinar-for-physics-of-plasmas">Webinar for physics of plasmas</h2>

<p>After <a href="https://pubs.aip.org/aip/pop/article-abstract/11/12/5497/261027/Collisional-radiative-model-of-an-argon">more than 20 years</a> this year I published <a href="https://pubs.aip.org/aip/pop/article/32/7/073507/3352389/Surrogate-models-to-optimize-plasma-assisted">a new paper</a> on
the journal Physics of Plasmas where I explored how we can leverage machine learning
to optimize processes in plasma enhanced chemical vapor deposition. Two relevant things about
this paper:</p>

<ul>
  <li>It is published with an open access license so anyone can access it (in fact, the whole Physics of Plasmas is open
access for 2025!).</li>
  <li>It was one of the featured articles by the journal, and I was asked to give a webinar this fall about the paper,
where I provided an introduction to plasma enhanced
atomic layer deposition and our machine learning work. The link to the webinar is <a href="https://mediacentral.princeton.edu/id/1_xfv654ci">here</a>.</li>
</ul>

<h2 id="our-review-on-nature-reviews-methods-primer-was-published">Our review on Nature Reviews Methods Primer was published</h2>

<p>This is a work that was a few years in the making. I had the chance to collaborate with some of the most talented researchers
in my field to write an introduction to atomic layer deposition. The link to the paper is <a href="https://www.nature.com/articles/s43586-025-00435-6">here</a>. It is, alas, behind a paywall.</p>

<h2 id="chairing-a-session-on-quantum-computing-at-sc2025">Chairing a session on quantum computing at SC2025</h2>

<p><a href="https://sc25.supercomputing.org">SC2025</a> is a conference focused on high performance computing. However, it has a post-Moore computing component, and as a member
of the program committee for 2025 I was asked to chair <a href="https://sc25.conference-program.com/presenter/?uid=872914">one of the sessions in quantum computing</a></p>

<h2 id="others">Others</h2>

<ul>
  <li>Honored to be a member of a PhD committee, Anurag did an outstanding job defending his thesis</li>
  <li>One of the projects I am leading got renewed this year, so more fun on ALD + microelectronics.</li>
  <li>Lots of changes in DOE, still processing them.</li>
  <li>Going through a lot of proposal reviews right now.</li>
</ul>

<p>Overall, a busy couple of months.</p>]]></content><author><name>Angel Yanguas-Gil</name></author><category term="updates" /><summary type="html"><![CDATA[After the recommended media silence during the shutdown it is time for some recent updates:]]></summary></entry><entry><title type="html">Mi web en español - anglyan.net</title><link href="/updates/2025/11/29/intro.html" rel="alternate" type="text/html" title="Mi web en español - anglyan.net" /><published>2025-11-29T00:00:00+00:00</published><updated>2025-11-29T00:00:00+00:00</updated><id>/updates/2025/11/29/intro</id><content type="html" xml:base="/updates/2025/11/29/intro.html"><![CDATA[<p>Inicialmente había decidido empezar un blog en español aquí. Sin embargo,
finalmente me he decidido a crear un sitio aparte dado que mis objetivos
son diferentes.</p>

<p>Mis dos sitios en español son:</p>

<ul>
  <li>
    <p><a href="https://anglyan.net">anglyan.net</a></p>
  </li>
  <li>
    <p><a href="https://cienciaypython.github.io/book/">cienciaypython</a>: tutoriales usando jupyter book
y MyST.</p>
  </li>
</ul>]]></content><author><name>Angel Yanguas-Gil</name></author><category term="updates" /><summary type="html"><![CDATA[Inicialmente había decidido empezar un blog en español aquí. Sin embargo, finalmente me he decidido a crear un sitio aparte dado que mis objetivos son diferentes.]]></summary></entry></feed>