Thoughts about science and AI: my own mental model

Over the past few years we have seen an increase in the amount of research activity happening at the intersection of AI and materials science. Research on scientific applications of AI has been going on for many years. The difference is that now this research is going mainstream, with multiple sessions in conferences such as the MRS spring meeting and AI papers appearing in traditional physics, chemistry, and materials science journals.

In order to make sense of what’s going on, I try to place AI methods in a space defined by three different axes:

Technology readiness level: basically, is this something that is AI research (low TRLs) or is it ready to be used in real life (TRL 9)?
How specific is the model to my research field? Is this something like email, something useful but general-purpose or is this something that has developed for a specific problem?
What is the visibility of this AI method? And by that I mean, is it something that is likely to be cited (like a specific model), disclosed (like a tool) or acknowledged (like a facility)?

How I break down AI methods in three different axes

Based on this you can put AI works in this 3D space. At low TRL levels sit the vast majority of papers and projects involving AI for materials science. These works, which focus on developing or demonstrating new methods, are essentially AI research. The point is not to be immediately useful to the researcher, they are essentially methods papers on a materials science topic where the publication itself is the main outcome.

In this region, whether AI lives up to its hype is largely immaterial to the work itself, affecting primarily the shelf life of many of these publications. One may dispute whether investing a lot of resources in it is warranted, but the situation is no different than, say, any other idea going through the research hype curve. Right now, though, communities are growing in this space, as we can see in conferences, journals, and funding priorities. The range of specificity and visibility in the research of these papers is really broad.

At the other side of the spectrum, at high TRLs, sits what I call AI for research. These are the methods that are or will be actively used by researchers who are not necessarily the ones who developed them. Reusability is the key here. Some of these methods will be general (like using generative AI to accelerate the development of scientific codes), but most will be very specialized. The range of visibility is a bit skewed towards higher visibility with respect to more exploratory work. AI methods that broadly support the work of the researcher are not likely to be considered AI for research, much like email is not considered a research tool. They tend to be skewed towards higher specificity for the same reason.

The corner with high TRL, domain-specific AI is the most intriguing to me, because I think we do not know yet how impactful AI will turn out to be, how the research enterprise will shape and be shaped by AI. In fact, we as domain scientists have barely started to quantify its value and potential gains. This region is where the aspirational goals of AI meet the reality of research.

What ifs?

Once you start placing hypothetical tools in this space you can start asking questions about the potential impact of AI in our research.

For instance, as a researcher, how should I communicate useful, general-purpose, but low-visibility ways of using AI? If I use an LLM to summarize papers and classify my sputtering papers, is that something that should be published in a physics or materials science journal simply because it involves materials? Or should that become part of how things are done in a group or a lab?

Here is another example: is the application of general-purpose models to a specific scientific problem a research goal in itself or is it just a means to achieve something else, much like an electron microscope? I assume that at some point in history, using an SEM to view something was noteworthy in itself until progressively it became commonplace and the novelty wore off. It seems that right now we are still in the “I used AI for X”, but at some point we will get over that phase.

Another way of framing this question is: what areas of this space should materials science methods papers cover? Is the application of a general model to query a database of materials properties a materials science or an AI paper? Should the scope be limited to specific, high-visibility parts of the space?

Finally, how do we evaluate the performance or impact of some of the high-visibility, high TRL models being developed right now? If we take a specific AI model to extract information from X-ray data, how can we quantify how good it is? Does this type of comparison belong in a high impact journal?