AI for Science and Science for AI

Artificial Intelligence techniques have progressed immensely over the past few years: many fields of science and engineering can now make use of these techniques. We are very interested in exploring how AI can help our research, and science research in general, but we are also quite interested how physics insights can lead to even more powerful AI algorithms. At this web page, you can learn more about this part of our research.

Optical_DL

Applications of AI techniques in science and engineering

AI techniques enabled amazing applications in computer vision, game playing, question answering, etc. AI can and will have similar dramatic impact on development of science and engineering. In order to achieve this, sometimes the existing AI techniques (developed for applications other than science) can be used as they are. But, more often, they need to be modified to be suitable for science applications, sometimes a little, sometimes a lot. Sometimes, completely new AI techniques need to be discovered for certain science applications. For some examples of our work on this, please see:

Interpretable AI

While today's AI algorithms work amazingly well, most of them operate in quite a "black box" fashion: they provide (often) correct answers, but it is close to impossible to understand how they work or reach answers. For many applications, it would be highly desirable to have AI algorithms that not only provide correct answers but also it is easy to understand how these answers were obtained. For most applications, that would give us more faith into these algorithms, and more transparency when they can be fully trusted, and when not. Moreover, in science, such algorithms could also provide physical insights, and be useful in developing new science theories.

For some of our work on interpretable AI, see:

New Hardware for AI

Almost all computing nowadays is done using electrons. However, there are a few known algorithms that can be implemented in a superior way using light (photons). For example, matrix multiplication can be implemented with light essentially instantly (at light-speed), and theoretically with zero energy consumption. Since deep learning algorithms rely so heavily on matrix multiplication, there is a value proposition for implementing some of them with photonic (instead of electronic) hardware. Such systems (Nature Photonics 2017) could for certain applications be substantially faster, consume much less energy, and have much lower latency.

For our additional work on photonics hardware, see:

New Algorithms for AI

Our group's strong background in general physics and math techniques also enables us to sometimes use these techniques to understand and analyze various AI algorithms, or construct novel AI algorithms with improved performance. On this topic, we often collaborate with our colleagues from computer science departments and/or industry. Some of examples of our work on this are described below.

Using unitary (instead of general) matrices in artificial neural networks (ANNs) is a promising way to solve the gradient explosion/vanishing problem, as well as to enable ANNs to learn long-term correlations in the data. We present (ICML 2017, with Yann LeCun's group) an Efficient Unitary Neural Network (EUNN) architecture that parametrizes the entire space of unitary matrices in a complete and computationally efficient way, thereby eliminating the need for time-consuming unitary subspace-projections.

In another project, we present (Neural Computation 2019, with Yoshua Bengio's group) a novel recurrent neural network (RNN) based model that combines the remembering ability of unitary RNNs with the ability of gated RNNs to effectively forget redundant/irrelevant information in its memory. We achieve this by extending unitary RNNs with a gating mechanism.

The concepts of unitary evolution matrices and associative memory have boosted the field of Recurrent Neural Networks (RNN) to state-of-the-art performance in a variety of sequential tasks. However, RNN still have a limited capacity to manipulate long-term memory. To bypass this weakness the most successful applications of RNN use external techniques such as attention mechanisms. In yet another project, (TACL 2019) we propose a novel RNN model that unifies the state-of-the-art approaches: Rotational Unit of Memory (RUM). The core of RUM is its rotational operation, which is, naturally, a unitary matrix, providing architectures with the power to learn long-term dependencies by overcoming the vanishing and exploding gradients problem. Moreover, the rotational unit also serves as associative memory. Our work with RUM turns out to be particularly suitable for Natural Language Processing (NLP), so we applied it to the task of summarizing scientific articles. In a further work, (AAAI 2021, with Preslav Nakov's group) we also explored transformers for the same summarization task.

In a next project, (ICLR 2022 with Pulkit Agrawal's group) we have shown that pre-training that encourages non-trivial equivariance to some transformations, while maintaining invariance to other transformations, can be used to improve the semantic quality of representations.

For our additional work on AI algorithms, see: