Make the Machine learn the Science

View My GitHub Profile

Data-Intensive Discovery Accelerated by Computational Techniques for Science (DIDACTS)

How do we realize the full impact of machine learning in the physical sciences? That is the core question that the DIDACTS project centers around. The specific challenge is that physical sciences are at a tipping point whereas current machine learning methods do not adequately address their needs.

Research information

Quad Chart description of project

(Click here for a PDF of the above Quad Chart)

As basic sciences including seismology, meteorology, materials science, and others, are becoming data intensive, we are reaching a tipping point, where identifying and utilizing meaningful signals in noisy data become increasingly difficult and require innovative, effective computational methods. A particularly challenging problem in the physics domain is the identification of Dark Matter (and properties of other elementary particles). Eighty-five percent of our Universe comprises something that we do not understand: Dark Matter. It binds the whole Universe together; without it, galaxies would not form and life would not exist. Yet we have no experimental knowledge of its properties.

Astroparticle physics experiments, characterized by massive amounts of extremely noisy spatiotemporal sensor data, provide an ideal development ground for computational methods that will be broadly applicable in experimental sciences. Our earlier studies have demonstrated that current neural network- based methods are not a good fit for encoding prior physical knowledge, and for faithfully representing physical constraints in these experiments. We propose going back to foundational probabilistic modeling and inverse problems, into which physical constraints and prior knowledge can be readily incorporated. This proposal brings together a well-qualified interdisciplinary team, comprising an astrophysicist working on particle physics detectors, a machine learning researcher with much interdisciplinary collaborative experience in the sciences, and an engineer with extensive expertise in signal processing, statistics. and inverse problems. The team is thus ideally positioned to make significant contributions advancing both data science and its impact on data-driven scientific disciplines.


Most of our non-paper content can be found in our Zenodo community, including anything below that isn’t directly citable. Publications and codes will be linked here once available. Repositories will be within our Github organization.


Highlighted talks at workshops

Other deliverables


NSF_poster (Click here for a PDF of the above poster)

Project leads

Please feel free to reach out to any of our PIs to discuss possible collaboration.



A core part of our work is engaging with communities within the physical sciences that face similar challenges through a series of workshops each year. So far, we have run the following workshops:



This work is supported by the National Science Foundation as part of it’s Harnessing the Data Revolution Big Idea (Soliciation 19-543) through awards 1940074, 1940209, and 1940080.