Research

In my research I develop and analyze of machine learning algorithms to address pressing social and environmental problems. Sometimes this entails developing or analyzing new statistical ML techniques, sometimes this entails carefully applying, adapting, and evaluating ML methods in a specific context of use; most often it entails a blend of the two. Some specific project areas and application contexts:

  • Tailored machine learning for remotely sensed data (e.g. using satellite imagery for environmental monitoring).
  • Characterizing and formalizing notions of representivity in training data (e.g. numerical representation: how many data points come from each source or group?, as distinct from what components of an individual or environment does a collection of data actually reflect?), and how these notions of representation effect our ability to train fair and useful machine learning systems.
  • Understanding and addressing key challenges in geospatial machine learning (e.g. spatial error structures make it hard to evaluate geospatial ML models, and can introduce concerns of bias or unfairness in downstream use).

For a full list of papers please see my google scholar page.

Selected Projects

Spotlight presentation! "Mission Critical -- Satellite Data is a Distinct Modality in Machine Learning" (w/ Hannah Kerner, Konstantin Klemmer, and Caleb Robinson) will appear at ICML 2024.

New! "Application-Driven Innovation in Machine Learning " (w/ David Rolnick, Alan Aspuru-Guzik, Sara Beery, Bistra Dilkina, Priya L. Donti, Marzyeh Ghassemi, Hannah Kerner, Claire Monteleoni, Milind Tambe, and Adam Whiten) will appear at ICML 2024.

"Geographic location encoding with spherical harmonics and sinusoidal representation networks" (w/ Marc Rußwurm, Konstantin Klemmer, Robin Zbinden, and Devis Tuia) appeared at ICLR 2024.

"Fairness and representation in satellite-based poverty maps: Evidence of urban-rural disparities and their impacts on downstream policy" (w/ Emily Aiken and Joshua Blumenstock) appeared at IJCAI 2023.

How to join

I am currently recruiting postdocs and PhD Students to join my lab at CU, who are interested in

  • statistical and/or geospatial ML methodology, and
  • context-driven research for real-world problems

and who want to work in and foster a lab environment which is

  • doing innovative, sometimes interdisciplinary research
  • collaborative, supportive, and communication-forward.

Interested in a PhD, postdoc, or other involvment with the lab? Please first read my guide to getting involved with the lab, which includes instructions for the best way to contact me. Please note that I will not be able to respond to all emails.

Courses

Current Topics in Computer Science: Geospatial and Statistical Machine Learning

  • Fall 2024 Tues/Thurs 12:30-1:45pm, CU Boulder (CSCI 7000)
  • Course description and learning objectives
  • Resources

    PhD application resources

  • CU Boulder's graduate admissions homepage
  • CU Boulder's tips for creating an impactful application and opportunities for an application fee waiver.
  • MOSAIKS: Generalizable and accessible machine learning with global satellite imagery

  • The MOSAIKS API (data interface) is now available: go to siml.berkeley.edu to make an account and use precomputed global features (using satellite imagery from 2019).
  • See the MOSAIKS project page for more resources and updates.