Satellogic Inc. announced the release of a large open dataset of high-resolution imagery, curated from the company's archive, to support the training of foundation models. The dataset contains around 3 million Satellogic images of unique locations -- 6 million images, including location revisits -- from around the world. Each image is 384 by 384 pixels, totaling 900 Gigapixels spanning different land-use types, objects, geographies, and seasons.

The full dataset can be accessed on Hugging Face. Satellogic data is released under a Creative Commons CC-BY 4.0 license, allowing for commercial use of the data with attribution. A paper presenting the dataset will be published along with the release of a baseline foundation model, a masked autoencoder (scalable self-supervised learners for computer vision), built on top of it.

The paper describes how the dataset is built, the model architecture and experimental setup. This work is the result of Satellogic's collaboration with an exceptional team of researchers led by Alexandre Lacoste at ServiceNow under Yoshua Bengio's guidance.