Skip to article frontmatterSkip to article content

Search and massive download

In this example, we will search for items using pygeodes, filter these items using geopandas dataframes and use a download queue to downloads these items and monitor the progress of the download.

Imports

Let’s start by importing geodes

from pygeodes import Geodes, Config

Configuration

We configure using a config file located in our cwd

conf = Config.from_file("config.json")
geodes = Geodes(conf=conf)

Searching products

We search for products in the T31TCK tile whose acquisition date is after 2023-01-01

from pygeodes.utils.datetime_utils import complete_datetime_from_str

query = {
    "grid:code": {"eq": "T31TCK"},
    "end_datetime": {"gte": complete_datetime_from_str("2023-01-01")},
}
items, dataframe = geodes.search_items(query=query)
/work/scratch/data/fournih/test_env/lib/python3.11/site-packages/urllib3/connectionpool.py:1099: InsecureRequestWarning: Unverified HTTPS request is being made to host 'geodes-portal.cnes.fr'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#tls-warnings
  warnings.warn(
Found 539 items matching your query, returning 80 as get_all parameter is set to False
80 item(s) found for query : {'grid:code': {'eq': 'T31TCK'}, 'end_datetime': {'gte': '2023-01-01T00:00:00.000000Z'}}

Exploring results

We get a list ot items and a dataframe, we can work with the dataframe for instance :

dataframe
Loading...

Adding columns

We want to filter on cloudcover, so we need to add the column to the dataframe.

items[0].list_available_keys()
{'area', 'continent_code', 'dataset', 'datetime', 'end_datetime', 'endpoint_description', 'endpoint_url', 'eo:cloud_cover', 'grid:code', 'hydrology.rivers', 'id', 'identifier', 'instrument', 'keywords', 'latest', 'platform', 'political.continents', 'processing:level', 'product:timeliness', 'product:type', 'proj:bbox', 'references', 's2:datatake_id', 'sar:instrument_mode', 'sat:absolute_orbit', 'sat:orbit_state', 'sat:relative_orbit', 'sci:doi', 'start_datetime', 'version'}

We find we can use spaceborne:cloudCover, so we add it to the dataframe :

from pygeodes.utils.formatting import format_items

dataframe_new = format_items(dataframe, {"eo:cloud_cover"})

Filtering our results

Now that the cloud cover is in our dataframe, we can filter on it.

dataframe_filtered = dataframe_new[dataframe_new["eo:cloud_cover"] < 30]
dataframe_filtered
Loading...

Plotting

We can plot our results on a map :

dataframe_filtered.explore()
Loading...

Downloading our items

We can download our results using the Profile system

from pygeodes.utils.profile import DownloadQueue, Profile

We reset our Profile to be sure to track only the downloads from the queue

Profile.reset()
items = dataframe_filtered["item"].values
queue = DownloadQueue(items)

In a separate cell, we run our queue

queue.run()