Skip to article frontmatterSkip to article content

Quickstart

Now that you installation went ok, let’s start with a small example. Let’s create our first Geodes object.

from pygeodes import Geodes

geodes = Geodes()

Searching for collections

Then we can start by searching for existing collections, for example with search term sentinel :

collections, dataframe = geodes.search_collections("sentinel")
/work/scratch/data/fournih/test_env/lib/python3.11/site-packages/urllib3/connectionpool.py:1099: InsecureRequestWarning: Unverified HTTPS request is being made to host 'geodes-portal.cnes.fr'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#tls-warnings
  warnings.warn(
Indexing: 100%|██████████| 111/111 [00:00<00:00, 2032.17it/s]

Let’s see what we found. As a result, we get a collections object, which is a list of Collection objects.

collections
[<Collection id=MUSCATE_Snow_SENTINEL2_L2B-SNOW>, <Collection id=MUSCATE_SENTINEL2_SENTINEL2_L2A>, <Collection id=MUSCATE_SENTINEL2_SENTINEL2_L3A>, <Collection id=MUSCATE_WaterQual_SENTINEL2_L2B-WATER>, <Collection id=PEPS_S2_L2A>, <Collection id=PEPS_S2_L1C>, <Collection id=PEPS_S1_L2>, <Collection id=PEPS_S1_L1>, <Collection id=MUSCATE_Snow_MULTISAT_L3B-SNOW>, <Collection id=TAKE5_SPOT4_L1C>, <Collection id=MUSCATE_LANDSAT_LANDSAT8_L2A>, <Collection id=MUSCATE_OSO_RASTER_L3B-OSO>, <Collection id=TAKE5_SPOT4_L2A>, <Collection id=MUSCATE_Snow_LANDSAT8_L2B-SNOW>, <Collection id=PEPS_S3_L1>, <Collection id=TAKE5_SPOT5_L1C>, <Collection id=TAKE5_SPOT5_L2A>, <Collection id=MUSCATE_OSO_VECTOR_L3B-OSO>]

and a dataframe object, which is a geopandas.GeoDataFrame.

dataframe
Loading...

The dataframe let’s see you quickly see what you found, with only a few columns (here description and title), but if you are more comfortable working with raw objects, it’s also possible.

Let’s see how many elements are in these collections.

for collection in collections:
    print(
        f"collection {collection.title} has {collection.summaries.other.get('total_items')} elements"
    )
collection MUSCATE Snow SENTINEL2 L2B has 91508 elements
collection MUSCATE SENTINEL2 L2A has 1019637 elements
collection MUSCATE SENTINEL2 L3A has 56011 elements
collection MUSCATE WaterQual SENTINEL2 L2B has 138 elements
collection PEPS Sentinel-2 L2A tiles has 0 elements
collection PEPS Sentinel-2 L1C tiles has 34163668 elements
collection PEPS Sentinel-1 Level2 has 1317955 elements
collection PEPS Sentinel-1 Level1 has 5759206 elements
collection MUSCATE L3B Snow has 667 elements
collection TAKE5 SPOT4 LEVEL1C has 902 elements
collection MUSCATE LANDSAT8 L2A has 27896 elements
collection MUSCATE OSO RASTER has 8 elements
collection TAKE5 SPOT4 LEVEL2A has 814 elements
collection MUSCATE Snow LANDSAT8 L2B has 8708 elements
collection GDH Sentinel-3 L1 STM Level-1 products has 46864 elements
collection TAKE5 SPOT5 LEVEL1C has 3740 elements
collection TAKE5 SPOT5 LEVEL2A has 2953 elements
collection MUSCATE OSO VECTOR has 768 elements

We could want to add columns to our dataframe. To know which columns are available, use collection.list_available_keys() on a Collection object :

collections[0].list_available_keys()
{'assets.snow.description', 'assets.snow.href', 'assets.snow.roles', 'assets.snow.title', 'assets.snow.type', 'assets.wms_capabilities.description', 'assets.wms_capabilities.href', 'assets.wms_capabilities.roles', 'assets.wms_capabilities.title', 'assets.wms_capabilities.type', 'description', 'extent.spatial.bbox', 'extent.temporal.interval', 'id', 'keywords', 'license', 'links', 'providers', 'stac_extensions', 'stac_version', 'summaries.access_url', 'summaries.constellation', 'summaries.contact_email', 'summaries.contact_name', 'summaries.dataset', 'summaries.format', 'summaries.geometry_type', 'summaries.gsd', 'summaries.instruments', 'summaries.item_type', 'summaries.latest', 'summaries.platform', 'summaries.processing:level', 'summaries.temporal_resolution', 'summaries.theme', 'summaries.total_items', 'summaries.variables', 'summaries.version', 'title', 'type'}

Let’s add summaries.total_items to the dataframe :

from pygeodes.utils.formatting import format_collections

new_dataframe = format_collections(
    dataframe, columns_to_add={"summaries.total_items"}
)
new_dataframe
Loading...

If you wish to produce a fresh new dataframe with your custom columns, use format_collections on a list of collections :

new_dataframe = format_collections(
    collections,
    columns_to_add={"summaries.constellation", "summaries.instruments"},
)
new_dataframe
Loading...

Note : title and description columns are always here by default.

Searching for items

As for collections, we can search for items. To know which arguments to put in your query, please use :

from pygeodes.utils.query import get_requestable_args

print(get_requestable_args())
{'version': 'v8.0', 'attributes': ['dataset (STRING)', 'datetime (DATE_ISO8601)', 'links (STRING)', 'product_validity (BOOLEAN)', 'sci:doi (STRING_ARRAY)', 'no_geometry (BOOLEAN)', 'start_datetime (DATE_ISO8601)', 'end_datetime (DATE_ISO8601)', 'processing:datetime (DATE_ISO8601)', 'processing:lineage (STRING)', 'processing_context (STRING)', 'processing_correction (STRING)', 'processing:version (STRING)', 'bbox (STRING)', 'nb_cols (STRING)', 'nb_rows (STRING)', 'instrument (STRING)', 'platform (STRING)', 'sar:instrument_mode (STRING)', 'processing:level (STRING)', 'sar:polarizations (STRING)', 'sat:orbit_cycle (INTEGER)', 'mission_take_id (INTEGER)', 's2:datatake_id (STRING)', 'sat:relative_orbit (INTEGER)', 'sat:absolute_orbit (LONG)', 'sat:orbit_state (STRING)', 'product:type (STRING)', 'parameter (STRING)', 'pparameter (STRING)', 'product (STRING)', 'temporal_resolution (STRING)', 'classification (STRING)', 'swath (STRING)', 'bands (STRING)', 'grid:code (STRING)', 'eo:cloud_cover (DOUBLE)', 'water_cover (DOUBLE)', 'saturated_defective_pixel (DOUBLE)', 'nodata_pixel (DOUBLE)', 'nb_col_interpolation_error (DOUBLE)', 'ground_useful_pixel (DOUBLE)', 'min_useful_pixel (DOUBLE)', 'sensor_angle (DOUBLE)', 'sensor_pitch (DOUBLE)', 'sensor_roll (DOUBLE)', 'continent_code (STRING_ARRAY)', 'area (DOUBLE)', 'sar:beam_ids (STRING)', 'subTile (STRING)', 'keywords (STRING_ARRAY)', 'political (JSON)']}

We see we can use sat:absolute_orbit. Let’s search for example those whose orbit direction is 30972:

query = {"sat:absolute_orbit": {"eq": 30972}}
items, dataframe = geodes.search_items(query=query, collections=['PEPS_S2_L1C'])
/work/scratch/data/fournih/test_env/lib/python3.11/site-packages/urllib3/connectionpool.py:1099: InsecureRequestWarning: Unverified HTTPS request is being made to host 'geodes-portal.cnes.fr'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#tls-warnings
  warnings.warn(
Found 760 items matching your query, returning 80 as get_all parameter is set to False
80 item(s) found for query : {'sat:absolute_orbit': {'eq': 30972}}

Again, we come out with an items object, and a dataframe object.

items[:10]  # printing everything is useless
[Item (URN:FEATURE:DATA:gdh:d939b8a4-7f6b-38ba-8925-415a7133a447:V1), Item (URN:FEATURE:DATA:gdh:8c836d61-26f9-341b-b44d-7e7fa7a74e9b:V1), Item (URN:FEATURE:DATA:gdh:ea6f4fb7-716a-3d5c-a560-b652e53906a4:V1), Item (URN:FEATURE:DATA:gdh:8728c52b-c8cc-3237-8cd6-e215836fc505:V1), Item (URN:FEATURE:DATA:gdh:60d83cde-631a-3d28-b858-251fdb4aae7c:V1), Item (URN:FEATURE:DATA:gdh:a6d9b9e0-250e-3cad-a77d-5a48608ce653:V1), Item (URN:FEATURE:DATA:gdh:72989caf-a884-3018-a7ff-c464c835b21b:V1), Item (URN:FEATURE:DATA:gdh:a449437f-8a51-3a1a-82e8-ea35b046b792:V1), Item (URN:FEATURE:DATA:gdh:0967199f-0eb1-3189-a19a-fe443e4ad1c9:V1), Item (URN:FEATURE:DATA:gdh:8550a63f-c516-3411-a5de-10fe155c94b5:V1)]
dataframe
Loading...

Let’s have a look around our items.

A thing we could want to do is filter them by cloud cover, let’s say between 39 and 40. But this column doesn’t appear in the dataframe. To know which columns are available, use item.list_available_keys() on an Item object.

items[0].list_available_keys()
{'area', 'continent_code', 'dataset', 'datetime', 'end_datetime', 'endpoint_description', 'endpoint_url', 'eo:cloud_cover', 'grid:code', 'id', 'identifier', 'instrument', 'keywords', 'latest', 'physical', 'platform', 'political.continents', 'processing:level', 'processing:version', 'product:timeliness', 'product:type', 'proj:bbox', 'references', 's2:datatake_id', 'sar:instrument_mode', 'sat:absolute_orbit', 'sat:orbit_state', 'sat:relative_orbit', 'sci:doi', 'start_datetime', 'version'}

We see we can use eo:cloud_cover. We can add it using format_items.

from pygeodes.utils.formatting import format_items

new_dataframe = format_items(
    dataframe, columns_to_add={"eo:cloud_cover"}
)
new_dataframe
Loading...

We’ve got our new dataframe. Let’s filter :

filtered = new_dataframe[
    (new_dataframe["eo:cloud_cover"] <= 40)
    & (new_dataframe["eo:cloud_cover"] >= 39)
]
filtered
Loading...

Let’s plot these items :

m = filtered.explore()
m
Loading...

If you want to have all available columns in your dataframe, just do :

full_dataframe = format_items(
    items, columns_to_add=items[0].list_available_keys()
)
full_dataframe
Loading...

Providing an api key

The next parts involve requests that require an api-key. You can register one using the following method. We will also set a default download directory, for later. We use a file config.json formed as follows :

{"api_key" : "MyApiKey","download_dir" : "/tmp"}
from pygeodes import Config

conf = Config.from_file("config.json")
geodes.set_conf(conf)

Other ways to configure pygeodes are described in configuration.

Quicklook

Now we can have a look at our items.

for item in filtered["item"]:
    print(f"Quicklook of {item}")
    item.show_quicklook()
Quicklook of Item (URN:FEATURE:DATA:gdh:e4c321cd-3e90-3a97-a357-5d64ca3e8e49:V1)
/work/scratch/data/fournih/test_env/lib/python3.11/site-packages/urllib3/connectionpool.py:1099: InsecureRequestWarning: Unverified HTTPS request is being made to host 'geodes-portal.cnes.fr'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#tls-warnings
  warnings.warn(
<IPython.core.display.Image object>

Downloading items

Now we could want to download these items for further use :

for item in filtered["item"]:
    item.download_archive()

As we provided /tmp as default download dir, the downloads are stored in this folder.