Ground Truth#

  • Last Modified: 17-03-2022

  • Authors: Satyarth Praveen, J. Emmanuel Johnson, Nadia Ahmed, Gonzalo Mateo-García


Brief Description#

In this notebook, we will explore how to visualize the different products that we use in the WorldFloods dataset. In addition we will show how to create the ground truth that we used for training the models (the 3-class ground truth used in Mateo-García et al. 2021 and the new binary ground truth for multi-output binary classification).

This tutorial follows the previous one about ingesting floodmaps which shows how to pre-process and download floodmaps and how to download Sentinel-2 images using the Google Earth Engine. In this notebook we use these products (floodmaps, S2 images and JRC Permanent Water Layer also downlinked from the GEE).

To create the ground truth necessary in the WorldFloods data, we need access to water layers, clouds. So let’s show how we do this!


!pip install git+https://github.com/spaceml-org/ml4floods#egg=ml4floods
Collecting ml4floods
  Cloning https://github.com/spaceml-org/ml4floods to /tmp/pip-install-1ocx6bq9/ml4floods_047e23a821dc4d3f90c1cd136315a242
  Running command git clone --filter=blob:none --quiet https://github.com/spaceml-org/ml4floods /tmp/pip-install-1ocx6bq9/ml4floods_047e23a821dc4d3f90c1cd136315a242
  Resolved https://github.com/spaceml-org/ml4floods to commit a2f4f6df8b79cc641572ba6fa5287b9a68c67505
  Preparing metadata (setup.py) ... ?25ldone
?25hRequirement already satisfied: torch in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from ml4floods) (1.10.2)
Requirement already satisfied: torchvision in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from ml4floods) (0.11.3)
Requirement already satisfied: pytorch-lightning in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from ml4floods) (1.5.9)
Requirement already satisfied: numpy in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from ml4floods) (1.22.1)
Requirement already satisfied: matplotlib in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from ml4floods) (3.5.1)
Requirement already satisfied: seaborn in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from ml4floods) (0.11.2)
Requirement already satisfied: scipy in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from ml4floods) (1.7.3)
Requirement already satisfied: pandas in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from ml4floods) (1.4.0)
Requirement already satisfied: rasterio in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from ml4floods) (1.2.10)
Requirement already satisfied: geopandas in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from ml4floods) (0.10.2)
Requirement already satisfied: fastprogress in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from ml4floods) (1.0.0)
Requirement already satisfied: tqdm in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from ml4floods) (4.62.3)
Requirement already satisfied: black in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from ml4floods) (22.1.0)
Requirement already satisfied: isort in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from ml4floods) (5.10.1)
Requirement already satisfied: mypy in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from ml4floods) (0.931)
Requirement already satisfied: opencv-python in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from ml4floods) (4.5.5.62)
Requirement already satisfied: albumentations in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from ml4floods) (1.1.0)
Requirement already satisfied: google-cloud-storage in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from ml4floods) (2.1.0)
Requirement already satisfied: pyprojroot in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from ml4floods) (0.2.0)
Requirement already satisfied: fsspec in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from ml4floods) (2022.1.0)
Requirement already satisfied: gcsfs in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from ml4floods) (2022.1.0)
Requirement already satisfied: requests_html in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from ml4floods) (0.10.0)
Requirement already satisfied: earthengine-api in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from ml4floods) (0.1.296)
Requirement already satisfied: mercantile in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from ml4floods) (1.2.1)
Requirement already satisfied: qudida>=0.0.4 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from albumentations->ml4floods) (0.0.4)
Requirement already satisfied: opencv-python-headless>=4.1.1 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from albumentations->ml4floods) (4.5.5.62)
Requirement already satisfied: scikit-image>=0.16.1 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from albumentations->ml4floods) (0.19.1)
Requirement already satisfied: PyYAML in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from albumentations->ml4floods) (6.0)
Requirement already satisfied: tomli>=1.1.0 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from black->ml4floods) (2.0.0)
Requirement already satisfied: click>=8.0.0 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from black->ml4floods) (8.0.3)
Requirement already satisfied: mypy-extensions>=0.4.3 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from black->ml4floods) (0.4.3)
Requirement already satisfied: typing-extensions>=3.10.0.0 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from black->ml4floods) (4.0.1)
Requirement already satisfied: platformdirs>=2 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from black->ml4floods) (2.4.1)
Requirement already satisfied: pathspec>=0.9.0 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from black->ml4floods) (0.9.0)
Requirement already satisfied: future in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from earthengine-api->ml4floods) (0.18.2)
Requirement already satisfied: httplib2<1dev,>=0.9.2 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from earthengine-api->ml4floods) (0.20.2)
Requirement already satisfied: google-auth-httplib2>=0.0.3 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from earthengine-api->ml4floods) (0.1.0)
Requirement already satisfied: google-auth>=1.4.1 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from earthengine-api->ml4floods) (2.5.0)
Requirement already satisfied: google-api-python-client<2,>=1.12.1 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from earthengine-api->ml4floods) (1.12.10)
Requirement already satisfied: six in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from earthengine-api->ml4floods) (1.16.0)
Requirement already satisfied: httplib2shim in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from earthengine-api->ml4floods) (0.0.3)
Requirement already satisfied: google-auth-oauthlib in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from gcsfs->ml4floods) (0.4.6)
Requirement already satisfied: aiohttp<4 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from gcsfs->ml4floods) (3.8.1)
Requirement already satisfied: requests in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from gcsfs->ml4floods) (2.27.1)
Requirement already satisfied: decorator>4.1.2 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from gcsfs->ml4floods) (5.1.1)
Requirement already satisfied: pyproj>=2.2.0 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from geopandas->ml4floods) (3.3.0)
Requirement already satisfied: shapely>=1.6 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from geopandas->ml4floods) (1.8.0)
Requirement already satisfied: fiona>=1.8 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from geopandas->ml4floods) (1.8.20)
Requirement already satisfied: python-dateutil>=2.8.1 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from pandas->ml4floods) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from pandas->ml4floods) (2021.3)
Requirement already satisfied: google-cloud-core<3.0dev,>=1.6.0 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from google-cloud-storage->ml4floods) (2.2.2)
Requirement already satisfied: google-api-core<3.0dev,>=1.29.0 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from google-cloud-storage->ml4floods) (2.4.0)
Requirement already satisfied: google-resumable-media>=1.3.0 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from google-cloud-storage->ml4floods) (2.1.0)
Requirement already satisfied: protobuf in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from google-cloud-storage->ml4floods) (3.19.4)
Requirement already satisfied: cycler>=0.10 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from matplotlib->ml4floods) (0.11.0)
Requirement already satisfied: kiwisolver>=1.0.1 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from matplotlib->ml4floods) (1.3.2)
Requirement already satisfied: fonttools>=4.22.0 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from matplotlib->ml4floods) (4.29.0)
Requirement already satisfied: pyparsing>=2.2.1 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from matplotlib->ml4floods) (3.0.7)
Requirement already satisfied: pillow>=6.2.0 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from matplotlib->ml4floods) (8.4.0)
Requirement already satisfied: packaging>=20.0 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from matplotlib->ml4floods) (21.3)
Requirement already satisfied: setuptools==59.5.0 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from pytorch-lightning->ml4floods) (59.5.0)
Requirement already satisfied: pyDeprecate==0.3.1 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from pytorch-lightning->ml4floods) (0.3.1)
Requirement already satisfied: tensorboard>=2.2.0 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from pytorch-lightning->ml4floods) (2.8.0)
Requirement already satisfied: torchmetrics>=0.4.1 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from pytorch-lightning->ml4floods) (0.7.0)
Requirement already satisfied: snuggs>=1.4.1 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from rasterio->ml4floods) (1.4.7)
Requirement already satisfied: certifi in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from rasterio->ml4floods) (2021.10.8)
Requirement already satisfied: attrs in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from rasterio->ml4floods) (21.4.0)
Requirement already satisfied: affine in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from rasterio->ml4floods) (2.3.0)
Requirement already satisfied: click-plugins in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from rasterio->ml4floods) (1.1.1)
Requirement already satisfied: cligj>=0.5 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from rasterio->ml4floods) (0.7.2)
Requirement already satisfied: parse in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from requests_html->ml4floods) (1.19.0)
Requirement already satisfied: w3lib in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from requests_html->ml4floods) (1.22.0)
Requirement already satisfied: pyquery in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from requests_html->ml4floods) (1.4.3)
Requirement already satisfied: pyppeteer>=0.0.14 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from requests_html->ml4floods) (1.0.2)
Requirement already satisfied: fake-useragent in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from requests_html->ml4floods) (0.1.11)
Requirement already satisfied: bs4 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from requests_html->ml4floods) (0.0.1)
Requirement already satisfied: multidict<7.0,>=4.5 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from aiohttp<4->gcsfs->ml4floods) (6.0.2)
Requirement already satisfied: aiosignal>=1.1.2 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from aiohttp<4->gcsfs->ml4floods) (1.2.0)
Requirement already satisfied: yarl<2.0,>=1.0 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from aiohttp<4->gcsfs->ml4floods) (1.7.2)
Requirement already satisfied: frozenlist>=1.1.1 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from aiohttp<4->gcsfs->ml4floods) (1.3.0)
Requirement already satisfied: async-timeout<5.0,>=4.0.0a3 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from aiohttp<4->gcsfs->ml4floods) (4.0.2)
Requirement already satisfied: charset-normalizer<3.0,>=2.0 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from aiohttp<4->gcsfs->ml4floods) (2.0.10)
Requirement already satisfied: munch in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from fiona>=1.8->geopandas->ml4floods) (2.5.0)
Requirement already satisfied: googleapis-common-protos<2.0dev,>=1.52.0 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from google-api-core<3.0dev,>=1.29.0->google-cloud-storage->ml4floods) (1.54.0)
Requirement already satisfied: uritemplate<4dev,>=3.0.0 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from google-api-python-client<2,>=1.12.1->earthengine-api->ml4floods) (3.0.1)
Requirement already satisfied: cachetools<6.0,>=2.0.0 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from google-auth>=1.4.1->earthengine-api->ml4floods) (5.0.0)
Requirement already satisfied: pyasn1-modules>=0.2.1 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from google-auth>=1.4.1->earthengine-api->ml4floods) (0.2.8)
Requirement already satisfied: rsa<5,>=3.1.4 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from google-auth>=1.4.1->earthengine-api->ml4floods) (4.8)
Requirement already satisfied: google-crc32c<2.0dev,>=1.0 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from google-resumable-media>=1.3.0->google-cloud-storage->ml4floods) (1.3.0)
Requirement already satisfied: importlib-metadata>=1.4 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from pyppeteer>=0.0.14->requests_html->ml4floods) (4.10.1)
Requirement already satisfied: pyee<9.0.0,>=8.1.0 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from pyppeteer>=0.0.14->requests_html->ml4floods) (8.2.2)
Requirement already satisfied: urllib3<2.0.0,>=1.25.8 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from pyppeteer>=0.0.14->requests_html->ml4floods) (1.26.8)
Requirement already satisfied: appdirs<2.0.0,>=1.4.3 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from pyppeteer>=0.0.14->requests_html->ml4floods) (1.4.4)
Requirement already satisfied: websockets<11.0,>=10.0 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from pyppeteer>=0.0.14->requests_html->ml4floods) (10.1)
Requirement already satisfied: scikit-learn>=0.19.1 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from qudida>=0.0.4->albumentations->ml4floods) (1.0.2)
Requirement already satisfied: idna<4,>=2.5 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from requests->gcsfs->ml4floods) (3.3)
Requirement already satisfied: imageio>=2.4.1 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from scikit-image>=0.16.1->albumentations->ml4floods) (2.14.1)
Requirement already satisfied: tifffile>=2019.7.26 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from scikit-image>=0.16.1->albumentations->ml4floods) (2021.11.2)
Requirement already satisfied: PyWavelets>=1.1.1 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from scikit-image>=0.16.1->albumentations->ml4floods) (1.2.0)
Requirement already satisfied: networkx>=2.2 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from scikit-image>=0.16.1->albumentations->ml4floods) (2.6.3)
Requirement already satisfied: wheel>=0.26 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from tensorboard>=2.2.0->pytorch-lightning->ml4floods) (0.37.1)
Requirement already satisfied: grpcio>=1.24.3 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from tensorboard>=2.2.0->pytorch-lightning->ml4floods) (1.43.0)
Requirement already satisfied: markdown>=2.6.8 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from tensorboard>=2.2.0->pytorch-lightning->ml4floods) (3.3.6)
Requirement already satisfied: tensorboard-data-server<0.7.0,>=0.6.0 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from tensorboard>=2.2.0->pytorch-lightning->ml4floods) (0.6.1)
Requirement already satisfied: werkzeug>=0.11.15 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from tensorboard>=2.2.0->pytorch-lightning->ml4floods) (2.0.2)
Requirement already satisfied: tensorboard-plugin-wit>=1.6.0 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from tensorboard>=2.2.0->pytorch-lightning->ml4floods) (1.8.1)
Requirement already satisfied: absl-py>=0.4 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from tensorboard>=2.2.0->pytorch-lightning->ml4floods) (1.0.0)
Requirement already satisfied: requests-oauthlib>=0.7.0 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from google-auth-oauthlib->gcsfs->ml4floods) (1.3.1)
Requirement already satisfied: beautifulsoup4 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from bs4->requests_html->ml4floods) (4.10.0)
Requirement already satisfied: lxml>=2.1 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from pyquery->requests_html->ml4floods) (4.7.1)
Requirement already satisfied: cssselect>0.7.9 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from pyquery->requests_html->ml4floods) (1.1.0)
Requirement already satisfied: zipp>=0.5 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from importlib-metadata>=1.4->pyppeteer>=0.0.14->requests_html->ml4floods) (3.7.0)
Requirement already satisfied: pyasn1<0.5.0,>=0.4.6 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from pyasn1-modules>=0.2.1->google-auth>=1.4.1->earthengine-api->ml4floods) (0.4.8)
Requirement already satisfied: oauthlib>=3.0.0 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib->gcsfs->ml4floods) (3.2.0)
Requirement already satisfied: joblib>=0.11 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from scikit-learn>=0.19.1->qudida>=0.0.4->albumentations->ml4floods) (1.1.0)
Requirement already satisfied: threadpoolctl>=2.0.0 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from scikit-learn>=0.19.1->qudida>=0.0.4->albumentations->ml4floods) (3.0.0)
Requirement already satisfied: soupsieve>1.2 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from beautifulsoup4->bs4->requests_html->ml4floods) (2.3.1)
Building wheels for collected packages: ml4floods
  Building wheel for ml4floods (setup.py) ... ?25ldone
?25h  Created wheel for ml4floods: filename=ml4floods-0.0.1-py3-none-any.whl size=140086 sha256=1be823c64a228fb0acf7e481c084b4a16d994c1aecfa9e94750c0d11eb3f8700
  Stored in directory: /tmp/pip-ephem-wheel-cache-rvlnv79q/wheels/73/e4/d0/32f1ac519c0c417c56095146587858a3e35491dbbef7d45186
Successfully built ml4floods
Installing collected packages: ml4floods
  Attempting uninstall: ml4floods
    Found existing installation: ml4floods 0.0.1
    Uninstalling ml4floods-0.0.1:
      Successfully uninstalled ml4floods-0.0.1
Successfully installed ml4floods-0.0.1
Requirement already satisfied: gdown in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (4.4.0)
Requirement already satisfied: six in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from gdown) (1.16.0)
Requirement already satisfied: beautifulsoup4 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from gdown) (4.10.0)
Requirement already satisfied: filelock in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from gdown) (3.6.0)
Requirement already satisfied: tqdm in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from gdown) (4.62.3)
Requirement already satisfied: requests[socks] in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from gdown) (2.27.1)
Requirement already satisfied: soupsieve>1.2 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from beautifulsoup4->gdown) (2.3.1)
Requirement already satisfied: certifi>=2017.4.17 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from requests[socks]->gdown) (2021.10.8)
Requirement already satisfied: charset-normalizer~=2.0.0 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from requests[socks]->gdown) (2.0.10)
Requirement already satisfied: idna<4,>=2.5 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from requests[socks]->gdown) (3.3)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from requests[socks]->gdown) (1.26.8)
Requirement already satisfied: PySocks!=1.5.7,>=1.5.6 in /home/gonzalo/miniconda3/envs/ml4floods/lib/python3.9/site-packages (from requests[socks]->gdown) (1.7.1)

Step 0: Access the data for this tutorial#

In order to run this tutorial we will use some products from the forthcoming WorldFloods Extra Dataset. We have copied some of those in our public Google Drive folder. To access the WorldFloods database see the download WorldFloods documentation.

Step 0a: mount the Public folder if you are in Google Colab#

If you’re running this tutorial in Google Colab you need to ‘add a shortcut to your Google Drive’ from the public Google Drive folder and mount that directory with the following code:

import os
try:
    from google.colab import drive
    drive.mount('/content/drive')
    assert os.path.exists('/content/drive/My Drive/Public WorldFloods Dataset'), "Add a shortcut to the publice Google Drive folder: https://drive.google.com/drive/u/0/folders/1dqFYWetX614r49kuVE3CbZwVO6qHvRVH"
    google_colab = True
    path_to_dataset_folder = '/content/drive/My Drive/Public WorldFloods Dataset'
except ImportError as e:
    print(e)
    print("Setting google colab to false, it will need to install the gdown package!")
    path_to_dataset_folder = "."
    google_colab = False
No module named 'google.colab'
Setting google colab to false, it will need to install the gdown package!

If you’re not in Google Colab you need to install the gdown package to access the products. pip install gdown.

Original Image#

Let’s load a demo Sentinel-2 (S2) image which will be the basis for all of our demonstrations.

import gdown
import os

# S2_IMG_PATH = "gs://ml4cc_data_lake/0_DEV/1_Staging/WorldFloods/EMSR501/AOI01/S2/2021-02-15.tif"

S2_IMG_PATH = os.path.join(path_to_dataset_folder, "products_sample_tutorial","S2", "2021-02-15.tif")
if not os.path.exists(S2_IMG_PATH): 
    import gdown
    os.makedirs(os.path.dirname(S2_IMG_PATH), exist_ok=True)
    gdown.download(id="1wPxBASA5KaEiKPyU78ryzSHjEuZaLYEL", output=S2_IMG_PATH)
Downloading...
From: https://drive.google.com/uc?id=1wPxBASA5KaEiKPyU78ryzSHjEuZaLYEL
To: /home/gonzalo/git/ml4floods/jupyterbook/content/prep/products_sample_tutorial/S2/2021-02-15.tif
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 560M/560M [00:08<00:00, 67.8MB/s]

Demo Plot#

We have an easy plot function to visualize the S2 image.

from ml4floods.visualization import plot_utils
from rasterio import plot as rasterioplt
import matplotlib.pyplot as plt
import rasterio
import numpy as np

# initialize figure
fig, ax = plt.subplots(figsize=(16,16))

plot_utils.plot_rgb_image(S2_IMG_PATH, ax=ax, size_read=800)

ax.set(
    title="Demo Sentinel-2 Image",
)
[Text(0.5, 1.0, 'Demo Sentinel-2 Image')]
../../_images/gt_masks_generation_8_1.png

This is mainly for demonstration purposes. For more advanced plotting utilities, please consult a specialized library.

So now we have our image, let’s see what tools we have available to do masks.

Cloud Masks#

Last Channel Cloud Probabilities#

Fortunately for some of the newer S2 images obtained from GEE, we can simply use the final channel.

image, transform = plot_utils.get_image_transform(S2_IMG_PATH, bands=[14], 
                                                  size_read=800)



# initialize figure
fig, ax = plt.subplots(figsize=(16,16))

# plot image
rasterioplt.show(image[0],transform=transform, vmin=0,vmax=100, cmap="Reds")

ax.set(
    title="Cloud Probabilities",
)
../../_images/gt_masks_generation_12_0.png
[Text(0.5, 1.0, 'Cloud Probabilities')]

Model-Based#

In other cases where this is not available, we can also use models on the original image. One method that we use within this library is called s2cloudless, found here. This is a model from the sentinel-hub which has a very convenient API for our uses. See this blog for more details.

Below, we showcase how we can return cloudmasks in a single function within this library.

Warnings: This will take a while to run. Please skip over the section to resume the tutorial.

# %%time
# from src.data.cloud_masks import compute_s2cloudless_probs

# # calculate cloud probabilities using cloudless
# s2_cloud_probs = compute_s2cloudless_probs(S2_IMG_PATH)

Figure#

# fig, ax = plt.subplots(figsize=(16,16))

# with rasterio.open(S2_IMG_PATH, "r") as s2_rst:
#     transform = s2_rst.transform

# rasterioplt.show(s2_cloud_probs, ax=ax, transform=transform, cmap="Reds")

# ax.set(
#     title="S2CloudLess Cloud Probabilities",
# )
# plt.show()

Note: This is actually the same library that is used within the GEE platform! See this tutorial for more information.

Water Masks#

Permanent Water#

So firstly, we have the permanent water. These are streams, rivers and lakes that already exist. This serves as a basis for where the flood events could occur.

# permanent water
# JRC_IMG_PATH = "gs://ml4cc_data_lake/0_DEV/1_Staging/WorldFloods/EMSR501/AOI01/PERMANENTWATERJRC/2021.tif"

JRC_IMG_PATH = os.path.join(path_to_dataset_folder, "products_sample_tutorial","PERMANENTWATERJRC", "2021.tif")
if not os.path.exists(JRC_IMG_PATH):
    os.makedirs(os.path.dirname(JRC_IMG_PATH), exist_ok=True)
    gdown.download(id="1N_NF9xdO4Qse2Bp76t0sZHrOO24fd-xt", output=JRC_IMG_PATH)
Downloading...
From: https://drive.google.com/uc?id=1N_NF9xdO4Qse2Bp76t0sZHrOO24fd-xt
To: /home/gonzalo/git/ml4floods/jupyterbook/content/prep/products_sample_tutorial/PERMANENTWATERJRC/2021.tif
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 314k/314k [00:00<00:00, 4.19MB/s]
permanent_water_image, transform = plot_utils.get_image_transform(JRC_IMG_PATH, bands=[0], 
                                                                  size_read=800)

# initialize figure
fig, ax = plt.subplots(figsize=(16,16))

rasterioplt.show(permanent_water_image[0],transform=transform, vmin=0,vmax=3, interpolation='nearest')

ax.set(
    title="Permanent Water",
)
plt.show()
../../_images/gt_masks_generation_21_0.png

Flood Events#

We need the flood events. These come from the Copernicus EMS data source where we have already preprocessed them to extract the necessary information for floods. These come in the form of geojson files which keep all of the polygons indicating the specific coordinates where there was a drought. We can easily read it with the geopandas package.

# FLOODMAP_PATH = "gs://ml4cc_data_lake/0_DEV/1_Staging/WorldFloods/EMSR501/AOI01/floodmap/2021-02-15.geojson"
FLOODMAP_PATH = os.path.join(path_to_dataset_folder, "products_sample_tutorial", "floodmap", "2021-02-15.geojson")

if not os.path.exists(FLOODMAP_PATH):
    os.makedirs(os.path.dirname(FLOODMAP_PATH), exist_ok=True)
    gdown.download(id="1GS1_9fXwrYdgu7o-0RU4I8ixch4pVpQC", output=FLOODMAP_PATH)
    
Downloading...
From: https://drive.google.com/uc?id=1GS1_9fXwrYdgu7o-0RU4I8ixch4pVpQC
To: /home/gonzalo/git/ml4floods/jupyterbook/content/prep/products_sample_tutorial/floodmap/2021-02-15.geojson
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 11.9M/11.9M [00:00<00:00, 35.4MB/s]
import geopandas as gpd

floodmap_df = gpd.read_file(FLOODMAP_PATH)
floodmap_df[floodmap_df["w_class"] == "Flooded area"]
w_class source geometry
0 Flooded area flood MULTIPOLYGON (((19.48718 42.09541, 19.48706 42...
1 Flooded area flood POLYGON ((19.48966 42.09551, 19.48966 42.09540...
2 Flooded area flood POLYGON ((19.48887 42.09549, 19.48887 42.09547...
3 Flooded area flood POLYGON ((19.48967 42.09486, 19.48967 42.09481...
4 Flooded area flood POLYGON ((19.48731 42.09523, 19.48731 42.09518...
... ... ... ...
2046 Flooded area flood POLYGON ((19.43283 41.90527, 19.43283 41.90518...
2047 Flooded area flood POLYGON ((19.43063 41.90832, 19.43064 41.90823...
2048 Flooded area flood POLYGON ((19.43015 41.90893, 19.43015 41.90884...
2049 Flooded area flood POLYGON ((19.43065 41.90295, 19.43064 41.90304...
2050 Flooded area flood POLYGON ((19.42929 41.91042, 19.42929 41.91033...

2051 rows × 3 columns

floodmap_df["w_class"].unique()
array(['Flooded area', 'BH080-Lake', 'BH130-Reservoir', 'BH140-River',
       'BH141-River Bank', 'BA040-Open Water', 'BH141-Stream',
       'BA010-Coastline', 'area_of_interest'], dtype=object)

That’s it! We can also plot this just to get an overview of the polygons available. Remember the last polygon is the area of interest, so plot all except that one.

# initialize figure
fig, ax = plt.subplots(figsize=(16,16))

floodmap_df.plot("w_class", ax=ax, facecolor="None", label="Flood Maps", linewidth=0.8,
                 legend=True)

ax.set(
    title="Floodmaps"
)
plt.show()
../../_images/gt_masks_generation_28_0.png

Plot - FloodMaps On Permanent Water#

We can also plot the floodmaps on the permanent water to get an overview of how much these flood maps overlap with the already placed permanent water areas.

# initialize figure

fig, ax = plt.subplots(figsize=(16,16))

# plot the permanent water
rasterioplt.show(permanent_water_image[0],transform=transform, vmin=0, vmax=3, interpolation='nearest', ax=ax)

# plot the floodmap (convert to same CRS the floodmap!)
with rasterio.open(JRC_IMG_PATH) as src:
    floodmap_df_utm = floodmap_df.to_crs(src.crs)

floodmap_df_utm.plot("source", ax=ax, facecolor="None", label="Flood Maps", linewidth=0.8, 
                     legend=True)
<AxesSubplot:>
../../_images/gt_masks_generation_30_1.png

So we can see that it’s very clear that there is some overlap between the permanent water and the flood maps, but there are also many areas that extend beyond the standard region.

Figure - FloodMaps On S2 Image#

We can actually overlay the floodmaps onto the S2 Image. Because all of the geocoordinates are consistent, this allows us to visualize all of the pieces together.

# initialize figure
fig, ax = plt.subplots(figsize=(16,16))

# plot S2 Image
plot_utils.plot_rgb_image(S2_IMG_PATH, ax=ax, size_read=800, label="S2 Image")

# plot the floodmap
floodmap_df_utm.plot("source", ax=ax, facecolor="None", label="Flood Maps", linewidth=0.8, 
                     legend=True)

plt.legend()
plt.show()
../../_images/gt_masks_generation_33_0.png

Ground Truth#

So now we’ve seen all of the bits and pieces needed to get the data required to create the ground truth: 1) S2 Image, 2) Cloud Masks, 3) FloodMaps, 4) Permanent Water. We can now create some ground truth images which can be used for our machine learning problem.

3-Class Problem#

This first ground truth type is a 3-class problem: Land, Water, Cloud. We basically fuse these four resources together to create the ground truth. So a simple logistic regression model would be able to use this. However, it would probably do poorly due to the complexity of the data.

It’s very simple to generate ground truth with the functions we created.

from ml4floods.data import create_gt
%%time

# Might need to run again if the code breaks.

metadata_floodmap = {}

gt, gt_meta = create_gt.generate_land_water_cloud_gt(                      
    S2_IMG_PATH,
    floodmap_df,
    metadata_floodmap,
    keep_streams=True,
    permanent_water_image_path=JRC_IMG_PATH
)
CPU times: user 13.4 s, sys: 8.25 s, total: 21.7 s
Wall time: 24.3 s
gt_meta
{'gtversion': 'v1',
 'encoding_values': {0: 'invalid', 1: 'land', 2: 'water', 3: 'cloud'},
 'shape': [2996, 4597],
 's2_image_path': '2021-02-15.tif',
 'permanent_water_image_path': '2021.tif',
 'cloudprob_image_path': 'None',
 'method clouds': 's2cloudless',
 'pixels invalid S2': 6472757,
 'pixels clouds S2': 250244,
 'pixels water S2': 1559933,
 'pixels land S2': 5489678,
 'pixels flood water S2': 561790,
 'pixels hydro water S2': 212124,
 'pixels permanent water S2': 786019,
 'bounds': BoundingBox(left=353000.0, bottom=4632790.0, right=398970.0, top=4662750.0),
 'crs': CRS.from_epsg(32634),
 'transform': Affine(10.0, 0.0, 353000.0,
        0.0, -10.0, 4662750.0)}

Show 3-class ground truth#

fig, ax = plt.subplots(figsize=(16,16))

plot_utils.plot_gt_v1(gt[np.newaxis], transform=gt_meta["transform"])
../../_images/gt_masks_generation_40_0.png
<AxesSubplot:>

2-Class Problem#

In this ground truth, we generate 2-band ground truth which consists of a binary classification problem for

from ml4floods.data import create_gt
%%time

# Might need to run again if the code breaks.

metadata_floodmap = {}

gt_binary, gt_meta_binary = create_gt.generate_water_cloud_binary_gt(                        
    S2_IMG_PATH,
    floodmap_df,
    metadata_floodmap=metadata_floodmap,
    keep_streams=True,
    permanent_water_image_path=JRC_IMG_PATH,
)
CPU times: user 13.7 s, sys: 2.64 s, total: 16.3 s
Wall time: 16.3 s
gt_meta_binary
{'gtversion': 'v2',
 'encoding_values': [{0: 'invalid', 1: 'clear', 2: 'cloud'},
  {0: 'invalid', 1: 'land', 2: 'water'}],
 'shape': [2996, 4597],
 's2_image_path': '2021-02-15.tif',
 'permanent_water_image_path': '2021.tif',
 'cloudprob_image_path': 'None',
 'method clouds': 's2cloudless',
 'pixels invalid S2': 6472855,
 'pixels clouds S2': 250244,
 'pixels water S2': 1564991,
 'pixels land S2': 5725159,
 'pixels flood water S2': 562505,
 'pixels hydro water S2': 219225,
 'pixels permanent water S2': 787354,
 'bounds': BoundingBox(left=353000.0, bottom=4632790.0, right=398970.0, top=4662750.0),
 'crs': CRS.from_epsg(32634),
 'transform': Affine(10.0, 0.0, 353000.0,
        0.0, -10.0, 4662750.0)}
COLORS = np.array(
    [[0, 0, 0], # invalid
    [173, 216, 230], # no cloud
    [255, 255, 255]], # cloud
    dtype=np.float32) / 255

labels = ["Invalid", "No Cloud", "Cloud",]
cmap_cat, norm_cat, patches = plot_utils.get_cmap_norm_colors(COLORS, labels)
fig, ax = plt.subplots(figsize=(16,16))

im_plt = rasterioplt.show(gt_binary[0],transform=gt_meta_binary["transform"], 
                          interpolation='nearest',ax=ax,
                          cmap=cmap_cat, norm=norm_cat)

ax.legend(
    handles=patches,
    loc='upper right',
    fontsize=15
)
ax.set_axis_off()
plt.show()
../../_images/gt_masks_generation_46_0.png
COLORS = np.array(
    [[0, 0, 0], # invalid
    [139, 64, 0], # land
    [0, 0, 139]], # water
    dtype=np.float32) / 255

labels = ["Invalid", "Land", "Flood water"]
cmap_cat, norm_cat, patches = plot_utils.get_cmap_norm_colors(COLORS, labels)
fig, ax = plt.subplots(figsize=(16,16))

im_plt = rasterioplt.show(gt_binary[1],transform=gt_meta_binary["transform"], 
                          interpolation='nearest',ax=ax,
                          cmap=cmap_cat, norm=norm_cat)

ax.legend(
    handles=patches,
    loc='upper right',
    fontsize=15
)
<matplotlib.legend.Legend at 0x7faf1f253e20>
../../_images/gt_masks_generation_48_1.png