%matplotlib inline

import numpy as np
from matplotlib import pyplot as plt
from check_answer import check_answer

Satellite images are normally stored in formats such as GeoTIFF or NetCDF but here, to avoid importing extra libraries, we have saved a Modis satellite image locally using a numpy specific format .npz. We can load this file into a numpy array by doing:

arr_modis = np.load("data/modis_cube.npz")["nadir"]

arr_modis.shape

This file contains a composite image taken by the MODIS Terra and Acqua satellites over an 8-day period. The image correspond to the area of southwest France and north of Spain around mid July 2018. Each pixel has a resolution of 500 meters and there are 7 spectral bands: [red, near infra-red (nir), blue, green, short-wave infra-red (swir) 1, swir2, swir3]

Let's plot one band (0 -> red) to have quick look at the region:

plt.imshow(arr_modis[:,:,0])

The MODIS data represents reflectance values (reflected/incident proportion) for each spectral band. Reflectance is a measure that indicates the proportion of the energy at each part of the spectrum that is reflected by the Earth at each spectral band. The values are therefore in the range [0-1]. 0 represent pixel with no reflection (all energy is being absorbed) and 1 represents all energy is reflected.

These values need to be stored as floating point numbers but to save space is a common practice to use 2-bytes integers (int16 vs float32) and then rescale dividing by a large number.

Before we can interpret these images we need to convert the stored values into reflectance values [0-1.0]. To do this conversion for Modis data we use a conversion factor of 10,000. (In Python you can use scientific notation in which 1e4 == 1*10^4 == 10,000)

To scale the values into reflectance values we can perform this operation which applies the division along the whole array element-wise (remember broadcasting):

refl_modis = arr_modis / 1e4

print("Min:", arr_modis.min(), "Max:", arr_modis.max())
print("Data type:", arr_modis.dtype)
print("-----------------")
print("Min:", refl_modis.min(), "Max:", refl_modis.max())
print("Data type:", refl_modis.dtype)

As we can see, numpy automatically converts the values in the array from integers into floats to performs the division.

Now we are going to create a true colour composite, creating an image by selecting the red, green and blue components of the multispectral image.

Exercise 3.1: This list contains the names of the Modis bands in order as stored in our 7 bands cube:

[red, near infra-red (nir), blue, green, short-wave infra-red (swir) 1, swir2, swir3]

What are the indices corresponding to the red, green and blue values in the previous array? Remember Python starts counting at 0.

# Substitute the ? symbols by the correct values
answ = [?,?,?]

check_answer("3.1", answ)

So far we have seen two different ways of selecting (indexing) values in a numpy array. (1) we can slice/dice an array using the : notation to specify the start and end index along axes. (2) we can use arrays of boolean values to select certain positions.

There is a third way of selecting values in a numpy array in which specific specific positions using a list of numbers. For example in the following array:

arr = np.array([7,5,3,2,4,6])

arr[[4,0,1]]

This mode of selecting specific positions of an array is called "fancy indexing". As oppossed to boolean selection fancy indexing does not require the list of numbers to have the same size as the applied array. Also we can change the order of the values in the resulting array compared to the initial array.

Exercise 3.2: Use fancy indexing to select the red, green and blue (in this order) channels from the modis reflectance array. Use the result of the previous exercise.

answ = refl_modis[:,:,?]

check_answer("3.2", answ.shape)

The proportion of light reflected by the Earth's surface is normally low. We can have a look at the RGB values in the reflectance array to find out the maximum and minimum values.

rgb_modis = refl_modis[:,:,[0,3,2]]

rgb_modis.min(), rgb_modis.max()

Only ~4% of the incident light gets reflected in this image. So we are going to rescale this image and make its dynamic range span the [0,1] interval.

rgb_modis = rgb_modis / rgb_modis.max()

rgb_modis.min(), rgb_modis.max()
rgb_modis.shape

Now lets plot the RGB image using matplotlib's imshow function.

Note: The imshow function can display 3-dimensional arrays as RGB images using the following conventions: (1) The axis containing the channels needs to be the last. (2) If data is stored as uint8 its values need to span the whole [0-255] range. (3) If data is stored as float, its values need to span the [0,1] range.

plt.imshow(rgb_modis)

The previous image is a bit dark and it's hard to see the details in it. To improve its contrast we are going to expand the dynamic range of the lower values in the image by multiplying the array by 3. By doing this, we will lose, or clip, that larger values in the image but will get an improved representation of the lower values to visually assess the image.

plt.imshow(rgb_modis*3)

The previous image is called natural colour because it maps the RGB channels of the image to the red, green and blue. Chainging the order of the bands we can map bands of the satellite to the RGB channels differently, also including other bands outside the visible. The images resulting from alternative mapping between RGB and spectral channels are commonly known as 'false coulour'.

Exercise 3.3: Similarly to the previous image, can you generate a false colour image mapping MODIS near-infrared band into the images' red channel?

answ = refl_modis[:,:,[1,3,2]]

plt.imshow(answ*30) # *30 is the previous normalisation and scaling, all in one step (approx.)

check_answer("3.3", answ[-1,-1,:])

Mapping the near infrared (nir) channel to the red colour we produces an image showing bright red in regions with more vegetation. Different combinations of reflectance bands are used to look at different properties of the land surface.

Exercise 3.4: Try different mappings between MODIS' spectral bands and the RGB channels in the following cell.

answ = refl_modis[:,:,[?,?,?]]

plt.imshow(answ*30)

These images using colour composites offer useful visual representations of remote sensing data. Another common way to represent and analyse these images is through the use of normalised indices. These indices are normally computed using two or more spectral bands into a normalised [-1,1] range. There are multiple well-known normalising algorithms used to detect fire scars, water but probably the most famous one is the Normalised Difference Vegetation Index (NDVI) which shows live vegetation.

NDVI=(NIR-Red)/(NIR+Red)

This operation can be performed in numpy using a very similar notation using arrays:

ndvi_modis = (refl_modis[:,:,1]-refl_modis[:,:,0]) / (refl_modis[:,:,1]+refl_modis[:,:,0])

"Shape:", ndvi_modis.shape, "Max:", np.nanmax(ndvi_modis), "Min:", np.nanmin(ndvi_modis)

As opposed to the previous false/true colour images, normalised indices contain just one band. To represent the values in the array we'll need to use a predefined colour palette to map the values into different colours.

We can pass the imshow function a parameter to specify the color map to use for our NDVI index. We use summer_r, which goes from yellow to green, to represent vegetation in the image. Here you can find a list of the colour maps available.

Note: imshow can create images either from 2-d arrays and 3-d arrays with dimension 3 in the last axis. In the case of 2-d arrays we’d need to specify the colormap we want to use (or display with the default one). Also 2-d images don’t need to be scaled or normalised into a given range, the colour palette in applied dynamically using the min and max values in the array. For images with three channels the colouring is driven using the proportion of each colour specified in the channels. 3-d arrays need to be normalised [0-255] for uint8 arrays or [0-1] for float arrays.

#To make vegetation look green in the image we apply a colour palette that goes from yellow to green.
plt.imshow(ndvi_modis, cmap='summer_r')

We can adjust how the values are mapped into the colours in the palette by providing the maximum and minimum values to the imshow function. This is useful to plot certain ranges and enhance where vegetation is in an image. For example:

plt.imshow(ndvi_modis, vmin=.2, vmax=.8, cmap='summer_r')

NDWI=(Green-NIR)/(Green+NIR)

Exercise 3.5: Can you calculate the NDWI array for the previous image?

# Modis spectral bands:
# [red, near infra-red (nir), blue, green, short-wave infra-red (swir) 1, swir2, swir3]

answ = ?

plt.imshow(answ, cmap='winter_r')

check_answer("3.5", answ[-1,-1])

Exercise 3.6: Can you use slice indexing [:, :] to select and region and plot a zoomed-in area of the previous image?

plt.imshow(answ[?:?,?:?], cmap='winter_r')

MODIS generates multi-spectral images. When the number of spectral bands is large, we call those images hyperspectral. The following dataset contains hyperspectral images collected from an airplane in February 2014 as it passed over the Australian National University.

High-resolution hyper-spectral data tends to make very large files. To make processing in this tutorial easier, we have packed a numpy array with the just the [red, nir, green, blue] bands. This array is 3-dimensional with the third dimension corresponding to the spectral bands, just like the MODIS data. Let's start by loading the data:

Note: This array also needs to be converted first to reflectance values dividing it by 1e4.

arr_anu = np.load("data/anu_cube.npz")["array"]
refl_anu = arr_anu/1e4

plt.imshow(refl_anu[:,:,1])

"Shape:", refl_anu.shape

Exercise 3.7: Can you calculate the NDVI index for this image following similar steps to the previous MODIS example?

Note: We save this variable with name ndvi_anu as it will be used later on.

answ = ?
ndvi_anu = answ

plt.imshow(answ, cmap='summer_r')

check_answer("3.7", answ[0,0])

Similar to NDVI, there is another index called Green Colour Coordinate (GCC), which is an alternative method to NDVI for looking at how green vegetation is. This one does not use the near-infrared band, which is serves as a convenient alternative for sensors that only generate red, green and blue.

GCC = Green / (Red+Green+Blue)

gcc_anu = refl_anu[:,:,2]/(refl_anu[:,:,0]+refl_anu[:,:,2]+refl_anu[:,:,3])

plt.imshow(gcc_anu, cmap='summer_r')

Let's do some classification generating a series of boolean arrays for different ranges of these vegetation indices. We define the following three categories.

  • Class 1: NDVI equal or greater than 0.5 and GCC greater than 0.4 indicates green, vegetated surfaces.
  • Class 2: NDVI less than 0.5 and GCC greater than 0.4 indicates surfaces that look green but are not vegetated.
  • Class 3: NDVI more than 0.5 and GCC less or equal than 0.4 indicates vegetated surfaces that don't look green.
    Note: Remember that * corresponds to the ’AND’ logical operation for boolean arrays.
c1 = np.logical_and(ndvi_anu>=0.5, gcc_anu>0.4)
c2 = np.logical_and(ndvi_anu<0.5, gcc_anu>0.4)
c3 = np.logical_and(ndvi_anu>=0.5, gcc_anu<=0.4)

plt.imshow(c1)

imshow can also create binary representations of boolean arrays.

Now we want to stack these three categorical or boolean arrays into a 3-dimensional array to be able to plot a RGB composite image. Numpy's dstack function stacks 2-dimensional arrays along the third axis, referred as 'depth' so the prefix 'd' in the name.

class_comp = np.dstack((c1,c2,c3))

class_comp.shape, class_comp.dtype

Now, to be able to generate the RGB image from this image we need to convert it into either uint8 with range [0-255] or into float32 with range [0-1].

Exercise 3.8: Can you convert the type of the class_comp array into float32?

class_comp = class_comp.?

answ = class_comp
check_answer("3.8", answ.dtype)

Now we can use imshow to plot the RGB representation of these three classes. Remeber C1 is mapped to red, C2, into green and C3 to blue.

plt.imshow(class_comp)

You will see this leaves a fourth class in the image, of pixels that fall in none of the three classes. They show up in black. There are two surfaces that show up 'green' and one area that shows mixed black and blue bits. What are they?

This simple classification uses index thresholds, which is a crude but sometimes useful way of classifying images. In this case, it helps to finding unvegetated green surfaces, such as tennis courts and artificial grass. It is also clear that greenish water confounds the classification, however.

Exercise 3.9: Can you try to compute a similar classification image using the previous Modis image?