The current trend in “new space” is launching tens or even hundreds of satellites with identical instruments and specifications to provide the world with an abundance of data. While more satellites are highly encouraged and definitely useful, we should also learn to use what we already got better.
Luckily, it’s not an either-or situation since data fusion becomes more powerful with additional satellites while the problem with cloud cover becomes slightly less severe. It’s not a linear relationship since no amount of additional optical satellites can overcome the ~60–70% average global cloud cover — they can only increase the bad odds slightly in one’s favor.
Optical Data Fusion:
The easiest data to merge is by far two identical optical satellites from the same constellation since they will have identical or very similar spectral bands, revisit times, and spatial resolution — which should come as no surprise. Imagine having two images that are clouded in different places and merging them together for a cloud-free image. Well, it’s never that easy. A situation we encounter more often is the imagery having thin clouds and thus the job becomes extraction and merging. These examples are also the easiest to understand and, to some degree, being able to visually confirm their validity and accuracy. My eyes can’t measure the actual error between the two images but they can quickly spot differences in intra-field variance and color. Not unlike the game “Spot the Difference” where you need to find differences between two images. Our human pattern recognition is really strong even in cases where an uneven veil of mist and clouds makes it all more difficult. That’s an easy statement to write so here are a couple of examples to prove my point (or disprove, you decide!).
Anyhow, that’s not meant as evidence since we measure error, statistical differences, and accuracy in another way (I’ll get to that soon, I promise). But here is another example, this time with cumulus clouds and cloud shadows:
And another one (if you want to see more comparison images):
Just to clarify, this is data fusion with two satellites (Sentinel-2a and Sentinel-2b) and since they never capture images on the same day, the actual data fusion is happening across days on a time-series. We usually work with 15, 30, or 60 days depending on the severity of the problem. Our artificial intelligence model can understand the undefined abstract veil of mist or clouds due to this data fusion process. It knows what clouds are and how dense they are by figuring out this process itself. It would be impossible for me to define how “cloudy” this clouded pixel is by measuring any of the values or simply relying on a single image. So by understanding how the ten spectral bands behave individually and collectively, it learns to remove haze, cirrus clouds, cloud shadows, planes, and such. This is the frequency and order of the two Sentinel-2 satellites that we combine in Denmark:
As we get closer to the equator less Sentinel-2 images will be available due to the satellite’s orbit (sun-synchronous orbit) and when we reach the equator it will be an image once a week. There are many limitations to sticking to a few satellites from the same constellation (in this case two satellites), so we work with as many satellites as possible. Furthermore, most users will use these two Sentinel-2 satellites together as they were the same but in reality, there are small differences in the imagery. So to get back to my topic, why work with one satellite, when data fusion can combine as many satellites as you want down to one useful image? And what about other constellations?
Optical Data Fusion with Multiple Constellations:
The next best is two optical satellite constellations with similar spectral bands and revisit times — as is the case with Sentinel-2 and Landsat 8/9. Well, actually the Landsat constellation has a worse spatial resolution, slower revisit time, and slightly different composition of spectral bands than Sentinel-2. However, that also means its orbit is offset from Sentinel-2, which gives data on days with previously no optical data, and it introduces thermal measurements — something Sentinel-2 does not have. So while these two constellations are harder to merge together, in practice we see a strong synergy due to the differences in specifications and instruments.
When the satellites are too different to use as if they were one, we found that most people will neglect one of the constellations for simplicity. We can through data fusion help these people get easier access to useful data. The differences in the orbit and sensing time constantly create different input data patterns for our artificial intelligence. Here we have Landsat and Sentinel-2 passing by on the same day while capturing different cloud cover or more precisely a lack thereof on the Landsat image.
In these cases, there is roughly a one-hour difference in sensing time between these two constellations which means minimal ground cover changes but possible cloud cover changes — as the clouds are in constant movement. And other times, we have useful Landsat data from the day before which caught a better timing. One thing is certain, weather patterns result in uncertainty and ever-changing data availability even with high revisit frequency. However, even though there have been a lot of clear skies in March (at least in Denmark and Germany), we see a need for data fusion to combine it all into consistent imagery and insights.
The weather is not always like that, so when no optical data is available our artificial intelligence will have to rely on SAR data, such as data from the Sentinel-1 satellite (C-Band SAR).
SAR and Optical Data Fusion:
This is where the fun starts, where human intuition breaks down, and where the results become a bit more abstract. It’s almost impossible for us to prove the exact improvement by adding SAR data to the estimation. However, removing this SAR data from our estimation significantly changes the prediction. The data from Sentinel-1a (a sad goodbye to Sentinel-1b) is very sparse and it’s not easy to make good cloud-free biomass predictions using SAR data alone. Impossible when it’s multi-spectral data. We use SAR data from Sentinel-1 to adjust our optical data which can be seen in the example below. On the third last image (a day with only SAR data) we see our cloudless imagery becoming more saturated with vegetation and the day after with partly clouded Sentinel-2 imagery it continues the same growth pattern.
Somewhat the same can be seen in these new cloud-free estimations from Denmark in April 2022 with vegetation growth without optical imagery.
Over long periods with dense cloud cover, the effects of SAR data on our predictions become clear. It’s when neither Landsat nor Sentinel-2 provides any useful optical data that SAR data is required and utilized. It’s not only in the winter we see what is close to a whole month without optical data from Sentinel-2. What’s particularly interesting is spotting changes within a field.
While the estimation is not perfect, it is fairly close to the actual situation when considering it’s based on 20 days without optical data. We even see a bare soil patch at the beginning of the time-series that almost manages to catch up to the rest of the field — all within 20 days. Here is another case with a clouded rapeseed field during 2020 May and June:
At the worst timing, there were 20 days without cloud-free imagery and the field is hard to recognize at that point. Usually, in spring, summer, and autumn, it’s a combination of optical and radar imagery that makes our cloudless optical data fusion work. If we look at larger tiles there are often clouds and cloud shadows present somewhere. Here is a shortened time-series where I have selected the most available cloud-free imagery from Sentinel-2 during the period 26th of June to 31st of July and our corresponding tiles to those dates:
In this context, some parts of this image will have two or more cloud-free images while other places are not so lucky. So both optical data and radar data are needed for these larger cloud-free estimations no matter what. There are even certain spots in the image that have been unlucky enough to be clouded every time the Sentinel-2 satellites come by and our estimations will be made using mostly SAR data — but never exclusively on SAR data. This is key because SAR data cannot stand on its own, to reliably produce the spectral bands of Sentinel-2.