Preprocessing Satellite Imagery for Machine Learning Project

Steven(Liang) Chen
5 min readMay 10, 2020

--

My background is software engineer. The reason why I would like to write this post is to share my recent experience with satellite imagery and how to use satellite imagery in machine learning project.

Section 1: Why Satellite Imagery

Satellite imagery contains information that is useful for data-related projects. There is plenty of satellite imagery sitting there waiting for us. And more important is they are free(not all of them, there is some commercial satellite imagery).

Some classical usages like crop yield prediction, object detection. Our team wins the Melbourne datathon 2019 by using Machine learning to predict the sugarcane yield. If you have an interest in our solution you can read this post written by my friend Stan. I also recommend this example written by Arcgis: analyze wildfire with Sentinel-2 imagery.

There is a gap between the raw imagery and the machine learning ready-to-use data that is the reason why I would like to write this blog.

Section 2: How to access satellite imagery

Today there are so many ways you can download satellite imagery. Here I will take Landsat-8 as an example to show you how to download satellite imagery.

choose dataset

After choosing the dataset you want you can also add some additional criteria e.g. cloud coverage

Additional Criteria

You can directly download it from the USGS website or you can use third-party libraries or applications to download it. If you do not know how to use the USGS website there are plenty of tutorials out there.

Today I will show you how to use QGIS to process Landsat-8. QGIS is a free open-source GIS application and it has more than 500 plugins which make it really powerful and easy to use.

Semi-Automatic Classification Plugin is the plugin I used in this project. If you want to learn more about this plugin you can watch Luca Congedo’s youtube channel he is the author of this plugin.

SCP Download Page

If you do not want to download the imagery to your local machine you can also analyze those satellite imagery online by using platforms like Google earth engine(I will share my experience with GEE in another post. We use GEE for crop yield project last year.)

Section 3: Understand the imagery

Let’s continue with the Landsat-8 imagery. After you download the imagery from USGC. You need to understand the filename and structure. In terms of the band, you may need to learn remote sensing knowledge.

The above image is helpful to understand the filename.

raw data

The attached screenshot is the raw data downloaded from USGC. As you can see the filename:LC08_L1TP_098082_20200125_20200128_01_T1_B1.TIF It has location information and band information and so on. There are metadata in text format we will use them later. The format may be different if you choose different satellites. This is the Landsat-8’s format.

Section 4: Process the imagery

In this section, I will introduce how I use SCP to do atmosphere correction. SCP is a plugin in QGIS. After you unzip the file you download from USGS and then choose the MTL file and image folder you will be able to see.

SCP
before atmosphere correction
after atmosphere correction

At this stage, you can use any band combination you want depends on your purpose. Here are some commonly used band combinations.

Band Combinations

4–3–2 Band Combination example:

4–3–2 True color

If you need a smaller or bigger area you may need to either clip or stitch the images by QGIS. Please bear in mind QGIS is not the only way to prepare your remote sensing data.

Section 5: Useful resources

Here are some materials that I find useful while I am writing this blog.

Complete Google Earth Engine for Remote Sensing & GIS: https://www.udemy.com/course/complete-google-earth-engine-for-remote-sensing-gis/

Satellite Remote Sensing Data Bootcamp With Opensource Tools: https://www.udemy.com/course/satellite-remote-sensing-data-bootcamp-with-opensource-tools/

QGIS For GIS Professionals: https://www.udemy.com/course/qgis-for-gis-professionals/

Making Sense of Satellite Data: An Open Source Workflow: https://medium.com/@robsimmon/making-sense-of-satellite-data-an-open-source-workflow-accessing-data-8f7f3c30f151

I am a beginner in Remote Sensing. Let me know if you find any issues in this post. I will keep updating it.

I will continue to use the preprocessed data in the following posts(while I am working on this ML project).

--

--

Steven(Liang) Chen
Steven(Liang) Chen

No responses yet