Supervised Classification in ArcMap

Common vegetation indices

Green Index: Float (Green Layer) / Float (Red + Green + blue)

Normalized Difference Vegetation Index “NDVI”: Float( NIR – Red) / Float( NIR + Red)

  • NDVI requires information from the near-infrared range of the electromagnetic spectrum
  • Requires a special sensor that capture that/those portions of the electromagnetic spectrum (landsat, RedEdge, etc)

Other Vegetation Indices: Link1Link2, and The Granddaddy Of All VI Lists

Unsupervised versus supervised classification

The unsupervised classification routine identifies groups of pixels that exhibit similar spectral response. It is up to the user to assign a meaning to each group after the image has been classified.

In supervised classification, an image is partitioned into classes based on reference or training samples supplied by the user. Classes in the resulting classified image are already assigned a meaning based on the training samples.

Advantages of unsupervised classification

  • No prior knowledge of the study area is required
  • Unique spectral classes are produced
  • Relatively fast and easy to perform

Disadvantages of unsupervised classification

  • Resulting classes do not necessarily represent features on the ground
  • User must spend time after the classification to assign meaning to the classes
  • Does not consider spatial relationships in the data (houses are next to roads, hardwoods next to water, etc)
  • Spectral properties may vary across the image (vegetation at the a of the hill versus the same vegetation at the bottom or a hill)

Advantages of supervised classification

  • The process groups pixels into classes that represent features you identify in your training samples
  • Training areas are reusable (lakes, fields, forests, etc)
  • We are assuming that the spectral response of a particular feature in an image will be relatively consistent throughout.

Disadvantages of supervised classification

  • Training samples may be too general and might mask unique spectral characteristics of the landscape (training samples represent “forest” while the spectral data may differentiate between conifer and deciduous)
  • User must spend time before the classification to develop homogenous training samples
  • Results easily confounded by sloppy training samples
  • Individual samples must be homogenous while the group of samples representing a target class must be exhaustive (must sample all shades of pine, water, or whatever classes you desire in your output image)
  • We are assuming that the spectral response of a particular feature in an image will be relatively consistent throughout.

Supervised classification

ESRI’s help describes the goal of image classification as the attempt to assign each cell in the study area to a known class or cluster and that the result of each classification is a map that partitions the study area into known classes which correspond to training samples (as in supervised classification) or naturally occurring classes (as in unsupervised classification).

For the demo today, I will follow ESRI’s supervised classification example in their online help section titled ‘Performing supervised classification’.

WORKFLOW:

  1. Load the Image Classification toolbar
  2. Add multispectral image (RGB or NIR aerial, satellite image, vegetation indices, and others) and set the classification image target
  3. Create, evaluate, refine training samples
  4. Create signature file
  5. Execute Maximum Likelihood classification
  6. Evaluate classification results

Evaluate classification results

US Department of the Interior, ITAP produced a Power Point describing the Overall, User’s, Producer’s accuracy and the Kappa Coefficient. Their document is linked HERE. Relevant information to our discussion starts on slide 30.

The highlights are…

Overall Accuracy: percentage of correctly-classified pixels

Producer’s Accuracy: the probability of a reference pixel being correctly classified; a measure of ‘omission error’; “…how well a certain area can be classified.”; represent pixels that belong to the TRUTH class but fail to be classified into the proper class

User’s Accuracy: the probability that a pixel classified on the map actually represents that category on the ground; a measure of ‘commission error’; represent pixels that belong to another class but are labeled as belonging to the TRUTH class

Kappa Coefficient: a measure of agreement (or accuracy) between the remote sensing-derived classification map and the reference data taking into consideration chance agreement; ranges from 0 to 1; generally 0.7 is considered satisfactory

The process in ArcMap is:

  1. Use the Create Accuracy Assessment Points tool to generate a set of random points and assign them a class based on reference data (or what you know as to be the TRUTH)
  2. Next, run the Update Accuracy Assessment Points tool to assign the “classified as” information to the output of step 1
  3. Then, run the Compute Confusion Matrix tool to generate the classification statistics