Progress in computer vision algorithms is strongly driven by datasets and benchmarks. The latter are annual events in which the latest algorithms compete against each other. One of the most impactful one to date is the ImageNet challenge. Here, participants compete in any of three categories:

Classification:
Given an image of an object, figure out what that object is.

Localization:
Find where the object is in the image and draw a bounding box around it.

Detection:
Find all of the objects in the image, draw a bounding box around each object, and classify each of them.

Of course the above tasks often require drawing bounding boxes around objects. It may be important to find the shape of the image as well.

Instance Segmentation:
Find where each object is in an image and within the bounding box, classify each pixel that belongs to the object.

Semantic Segmentation:
Given an image, classify each pixel into different classes. These classes could include sky or background.

Panoptic Segmentation:
Combination of instance and semantic segmentation.