What Is Data Annotation?
Data annotation is the process of labeling each point of a dataset to show the actual output that the supervised machine learning model needs to predict. It is the process of categorizing and labeling data, meaning that the user would take each data point available and manually classify it for the machine learning model to use.
For example, to build a machine learning model that takes an image of a cat or a dog and can differentiate and predict which animal is found inside the given image, then the user will need (for example) to feed the model 1,000 images of both animals (500 images for cats and 500 for dogs). For this to work, before feeding the model these images, each image should be labeled as either a dog or a cat. Labeling each image into one of these two categories before running them through the model is what data annotation is.