What graphical classes should be annoted on a cadastral map to automatise their processing ?
I think you are conflating multiple separate things. The fact you’re speaking about an annotation-ontology possibly exacerbates the possible confusion.
I agree that for computer-vision algorithms, we need to come up with the classes to be identified. I am not convinced the extraction of the geometries and the identification / classification of the different types should be together in 1 process. I think there is a good case to be made for extraction of the geometries as a single process, and then an additional processing with a CNN (Convolutional Neural Network). This would allow for both the computer-vision to be very narrow in its goals, and the CNN to be broad enough for all of the specifics. With enough and consistent training data, a CNN with a lot more classes should be quite feasible, especially if they pertain to graphical properties of a scanned cadastral map. For example, small circles on parcel-boundaries to indicate a woody hedgerow with trees. You could then even employ a Recurrent Neural Network (RNN) to use some streaming-information of maps, for example in determining the manifestation of the entity that was delineated in the CV-part of finding the geometries. Simply said: a CV-parcel-boundary with a RNN-detected row of circles next to it, is a woody parcel boundary.
I understand that using CV for determining the types, requires a pixelated translation of the types described into hard classes, as few as possible. I did that five years ago with a simple (based on some ImageMagick scripting) processing in determining only 5 colours: white for background, black for boundaries, green for text, red for buildings and blue for water. I regard that as the poor-man’s take on AI; some ImageMagick-scripting, without genuine neural networks, and any CV more refined than the pixel-array-masks that the CIA already used in the eighties to digitize fingerprint-recognition.
I don’t think the wording “annotation ontology” is appropriate. What you lay out is more of a taxonomy, not a real ontology. Although you allow for combining entities, such a limited set will never be able to fully grasp the width of entities on those maps, and the possible subclasses you could be able to specify if those features are marked differently on the maps. That kind of clarity for mutual dependencies in types perhaps is key in coming up with a global model to fit this task. A hands-on example to illustrate this are sheds. Normally, sheds are drawn just like houses. You can’t really distinguish between them, in general. In some cases in The Netherlands, especially hay-sheds (grange à foin) are coloured an unsaturated golden-gray/yellow color. So, in some cases you could distinguish them from normal sheds based on their color, particularly often having a pentagram-shape. That should then be seen as a subtype of building. You could leave this division out completely, but distinguishing them perhaps is necessary for CV-purposes, because of their deviating colour.
More content-related I do see a problem in the Road-network-class. In my experience, this most often is not clear at all. There are cases where the roads are indeed some brownish-yellow, But in most cases, the roads aren’t coloured in at all, and they can’t be distinguished whatsoever, apart from their shape, but even then this is extremely hard. Especially since the road-network overlaps with the non-built environment. There are tree-lanes (allées), and public roads in private property, thus carrying a parcel number.
There also are a lot of edge-cases. One very common occurrence is a pier; a heightened wooden causeway (jetée). This could be water, built or road-network. The combination of water + road does not suffice, as this could also be a dredged-out transect of a river, “vaargeul” (fairway), for example for a “overzetpont” (ferry / traversier over a river) that is drawn on a map as a marked piece of the river.
Also, I think instead of roads, “dijken” (dike / dyke / digue / rempart) should be mentioned as a separate category, because they are often visualised through shading, and not always apparent from the register (OAT), so classification of those with CV is crucial, I think. That works both ways; it is important to know in a CV context that those lines are shading, and the other way around: those lines on the map are the only place where that entity-definition is visible / deducible.