# Annotating Photo Data

#### Why can't ReefCloud auto-analyse photos without my input?

ReefCloud requires some input from you in terms of classifying features in photos before it can analyse your photo data.

Benthic imagery can be used to address a wide variety of scientific questions, so there's no "one size fits all" for monitoring. For example, some users will be interested in knowing about how much living hard coral there is compared to macroalgae, while others might be interested in knowing about how coral species assemblages are changing and require more detailed taxonomic information from their photos. Its important that ReefCloud can be a flexible tool, but this means it needs to understand what *you* want to know about your photos, and this requires a little training.

ReefCloud assigns a [new machine](/frequently-asked-questions/annotating-photo-data/machine-learning.md) to each project, which means your machine can be trained to your exact specifications. The "Classify Images" tool allows you to start showing your machine how you would identify the organisms under each point. As soon as you provide an identification (linking the point to a label), the machine starts to learn, and tries to determine what the label from every single point should be. At first, with just a few examples, the machine will misclassify some points. The more examples you give your machine, the more skilled it will become at correctly identifying benthic components, until it can work through your dataset for you. It's important to show your machine a wide variety of different benthic components to help it learn - for example, if you only classify pictures of sand, it will only be able to identify sand.

#### How many points do I need to annotate?

This depends on the complexity of your Label Set and the number of photos, and your scientific question. Generally, we find the machine produces excellent results after 30% of the points are annotated by you. You can check the performance of the machine for your project, using the Reporting tab.

The machine will perform much better if it has seen examples from across your dataset. For example, if you have multiple transects or sites, you should annotate images from all of them, rather than just from one or two.

#### What should I do if an annotation point falls on my transect tape/tripod/quadrat?

If the item appears in just one or two images, you can use the "Disable Images" button under the Image Review or Classification tabs to remove the images from the analysis.

If the item occurs in more than a few photos, its important train your machine to recognise the item. To do this add a category (such as "unknown", "NA", "tape" or "other") to your Label Set for the item (or items). If you do not want the item to appear in your final analyses, you’ll need to later remove all the instances from the final downloaded dataset and readjust the % cover for the images where they occurred so the total cover for each image adds up to 100%.

#### What is ReefCloud doing behind the scenes?

ReefCloud crops your image into small squares centred on each point, called "patches". Patches are roughly 1/8th of the image height or 256 by 256 pixels. You can view the patches by clicking on the "Show Points" icon at the bottom of each image under the "Classify Images" tab. The algorithm views all 65,000 pixels within that patch, extracting as much information from the patch using on colours and contrasts and neighbours of each pixel, and compressing all that information into a 128-digit number called a “feature vector”. When we assign a label to a point, that feature vector is associated with a certain class, defined by you during training. The machine that runs over the AI performs a classification on remaining non-human annotated points by inferencing using the numbers. This is faster than using traditional machine learning where imagery is directly compared – but has the same accuracy. The difference in time for “inferencing” between these two methodologies is huge – minutes for a number vs hours to days for pixels. Even though it takes a little additional time to create the feature vector for each point you annotate, the benefit is ReefCloud can re-inference your whole dataset easily whenever you want.

This means that the way you decide on the label is quite important. If you choose exactly what is under a very small point, and do this enough times, the machine will quickly discover a relationship between the label you assign and the part of the feature vector that relates to the pixels under the point, and it will learn to annotate the remainder of your points paying special attention to that very small central area of the patch (in the context of its surroundings). Whatever you choose to do, the most important thing is that all the annotators agree to do it very consistently across your project: because you get a fresh new machine to train each time you start a new project it will learn the way you teach it.

**What are "Transfer Learning Points"?**

In addition to the points defined in your project ("human annotatable points"), ReefCloud examines additional points in each photo ("transfer learning points") up to a total of 50 points per image.

Under the “Dashboard” tab, you’ll see there is a value given for the number of human annotatable points and a second value for the number of transfer learning points. The number of human annotation points are defined during the initial project set up and that remains fixed. The machine then always adds additional “transfer learning points” to each image to make the total number up to 50 points per image (i.e. if you have 5 points per image it will add 45, if you have 20 points per image it will add 30). The machine annotatable points can not be seen under the "Classify Images" tab, but the information will be available in the downloadable data.

When you download your data you can easily remove those additional transfer learning points from your dataset (by filtering the dataset to remove the points that are “machine classified” only to exclude them from your analysis). However a benefit of using ReefCloud is your trained machine is assessing 50 points per image which improves the accuracy and reliability of your dataset.

**Can I change the number and distribution of points on my photos?**

Unfortunately, once you've decided on the number and distribution of points in your project set up, its not possible to adjust this inside your project. This is because of the way your AI model is created: its developed around the points and changing them disrupts the model. If its important to change the number of points its best to create a new project from scratch.

**How does ReefCloud differ from CoralNet?**

The machine learning technology that ReefCloud offers matches the analytical capabilities of CoralNet, and is based around the same principals. Using feature vectors as a proxy (as opposed to comparing patches directly) minimises re-inferencing time, so the model runs a lot faster. ReefCloud also has several additional features (e.g., the [ReefCloud Public Dashboard](https://reefcloud.ai/dashboard)) that gives monitoring teams the option to contribute their summary data to statistical models who's outputs provide an improved understanding of global and regional trends in coral reef health too.

If you already use CoralNet or CPCe to analyse reef photo data, only you and your team can decide if investing time switching to a new platform is right for you. If you have previous CoralNet (or CPCe) exports, ReefCloud allows you to upload previously analysed photos along with your annotations, so you can pre-train the ReefCloud machine without having to invest time in re-annotating your data in ReefCloud.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.reefcloud.ai/frequently-asked-questions/annotating-photo-data.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.