# Train, Validation and Test Sets

### Before You Begin

Let's get a common understanding of a few terms in machine learning world. Next few sections is for anyone who needs to know the difference between the various dataset splits concepts while training Machine Learning models. For deeper understanding, refer to a great article [here](https://machinelearningmastery.com/difference-test-validation-datasets/).

A few key terms to get familiarized in the context of Jasper

{% hint style="success" %}
**Training Dataset**: The sample of data used to train an algorithm to build the model.
{% endhint %}

{% hint style="success" %}
&#x20;**Validation Dataset**: The sample of data used tell how good the model performs its prediction or classification on a data that it had not seen so far. Results from the validation set indicates if the model need to be trained on more data or not
{% endhint %}

{% hint style="success" %}
**Test Dataset**: The sample of data used to provide an unbiased evaluation of a final model fit on the training dataset. The test set is well curated. It should contain sampled data that spans the various that the model would face, when used in the real world.
{% endhint %}

:man\_mage: In short Validation Set is used to fine tune the model , where as Test Set is used to find the performance of your model

## Data Split Ratio

Before building any model , once the test data is prepared, the data is split into Train  & Test.   After this from the Test set, randomly choose X% of Train dataset to be the actual **Train** set and the remaining (100-X)% to be the **Validation** set, where X is a fixed number(say 80%).

![](https://2421228421-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M8iPmNJ0p6UOoY4DlPx%2F-MA5eKGYeh6kVvQGAnGr%2F-MA5ezGnQTUk_cOlSJ07%2Fimage.png?alt=media\&token=a6b48544-b7fd-4c9f-8367-a695c46e168f)

**Why is Validation Set important**

Model performance is computed against running the model against the validation set.**Validation set** is used for tuning the parameters of a model. See more about performance metrics in Jasper  [Build Your Model](https://jasper-docs.psionix.dev/training-your-data#accuracy-scores)&#x20;


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://jasper-docs.psionix.dev/resources/train-validation-and-test-sets.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
