It's important to hand-label your data

09 Dec, 2023

One thing that I tend to do a lot of when training a neural network to perform some task: test myself on the task. To this end, I often end up building a Torch DataLoader that presents me, the human, with samples to label.

I've found that this is a highly effective way to get to know your dataset. You may find issues with labeling, or how you are posing the problem to the network. And you will certainly gain an appreciation of the exact task you are training the neural net to perform—it is not always exactly what you expect.

So I encourage anyone training neural networks to pay close attention to the samples in the dataset, and try labeling some yourself. Building a simple UI can be a fun afternoon project, and you will certainly learn a lot about your problem. And maybe dream up the next model that solves that problem.