##  Download the data from Kaggle

Please go to Kaggle and create an API key to download the data. Follow [this guide](https://www.kaggle.com/general/74235) to do it.

The code in the cells below will prompt you for the kaggle.json file you've downloaded from the Kaggle and do the rest of the connection establishing and data downloading.

In [1]:
! pip install -q kaggle

import os
from google.colab import files

if not os.path.exists("/root/.kaggle/kaggle.json"):
  files.upload()
  ! mkdir -p ~/.kaggle
  ! mv kaggle.json ~/.kaggle/ 
  ! chmod 600 ~/.kaggle/kaggle.json

Saving kaggle.json to kaggle.json


Check if everything is correct.

In [2]:
! kaggle datasets list

ref                                                                 title                                                size  lastUpdated          downloadCount  
------------------------------------------------------------------  --------------------------------------------------  -----  -------------------  -------------  
anikannal/solar-power-generation-data                               Solar Power Generation Data                           2MB  2020-08-18 15:52:03            830  
ruchi798/bookcrossing-dataset                                       Book-Crossing: User review ratings                   25MB  2020-08-11 10:40:25            213  
nehaprabhavalkar/av-healthcare-analytics-ii                         AV : Healthcare Analytics II                          7MB  2020-08-29 03:40:10            350  
Cornell-University/arxiv                                            arXiv Dataset                                       880MB  2020-08-27 23:07:17           2298  
imoore/60k-stack

Download Food-101 dataset.


In [3]:
if not os.path.exists('/content/food-101'):
  !kaggle datasets download -d dansbecker/food-101
  !unzip -o food-101.zip 
  !unzip -q food-101.zip -x food-101.zip 
  !rm food-101.zip

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
  inflating: food-101/food-101/images/tacos/1033196.jpg  
  inflating: food-101/food-101/images/tacos/1036030.jpg  
  inflating: food-101/food-101/images/tacos/1042175.jpg  
  inflating: food-101/food-101/images/tacos/1044043.jpg  
  inflating: food-101/food-101/images/tacos/1058697.jpg  
  inflating: food-101/food-101/images/tacos/1059239.jpg  
  inflating: food-101/food-101/images/tacos/1059326.jpg  
  inflating: food-101/food-101/images/tacos/1066762.jpg  
  inflating: food-101/food-101/images/tacos/1070967.jpg  
  inflating: food-101/food-101/images/tacos/1073468.jpg  
  inflating: food-101/food-101/images/tacos/1075296.jpg  
  inflating: food-101/food-101/images/tacos/1085243.jpg  
  inflating: food-101/food-101/images/tacos/108529.jpg  
  inflating: food-101/food-101/images/tacos/1086014.jpg  
  inflating: food-101/food-101/images/tacos/108945.jpg  
  inflating: food-101/food-101/images/tacos/1089575.jpg  
  inflati

## Install packages
We have to install some extra packages

In [4]:
!pip install albumentations==0.4.6 pytorch-lightning==0.9.0

Collecting albumentations==0.4.6
[?25l  Downloading https://files.pythonhosted.org/packages/92/33/1c459c2c9a4028ec75527eff88bc4e2d256555189f42af4baf4d7bd89233/albumentations-0.4.6.tar.gz (117kB)
[K     |████████████████████████████████| 122kB 6.8MB/s 
[?25hCollecting pytorch-lightning==0.9.0
[?25l  Downloading https://files.pythonhosted.org/packages/ed/af/2f10c8ee22d7a05fe8c9be58ad5c55b71ab4dd895b44f0156bfd5535a708/pytorch_lightning-0.9.0-py3-none-any.whl (408kB)
[K     |████████████████████████████████| 409kB 13.0MB/s 
Collecting imgaug>=0.4.0
[?25l  Downloading https://files.pythonhosted.org/packages/66/b1/af3142c4a85cba6da9f4ebb5ff4e21e2616309552caca5e8acefe9840622/imgaug-0.4.0-py2.py3-none-any.whl (948kB)
[K     |████████████████████████████████| 952kB 17.7MB/s 
Collecting future>=0.17.1
[?25l  Downloading https://files.pythonhosted.org/packages/45/0b/38b06fd9b92dc2b68d58b75f900e97884c45bedd2ff83203d933cf5851c9/future-0.18.2.tar.gz (829kB)
[K     |█████████████████████████

## Dataset Preparation

We are going to take 21 classes from the dataset. To split the dataset run the code below:

In [5]:
import os.path as osp
from shutil import copyfile
from tqdm.notebook import tqdm

classes = [
    "apple_pie",
    "bruschetta",
    "caesar_salad",
    "steak",
    "spring_rolls",
    "spaghetti_carbonara",
    "frozen_yogurt",
    "falafel",
    "mussels",
    "ramen",
    "onion_rings",
    "oysters",
    "risotto",
    "waffles",
    "cup_cakes",
    "grilled_cheese_sandwich",
    "fried_calamari",
    "huevos_rancheros",
    "croque_madame",
    "bread_pudding",
    "dumplings",
]
data_root = '/content/food-101/food-101'
assert osp.isdir(data_root)
assert "images" in os.listdir(data_root)
assert "meta" in os.listdir(data_root)
os.makedirs(osp.join(data_root, "train"), exist_ok=True)
os.makedirs(osp.join(data_root, "test"), exist_ok=True)
for cls_name in classes:
    os.makedirs(osp.join(data_root, "train", cls_name), exist_ok=True)
    os.makedirs(osp.join(data_root, "test", cls_name), exist_ok=True)
with open(osp.join(data_root, "meta", "train.txt"), "r") as file:
    for image in tqdm(file):
        image = image.rstrip()
        if image.split("/")[0] in classes:
            copyfile(
                osp.join(data_root, "images", image + ".jpg"),
                osp.join(data_root, "train", image + ".jpg"),
            )
with open(osp.join(data_root, "meta", "test.txt"), "r") as file:
    for image in tqdm(file):
        image = image.rstrip()
        if image.split("/")[0] in classes:
            copyfile(
                osp.join(data_root, "images", image + ".jpg"),
                osp.join(data_root, "test", image + ".jpg"),
            )


HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))




## Apex Installation
Torch 1.6.0 has its own AMP for mixed-precision training. However, we are going to use Apex. Please, run the block below to install it:

In [6]:
!git clone https://github.com/NVIDIA/apex
%cd apex
!pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
%cd ..

Cloning into 'apex'...
remote: Enumerating objects: 7431, done.[K
remote: Total 7431 (delta 0), reused 0 (delta 0), pack-reused 7431[K
Receiving objects: 100% (7431/7431), 13.91 MiB | 13.73 MiB/s, done.
Resolving deltas: 100% (5023/5023), done.
/content/apex
  cmdoptions.check_install_build_global(options)
Created temporary directory: /tmp/pip-ephem-wheel-cache-i2m8ug50
Created temporary directory: /tmp/pip-req-tracker-5zyf_ul2
Created requirements tracker '/tmp/pip-req-tracker-5zyf_ul2'
Created temporary directory: /tmp/pip-install-hid3au93
Processing /content/apex
  Created temporary directory: /tmp/pip-req-build-usombg2r
  Added file:///content/apex to build tracker '/tmp/pip-req-tracker-5zyf_ul2'
    Running setup.py (path:/tmp/pip-req-build-usombg2r/setup.py) egg_info for package from file:///content/apex
    Running command python setup.py egg_info


    torch.__version__  = 1.6.0+cu101


    running egg_info
    creating /tmp/pip-req-build-usombg2r/pip-egg-info/apex.egg-info
 

## Model Definition

### Augmentations
We use Albumentations library for augmentations:


In [7]:
import albumentations as A
import numpy as np
from albumentations.pytorch import ToTensorV2

def get_training_augmentation():
    augmentations_train = A.Compose(
        [
            A.RandomResizedCrop(224, 224, scale=(0.8, 1.0)),
            A.HorizontalFlip(),
            A.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
            ToTensorV2(),
        ],
    )
    return lambda img: augmentations_train(image=np.array(img))


def get_test_augmentation():
    augmentations_val = A.Compose(
        [
            A.SmallestMaxSize(256),
            A.CenterCrop(224, 224),
            A.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
            ToTensorV2(),
        ],
    )
    return lambda img: augmentations_val(image=np.array(img))


### Extra Losses
Some of our tricks require the definition of custom loss functions:
 

In [8]:
# MIT License
# Copyright (c) 2018 Haitong Li


import torch
import torch.nn as nn
import torch.nn.functional as F


# Based on https://github.com/peterliht/knowledge-distillation-pytorch/blob/master/model/net.py
class KnowledgeDistillationLoss(nn.Module):
    def __init__(self, alpha, T, criterion):
        super().__init__()
        self.criterion = criterion
        self.KLDivLoss = nn.KLDivLoss(reduction="batchmean")
        self.alpha = alpha
        self.T = T

    def forward(self, input, target, teacher_target):
        loss = self.KLDivLoss(
            F.log_softmax(input / self.T, dim=1),
            F.softmax(teacher_target / self.T, dim=1),
        ) * (self.alpha * self.T * self.T) + self.criterion(input, target) * (
            1.0 - self.alpha
        )
        return loss


class MixUpAugmentationLoss(nn.Module):
    def __init__(self, criterion):
        super().__init__()
        self.criterion = criterion

    def forward(self, input, target, *args):
        # Validation step
        if isinstance(target, torch.Tensor):
            return self.criterion(input, target, *args)
        target_a, target_b, lmbd = target
        return lmbd * self.criterion(input, target_a, *args) + (
            1 - lmbd
        ) * self.criterion(input, target_b, *args)


# Based on https://github.com/pytorch/pytorch/issues/7455
class LabelSmoothingLoss(nn.Module):
    def __init__(self, n_classes, smoothing=0.0, dim=-1):
        super(LabelSmoothingLoss, self).__init__()
        self.confidence = 1.0 - smoothing
        self.smoothing = smoothing
        self.cls = n_classes
        self.dim = dim

    def forward(self, output, target, *args):
        output = output.log_softmax(dim=self.dim)
        with torch.no_grad():
            # Create matrix with shapes batch_size x n_classes
            true_dist = torch.zeros_like(output)
            # Initialize all elements with epsilon / N - 1
            true_dist.fill_(self.smoothing / (self.cls - 1))
            # Fill correct class for each sample in the batch with 1 - epsilon
            true_dist.scatter_(1, target.data.unsqueeze(1), self.confidence)
        return torch.mean(torch.sum(-true_dist * output, dim=self.dim))


### Teacher model weights
If you're going to launch the experiment with the Knowledge Distillation technique
you should have a teacher model weights. We trained ResNet-50 and used this model as the teacher. Launch the code below to download our weights. You can skip this step if you don't want to use the KD (or use your own pretrained weights).

**Note:** The usage of KD requires more time for training.

In [9]:
!wget https://www.dropbox.com/s/za5eeyhhy6pmpd2/bag_of_tricks_resnet50_teacher.ckpt?dl=0 -O ./teacher.ckpt

--2020-09-03 09:48:23--  https://www.dropbox.com/s/za5eeyhhy6pmpd2/bag_of_tricks_resnet50_teacher.ckpt?dl=0
Resolving www.dropbox.com (www.dropbox.com)... 162.125.67.1, 2620:100:6023:1::a27d:4301
Connecting to www.dropbox.com (www.dropbox.com)|162.125.67.1|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: /s/raw/za5eeyhhy6pmpd2/bag_of_tricks_resnet50_teacher.ckpt [following]
--2020-09-03 09:48:23--  https://www.dropbox.com/s/raw/za5eeyhhy6pmpd2/bag_of_tricks_resnet50_teacher.ckpt
Reusing existing connection to www.dropbox.com:443.
HTTP request sent, awaiting response... 302 Found
Location: https://uc0d44dd339ec3fdf4bc3ebc2ae6.dl.dropboxusercontent.com/cd/0/inline/A-qH_9fKzInWFiDcmlucWEbPJfv9R08WwhXgBkruuEGkZ8d_1sbBY-AkoWc4qhkjRCZtrPcIKtgAsVkcLGWcGGGS-qDCI4AoJlMWr5qzFWDQcxgZR9CIjJrtYJkoKIcojFk/file# [following]
--2020-09-03 09:48:24--  https://uc0d44dd339ec3fdf4bc3ebc2ae6.dl.dropboxusercontent.com/cd/0/inline/A-qH_9fKzInWFiDcmlucWEbPJfv9R08WwhXgB

### Model
In PyTorch-Lightning framework we should define a class which inherits from `LightningModule` and overwrite some methods:

In [10]:
import warnings
from typing import Dict

import pytorch_lightning as pl
from torch.utils.data import DataLoader
from torchvision.datasets import ImageFolder


class LitFood101(pl.LightningModule):
    def __init__(self, model, config):
        super().__init__()
        self.model = model
        self.config = config
        # We need to specify a number of classes there to avoid the RuntimeError
        # See https://github.com/PyTorchLightning/pytorch-lightning/issues/3006
        # However, we will get another warning and it should be handled in forward steps
        self.metric = pl.metrics.Accuracy(num_classes=self.config.num_classes)
        dim_feats = self.model.fc.in_features  # =2048
        nb_classes = self.config.num_classes
        self.model.fc = nn.Linear(dim_feats, nb_classes)

    def forward(self, x):
        return self.model(x)

    def setup(self, stage):
        if self.config.use_smoothing:
            self.criterion = LabelSmoothingLoss(
                self.config.num_classes, self.config.smoothing,
            )
        else:
            self.criterion = nn.CrossEntropyLoss()

        if self.config.use_mixup:
            self.criterion = MixUpAugmentationLoss(self.criterion)

    def on_epoch_start(self):
        self.previous_batch = [None, None]

    def training_step(self, batch, *args):
        x, y = batch[0]["image"], batch[1]
        if self.args.use_mixup:
            mixup_x, *mixup_y = self.mixup_batch(x, y, *self.previous_batch)
            logits = self(mixup_x)
            loss = self.criterion(logits, mixup_y)
        else:
            logits = self(x)
            loss = self.criterion(logits, y)
        # We ignore a warning about a mismatch between a number of predicted classes
        # and a number of initialized for Accuracy class
        with warnings.catch_warnings():
            warnings.simplefilter("ignore")
            accuracy = self.metric(logits.argmax(dim=-1), y)
        tensorboard_logs = {"train_loss": loss, "train_acc": accuracy}
        self.previous_batch = [x, y]

        return {"loss": loss, "progress_bar": tensorboard_logs, "log": tensorboard_logs}

    def validation_step(self, batch, *args):
        x, y = batch[0]["image"], batch[1]
        logits = self(x)
        val_loss = self.criterion(logits, y)
        with warnings.catch_warnings():
            warnings.simplefilter("ignore")
            val_accuracy = self.metric(logits.argmax(dim=-1), y)
        return {"val_loss": val_loss, "val_acc": val_accuracy}

    def test_step(self, batch, *args):
        x, y = batch[0]["image"], batch[1]
        logits = self(x)
        with warnings.catch_warnings():
            warnings.simplefilter("ignore")
            test_accuracy = self.metric(logits.argmax(dim=-1), y)
        return {"test_acc": test_accuracy}

    def validation_epoch_end(self, outputs):
        avg_loss = torch.stack([x["val_loss"] for x in outputs]).mean()
        avg_accuracy = torch.stack([x["val_acc"] for x in outputs]).mean()
        tensorboard_logs = {"val_loss": avg_loss, "val_acc": avg_accuracy}
        return {
            "avg_val_loss": avg_loss,
            "avg_val_acc": avg_accuracy,
            "log": tensorboard_logs,
        }

    def test_epoch_end(self, outputs):
        avg_accuracy = torch.stack([x["test_acc"] for x in outputs]).mean()
        return {"avg_test_acc": avg_accuracy.item()}

    def configure_optimizers(self):
        optimizer = torch.optim.Adam(self.model.parameters(), lr=self.config.lr)
        if self.config.use_cosine_scheduler:
            scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(
                optimizer, T_max=self.config.max_epochs, eta_min=0.0,
            )
        else:
            scheduler = torch.optim.lr_scheduler.MultiStepLR(
                optimizer, milestones=self.config.milestones,
            )
        return [optimizer], [scheduler]

    def train_dataloader(self):
        train_dataset = ImageFolder(
            os.path.join('/content/food-101/food-101', "train"),
            transform=get_training_augmentation(),
        )

        return DataLoader(
            train_dataset,
            batch_size=self.config.batch_size,
            shuffle=True,
            num_workers=self.config.workers,
            pin_memory=True,
        )

    def val_dataloader(self):
        val_dataset = ImageFolder(
            os.path.join('/content/food-101/food-101', "test"),
            transform=get_test_augmentation(),
        )
        return DataLoader(
            val_dataset,
            batch_size=32,
            shuffle=False,
            num_workers=self.config.workers,
            pin_memory=True,
        )

    def test_dataloader(self):
        return self.val_dataloader()

    def optimizer_step(self, epoch, batch_idx, optimizer, *args, **kwargs):
        # Learning Rate warm-up
        if self.config.warmup != -1 and epoch < self.config.warmup:
            lr = self.config.lr * (epoch + 1) / self.config.warmup
            for pg in optimizer.param_groups:
                pg["lr"] = lr

        self.logger.log_metrics({"lr": optimizer.param_groups[0]["lr"]}, step=epoch)
        optimizer.step()
        optimizer.zero_grad()

    def mixup_batch(self, x, y, x_previous, y_previous):
        lmbd = (
            np.random.beta(self.config.mixup_alpha, self.config.mixup_alpha)
            if self.config.mixup_alpha > 0
            else 1
        )
        if x_previous is None:
            x_previous = torch.empty_like(x).copy_(x)
            y_previous = torch.empty_like(y).copy_(y)
        batch_size = x.size(0)
        index = torch.randperm(batch_size)
        # If current batch size != previous batch size, we take only a part of the previous batch
        x_previous = x_previous[:batch_size, ...]
        y_previous = y_previous[:batch_size, ...]
        x_mixed = lmbd * x + (1 - lmbd) * x_previous[index, ...]
        y_a, y_b = y, y_previous[index]
        return x_mixed, y_a, y_b, lmbd


class LitFood101KD(LitFood101):
    def __init__(self, model, teacher, config):
        super().__init__(model, config)
        self.teacher = teacher
        dim_feats = self.teacher.fc.in_features  # =2048
        nb_classes = self.config.num_classes
        self.teacher.fc = nn.Linear(dim_feats, nb_classes)
        teacher_checkpoint = torch.load("./teacher.ckpt")
        self.teacher.load_state_dict(teacher_checkpoint["state_dict"])

    def setup(self, stage):
        criterion = (
            LabelSmoothingLoss(self.config.num_classes, self.config.smoothing)
            if self.config.use_smoothing
            else nn.CrossEntropyLoss()
        )
        self.criterion = KnowledgeDistillationLoss(
            self.config.distill_alpha, self.config.distill_temperature, 
            criterion=criterion,
        )
        if self.config.use_mixup:
            self.criterion = MixUpAugmentationLoss(self.criterion)
        self.teacher.eval()

    def training_step(self, batch, *args):
        x, y = batch[0]["image"], batch[1]
        with torch.no_grad():
            teacher_output = self.teacher(x)

        if self.config.use_mixup:
            mixup_x, *mixup_y = self.mixup_batch(x, y, *self.previous_batch)
            logits = self(mixup_x)
            loss = self.criterion(logits, mixup_y, teacher_output)
        else:
            logits = self(x)
            loss = self.criterion(logits, y, teacher_output)

        with warnings.catch_warnings():
            warnings.simplefilter("ignore")
            accuracy = self.metric(logits.argmax(dim=-1), y)
        tensorboard_logs = {"train_loss": loss, "train_acc": accuracy}

        return {"loss": loss, "progress_bar": tensorboard_logs, "log": tensorboard_logs}

    def validation_step(self, batch, *args):
        x, y = batch[0]["image"], batch[1]
        logits = self(x)
        with torch.no_grad():
            teacher_output = self.teacher(x)
        val_loss = self.criterion(logits, y, teacher_output)
        with warnings.catch_warnings():
            warnings.simplefilter("ignore")
            val_accuracy = self.metric(logits.argmax(dim=-1), y)
        return {"val_loss": val_loss, "val_acc": val_accuracy}

    def test_step(self, batch, *args):
        x, y = batch[0]["image"], batch[1]
        logits = self(x)
        with warnings.catch_warnings():
            warnings.simplefilter("ignore")
            test_accuracy = self.metric(logits.argmax(dim=-1), y)
        return {"test_acc": test_accuracy}



## Training
In the cell below you can set up the training process. Change values in the dictionary to turn on tricks and adjust hyperparameters.

In [11]:
from dataclasses import dataclass

@dataclass
class Config:
  workers: int = 4 # Number of data loading workers
  use_smoothing: bool = True # Use label smoothing trick
  smoothing: float = 0.2 # Coefficient for label smoothing (from 0.0 (no smoothing) to 1.0)
  use_mixup: bool = True # Use mixup augmentation during training
  mixup_alpha: float = 0.2 # Alpha value for mixup augmentation
  use_cosine_scheduler: bool = True # Use Cosine LR Scheduler instead of MultiStep
  batch_size: int = 32 # Mini-batch size
  lr: float = 1e-4 # Initial learning rate
  milestones: tuple = (15, 30) # Milestones for dropping the LR
  warmup: int = 6 # Number of epochs to warm up the LR. -1 to turn off
  max_epochs: int = 40 # Max number of epochs
  amp_level: str = 'O1' # Apex optimization level
  num_classes: int = 21 # Number of classes in the dataset
  use_knowledge_distillation: bool = True # Use knowledge distillation from resnet-50
  distill_alpha: float = 0.5 # Distillation strength
  distill_temperature: int = 20 # Temperature hyper-parameter to make the outputs smoother for KD


Load the tensorboard:

In [12]:
%load_ext tensorboard
%tensorboard --logdir lightning_logs/

<IPython.core.display.Javascript object>

Run the code below to start the training:

In [None]:
from pytorch_lightning import (
    Trainer,
    seed_everything,
)
from pytorch_lightning.callbacks import ModelCheckpoint
from torchvision.models import resnet18, resnet50

seed_everything(42)

config = Config()

checkpoint_callback = ModelCheckpoint(monitor="avg_val_acc", mode="max")
trainer = Trainer(
    gpus=1,
    amp_level=config.amp_level,
    amp_backend='apex',
    precision=16 if config.amp_level != 'O0' else 32,
    deterministic=True,
    benchmark=False,
    checkpoint_callback=checkpoint_callback,
    max_epochs=config.max_epochs
)

# create model
model = resnet18(pretrained=True)
if config.use_knowledge_distillation:
    teacher_model = resnet50(pretrained=False)
    model = LitFood101KD(model, teacher_model, config)
else:
    model = LitFood101(model, config)

trainer.fit(model)



GPU available: True, used: True
TPU available: False, using: 0 TPU cores
CUDA_VISIBLE_DEVICES: [0]
Using APEX 16bit precision.
Downloading: "https://download.pytorch.org/models/resnet18-5c106cde.pth" to /root/.cache/torch/hub/checkpoints/resnet18-5c106cde.pth


HBox(children=(FloatProgress(value=0.0, max=46827520.0), HTML(value='')))





  | Name      | Type                  | Params
----------------------------------------------------
0 | model     | ResNet                | 11 M  
1 | metric    | Accuracy              | 0     
2 | teacher   | ResNet                | 23 M  
3 | criterion | MixUpAugmentationLoss | 0     


Selected optimization level O1:  Insert automatic casts around Pytorch functions and Tensor methods.

Defaults for this optimization level are:
enabled                : True
opt_level              : O1
cast_model_type        : None
patch_torch_functions  : True
keep_batchnorm_fp32    : None
master_weights         : None
loss_scale             : dynamic
Processing user overrides (additional kwargs that are not None)...
After processing overrides, optimization options are:
enabled                : True
opt_level              : O1
cast_model_type        : None
patch_torch_functions  : True
keep_batchnorm_fp32    : None
master_weights         : None
loss_scale             : dynamic


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validation sanity check', layout=Layout…



HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Training', layout=Layout(flex='2'), max…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 65536.0


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 65536.0


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 131072.0


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

Gradient overflow.  Skipping step, loss scaler 0 reducing loss scale to 131072.0


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

## Testing
Run the cell below to test a model that has been trained.

In [None]:
trainer.test()