# Experiment Configuration with Hydra

In this lesson, we will configure experiments for our PyTorch Lightning
training workflow. This will allow us to test various combinations of
hyperparameter settings and assess relative model performance.

to set up the jupyter env, set up hatch as described in the readme, then:
```
hatch -e nb shell
exit
jupyter lab
```

Let's start by importing the required packages we will use.

Notice that at this point, our data module and custom model module that we
created from the prior lessons are now abstracted enough such that we can
import them as modules!

In [None]:
import os
from pathlib import Path
from typing import List

import hydra
import wandb
from omegaconf import OmegaConf
from lightning.pytorch.callbacks import Callback
from lightning.pytorch.loggers import CSVLogger, WandbLogger

from ml_pipeline.datasets.datamodule import BurnScarsDataModule
from ml_pipeline.model.lightningmodule import BurnScarsSegmentationModel

Now we will configure [Hydra](https://hydra.cc/docs/intro/), an experiment
configuration tool. Hydra leverages configuration files for groups of
related, tunable hyperparameters, allowing you to easily organize, version
control, and automate multiple experiment configurations.

In [None]:
with hydra.initialize(config_path="../../config", version_base="1.3.0"):
    cfg = hydra.compose(
        config_name="config",
        overrides=["seed=0", "author=devseed", "name=test-exp-nb-1"],
        return_hydra_config=True,
    )
    print(OmegaConf.to_yaml(cfg))

We will also authenticate and configure our
[Weights and Biases (W&B)](https://wandb.ai/) logger.
W&B is an ML model visualization platform used for examining graphs of
metrics and comparing inputs, intermediates and output results.

In [None]:
# store experiment logs inside logs/
cwd = os.getcwd()
(Path(cwd) / "logs").mkdir(exist_ok=True)

# loggers
csv_logger = CSVLogger(save_dir="logs", name=cfg.name)
wandb_logger = wandb_logger = WandbLogger(
    name=cfg.name,
    save_dir="logs",
    offline=True,  # set to False if logging online
    project="ml-pipeline",
    entity="nasa-impact",
    log_model=False,
)

This establishes a callback to set up the experiment parameters with Hydra
during Pytorch Lightning model training.

In [None]:
# callbacks
callbacks: List[Callback] = []
if "callbacks" in cfg:
    for _, cb_conf in cfg.callbacks.items():
        if "_target_" in cb_conf:
            callbacks.append(hydra.utils.instantiate(cb_conf))

Now let's compile everything together. Hydra will examine what is in
`config_path` specified earlier to know which combinations of experiments to
run.

In [None]:
datamodule = BurnScarsDataModule(**cfg.datamodule)  # datamodule
model = BurnScarsSegmentationModel(**cfg.model)  # model
trainer = hydra.utils.instantiate(
    config=cfg.trainer,
    callbacks=callbacks,
    logger=[csv_logger, wandb_logger],
)  # trainer

Let's train some experiments.

In [None]:
# fit
trainer.fit(model=model, datamodule=datamodule)

Finally, we will test the experiments we trained. The logged results are viewable in the W&B dashboard.

In [None]:
# test
# trainer.test(model=model, datamodule=datamodule)