Introducing LightningCLI V2

The Lightning 1.5 release introduces CLI V2 with support for subcommands; shorthand notation; and registries for callbacks, optimizers, learning rate schedulers, LightningModules, and LightningDataModules.

PyTorch Lightning team
5 min readNov 16, 2021

PyTorch Lightning v1.5 marks a major leap of reliability to support the increasingly complex demands of the leading AI organizations and prestigious research labs that rely on Lightning to develop and deploy AI at scale.

As a result of our growth, PyTorch Lightning’s ambition has never been greater and aims at becoming the simplest, most flexible framework for expediting any kind of deep learning research to production.

To this end, user Quality of Life is at the core of our work, and today we are happy to introduce the LightningCLI v2 as part of the Lightning v1.5 release.

You can find more information about the LightningCLI in our docs!

LightningCLI, No Boilerplate For Reproducible AI

Running non-trivial experiments often requires configuring many different trainer and model arguments such as learning rates, batch sizes, number of epochs, data paths, data splits, number of GPUs, etc., that need to be exposed in a training script as most experiments are launched from command-line.

Implementing command-line tools using libraries such as Python’s standard library argparse to manage hundreds of possible trainer, data, and model configurations is a huge source of boilerplate as follows:

This often leads to basic configurations being hard-coded and inaccessible for experimentation and reuse. Additionally, most of the configuration is duplicated in the signature and argument defaults, as well as docstrings and argument help messages.

Here is all you need to start using the LightningCLI:

The LightningCLI exposes arguments directly from your code classes or functions and generates help messages from their docstrings while performing type checking on instantiation! This means that the command-line interface adapts to your code instead of the other way around. The support for configuration no longer leaks into your research code.

Your code becomes the source of truth and your configuration is always up to date. This provides a standardized way to configure experiments using a single file. The full configuration is automatically saved after each run. This has the benefit of greatly simplifying the reproducibility of experiments which is critical for machine learning research.

Support for Fit, Validate, Test, Predict, and Tune

Before 1.5, the LightningCLI only supported fitting, but we’ve added support for all other Trainer entry points! You can choose which one to run by specifying it as a subcommand:

After carefully listening to our users’ feedback, we implemented a new notation to easily instantiate objects directly from the command line. This dramatically improves the command line experience as you can customize almost any aspect of your training by referencing only class names.

In the following snippet, the Trainer will be instantiated with two callbacks:
- EarlyStopping(patience=5) and
- LearningRateMonitor(logging_interval='epoch')
which are being referenced directly from their name for short-hand notation.

Adding specific objects to an experiment in your run has never been that easy!

This new notation can be used to instantiate multiplecallbacks to form a list, model, optimizer, lr_scheduler, and data.

Optimizer and Learning Rate Schedulers Swapping

Optimizers and learning rate schedulers are also configurable. The most common case is a model with a single optimizer and optionally a single learning rate scheduler.

For this common use case, optimizer and learning rate schedulers swapping is as simple as follows:

All of PyTorch’s optimizers and learning rate schedulers (under torch.optim) are supported out-of-the-box. This allows you to quickly experiment without having to add support to each optimizer class in your LightningModule.configure_optimizers() method. In fact, it can be left unimplemented as it just adds boilerplate in many cases.

Here is how it used to look like, but the LightningCLI makes it obsolete.

You will not have to write this boilerplate ever again after adopting the LightningCLI

Instantiation-Only Mode

Perhaps you are interested in the parsing functionality of the LightningCLI but want to manage the Trainer calls yourself.

To this end, we’ve added a simple flag run=True|False used as follows:

All the classes will be instantiated but you keep full control on how and when you want to utilize them.

Registries

Lightning exposes several registries for you to store your Lightning components via a decorator mechanism. Here is how you could register your own components:

This is supported for Callback, optimizer, lr_scheduler, LightningModule, and LightningDataModule.

From the command line, you can use the shorthand notation described above:

This allows you to get running with your custom classes effortlessly and without added boilerplate. This is particularly interesting for library authors who want to provide their users with a range of models and data modules to choose from.

You can find more information about the LightningCLI in our docs!

Next Steps

The Lightning Team is more than ever committed to providing the best experience possible to anyone doing optimization with PyTorch and the PyTorch Lightning API being already stable, breaking changes will be minimal.

If you’re interested in helping out with these efforts, find us on slack!

Built by the PyTorch Lightning creators, let us introduce you to Grid.ai. Our platform enables you to scale your model training without worrying about infrastructure, similarly as Lightning automates the training.

You can get started with Grid.ai for free with just a GitHub or Google Account.

https://cdn-images-1.medium.com/max/2400/0*BSvoUp30zZ8dkwpv.png

--

--

PyTorch Lightning team

We are the core contributors team developing PyTorch Lightning — the deep learning research framework to run complex models without the boilerplate