Introducing LightningCLI V2
The Lightning 1.5 release introduces CLI V2 with support for subcommands; shorthand notation; and registries for callbacks, optimizers, learning rate schedulers, LightningModules, and LightningDataModules.
PyTorch Lightning v1.5 marks a major leap of reliability to support the increasingly complex demands of the leading AI organizations and prestigious research labs that rely on Lightning to develop and deploy AI at scale.
As a result of our growth, PyTorch Lightning’s ambition has never been greater and aims at becoming the simplest, most flexible framework for expediting any kind of deep learning research to production.
To this end, user Quality of Life is at the core of our work, and today we are happy to introduce the LightningCLI v2 as part of the Lightning v1.5 release.
You can find more information about the LightningCLI in our docs!
LightningCLI, No Boilerplate For Reproducible AI
Running non-trivial experiments often requires configuring many different trainer and model arguments such as learning rates, batch sizes, number of epochs, data paths, data splits, number of GPUs, etc., that need to be exposed in a training script as most experiments are launched from command-line.
Implementing command-line tools using libraries such as Python’s standard library argparse
to manage hundreds of possible trainer, data, and model configurations is a huge source of boilerplate as follows:
This often leads to basic configurations being hard-coded and inaccessible for experimentation and reuse. Additionally, most of the configuration is duplicated in the signature and argument defaults, as well as docstrings and argument help messages.
Here is all you need to start using the LightningCLI
:
The LightningCLI
exposes arguments directly from your code classes or functions and generates help messages from their docstrings while performing type checking on instantiation! This means that the command-line interface adapts to your code instead of the other way around. The support for configuration no longer leaks into your research code.
Your code becomes the source of truth and your configuration is always up to date. This provides a standardized way to configure experiments using a single file. The full configuration is automatically saved after each run. This has the benefit of greatly simplifying the reproducibility of experiments which is critical for machine learning research.
Support for Fit, Validate, Test, Predict, and Tune
Before 1.5, the LightningCLI
only supported fitting, but we’ve added support for all other Trainer
entry points! You can choose which one to run by specifying it as a subcommand:
After carefully listening to our users’ feedback, we implemented a new notation to easily instantiate objects directly from the command line. This dramatically improves the command line experience as you can customize almost any aspect of your training by referencing only class names.
In the following snippet, the Trainer
will be instantiated with two callbacks:
- EarlyStopping(patience=5)
and
- LearningRateMonitor(logging_interval='epoch')
which are being referenced directly from their name for short-hand notation.
Adding specific objects to an experiment in your run has never been that easy!
This new notation can be used to instantiate multiplecallbacks
to form a list, model
, optimizer
, lr_scheduler
, and data
.
Optimizer and Learning Rate Schedulers Swapping
Optimizers and learning rate schedulers are also configurable. The most common case is a model with a single optimizer and optionally a single learning rate scheduler.
For this common use case, optimizer and learning rate schedulers swapping is as simple as follows:
All of PyTorch’s optimizers and learning rate schedulers (under torch.optim
) are supported out-of-the-box. This allows you to quickly experiment without having to add support to each optimizer class in your LightningModule.configure_optimizers()
method. In fact, it can be left unimplemented as it just adds boilerplate in many cases.
Here is how it used to look like, but the LightningCLI
makes it obsolete.
Instantiation-Only Mode
Perhaps you are interested in the parsing functionality of the LightningCLI
but want to manage the Trainer
calls yourself.
To this end, we’ve added a simple flag run=True|False
used as follows:
All the classes will be instantiated but you keep full control on how and when you want to utilize them.
Registries
Lightning exposes several registries for you to store your Lightning components via a decorator mechanism. Here is how you could register your own components:
This is supported for Callback
, optimizer
, lr_scheduler
, LightningModule
, and LightningDataModule
.
From the command line, you can use the shorthand notation described above:
This allows you to get running with your custom classes effortlessly and without added boilerplate. This is particularly interesting for library authors who want to provide their users with a range of models and data modules to choose from.
You can find more information about the LightningCLI in our docs!
Next Steps
The Lightning Team is more than ever committed to providing the best experience possible to anyone doing optimization with PyTorch and the PyTorch Lightning API being already stable, breaking changes will be minimal.
If you’re interested in helping out with these efforts, find us on slack!
Built by the PyTorch Lightning creators, let us introduce you to Grid.ai. Our platform enables you to scale your model training without worrying about infrastructure, similarly as Lightning automates the training.
You can get started with Grid.ai for free with just a GitHub or Google Account.