在我們對線性回歸的介紹中,我們介紹了各種組件,包括數據、模型、損失函數和優化算法。事實上,線性回歸是最簡單的機器學習模型之一。然而,訓練它使用許多與本書中其他模型所需的組件相同的組件。因此,在深入了解實現細節之前,有必要設計一些貫穿本書的 API。將深度學習中的組件視為對象,我們可以從為這些對象及其交互定義類開始。這種面向對象的實現設計將極大地簡化演示,您甚至可能想在您的項目中使用它。
受PyTorch Lightning等開源庫的啟發,在高層次上我們希望擁有三個類:(i)Module包含模型、損失和優化方法;(ii)DataModule提供用于訓練和驗證的數據加載器;(iii) 兩個類結合使用該類 Trainer,這使我們能夠在各種硬件平臺上訓練模型。本書中的大部分代碼都改編自Moduleand DataModule。Trainer只有在討論 GPU、CPU、并行訓練和優化算法時,我們才會涉及該類。
import time import numpy as np import torch from torch import nn from d2l import torch as d2l
import time import numpy as np from mxnet.gluon import nn from d2l import mxnet as d2l
import time from dataclasses import field from typing import Any import jax import numpy as np from flax import linen as nn from flax.training import train_state from jax import numpy as jnp from d2l import jax as d2l
No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)
import time import numpy as np import tensorflow as tf from d2l import torch as d2l
3.2.1. 公用事業
我們需要一些實用程序來簡化 Jupyter 筆記本中的面向對象編程。挑戰之一是類定義往往是相當長的代碼塊。筆記本電腦的可讀性需要簡短的代碼片段,穿插著解釋,這種要求與 Python 庫常見的編程風格不相容。第一個實用函數允許我們在創建類后將函數注冊為類中的方法。事實上,即使我們已經創建了類的實例,我們也可以這樣做!它允許我們將一個類的實現拆分成多個代碼塊。
def add_to_class(Class): #@save """Register functions as methods in created class.""" def wrapper(obj): setattr(Class, obj.__name__, obj) return wrapper
讓我們快速瀏覽一下如何使用它。我們計劃 A用一個方法來實現一個類do。我們可以先聲明類并創建一個實例,而不是在同一個代碼塊中A同時 擁有兩者的代碼。doAa
class A: def __init__(self): self.b = 1 a = A()
do接下來我們像往常一樣 定義方法,但不在 classA的范圍內。相反,我們add_to_class用類A作為參數來裝飾這個方法。這樣做時,該方法能夠訪問 的成員變量,A正如我們所期望的那樣,如果它已被定義為 的A定義的一部分。讓我們看看當我們為實例調用它時會發生什么a。
@add_to_class(A) def do(self): print('Class attribute "b" is', self.b) a.do()
Class attribute "b" is 1
@add_to_class(A) def do(self): print('Class attribute "b" is', self.b) a.do()
Class attribute "b" is 1
@add_to_class(A) def do(self): print('Class attribute "b" is', self.b) a.do()
Class attribute "b" is 1
@add_to_class(A) def do(self): print('Class attribute "b" is', self.b) a.do()
Class attribute "b" is 1
第二個是實用程序類,它將類 __init__方法中的所有參數保存為類屬性。這使我們無需額外代碼即可隱式擴展構造函數調用簽名。
class HyperParameters: #@save """The base class of hyperparameters.""" def save_hyperparameters(self, ignore=[]): raise NotImplemented
我們將其實施推遲到第 23.7 節。HyperParameters要使用它,我們定義繼承自該方法并調用 save_hyperparameters該方法的類__init__。
# Call the fully implemented HyperParameters class saved in d2l class B(d2l.HyperParameters): def __init__(self, a, b, c): self.save_hyperparameters(ignore=['c']) print('self.a =', self.a, 'self.b =', self.b) print('There is no self.c =', not hasattr(self, 'c')) b = B(a=1, b=2, c=3)
self.a = 1 self.b = 2 There is no self.c = True
# Call the fully implemented HyperParameters class saved in d2l class B(d2l.HyperParameters): def __init__(self, a, b, c): self.save_hyperparameters(ignore=['c']) print('self.a =', self.a, 'self.b =', self.b) print('There is no self.c =', not hasattr(self, 'c')) b = B(a=1, b=2, c=3)
self.a = 1 self.b = 2 There is no self.c = True
# Call the fully implemented HyperParameters class saved in d2l class B(d2l.HyperParameters): def __init__(self, a, b, c): self.save_hyperparameters(ignore=['c']) print('self.a =', self.a, 'self.b =', self.b) print('There is no self.c =', not hasattr(self, 'c')) b = B(a=1, b=2, c=3)
self.a = 1 self.b = 2 There is no self.c = True
# Call the fully implemented HyperParameters class saved in d2l class B(d2l.HyperParameters): def __init__(self, a, b, c): self.save_hyperparameters(ignore=['c']) print('self.a =', self.a, 'self.b =', self.b) print('There is no self.c =', not hasattr(self, 'c')) b = B(a=1, b=2, c=3)
self.a = 1 self.b = 2 There is no self.c = True
最后一個實用程序允許我們在實驗進行時以交互方式繪制實驗進度。為了尊重更強大(和復雜)的TensorBoard,我們將其命名為ProgressBoard。實現推遲到 第 23.7 節。現在,讓我們簡單地看看它的實際效果。
該方法在圖中 draw繪制一個點,并在圖例中指定。可選的僅通過顯示來平滑線條(x, y)labelevery_n1/n圖中的點。他們的價值是從平均n原始圖中的鄰居點。
class ProgressBoard(d2l.HyperParameters): #@save """The board that plots data points in animation.""" def __init__(self, xlabel=None, ylabel=None, xlim=None, ylim=None, xscale='linear', yscale='linear', ls=['-', '--', '-.', ':'], colors=['C0', 'C1', 'C2', 'C3'], fig=None, axes=None, figsize=(3.5, 2.5), display=True): self.save_hyperparameters() def draw(self, x, y, label, every_n=1): raise NotImplemented
在下面的示例中,我們以不同的平滑度繪制sin和。cos如果你運行這個代碼塊,你會看到線條在動畫中增長。
board = d2l.ProgressBoard('x') for x in np.arange(0, 10, 0.1): board.draw(x, np.sin(x), 'sin', every_n=2) board.draw(x, np.cos(x), 'cos', every_n=10)
board = d2l.ProgressBoard('x') for x in np.arange(0, 10, 0.1): board.draw(x, np.sin(x), 'sin', every_n=2) board.draw(x, np.cos(x), 'cos', every_n=10)
board = d2l.ProgressBoard('x') for x in np.arange(0, 10, 0.1): board.draw(x, np.sin(x), 'sin', every_n=2) board.draw(x, np.cos(x), 'cos', every_n=10)
board = d2l.ProgressBoard('x') for x in np.arange(0, 10, 0.1): board.draw(x, np.sin(x), 'sin', every_n=2) board.draw(x, np.cos(x), 'cos', every_n=10)
3.2.2. 楷模
該類Module是我們將要實現的所有模型的基類。我們至少需要定義三個方法。該__init__方法存儲可學習參數,該training_step方法接受數據批次以返回損失值,該方法configure_optimizers返回優化方法或它們的列表,用于更新可學習參數。我們可以選擇定義 validation_step報告評估措施。有時我們將計算輸出的代碼放入一個單獨的forward方法中,以使其更具可重用性。
class Module(nn.Module, d2l.HyperParameters): #@save """The base class of models.""" def __init__(self, plot_train_per_epoch=2, plot_valid_per_epoch=1): super().__init__() self.save_hyperparameters() self.board = ProgressBoard() def loss(self, y_hat, y): raise NotImplementedError def forward(self, X): assert hasattr(self, 'net'), 'Neural network is defined' return self.net(X) def plot(self, key, value, train): """Plot a point in animation.""" assert hasattr(self, 'trainer'), 'Trainer is not inited' self.board.xlabel = 'epoch' if train: x = self.trainer.train_batch_idx / self.trainer.num_train_batches n = self.trainer.num_train_batches / self.plot_train_per_epoch else: x = self.trainer.epoch + 1 n = self.trainer.num_val_batches / self.plot_valid_per_epoch self.board.draw(x, value.to(d2l.cpu()).detach().numpy(), ('train_' if train else 'val_') + key, every_n=int(n)) def training_step(self, batch): l = self.loss(self(*batch[:-1]), batch[-1]) self.plot('loss', l, train=True) return l def validation_step(self, batch): l = self.loss(self(*batch[:-1]), batch[-1]) self.plot('loss', l, train=False) def configure_optimizers(self): raise NotImplementedError
您可能會注意到它Module是nn.ModulePyTorch 中神經網絡基類的子類。它提供了方便的功能來處理神經網絡。例如,如果我們定義一個forward方法,例如,那么對于一個實例,我們可以通過 調用這個方法。這是有效的,因為它調用 內置方法中的方法。您可以在第 6.1 節中找到更多詳細信息和示例。forward(self, X)aa(X)forward__call__nn.Module
class Module(nn.Block, d2l.HyperParameters): #@save """The base class of models.""" def __init__(self, plot_train_per_epoch=2, plot_valid_per_epoch=1): super().__init__() self.save_hyperparameters() self.board = ProgressBoard() def loss(self, y_hat, y): raise NotImplementedError def forward(self, X): assert hasattr(self, 'net'), 'Neural network is defined' return self.net(X) def plot(self, key, value, train): """Plot a point in animation.""" assert hasattr(self, 'trainer'), 'Trainer is not inited' self.board.xlabel = 'epoch' if train: x = self.trainer.train_batch_idx / self.trainer.num_train_batches n = self.trainer.num_train_batches / self.plot_train_per_epoch else: x = self.trainer.epoch + 1 n = self.trainer.num_val_batches / self.plot_valid_per_epoch self.board.draw(x, value.asnumpy(), ( 'train_' if train else 'val_') + key, every_n=int(n)) def training_step(self, batch): l = self.loss(self(*batch[:-1]), batch[-1]) self.plot('loss', l, train=True) return l def validation_step(self, batch): l = self.loss(self(*batch[:-1]), batch[-1]) self.plot('loss', l, train=False) def configure_optimizers(self): raise NotImplementedError
You may notice that Module is a subclass of nn.Block, the base class of neural networks in Gluon. It provides convenient features to handle neural networks. For example, if we define a forward method, such as forward(self, X), then for an instance a we can invoke this method by a(X). This works since it calls the forward method in the built-in __call__ method. You can find more details and examples about nn.Block in Section 6.1.
With the introduction of dataclasses in Python 3.7, classes decorated with @dataclass automatically add magic methods such as __init__ and __repr__. The member variables are defined using type annotations. All Flax modules are Python 3.7 dataclasses.
class Module(nn.Module, d2l.HyperParameters): #@save """The base class of models.""" # No need for save_hyperparam when using Python dataclass plot_train_per_epoch: int = field(default=2, init=False) plot_valid_per_epoch: int = field(default=1, init=False) # Use default_factory to make sure new plots are generated on each run board: ProgressBoard = field(default_factory=lambda: ProgressBoard(), init=False) def loss(self, y_hat, y): raise NotImplementedError # JAX & Flax do not have a forward-method-like syntax. Flax uses setup # and built-in __call__ magic methods for forward pass. Adding here # for consistency def forward(self, X, *args, **kwargs): assert hasattr(self, 'net'), 'Neural network is defined' return self.net(X, *args, **kwargs) def __call__(self, X, *args, **kwargs): return self.forward(X, *args, **kwargs) def plot(self, key, value, train): """Plot a point in animation.""" assert hasattr(self, 'trainer'), 'Trainer is not inited' self.board.xlabel = 'epoch' if train: x = self.trainer.train_batch_idx / self.trainer.num_train_batches n = self.trainer.num_train_batches / self.plot_train_per_epoch else: x = self.trainer.epoch + 1 n = self.trainer.num_val_batches / self.plot_valid_per_epoch self.board.draw(x, jax.device_put(value, d2l.cpu()), ('train_' if train else 'val_') + key, every_n=int(n)) def training_step(self, params, batch, state): l, grads = jax.value_and_grad(self.loss)(params, batch[:-1], batch[-1], state) self.plot("loss", l, train=True) return l, grads def validation_step(self, params, batch, state): l = self.loss(params, batch[:-1], batch[-1], state) self.plot('loss', l, train=False) def apply_init(self, dummy_input, key): """To be defined later in :numref:`sec_lazy_init`""" raise NotImplementedError def configure_optimizers(self): raise NotImplementedError
You may notice that Module is a subclass of linen.Module, the base class of neural networks in Flax. It provides convenient features to handle neural networks. For example, it handles the model parameters, provides the nn.compact decorator to simplify code, invokes the __call__ method among other things. Here we also redirect __call__ to the forward method. We do this to make our code more similar to other framework implementations.
class Module(tf.keras.Model, d2l.HyperParameters): #@save """The base class of models.""" def __init__(self, plot_train_per_epoch=2, plot_valid_per_epoch=1): super().__init__() self.save_hyperparameters() self.board = ProgressBoard() self.training = None def loss(self, y_hat, y): raise NotImplementedError def forward(self, X): assert hasattr(self, 'net'), 'Neural network is defined' return self.net(X) def call(self, X, *args, **kwargs): if kwargs and "training" in kwargs: self.training = kwargs['training'] return self.forward(X, *args) def plot(self, key, value, train): """Plot a point in animation.""" assert hasattr(self, 'trainer'), 'Trainer is not inited' self.board.xlabel = 'epoch' if train: x = self.trainer.train_batch_idx / self.trainer.num_train_batches n = self.trainer.num_train_batches / self.plot_train_per_epoch else: x = self.trainer.epoch + 1 n = self.trainer.num_val_batches / self.plot_valid_per_epoch self.board.draw(x, value.numpy(), ( 'train_' if train else 'val_') + key, every_n=int(n)) def training_step(self, batch): l = self.loss(self(*batch[:-1]), batch[-1]) self.plot('loss', l, train=True) return l def validation_step(self, batch): l = self.loss(self(*batch[:-1]), batch[-1]) self.plot('loss', l, train=False) def configure_optimizers(self): raise NotImplementedError
You may notice that Module is a subclass of tf.keras.Model, the base class of neural networks in TensorFlow. It provides convenient features to handle neural networks. For example, it invokes the call method in the built-in __call__ method. Here we redirect call to the forward method, saving its arguments as a class attribute. We do this to make our code more similar to other framework implementations.
3.2.3. 數據
該類DataModule是數據的基類。該方法經常__init__用于準備數據。如果需要,這包括下載和預處理。返回train_dataloader 訓練數據集的數據加載器。數據加載器是一個 (Python) 生成器,每次使用時都會生成一個數據批次。然后將該批次輸入到計算損失training_step的方法中。Module有一個val_dataloader返回驗證數據集加載器的選項。它的行為方式相同,只是它為validation_step中的方法生成數據批次Module。
class DataModule(d2l.HyperParameters): #@save """The base class of data.""" def __init__(self, root='../data', num_workers=4): self.save_hyperparameters() def get_dataloader(self, train): raise NotImplementedError def train_dataloader(self): return self.get_dataloader(train=True) def val_dataloader(self): return self.get_dataloader(train=False)
class DataModule(d2l.HyperParameters): #@save """The base class of data.""" def __init__(self, root='../data', num_workers=4): self.save_hyperparameters() def get_dataloader(self, train): raise NotImplementedError def train_dataloader(self): return self.get_dataloader(train=True) def val_dataloader(self): return self.get_dataloader(train=False)
class DataModule(d2l.HyperParameters): #@save """The base class of data.""" def __init__(self, root='../data'): self.save_hyperparameters() def get_dataloader(self, train): raise NotImplementedError def train_dataloader(self): return self.get_dataloader(train=True) def val_dataloader(self): return self.get_dataloader(train=False)
class DataModule(d2l.HyperParameters): #@save """The base class of data.""" def __init__(self, root='../data'): self.save_hyperparameters() def get_dataloader(self, train): raise NotImplementedError def train_dataloader(self): return self.get_dataloader(train=True) def val_dataloader(self): return self.get_dataloader(train=False)
3.2.4. 訓練
該類 使用中指定的數據Trainer訓練類中的可學習參數。關鍵方法是,它接受兩個參數:,一個實例,和 ,一個實例。然后它遍歷整個數據集時間來訓練模型。和以前一樣,我們將把這個方法的實現推遲到后面的章節。ModuleDataModulefitmodelModuledataDataModulemax_epochs
class Trainer(d2l.HyperParameters): #@save """The base class for training models with data.""" def __init__(self, max_epochs, num_gpus=0, gradient_clip_val=0): self.save_hyperparameters() assert num_gpus == 0, 'No GPU support yet' def prepare_data(self, data): self.train_dataloader = data.train_dataloader() self.val_dataloader = data.val_dataloader() self.num_train_batches = len(self.train_dataloader) self.num_val_batches = (len(self.val_dataloader) if self.val_dataloader is not None else 0) def prepare_model(self, model): model.trainer = self model.board.xlim = [0, self.max_epochs] self.model = model def fit(self, model, data): self.prepare_data(data) self.prepare_model(model) self.optim = model.configure_optimizers() self.epoch = 0 self.train_batch_idx = 0 self.val_batch_idx = 0 for self.epoch in range(self.max_epochs): self.fit_epoch() def fit_epoch(self): raise NotImplementedError
The Trainer class trains the learnable parameters in the Module class with data specified in DataModule. The key method is fit, which accepts two arguments: model, an instance of Module, and data, an instance of DataModule. It then iterates over the entire dataset max_epochs times to train the model. As before, we will defer the implementation of this method to later chapters.
class Trainer(d2l.HyperParameters): #@save """The base class for training models with data.""" def __init__(self, max_epochs, num_gpus=0, gradient_clip_val=0): self.save_hyperparameters() assert num_gpus == 0, 'No GPU support yet' def prepare_data(self, data): self.train_dataloader = data.train_dataloader() self.val_dataloader = data.val_dataloader() self.num_train_batches = len(self.train_dataloader) self.num_val_batches = (len(self.val_dataloader) if self.val_dataloader is not None else 0) def prepare_model(self, model): model.trainer = self model.board.xlim = [0, self.max_epochs] self.model = model def fit(self, model, data): self.prepare_data(data) self.prepare_model(model) self.optim = model.configure_optimizers() self.epoch = 0 self.train_batch_idx = 0 self.val_batch_idx = 0 for self.epoch in range(self.max_epochs): self.fit_epoch() def fit_epoch(self): raise NotImplementedError
The Trainer class trains the learnable parameters params with data specified in DataModule. The key method is fit, which accepts three arguments: model, an instance of Module, data, an instance of DataModule, and key, a JAX PRNGKeyArray. We make the key argument optional here to simplify the interface, but it is recommended to always pass and initialize the model parameters with a root key in JAX and Flax. It then iterates over the entire dataset max_epochs times to train the model. As before, we will defer the implementation of this method to later chapters.
class Trainer(d2l.HyperParameters): #@save """The base class for training models with data.""" def __init__(self, max_epochs, num_gpus=0, gradient_clip_val=0): self.save_hyperparameters() assert num_gpus == 0, 'No GPU support yet' def prepare_data(self, data): self.train_dataloader = data.train_dataloader() self.val_dataloader = data.val_dataloader() self.num_train_batches = len(self.train_dataloader) self.num_val_batches = (len(self.val_dataloader) if self.val_dataloader is not None else 0) def prepare_model(self, model): model.trainer = self model.board.xlim = [0, self.max_epochs] self.model = model def fit(self, model, data, key=None): self.prepare_data(data) self.prepare_model(model) self.optim = model.configure_optimizers() if key is None: root_key = d2l.get_key() else: root_key = key params_key, dropout_key = jax.random.split(root_key) key = {'params': params_key, 'dropout': dropout_key} dummy_input = next(iter(self.train_dataloader))[:-1] variables = model.apply_init(dummy_input, key=key) params = variables['params'] if 'batch_stats' in variables.keys(): # Here batch_stats will be used later (e.g., for batch norm) batch_stats = variables['batch_stats'] else: batch_stats = {} # Flax uses optax under the hood for a single state obj TrainState. # More will be discussed later in the dropout and batch # normalization section class TrainState(train_state.TrainState): batch_stats: Any dropout_rng: jax.random.PRNGKeyArray self.state = TrainState.create(apply_fn=model.apply, params=params, batch_stats=batch_stats, dropout_rng=dropout_key, tx=model.configure_optimizers()) self.epoch = 0 self.train_batch_idx = 0 self.val_batch_idx = 0 for self.epoch in range(self.max_epochs): self.fit_epoch() def fit_epoch(self): raise NotImplementedError
The Trainer class trains the learnable parameters in the Module class with data specified in DataModule. The key method is fit, which accepts two arguments: model, an instance of Module, and data, an instance of DataModule. It then iterates over the entire dataset max_epochs times to train the model. As before, we will defer the implementation of this method to later chapters.
class Trainer(d2l.HyperParameters): #@save """The base class for training models with data.""" def __init__(self, max_epochs, num_gpus=0, gradient_clip_val=0): self.save_hyperparameters() assert num_gpus == 0, 'No GPU support yet' def prepare_data(self, data): self.train_dataloader = data.train_dataloader() self.val_dataloader = data.val_dataloader() self.num_train_batches = len(self.train_dataloader) self.num_val_batches = (len(self.val_dataloader) if self.val_dataloader is not None else 0) def prepare_model(self, model): model.trainer = self model.board.xlim = [0, self.max_epochs] self.model = model def fit(self, model, data): self.prepare_data(data) self.prepare_model(model) self.optim = model.configure_optimizers() self.epoch = 0 self.train_batch_idx = 0 self.val_batch_idx = 0 for self.epoch in range(self.max_epochs): self.fit_epoch() def fit_epoch(self): raise NotImplementedError
3.2.5. 概括
為了突出我們未來深度學習實現的面向對象設計,上面的類只是展示了它們的對象如何存儲數據和相互交互。@add_to_class我們將在本書的其余部分繼續豐富這些類的實現,例如 via 。此外,這些完全實現的類保存在d2l 庫中,d2l 庫是一個 輕量級工具包,可以輕松進行深度學習的結構化建模。特別是,它有助于在項目之間重用許多組件,而無需進行太多更改。例如,我們可以只替換優化器、模型、數據集等;這種程度的模塊化在簡潔和簡單方面為整本書帶來了好處(這就是我們添加它的原因),它可以為您自己的項目做同樣的事情。
3.2.6. 練習
找到保存在d2l 庫中的上述類的完整實現。我們強烈建議您在對深度學習建模有一定的了解后,再詳細查看實現。
刪除類save_hyperparameters中的語句B。你還能打印self.aandself.b嗎?可選:如果您已經深入了解該類的完整實現HyperParameters,您能解釋一下原因嗎?
-
pytorch
+關注
關注
2文章
808瀏覽量
13283
發布評論請先 登錄
相關推薦
評論