听中国有声小说,完美的世界 1993 电影,女人书籍排行榜

今天的內(nèi)容介紹的是回歸問(wèn)題。在回歸問(wèn)題中，我們的目標(biāo)是預(yù)測(cè)連續(xù)值的輸出，如價(jià)格或概率。將此與分類問(wèn)題進(jìn)行對(duì)比，我們的目標(biāo)是預(yù)測(cè)離散標(biāo)簽（例如，圖片里有一個(gè)蘋(píng)果或一個(gè)橙子）。

本筆記采用了經(jīng)典的 Auto MPG 數(shù)據(jù)集，并建立了一個(gè)模型來(lái)預(yù)測(cè) 20 世紀(jì) 70 年代末和 80 年代初汽車的燃油效率。為此，我們將為模型提供該時(shí)間段內(nèi)許多模型的描述。此描述包括以下屬性：氣缸，排量，馬力和重量。

此示例使用 tf.keras API，有關(guān)詳細(xì)信息，請(qǐng)參閱指南

https://tensorflow.google.cn/guide/keras?hl=zh-CN

# Use seaborn for pairplot!pip install -q seaborn

from __future__ import absolute_import, division, print_functionimport pathlibimport pandas as pdimport seaborn as snsimport tensorflow as tffrom tensorflow import kerasfrom tensorflow.keras import layersprint(tf.__version__)

1.12.0

Auto MPG 數(shù)據(jù)集

該數(shù)據(jù)集可從UCI Machine Learning Repository 獲得（https://archive.ics.uci.edu/）。

取得數(shù)據(jù)

首先下載數(shù)據(jù)集

dataset_path = keras.utils.get_file("auto-mpg.data", "https://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data")dataset_path

Downloading data from https://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data

32768/30286 [================================] - 0s 1us/step

'/root/.keras/datasets/auto-mpg.data'

使用 pandas 導(dǎo)入

column_names = ['MPG','Cylinders','Displacement','Horsepower','Weight', 'Acceleration', 'Model Year', 'Origin'] raw_dataset = pd.read_csv(dataset_path, names=column_names, na_values = "?", comment='\t', sep=" ", skipinitialspace=True)dataset = raw_dataset.copy()dataset.tail()

清理數(shù)據(jù)

數(shù)據(jù)集包含一些未知數(shù)值

dataset.isna().sum()

MPG 0Cylinders 0Displacement 0Horsepower 6Weight 0Acceleration 0Model Year 0Origin 0dtype: int64

刪除那些行來(lái)保持本初始教程簡(jiǎn)單明了

dataset = dataset.dropna()

上方表格中，“Origin” 列實(shí)際上是分類，而不是數(shù)字。所以把它轉(zhuǎn)換為one-hot：

origin = dataset.pop('Origin')

dataset['USA'] = (origin == 1)*1.0dataset['Europe'] = (origin == 2)*1.0dataset['Japan'] = (origin == 3)*1.0dataset.tail()

將數(shù)據(jù)拆分成訓(xùn)練和測(cè)試

現(xiàn)在將數(shù)據(jù)拆分成一個(gè)訓(xùn)練集和一個(gè)測(cè)試集。我們將在模型的最終評(píng)估中使用測(cè)試集。

train_dataset = dataset.sample(frac=0.8,random_state=0)test_dataset = dataset.drop(train_dataset.index)

檢查數(shù)據(jù)

快速瀏覽訓(xùn)練集中幾個(gè)對(duì)列的聯(lián)合分布

sns.pairplot(train_dataset[["MPG", "Cylinders", "Displacement", "Weight"]], diag_kind="kde")

并查看這個(gè)整體統(tǒng)計(jì)數(shù)據(jù)：

train_stats = train_dataset.describe()train_stats.pop("MPG")train_stats = train_stats.transpose()train_stats

從標(biāo)簽中分割特征

將目標(biāo)值或 “標(biāo)簽” 與特征分開(kāi)。此標(biāo)簽是您將要訓(xùn)練模型進(jìn)行預(yù)測(cè)的數(shù)值。

train_labels = train_dataset.pop('MPG')test_labels = test_dataset.pop('MPG')

將數(shù)據(jù)規(guī)范化

再次查看上面的train_stats 塊，并注意一下，每個(gè)特征的范圍有多么的大相徑庭。

使用不同比例和范圍進(jìn)行特征規(guī)范化是一個(gè)不錯(cuò)的做法。盡管模型可能在沒(méi)有特征歸一化的情況下收斂，但它會(huì)使訓(xùn)練更加困難，并且它使得結(jié)果模型依賴于輸入中使用的單位的選擇。

注意：我們故意只使用來(lái)自訓(xùn)練集的統(tǒng)計(jì)數(shù)據(jù)，這些統(tǒng)計(jì)數(shù)據(jù)也將被用于評(píng)估。這樣模型就沒(méi)有關(guān)于測(cè)試集的任何信息。

defnorm(x): return(x - train_stats['mean']) / train_stats['std']normed_train_data = norm(train_dataset)normed_test_data = norm(test_dataset)

這個(gè)規(guī)范化的數(shù)據(jù)是我們用來(lái)訓(xùn)練模型的數(shù)據(jù)。

注意：此處用于規(guī)范化輸入的統(tǒng)計(jì)信息與模型權(quán)重同樣重要。

模型

建模

讓我們建立我們的模型。在這里，我們將使用具有兩個(gè)密集連接的隱藏層的 Sequential 模型，以及返回單個(gè)連續(xù)值的輸出層。模型構(gòu)建步驟包含在一個(gè)函數(shù) build_model 中，因?yàn)槲覀兩院髮?chuàng)建第二個(gè)模型。

def build_model(): model = keras.Sequential([ layers.Dense(64, activation=tf.nn.relu, input_shape=[len(train_dataset.keys())]), layers.Dense(64, activation=tf.nn.relu), layers.Dense(1) ]) optimizer = tf.train.RMSPropOptimizer(0.001) model.compile(loss='mse', optimizer=optimizer, metrics=['mae', 'mse']) return model

model = build_model()

檢查模型

使用 .summary 方法打印模型的簡(jiǎn)單描述

model.summary()

_________________________________________________________________Layer (type) Output Shape Param # =================================================================dense (Dense) (None, 64) 640 _________________________________________________________________dense_1 (Dense) (None, 64) 4160 _________________________________________________________________dense_2 (Dense) (None, 1) 65 =================================================================Total params: 4,865Trainable params: 4,865Non-trainable params: 0_________________________________________________________________

現(xiàn)在來(lái)試一試這個(gè)模型。從訓(xùn)練數(shù)據(jù)中取出一批 10 個(gè)示例并調(diào)用 model.predict。

example_batch = normed_train_data[:10]example_result = model.predict(example_batch)example_result

array([[ 0.08682194], [ 0.0385334 ], [ 0.11662665], [-0.22370592], [ 0.12390759], [ 0.1889237 ], [ 0.1349103 ], [ 0.41427213], [ 0.19710071], [ 0.01540279]], dtype=float32)

它看上去起效了，產(chǎn)生預(yù)期的形狀和類型的結(jié)果。

訓(xùn)練模型

該模型經(jīng)過(guò) 1000 個(gè) epoch 的訓(xùn)練，并在歷史對(duì)象中記錄訓(xùn)練和驗(yàn)證的準(zhǔn)確性。

# Display training progress by printing a single dot for each completed epochclass PrintDot(keras.callbacks.Callback): def on_epoch_end(self, epoch, logs): if epoch % 100 == 0: print('') print('.', end='')EPOCHS = 1000history = model.fit( normed_train_data, train_labels, epochs=EPOCHS, validation_split = 0.2, verbose=0, callbacks=[PrintDot()])

................................................................................................................................................................................................................................................................................................................................................................................................................

使用存儲(chǔ)在歷史對(duì)象中的統(tǒng)計(jì)數(shù)據(jù)將模型的訓(xùn)練進(jìn)度可視化。

hist = pd.DataFrame(history.history)hist['epoch'] = history.epochhist.tail()

import matplotlib.pyplot as pltdef plot_history(history): plt.figure() plt.xlabel('Epoch') plt.ylabel('Mean Abs Error [MPG]') plt.plot(hist['epoch'], hist['mean_absolute_error'], label='Train Error') plt.plot(hist['epoch'], hist['val_mean_absolute_error'], label = 'Val Error') plt.legend() plt.ylim([0,5]) plt.figure() plt.xlabel('Epoch') plt.ylabel('Mean Square Error [$MPG^2$]') plt.plot(hist['epoch'], hist['mean_squared_error'], label='Train Error') plt.plot(hist['epoch'], hist['val_mean_squared_error'], label = 'Val Error') plt.legend() plt.ylim([0,20])plot_history(history)

該圖顯示數(shù)百個(gè) epoch 后的驗(yàn)證錯(cuò)誤幾乎沒(méi)有改善，甚至降低了。讓我們更新 model.fit 方法，以便在驗(yàn)證分?jǐn)?shù)沒(méi)有提高時(shí)自動(dòng)停止訓(xùn)練。我們將使用一個(gè)回調(diào)測(cè)試每個(gè) epoch 的訓(xùn)練條件。如果經(jīng)過(guò)一定數(shù)量的時(shí)期而沒(méi)有顯示出改進(jìn)，則自動(dòng)停止訓(xùn)練。

您可以在

https://tensorflow.google.cn/api_docs/python/tf/keras/callbacks/EarlyStopping?hl=zh-CN了解有關(guān)此回調(diào)的更多信息。

model = build_model()# The patience parameter is the amount of epochs to check for improvementearly_stop = keras.callbacks.EarlyStopping(monitor='val_loss', patience=50)history = model.fit(normed_train_data, train_labels, epochs=EPOCHS, validation_split = 0.2, verbose=0, callbacks=[early_stop, PrintDot()])plot_history(history)

該圖表顯示在驗(yàn)證集上，平均誤差通常在 +/- 2 MPG 左右。這個(gè)結(jié)果好嗎？我們將決定權(quán)留給你。

讓我們看看模型在測(cè)試集上是如何執(zhí)行的，在訓(xùn)練模型時(shí)我們并沒(méi)有使用它：

loss, mae, mse = model.evaluate(normed_test_data, test_labels, verbose=0)print("Testing set Mean Abs Error: {:5.2f} MPG".format(mae))

Testing set Mean Abs Error: 1.88 MPG

作出預(yù)測(cè)

最后，使用測(cè)試集中的數(shù)據(jù)預(yù)測(cè) MPG 值：

test_predictions = model.predict(normed_test_data).flatten()plt.scatter(test_labels, test_predictions)plt.xlabel('True Values [MPG]')plt.ylabel('Predictions [MPG]')plt.axis('equal')plt.axis('square')plt.xlim([0,plt.xlim()[1]])plt.ylim([0,plt.ylim()[1]])_ = plt.plot([-100, 100], [-100, 100])

error = test_predictions - test_labelsplt.hist(error, bins = 25)plt.xlabel("Prediction Error [MPG]")_ = plt.ylabel("Count")

結(jié)論

本筆記介紹了一些處理回歸問(wèn)題的技巧：

均方誤差（MSE）是用于回歸問(wèn)題的常見(jiàn)損失函數(shù)（與分類問(wèn)題不同）

同樣，用于回歸的評(píng)估指標(biāo)與分類不同。常見(jiàn)的回歸指標(biāo)是平均絕對(duì)誤差（MAE）

當(dāng)輸入數(shù)據(jù)要素具有不同范圍的值時(shí)，應(yīng)單獨(dú)縮放每個(gè)要素

如果訓(xùn)練數(shù)據(jù)不多，則選擇隱藏層較少的小型網(wǎng)絡(luò)，以避免過(guò)度擬合

防止過(guò)度裝配的一個(gè)有用的技術(shù)是盡早停止

聲明：本文內(nèi)容及配圖由入駐作者撰寫(xiě)或者入駐合作網(wǎng)站授權(quán)轉(zhuǎn)載。文章觀點(diǎn)僅代表作者本人，不代表電子發(fā)燒友網(wǎng)立場(chǎng)。文章及其配圖僅供工程師學(xué)習(xí)之用，如有內(nèi)容侵權(quán)或者其他違規(guī)問(wèn)題，請(qǐng)聯(lián)系本站處理。舉報(bào)投訴

數(shù)據(jù)集

數(shù)據(jù)集

+關(guān)注

關(guān)注
4

文章
1208

瀏覽量
24703
tensorflow

tensorflow

+關(guān)注

關(guān)注
13

文章
329

瀏覽量
60536

原文標(biāo)題：TensorFlow 回歸：預(yù)測(cè)燃油效率

文章出處：【微信號(hào)：tensorflowers，微信公眾號(hào)：Tensorflowers】歡迎添加關(guān)注！文章轉(zhuǎn)載請(qǐng)注明出處。

評(píng)論

相關(guān)推薦

GPRS小區(qū)流量預(yù)測(cè)中時(shí)序模型的比較研究

針對(duì)通用無(wú)線分組業(yè)務(wù)(GPRS)小區(qū)流量預(yù)測(cè)問(wèn)題,對(duì)幾種典型時(shí)序預(yù)測(cè)模型的性能進(jìn)行了綜合分析。在總結(jié)時(shí)序預(yù)測(cè)模型使用步驟的基礎(chǔ)上,分析了自

發(fā)表于 05-06 09:03

經(jīng)濟(jì)預(yù)測(cè)模型

該資料是由幾篇論文和一個(gè)講義組成，具體講解了回歸分析預(yù)測(cè)、時(shí)間序列預(yù)測(cè)、宏觀計(jì)量經(jīng)濟(jì)模型

發(fā)表于 08-15 10:47

回歸預(yù)測(cè)之入門(mén)

會(huì)通過(guò)一系列的過(guò)程得到一個(gè)估計(jì)的函數(shù)，這個(gè)函數(shù)有能力對(duì)沒(méi)有見(jiàn)過(guò)的新數(shù)據(jù)給出一個(gè)新的估計(jì)，也被稱為構(gòu)建一

發(fā)表于 10-15 10:19

Keras之ML~P：基于Keras中建立的回歸預(yù)測(cè)的神經(jīng)網(wǎng)絡(luò)模型

Keras之ML~P：基于Keras中建立的回歸預(yù)測(cè)的神經(jīng)網(wǎng)絡(luò)模型(根據(jù)200個(gè)數(shù)據(jù)樣本預(yù)測(cè)新的

發(fā)表于 12-20 10:43

Tensorflow的非線性回歸

Tensorflow 非線性回歸

發(fā)表于 05-12 10:19

TensorFlow實(shí)現(xiàn)簡(jiǎn)單線性回歸

?？梢詫?duì)數(shù)據(jù)進(jìn)行歸一化處理：為訓(xùn)練數(shù)據(jù)聲明 TensorFlow 占位符：創(chuàng)建 TensorFlow 的權(quán)重和偏置變量且初始值為零：定義用于預(yù)測(cè)的線性

發(fā)表于 08-11 19:34

TensorFlow實(shí)現(xiàn)多元線性回歸（超詳細(xì)）

隨著訓(xùn)練過(guò)程的進(jìn)行而減少：本節(jié)使用了 13 個(gè)特征來(lái)訓(xùn)練模型。簡(jiǎn)單線性回歸和多元線性回歸的主要不同在于權(quán)重，且系數(shù)的數(shù)量始終等于輸入特征的

發(fā)表于 08-11 19:35

TensorFlow邏輯回歸處理MNIST數(shù)據(jù)集

。mnist.train.images 的每項(xiàng)都是一個(gè)范圍介于 0 到 1 的像素強(qiáng)度：在 TensorFlow 圖中為訓(xùn)練數(shù)據(jù)集的輸入 x 和標(biāo)簽 y 創(chuàng)建占位符：創(chuàng)建學(xué)習(xí)變量、權(quán)重和偏置：創(chuàng)建邏輯

發(fā)表于 08-11 19:36

TensorFlow邏輯回歸處理MNIST數(shù)據(jù)集

。mnist.train.images 的每項(xiàng)都是一個(gè)范圍介于 0 到 1 的像素強(qiáng)度：在 TensorFlow 圖中為訓(xùn)練數(shù)據(jù)集的輸入 x 和標(biāo)簽 y 創(chuàng)建占位符：創(chuàng)建學(xué)習(xí)變量、權(quán)重和偏置：創(chuàng)建邏輯

發(fā)表于 08-11 19:36

Edge Impulse的回歸模型

Edge Impulse的回歸模型可以從數(shù)據(jù)中學(xué)習(xí)模式，并將其應(yīng)用于新數(shù)據(jù)。非常適合預(yù)測(cè)數(shù)字連續(xù)值。

發(fā)表于 12-20 06:21

使用KNN進(jìn)行分類和回歸

一般情況下k-Nearest Neighbor (KNN)都是用來(lái)解決分類的問(wèn)題，其實(shí)KNN是一種可以應(yīng)用于數(shù)據(jù)分類和預(yù)測(cè)的簡(jiǎn)單算法，本文中我們將它與簡(jiǎn)單的線性回歸進(jìn)行比較。KNN

發(fā)表于 10-28 14:44

自回歸滯后模型進(jìn)行多變量時(shí)間序列預(yù)測(cè)案例分享

1、如何建立一個(gè)模型來(lái)進(jìn)行多元時(shí)間序列預(yù)測(cè)呢？　　下圖顯示了關(guān)于不同類型葡萄酒銷量的月度多元時(shí)間

發(fā)表于 11-30 15:33

如何利用高斯過(guò)程回歸模型建立燃料電池電堆功率預(yù)測(cè)模型？

對(duì)車輛輔助裝置的控件進(jìn)行優(yōu)化設(shè)計(jì)時(shí)，需要建立一個(gè)燃料電池電堆功率預(yù)測(cè)模型，而該模型的

發(fā)表于 11-02 15:34 ?1198次閱讀

如何使用Tensorflow保存或加載模型

TensorFlow是一個(gè)廣泛使用的開(kāi)源機(jī)器學(xué)習(xí)庫(kù)，它提供了豐富的API來(lái)構(gòu)建和訓(xùn)練各種深度學(xué)習(xí)模型。在

發(fā)表于 07-04 13:07 ?1527次閱讀

keras模型轉(zhuǎn)tensorflow session

在這篇文章中，我們將討論如何將Keras模型轉(zhuǎn)換為TensorFlow session。 Keras和TensorFlow簡(jiǎn)介 Keras是一個(gè)

發(fā)表于 07-05 09:36 ?544次閱讀

在线观看www成人影院-在线观看www日本免费网站-在线观看www视频-在线观看操-欧美18在线-欧美1级

搜索歷史

TensorFlow回歸：建立了一個(gè)模型來(lái)預(yù)測(cè)汽車的燃油效率。

評(píng)論

GPRS小區(qū)流量預(yù)測(cè)中時(shí)序模型的比較研究

經(jīng)濟(jì)預(yù)測(cè)模型

回歸預(yù)測(cè)之入門(mén)

Keras之ML~P：基于Keras中建立的回歸預(yù)測(cè)的神經(jīng)網(wǎng)絡(luò)模型

Tensorflow的非線性回歸

TensorFlow實(shí)現(xiàn)簡(jiǎn)單線性回歸

TensorFlow實(shí)現(xiàn)多元線性回歸（超詳細(xì)）

TensorFlow邏輯回歸處理MNIST數(shù)據(jù)集

TensorFlow邏輯回歸處理MNIST數(shù)據(jù)集

Edge Impulse的回歸模型

使用KNN進(jìn)行分類和回歸

自回歸滯后模型進(jìn)行多變量時(shí)間序列預(yù)測(cè)案例分享

如何利用高斯過(guò)程回歸模型建立燃料電池電堆功率預(yù)測(cè)模型？

如何使用Tensorflow保存或加載模型

keras模型轉(zhuǎn)tensorflow session

搜索歷史

TensorFlow回歸：建立了一個(gè)模型來(lái)預(yù)測(cè)汽車的燃油效率。

評(píng)論

TensorFlow回歸：建立了一個(gè)模型來(lái)預(yù)測(cè)汽車的燃油效率。