parent
94bce3fbe9
commit
e1da9dd954
@ -0,0 +1,17 @@
|
||||
---
|
||||
name: Bug report
|
||||
about: bug相关issue请按照此模板填写否则会被直接关闭
|
||||
title: ''
|
||||
labels: ''
|
||||
assignees: ''
|
||||
|
||||
---
|
||||
|
||||
**bug描述**
|
||||
描述一下你遇到的bug, 例如报错位置、报错信息(重要, 可以直接截个图)等
|
||||
|
||||
**版本信息**
|
||||
pytorch:
|
||||
torchvision:
|
||||
torchtext:
|
||||
...
|
||||
@ -0,0 +1,5 @@
|
||||
FROM node:alpine
|
||||
RUN npm i docsify-cli -g
|
||||
COPY . /data
|
||||
WORKDIR /data
|
||||
CMD [ "docsify", "serve", "docs" ]
|
||||
@ -0,0 +1,201 @@
|
||||
Apache License
|
||||
Version 2.0, January 2004
|
||||
http://www.apache.org/licenses/
|
||||
|
||||
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
|
||||
|
||||
1. Definitions.
|
||||
|
||||
"License" shall mean the terms and conditions for use, reproduction,
|
||||
and distribution as defined by Sections 1 through 9 of this document.
|
||||
|
||||
"Licensor" shall mean the copyright owner or entity authorized by
|
||||
the copyright owner that is granting the License.
|
||||
|
||||
"Legal Entity" shall mean the union of the acting entity and all
|
||||
other entities that control, are controlled by, or are under common
|
||||
control with that entity. For the purposes of this definition,
|
||||
"control" means (i) the power, direct or indirect, to cause the
|
||||
direction or management of such entity, whether by contract or
|
||||
otherwise, or (ii) ownership of fifty percent (50%) or more of the
|
||||
outstanding shares, or (iii) beneficial ownership of such entity.
|
||||
|
||||
"You" (or "Your") shall mean an individual or Legal Entity
|
||||
exercising permissions granted by this License.
|
||||
|
||||
"Source" form shall mean the preferred form for making modifications,
|
||||
including but not limited to software source code, documentation
|
||||
source, and configuration files.
|
||||
|
||||
"Object" form shall mean any form resulting from mechanical
|
||||
transformation or translation of a Source form, including but
|
||||
not limited to compiled object code, generated documentation,
|
||||
and conversions to other media types.
|
||||
|
||||
"Work" shall mean the work of authorship, whether in Source or
|
||||
Object form, made available under the License, as indicated by a
|
||||
copyright notice that is included in or attached to the work
|
||||
(an example is provided in the Appendix below).
|
||||
|
||||
"Derivative Works" shall mean any work, whether in Source or Object
|
||||
form, that is based on (or derived from) the Work and for which the
|
||||
editorial revisions, annotations, elaborations, or other modifications
|
||||
represent, as a whole, an original work of authorship. For the purposes
|
||||
of this License, Derivative Works shall not include works that remain
|
||||
separable from, or merely link (or bind by name) to the interfaces of,
|
||||
the Work and Derivative Works thereof.
|
||||
|
||||
"Contribution" shall mean any work of authorship, including
|
||||
the original version of the Work and any modifications or additions
|
||||
to that Work or Derivative Works thereof, that is intentionally
|
||||
submitted to Licensor for inclusion in the Work by the copyright owner
|
||||
or by an individual or Legal Entity authorized to submit on behalf of
|
||||
the copyright owner. For the purposes of this definition, "submitted"
|
||||
means any form of electronic, verbal, or written communication sent
|
||||
to the Licensor or its representatives, including but not limited to
|
||||
communication on electronic mailing lists, source code control systems,
|
||||
and issue tracking systems that are managed by, or on behalf of, the
|
||||
Licensor for the purpose of discussing and improving the Work, but
|
||||
excluding communication that is conspicuously marked or otherwise
|
||||
designated in writing by the copyright owner as "Not a Contribution."
|
||||
|
||||
"Contributor" shall mean Licensor and any individual or Legal Entity
|
||||
on behalf of whom a Contribution has been received by Licensor and
|
||||
subsequently incorporated within the Work.
|
||||
|
||||
2. Grant of Copyright License. Subject to the terms and conditions of
|
||||
this License, each Contributor hereby grants to You a perpetual,
|
||||
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
||||
copyright license to reproduce, prepare Derivative Works of,
|
||||
publicly display, publicly perform, sublicense, and distribute the
|
||||
Work and such Derivative Works in Source or Object form.
|
||||
|
||||
3. Grant of Patent License. Subject to the terms and conditions of
|
||||
this License, each Contributor hereby grants to You a perpetual,
|
||||
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
||||
(except as stated in this section) patent license to make, have made,
|
||||
use, offer to sell, sell, import, and otherwise transfer the Work,
|
||||
where such license applies only to those patent claims licensable
|
||||
by such Contributor that are necessarily infringed by their
|
||||
Contribution(s) alone or by combination of their Contribution(s)
|
||||
with the Work to which such Contribution(s) was submitted. If You
|
||||
institute patent litigation against any entity (including a
|
||||
cross-claim or counterclaim in a lawsuit) alleging that the Work
|
||||
or a Contribution incorporated within the Work constitutes direct
|
||||
or contributory patent infringement, then any patent licenses
|
||||
granted to You under this License for that Work shall terminate
|
||||
as of the date such litigation is filed.
|
||||
|
||||
4. Redistribution. You may reproduce and distribute copies of the
|
||||
Work or Derivative Works thereof in any medium, with or without
|
||||
modifications, and in Source or Object form, provided that You
|
||||
meet the following conditions:
|
||||
|
||||
(a) You must give any other recipients of the Work or
|
||||
Derivative Works a copy of this License; and
|
||||
|
||||
(b) You must cause any modified files to carry prominent notices
|
||||
stating that You changed the files; and
|
||||
|
||||
(c) You must retain, in the Source form of any Derivative Works
|
||||
that You distribute, all copyright, patent, trademark, and
|
||||
attribution notices from the Source form of the Work,
|
||||
excluding those notices that do not pertain to any part of
|
||||
the Derivative Works; and
|
||||
|
||||
(d) If the Work includes a "NOTICE" text file as part of its
|
||||
distribution, then any Derivative Works that You distribute must
|
||||
include a readable copy of the attribution notices contained
|
||||
within such NOTICE file, excluding those notices that do not
|
||||
pertain to any part of the Derivative Works, in at least one
|
||||
of the following places: within a NOTICE text file distributed
|
||||
as part of the Derivative Works; within the Source form or
|
||||
documentation, if provided along with the Derivative Works; or,
|
||||
within a display generated by the Derivative Works, if and
|
||||
wherever such third-party notices normally appear. The contents
|
||||
of the NOTICE file are for informational purposes only and
|
||||
do not modify the License. You may add Your own attribution
|
||||
notices within Derivative Works that You distribute, alongside
|
||||
or as an addendum to the NOTICE text from the Work, provided
|
||||
that such additional attribution notices cannot be construed
|
||||
as modifying the License.
|
||||
|
||||
You may add Your own copyright statement to Your modifications and
|
||||
may provide additional or different license terms and conditions
|
||||
for use, reproduction, or distribution of Your modifications, or
|
||||
for any such Derivative Works as a whole, provided Your use,
|
||||
reproduction, and distribution of the Work otherwise complies with
|
||||
the conditions stated in this License.
|
||||
|
||||
5. Submission of Contributions. Unless You explicitly state otherwise,
|
||||
any Contribution intentionally submitted for inclusion in the Work
|
||||
by You to the Licensor shall be under the terms and conditions of
|
||||
this License, without any additional terms or conditions.
|
||||
Notwithstanding the above, nothing herein shall supersede or modify
|
||||
the terms of any separate license agreement you may have executed
|
||||
with Licensor regarding such Contributions.
|
||||
|
||||
6. Trademarks. This License does not grant permission to use the trade
|
||||
names, trademarks, service marks, or product names of the Licensor,
|
||||
except as required for reasonable and customary use in describing the
|
||||
origin of the Work and reproducing the content of the NOTICE file.
|
||||
|
||||
7. Disclaimer of Warranty. Unless required by applicable law or
|
||||
agreed to in writing, Licensor provides the Work (and each
|
||||
Contributor provides its Contributions) on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
|
||||
implied, including, without limitation, any warranties or conditions
|
||||
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
|
||||
PARTICULAR PURPOSE. You are solely responsible for determining the
|
||||
appropriateness of using or redistributing the Work and assume any
|
||||
risks associated with Your exercise of permissions under this License.
|
||||
|
||||
8. Limitation of Liability. In no event and under no legal theory,
|
||||
whether in tort (including negligence), contract, or otherwise,
|
||||
unless required by applicable law (such as deliberate and grossly
|
||||
negligent acts) or agreed to in writing, shall any Contributor be
|
||||
liable to You for damages, including any direct, indirect, special,
|
||||
incidental, or consequential damages of any character arising as a
|
||||
result of this License or out of the use or inability to use the
|
||||
Work (including but not limited to damages for loss of goodwill,
|
||||
work stoppage, computer failure or malfunction, or any and all
|
||||
other commercial damages or losses), even if such Contributor
|
||||
has been advised of the possibility of such damages.
|
||||
|
||||
9. Accepting Warranty or Additional Liability. While redistributing
|
||||
the Work or Derivative Works thereof, You may choose to offer,
|
||||
and charge a fee for, acceptance of support, warranty, indemnity,
|
||||
or other liability obligations and/or rights consistent with this
|
||||
License. However, in accepting such obligations, You may act only
|
||||
on Your own behalf and on Your sole responsibility, not on behalf
|
||||
of any other Contributor, and only if You agree to indemnify,
|
||||
defend, and hold each Contributor harmless for any liability
|
||||
incurred by, or claims asserted against, such Contributor by reason
|
||||
of your accepting any such warranty or additional liability.
|
||||
|
||||
END OF TERMS AND CONDITIONS
|
||||
|
||||
APPENDIX: How to apply the Apache License to your work.
|
||||
|
||||
To apply the Apache License to your work, attach the following
|
||||
boilerplate notice, with the fields enclosed by brackets "[]"
|
||||
replaced with your own identifying information. (Don't include
|
||||
the brackets!) The text should be enclosed in the appropriate
|
||||
comment syntax for the file format. We also recommend that a
|
||||
file or class name and description of purpose be included on the
|
||||
same "printed page" as the copyright notice for easier
|
||||
identification within third-party archives.
|
||||
|
||||
Copyright [yyyy] [name of copyright owner]
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License");
|
||||
you may not use this file except in compliance with the License.
|
||||
You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
@ -0,0 +1,129 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# 3.10 多层感知机的简洁实现"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"0.4.1\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"import torch\n",
|
||||
"from torch import nn\n",
|
||||
"from torch.nn import init\n",
|
||||
"import numpy as np\n",
|
||||
"import sys\n",
|
||||
"sys.path.append(\"..\") \n",
|
||||
"import d2lzh_pytorch as d2l\n",
|
||||
"\n",
|
||||
"print(torch.__version__)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 3.10.1 定义模型"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"num_inputs, num_outputs, num_hiddens = 784, 10, 256\n",
|
||||
" \n",
|
||||
"net = nn.Sequential(\n",
|
||||
" d2l.FlattenLayer(),\n",
|
||||
" nn.Linear(num_inputs, num_hiddens),\n",
|
||||
" nn.ReLU(),\n",
|
||||
" nn.Linear(num_hiddens, num_outputs), \n",
|
||||
" )\n",
|
||||
" \n",
|
||||
"for params in net.parameters():\n",
|
||||
" init.normal_(params, mean=0, std=0.01)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 3.10.2 读取数据并训练模型"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"epoch 1, loss 0.0031, train acc 0.703, test acc 0.757\n",
|
||||
"epoch 2, loss 0.0019, train acc 0.824, test acc 0.822\n",
|
||||
"epoch 3, loss 0.0016, train acc 0.845, test acc 0.825\n",
|
||||
"epoch 4, loss 0.0015, train acc 0.855, test acc 0.811\n",
|
||||
"epoch 5, loss 0.0014, train acc 0.865, test acc 0.846\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"batch_size = 256\n",
|
||||
"train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size)\n",
|
||||
"loss = torch.nn.CrossEntropyLoss()\n",
|
||||
"\n",
|
||||
"optimizer = torch.optim.SGD(net.parameters(), lr=0.5)\n",
|
||||
"\n",
|
||||
"num_epochs = 5\n",
|
||||
"d2l.train_ch3(net, train_iter, test_iter, loss, num_epochs, batch_size, None, None, optimizer)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python [default]",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.6.3"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
||||
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
@ -0,0 +1,160 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# 3.1 线性回归"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"0.4.1\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"import torch\n",
|
||||
"from time import time\n",
|
||||
"\n",
|
||||
"print(torch.__version__)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"a = torch.ones(1000)\n",
|
||||
"b = torch.ones(1000)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"将这两个向量按元素逐一做标量加法:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"0.020173072814941406\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"start = time()\n",
|
||||
"c = torch.zeros(1000)\n",
|
||||
"for i in range(1000):\n",
|
||||
" c[i] = a[i] + b[i]\n",
|
||||
"print(time() - start)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"将这两个向量直接做矢量加法:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"8.20159912109375e-05\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"start = time()\n",
|
||||
"d = a + b\n",
|
||||
"print(time() - start)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"**结果很明显,后者比前者更省时。因此,我们应该尽可能采用矢量计算,以提升计算效率。**"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"广播机制例子🌰:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"tensor([11., 11., 11.])\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"a = torch.ones(3)\n",
|
||||
"b = 10\n",
|
||||
"print(a + b)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python [default]",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.6.3"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
||||
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
@ -0,0 +1,205 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# 3.7 softmax回归的简洁实现"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"0.4.1\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"import torch\n",
|
||||
"from torch import nn\n",
|
||||
"from torch.nn import init\n",
|
||||
"import numpy as np\n",
|
||||
"import sys\n",
|
||||
"sys.path.append(\"..\") \n",
|
||||
"import d2lzh_pytorch as d2l\n",
|
||||
"\n",
|
||||
"print(torch.__version__)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 3.7.1 获取和读取数据"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"batch_size = 256\n",
|
||||
"train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 3.7.2 定义和初始化模型"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"num_inputs = 784\n",
|
||||
"num_outputs = 10\n",
|
||||
"\n",
|
||||
"# class LinearNet(nn.Module):\n",
|
||||
"# def __init__(self, num_inputs, num_outputs):\n",
|
||||
"# super(LinearNet, self).__init__()\n",
|
||||
"# self.linear = nn.Linear(num_inputs, num_outputs)\n",
|
||||
"# def forward(self, x): # x shape: (batch, 1, 28, 28)\n",
|
||||
"# y = self.linear(x.view(x.shape[0], -1))\n",
|
||||
"# return y\n",
|
||||
" \n",
|
||||
"# net = LinearNet(num_inputs, num_outputs)\n",
|
||||
"\n",
|
||||
"class FlattenLayer(nn.Module):\n",
|
||||
" def __init__(self):\n",
|
||||
" super(FlattenLayer, self).__init__()\n",
|
||||
" def forward(self, x): # x shape: (batch, *, *, ...)\n",
|
||||
" return x.view(x.shape[0], -1)\n",
|
||||
"\n",
|
||||
"from collections import OrderedDict\n",
|
||||
"net = nn.Sequential(\n",
|
||||
" # FlattenLayer(),\n",
|
||||
" # nn.Linear(num_inputs, num_outputs)\n",
|
||||
" OrderedDict([\n",
|
||||
" ('flatten', FlattenLayer()),\n",
|
||||
" ('linear', nn.Linear(num_inputs, num_outputs))])\n",
|
||||
" )"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"Parameter containing:\n",
|
||||
"tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], requires_grad=True)"
|
||||
]
|
||||
},
|
||||
"execution_count": 4,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"init.normal_(net.linear.weight, mean=0, std=0.01)\n",
|
||||
"init.constant_(net.linear.bias, val=0) "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 3.7.3 softmax和交叉熵损失函数"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"loss = nn.CrossEntropyLoss()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 3.7.4 定义优化算法"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"optimizer = torch.optim.SGD(net.parameters(), lr=0.1)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 3.7.5 训练模型"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"epoch 1, loss 0.0031, train acc 0.748, test acc 0.785\n",
|
||||
"epoch 2, loss 0.0022, train acc 0.813, test acc 0.802\n",
|
||||
"epoch 3, loss 0.0021, train acc 0.824, test acc 0.808\n",
|
||||
"epoch 4, loss 0.0020, train acc 0.833, test acc 0.824\n",
|
||||
"epoch 5, loss 0.0019, train acc 0.837, test acc 0.806\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"num_epochs = 5\n",
|
||||
"d2l.train_ch3(net, train_iter, test_iter, loss, num_epochs, batch_size, None, None, optimizer)"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python [default]",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.6.3"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
||||
File diff suppressed because it is too large
Load Diff
@ -0,0 +1,197 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# 3.9 多层感知机的从零开始实现"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"0.4.1\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"import torch\n",
|
||||
"import numpy as np\n",
|
||||
"import sys\n",
|
||||
"sys.path.append(\"..\") # 为了导入上层目录的d2lzh_pytorch\n",
|
||||
"import d2lzh_pytorch as d2l\n",
|
||||
"\n",
|
||||
"print(torch.__version__)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 3.9.1 获取和读取数据"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"batch_size = 256\n",
|
||||
"train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 3.9.2 定义模型参数"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"num_inputs, num_outputs, num_hiddens = 784, 10, 256\n",
|
||||
"\n",
|
||||
"W1 = torch.tensor(np.random.normal(0, 0.01, (num_inputs, num_hiddens)), dtype=torch.float)\n",
|
||||
"b1 = torch.zeros(num_hiddens, dtype=torch.float)\n",
|
||||
"W2 = torch.tensor(np.random.normal(0, 0.01, (num_hiddens, num_outputs)), dtype=torch.float)\n",
|
||||
"b2 = torch.zeros(num_outputs, dtype=torch.float)\n",
|
||||
"\n",
|
||||
"params = [W1, b1, W2, b2]\n",
|
||||
"for param in params:\n",
|
||||
" param.requires_grad_(requires_grad=True)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 3.9.3 定义激活函数"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"def relu(X):\n",
|
||||
" return torch.max(input=X, other=torch.tensor(0.0))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 3.9.4 定义模型"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"def net(X):\n",
|
||||
" X = X.view((-1, num_inputs))\n",
|
||||
" H = relu(torch.matmul(X, W1) + b1)\n",
|
||||
" return torch.matmul(H, W2) + b2"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 3.9.5 定义损失函数"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"loss = torch.nn.CrossEntropyLoss()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 3.9.6 训练模型"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"epoch 1, loss 0.0030, train acc 0.714, test acc 0.753\n",
|
||||
"epoch 2, loss 0.0019, train acc 0.821, test acc 0.777\n",
|
||||
"epoch 3, loss 0.0017, train acc 0.842, test acc 0.834\n",
|
||||
"epoch 4, loss 0.0015, train acc 0.857, test acc 0.839\n",
|
||||
"epoch 5, loss 0.0014, train acc 0.865, test acc 0.845\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"num_epochs, lr = 5, 100.0\n",
|
||||
"d2l.train_ch3(net, train_iter, test_iter, loss, num_epochs, batch_size, params, lr)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python [default]",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.6.3"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
||||
File diff suppressed because it is too large
Load Diff
@ -0,0 +1,378 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# 4.2 模型参数的访问、初始化和共享"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"0.4.1\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"import torch\n",
|
||||
"from torch import nn\n",
|
||||
"from torch.nn import init\n",
|
||||
"\n",
|
||||
"print(torch.__version__)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Sequential(\n",
|
||||
" (0): Linear(in_features=4, out_features=3, bias=True)\n",
|
||||
" (1): ReLU()\n",
|
||||
" (2): Linear(in_features=3, out_features=1, bias=True)\n",
|
||||
")\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"net = nn.Sequential(nn.Linear(4, 3), nn.ReLU(), nn.Linear(3, 1)) # pytorch已进行默认初始化\n",
|
||||
"\n",
|
||||
"print(net)\n",
|
||||
"X = torch.rand(2, 4)\n",
|
||||
"Y = net(X).sum()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 4.2.1 访问模型参数"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"<class 'generator'>\n",
|
||||
"0.weight torch.Size([3, 4])\n",
|
||||
"0.bias torch.Size([3])\n",
|
||||
"2.weight torch.Size([1, 3])\n",
|
||||
"2.bias torch.Size([1])\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"print(type(net.named_parameters()))\n",
|
||||
"for name, param in net.named_parameters():\n",
|
||||
" print(name, param.size())"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"weight torch.Size([3, 4]) <class 'torch.nn.parameter.Parameter'>\n",
|
||||
"bias torch.Size([3]) <class 'torch.nn.parameter.Parameter'>\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"for name, param in net[0].named_parameters():\n",
|
||||
" print(name, param.size(), type(param))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"weight1\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"class MyModel(nn.Module):\n",
|
||||
" def __init__(self, **kwargs):\n",
|
||||
" super(MyModel, self).__init__(**kwargs)\n",
|
||||
" self.weight1 = nn.Parameter(torch.rand(20, 20))\n",
|
||||
" self.weight2 = torch.rand(20, 20)\n",
|
||||
" def forward(self, x):\n",
|
||||
" pass\n",
|
||||
" \n",
|
||||
"n = MyModel()\n",
|
||||
"for name, param in n.named_parameters():\n",
|
||||
" print(name)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"tensor([[ 0.2719, -0.0898, -0.2462, 0.0655],\n",
|
||||
" [-0.4669, -0.2703, 0.3230, 0.2067],\n",
|
||||
" [-0.2708, 0.1171, -0.0995, 0.3913]])\n",
|
||||
"None\n",
|
||||
"tensor([[-0.2281, -0.0653, -0.1646, -0.2569],\n",
|
||||
" [-0.1916, -0.0549, -0.1382, -0.2158],\n",
|
||||
" [ 0.0000, 0.0000, 0.0000, 0.0000]])\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"weight_0 = list(net[0].parameters())[0]\n",
|
||||
"print(weight_0.data)\n",
|
||||
"print(weight_0.grad)\n",
|
||||
"Y.backward()\n",
|
||||
"print(weight_0.grad)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 4.2.2 初始化模型参数"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"0.weight tensor([[ 0.0030, 0.0094, 0.0070, -0.0010],\n",
|
||||
" [ 0.0001, 0.0039, 0.0105, -0.0126],\n",
|
||||
" [ 0.0105, -0.0135, -0.0047, -0.0006]])\n",
|
||||
"2.weight tensor([[-0.0074, 0.0051, 0.0066]])\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"for name, param in net.named_parameters():\n",
|
||||
" if 'weight' in name:\n",
|
||||
" init.normal_(param, mean=0, std=0.01)\n",
|
||||
" print(name, param.data)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"0.bias tensor([0., 0., 0.])\n",
|
||||
"2.bias tensor([0.])\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"for name, param in net.named_parameters():\n",
|
||||
" if 'bias' in name:\n",
|
||||
" init.constant_(param, val=0)\n",
|
||||
" print(name, param.data)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 4.2.3 自定义初始化方法"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 9,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"def init_weight_(tensor):\n",
|
||||
" with torch.no_grad():\n",
|
||||
" tensor.uniform_(-10, 10)\n",
|
||||
" tensor *= (tensor.abs() >= 5).float()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 10,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"0.weight tensor([[ 7.0403, 0.0000, -9.4569, 7.0111],\n",
|
||||
" [-0.0000, -0.0000, 0.0000, 0.0000],\n",
|
||||
" [ 9.8063, -0.0000, 0.0000, -9.7993]])\n",
|
||||
"2.weight tensor([[-5.8198, 7.7558, -5.0293]])\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"for name, param in net.named_parameters():\n",
|
||||
" if 'weight' in name:\n",
|
||||
" init_weight_(param)\n",
|
||||
" print(name, param.data)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 11,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"0.bias tensor([1., 1., 1.])\n",
|
||||
"2.bias tensor([1.])\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"for name, param in net.named_parameters():\n",
|
||||
" if 'bias' in name:\n",
|
||||
" param.data += 1\n",
|
||||
" print(name, param.data)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 4.2.4 共享模型参数"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 12,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Sequential(\n",
|
||||
" (0): Linear(in_features=1, out_features=1, bias=False)\n",
|
||||
" (1): Linear(in_features=1, out_features=1, bias=False)\n",
|
||||
")\n",
|
||||
"0.weight tensor([[3.]])\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"linear = nn.Linear(1, 1, bias=False)\n",
|
||||
"net = nn.Sequential(linear, linear) \n",
|
||||
"print(net)\n",
|
||||
"for name, param in net.named_parameters():\n",
|
||||
" init.constant_(param, val=3)\n",
|
||||
" print(name, param.data)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 13,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"True\n",
|
||||
"True\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"print(id(net[0]) == id(net[1]))\n",
|
||||
"print(id(net[0].weight) == id(net[1].weight))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 14,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"tensor(9., grad_fn=<SumBackward0>)\n",
|
||||
"tensor([[6.]])\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"x = torch.ones(1, 1)\n",
|
||||
"y = net(x).sum()\n",
|
||||
"print(y)\n",
|
||||
"y.backward()\n",
|
||||
"print(net[0].weight.grad)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python [default]",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.6.3"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
||||
@ -0,0 +1,269 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# 4.4 自定义层\n",
|
||||
"## 4.4.1 不含模型参数的自定义层"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"0.4.1\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"import torch\n",
|
||||
"from torch import nn\n",
|
||||
"\n",
|
||||
"print(torch.__version__)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"class CenteredLayer(nn.Module):\n",
|
||||
" def __init__(self, **kwargs):\n",
|
||||
" super(CenteredLayer, self).__init__(**kwargs)\n",
|
||||
" def forward(self, x):\n",
|
||||
" return x - x.mean()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"tensor([-2., -1., 0., 1., 2.])"
|
||||
]
|
||||
},
|
||||
"execution_count": 3,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"layer = CenteredLayer()\n",
|
||||
"layer(torch.tensor([1, 2, 3, 4, 5], dtype=torch.float))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"net = nn.Sequential(nn.Linear(8, 128), CenteredLayer())"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"0.0"
|
||||
]
|
||||
},
|
||||
"execution_count": 5,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"y = net(torch.rand(4, 8))\n",
|
||||
"y.mean().item()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 4.4.2 含模型参数的自定义层"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"MyListDense(\n",
|
||||
" (params): ParameterList(\n",
|
||||
" (0): Parameter containing: [torch.FloatTensor of size 4x4]\n",
|
||||
" (1): Parameter containing: [torch.FloatTensor of size 4x4]\n",
|
||||
" (2): Parameter containing: [torch.FloatTensor of size 4x4]\n",
|
||||
" (3): Parameter containing: [torch.FloatTensor of size 4x1]\n",
|
||||
" )\n",
|
||||
")\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"class MyListDense(nn.Module):\n",
|
||||
" def __init__(self):\n",
|
||||
" super(MyListDense, self).__init__()\n",
|
||||
" self.params = nn.ParameterList([nn.Parameter(torch.randn(4, 4)) for i in range(3)])\n",
|
||||
" self.params.append(nn.Parameter(torch.randn(4, 1)))\n",
|
||||
"\n",
|
||||
" def forward(self, x):\n",
|
||||
" for i in range(len(self.params)):\n",
|
||||
" x = torch.mm(x, self.params[i])\n",
|
||||
" return x\n",
|
||||
"net = MyListDense()\n",
|
||||
"print(net)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"MyDictDense(\n",
|
||||
" (params): ParameterDict(\n",
|
||||
" (linear1): Parameter containing: [torch.FloatTensor of size 4x4]\n",
|
||||
" (linear2): Parameter containing: [torch.FloatTensor of size 4x1]\n",
|
||||
" (linear3): Parameter containing: [torch.FloatTensor of size 4x2]\n",
|
||||
" )\n",
|
||||
")\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"class MyDictDense(nn.Module):\n",
|
||||
" def __init__(self):\n",
|
||||
" super(MyDictDense, self).__init__()\n",
|
||||
" self.params = nn.ParameterDict({\n",
|
||||
" 'linear1': nn.Parameter(torch.randn(4, 4)),\n",
|
||||
" 'linear2': nn.Parameter(torch.randn(4, 1))\n",
|
||||
" })\n",
|
||||
" self.params.update({'linear3': nn.Parameter(torch.randn(4, 2))}) # 新增\n",
|
||||
"\n",
|
||||
" def forward(self, x, choice='linear1'):\n",
|
||||
" return torch.mm(x, self.params[choice])\n",
|
||||
"\n",
|
||||
"net = MyDictDense()\n",
|
||||
"print(net)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"tensor([[1.5082, 1.5574, 2.1651, 1.2409]], grad_fn=<MmBackward>)\n",
|
||||
"tensor([[-0.8783]], grad_fn=<MmBackward>)\n",
|
||||
"tensor([[ 2.2193, -1.6539]], grad_fn=<MmBackward>)\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"x = torch.ones(1, 4)\n",
|
||||
"print(net(x, 'linear1'))\n",
|
||||
"print(net(x, 'linear2'))\n",
|
||||
"print(net(x, 'linear3'))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 10,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Sequential(\n",
|
||||
" (0): MyDictDense(\n",
|
||||
" (params): ParameterDict(\n",
|
||||
" (linear1): Parameter containing: [torch.FloatTensor of size 4x4]\n",
|
||||
" (linear2): Parameter containing: [torch.FloatTensor of size 4x1]\n",
|
||||
" (linear3): Parameter containing: [torch.FloatTensor of size 4x2]\n",
|
||||
" )\n",
|
||||
" )\n",
|
||||
" (1): MyListDense(\n",
|
||||
" (params): ParameterList(\n",
|
||||
" (0): Parameter containing: [torch.FloatTensor of size 4x4]\n",
|
||||
" (1): Parameter containing: [torch.FloatTensor of size 4x4]\n",
|
||||
" (2): Parameter containing: [torch.FloatTensor of size 4x4]\n",
|
||||
" (3): Parameter containing: [torch.FloatTensor of size 4x1]\n",
|
||||
" )\n",
|
||||
" )\n",
|
||||
")\n",
|
||||
"tensor([[-101.2394]], grad_fn=<MmBackward>)\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"net = nn.Sequential(\n",
|
||||
" MyDictDense(),\n",
|
||||
" MyListDense(),\n",
|
||||
")\n",
|
||||
"print(net)\n",
|
||||
"print(net(x))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python [default]",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.6.3"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
||||
@ -0,0 +1,254 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# 4.5 读取和存储"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"0.4.1\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"import torch\n",
|
||||
"from torch import nn\n",
|
||||
"\n",
|
||||
"print(torch.__version__)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 4.5.1 读写`Tensor`"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"x = torch.ones(3)\n",
|
||||
"torch.save(x, 'x.pt')"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"tensor([1., 1., 1.])"
|
||||
]
|
||||
},
|
||||
"execution_count": 3,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"x2 = torch.load('x.pt')\n",
|
||||
"x2"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"[tensor([1., 1., 1.]), tensor([0., 0., 0., 0.])]"
|
||||
]
|
||||
},
|
||||
"execution_count": 4,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"y = torch.zeros(4)\n",
|
||||
"torch.save([x, y], 'xy.pt')\n",
|
||||
"xy_list = torch.load('xy.pt')\n",
|
||||
"xy_list"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"{'x': tensor([1., 1., 1.]), 'y': tensor([0., 0., 0., 0.])}"
|
||||
]
|
||||
},
|
||||
"execution_count": 5,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"torch.save({'x': x, 'y': y}, 'xy_dict.pt')\n",
|
||||
"xy = torch.load('xy_dict.pt')\n",
|
||||
"xy"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 4.5.2 读写模型\n",
|
||||
"### 4.5.2.1 `state_dict`"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"OrderedDict([('hidden.weight', tensor([[ 0.1836, -0.1812, -0.1681],\n",
|
||||
" [ 0.0406, 0.3061, 0.4599]])),\n",
|
||||
" ('hidden.bias', tensor([-0.3384, 0.1910])),\n",
|
||||
" ('output.weight', tensor([[0.0380, 0.4919]])),\n",
|
||||
" ('output.bias', tensor([0.1451]))])"
|
||||
]
|
||||
},
|
||||
"execution_count": 6,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"class MLP(nn.Module):\n",
|
||||
" def __init__(self):\n",
|
||||
" super(MLP, self).__init__()\n",
|
||||
" self.hidden = nn.Linear(3, 2)\n",
|
||||
" self.act = nn.ReLU()\n",
|
||||
" self.output = nn.Linear(2, 1)\n",
|
||||
"\n",
|
||||
" def forward(self, x):\n",
|
||||
" a = self.act(self.hidden(x))\n",
|
||||
" return self.output(a)\n",
|
||||
"\n",
|
||||
"net = MLP()\n",
|
||||
"net.state_dict()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"{'param_groups': [{'dampening': 0,\n",
|
||||
" 'lr': 0.001,\n",
|
||||
" 'momentum': 0.9,\n",
|
||||
" 'nesterov': False,\n",
|
||||
" 'params': [4624483024, 4624484608, 4624484680, 4624484752],\n",
|
||||
" 'weight_decay': 0}],\n",
|
||||
" 'state': {}}"
|
||||
]
|
||||
},
|
||||
"execution_count": 7,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"optimizer = torch.optim.SGD(net.parameters(), lr=0.001, momentum=0.9)\n",
|
||||
"optimizer.state_dict()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### 4.5.2.2 保存和加载模型"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"tensor([[1],\n",
|
||||
" [1]], dtype=torch.uint8)"
|
||||
]
|
||||
},
|
||||
"execution_count": 8,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"X = torch.randn(2, 3)\n",
|
||||
"Y = net(X)\n",
|
||||
"\n",
|
||||
"PATH = \"./net.pt\"\n",
|
||||
"torch.save(net.state_dict(), PATH)\n",
|
||||
"\n",
|
||||
"net2 = MLP()\n",
|
||||
"net2.load_state_dict(torch.load(PATH))\n",
|
||||
"Y2 = net2(X)\n",
|
||||
"Y2 == Y"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python [default]",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.6.3"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
||||
@ -0,0 +1,482 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# 4.6 GPU计算"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2019-03-17T08:12:15.123349Z",
|
||||
"start_time": "2019-03-17T08:12:14.979997Z"
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Sun Mar 17 16:12:15 2019 \r\n",
|
||||
"+-----------------------------------------------------------------------------+\r\n",
|
||||
"| NVIDIA-SMI 390.48 Driver Version: 390.48 |\r\n",
|
||||
"|-------------------------------+----------------------+----------------------+\r\n",
|
||||
"| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |\r\n",
|
||||
"| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |\r\n",
|
||||
"|===============================+======================+======================|\r\n",
|
||||
"| 0 GeForce GTX 1050 Off | 00000000:01:00.0 Off | N/A |\r\n",
|
||||
"| 20% 40C P5 N/A / 75W | 1213MiB / 2000MiB | 23% Default |\r\n",
|
||||
"+-------------------------------+----------------------+----------------------+\r\n",
|
||||
" \r\n",
|
||||
"+-----------------------------------------------------------------------------+\r\n",
|
||||
"| Processes: GPU Memory |\r\n",
|
||||
"| GPU PID Type Process name Usage |\r\n",
|
||||
"|=============================================================================|\r\n",
|
||||
"| 0 1235 G /usr/lib/xorg/Xorg 434MiB |\r\n",
|
||||
"| 0 2095 G compiz 171MiB |\r\n",
|
||||
"| 0 2660 G /opt/teamviewer/tv_bin/TeamViewer 5MiB |\r\n",
|
||||
"| 0 4166 G /proc/self/exe 397MiB |\r\n",
|
||||
"| 0 13274 C /home/tss/anaconda3/bin/python 191MiB |\r\n",
|
||||
"+-----------------------------------------------------------------------------+\r\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"!nvidia-smi # 对Linux/macOS用户有效"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2019-03-17T08:12:15.512222Z",
|
||||
"start_time": "2019-03-17T08:12:15.124792Z"
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"0.4.1\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"import torch\n",
|
||||
"from torch import nn\n",
|
||||
"\n",
|
||||
"print(torch.__version__)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 4.6.1 计算设备"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2019-03-17T08:12:15.539276Z",
|
||||
"start_time": "2019-03-17T08:12:15.513205Z"
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"True"
|
||||
]
|
||||
},
|
||||
"execution_count": 3,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"torch.cuda.is_available() # cuda是否可用"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2019-03-17T08:12:15.543795Z",
|
||||
"start_time": "2019-03-17T08:12:15.540338Z"
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"1"
|
||||
]
|
||||
},
|
||||
"execution_count": 4,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"torch.cuda.device_count() # gpu数量"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2019-03-17T08:12:15.551451Z",
|
||||
"start_time": "2019-03-17T08:12:15.544964Z"
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"0"
|
||||
]
|
||||
},
|
||||
"execution_count": 5,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"torch.cuda.current_device() # 当前设备索引, 从0开始"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2019-03-17T08:12:15.555020Z",
|
||||
"start_time": "2019-03-17T08:12:15.552387Z"
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"'GeForce GTX 1050'"
|
||||
]
|
||||
},
|
||||
"execution_count": 6,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"torch.cuda.get_device_name(0) # 返回gpu名字"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 4.6.2 `Tensor`的GPU计算"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2019-03-17T08:12:15.562186Z",
|
||||
"start_time": "2019-03-17T08:12:15.556621Z"
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"tensor([1, 2, 3])"
|
||||
]
|
||||
},
|
||||
"execution_count": 7,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"x = torch.tensor([1, 2, 3])\n",
|
||||
"x"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2019-03-17T08:12:17.441336Z",
|
||||
"start_time": "2019-03-17T08:12:15.563813Z"
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"tensor([1, 2, 3], device='cuda:0')"
|
||||
]
|
||||
},
|
||||
"execution_count": 8,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"x = x.cuda(0)\n",
|
||||
"x"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 9,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2019-03-17T08:12:17.449383Z",
|
||||
"start_time": "2019-03-17T08:12:17.445193Z"
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"device(type='cuda', index=0)"
|
||||
]
|
||||
},
|
||||
"execution_count": 9,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"x.device"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 10,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2019-03-17T08:12:17.454548Z",
|
||||
"start_time": "2019-03-17T08:12:17.450268Z"
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"tensor([1, 2, 3], device='cuda:0')"
|
||||
]
|
||||
},
|
||||
"execution_count": 10,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')\n",
|
||||
"\n",
|
||||
"x = torch.tensor([1, 2, 3], device=device)\n",
|
||||
"# or\n",
|
||||
"x = torch.tensor([1, 2, 3]).to(device)\n",
|
||||
"x"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 11,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2019-03-17T08:12:17.467441Z",
|
||||
"start_time": "2019-03-17T08:12:17.455495Z"
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"tensor([1, 4, 9], device='cuda:0')"
|
||||
]
|
||||
},
|
||||
"execution_count": 11,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"y = x**2\n",
|
||||
"y"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 12,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2019-03-17T08:12:17.470297Z",
|
||||
"start_time": "2019-03-17T08:12:17.468866Z"
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# z = y + x.cpu()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 4.6.3 模型的GPU计算"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 13,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2019-03-17T08:12:17.474763Z",
|
||||
"start_time": "2019-03-17T08:12:17.471348Z"
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"device(type='cpu')"
|
||||
]
|
||||
},
|
||||
"execution_count": 13,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"net = nn.Linear(3, 1)\n",
|
||||
"list(net.parameters())[0].device"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 14,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2019-03-17T08:12:17.478553Z",
|
||||
"start_time": "2019-03-17T08:12:17.475677Z"
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"device(type='cuda', index=0)"
|
||||
]
|
||||
},
|
||||
"execution_count": 14,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"net.cuda()\n",
|
||||
"list(net.parameters())[0].device"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 15,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2019-03-17T08:12:17.957448Z",
|
||||
"start_time": "2019-03-17T08:12:17.479843Z"
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"tensor([[-0.5574],\n",
|
||||
" [-0.3792]], device='cuda:0', grad_fn=<ThAddmmBackward>)"
|
||||
]
|
||||
},
|
||||
"execution_count": 15,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"x = torch.rand(2,3).cuda()\n",
|
||||
"net(x)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python [default]",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.6.4"
|
||||
},
|
||||
"varInspector": {
|
||||
"cols": {
|
||||
"lenName": 16,
|
||||
"lenType": 16,
|
||||
"lenVar": 40
|
||||
},
|
||||
"kernels_config": {
|
||||
"python": {
|
||||
"delete_cmd_postfix": "",
|
||||
"delete_cmd_prefix": "del ",
|
||||
"library": "var_list.py",
|
||||
"varRefreshCmd": "print(var_dic_list())"
|
||||
},
|
||||
"r": {
|
||||
"delete_cmd_postfix": ") ",
|
||||
"delete_cmd_prefix": "rm(",
|
||||
"library": "var_list.r",
|
||||
"varRefreshCmd": "cat(var_dic_list()) "
|
||||
}
|
||||
},
|
||||
"types_to_exclude": [
|
||||
"module",
|
||||
"function",
|
||||
"builtin_function_or_method",
|
||||
"instance",
|
||||
"_Feature"
|
||||
],
|
||||
"window_display": false
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
||||
@ -0,0 +1,266 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# 5.1 二维卷积层\n",
|
||||
"## 5.1.1 二维互相关运算"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"ename": "",
|
||||
"evalue": "",
|
||||
"output_type": "error",
|
||||
"traceback": [
|
||||
"\u001b[1;31mPython 3.6.13 ('deepsort') 需要安装 ipykernel。\n",
|
||||
"Run the following command to install 'ipykernel' into the Python environment. \n",
|
||||
"Command: 'conda install -n deepsort ipykernel --update-deps --force-reinstall'"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"import torch \n",
|
||||
"from torch import nn\n",
|
||||
"\n",
|
||||
"print(torch.__version__)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"def corr2d(X, K): # 本函数已保存在d2lzh_pytorch包中方便以后使用\n",
|
||||
" h, w = K.shape\n",
|
||||
" X, K = X.float(), K.float()\n",
|
||||
" Y = torch.zeros((X.shape[0] - h + 1, X.shape[1] - w + 1))\n",
|
||||
" for i in range(Y.shape[0]):\n",
|
||||
" for j in range(Y.shape[1]):\n",
|
||||
" Y[i, j] = (X[i: i + h, j: j + w] * K).sum()\n",
|
||||
" return Y"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"tensor([[19., 25.],\n",
|
||||
" [37., 43.]])"
|
||||
]
|
||||
},
|
||||
"execution_count": 3,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"X = torch.tensor([[0, 1, 2], [3, 4, 5], [6, 7, 8]])\n",
|
||||
"K = torch.tensor([[0, 1], [2, 3]])\n",
|
||||
"corr2d(X, K)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 5.1.2 二维卷积层"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"class Conv2D(nn.Module):\n",
|
||||
" def __init__(self, kernel_size):\n",
|
||||
" super(Conv2D, self).__init__()\n",
|
||||
" self.weight = nn.Parameter(torch.randn(kernel_size))\n",
|
||||
" self.bias = nn.Parameter(torch.randn(1))\n",
|
||||
"\n",
|
||||
" def forward(self, x):\n",
|
||||
" return corr2d(x, self.weight) + self.bias"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 5.1.3 图像中物体边缘检测"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"tensor([[1., 1., 0., 0., 0., 0., 1., 1.],\n",
|
||||
" [1., 1., 0., 0., 0., 0., 1., 1.],\n",
|
||||
" [1., 1., 0., 0., 0., 0., 1., 1.],\n",
|
||||
" [1., 1., 0., 0., 0., 0., 1., 1.],\n",
|
||||
" [1., 1., 0., 0., 0., 0., 1., 1.],\n",
|
||||
" [1., 1., 0., 0., 0., 0., 1., 1.]])"
|
||||
]
|
||||
},
|
||||
"execution_count": 5,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"X = torch.ones(6, 8)\n",
|
||||
"X[:, 2:6] = 0\n",
|
||||
"X"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"K = torch.tensor([[1, -1]])"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"tensor([[ 0., 1., 0., 0., 0., -1., 0.],\n",
|
||||
" [ 0., 1., 0., 0., 0., -1., 0.],\n",
|
||||
" [ 0., 1., 0., 0., 0., -1., 0.],\n",
|
||||
" [ 0., 1., 0., 0., 0., -1., 0.],\n",
|
||||
" [ 0., 1., 0., 0., 0., -1., 0.],\n",
|
||||
" [ 0., 1., 0., 0., 0., -1., 0.]])"
|
||||
]
|
||||
},
|
||||
"execution_count": 7,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"Y = corr2d(X, K)\n",
|
||||
"Y"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 5.1.4 通过数据学习核数组"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Step 5, loss 1.844\n",
|
||||
"Step 10, loss 0.206\n",
|
||||
"Step 15, loss 0.023\n",
|
||||
"Step 20, loss 0.003\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# 构造一个核数组形状是(1, 2)的二维卷积层\n",
|
||||
"conv2d = Conv2D(kernel_size=(1, 2))\n",
|
||||
"\n",
|
||||
"step = 20\n",
|
||||
"lr = 0.01\n",
|
||||
"for i in range(step):\n",
|
||||
" Y_hat = conv2d(X)\n",
|
||||
" l = ((Y_hat - Y) ** 2).sum()\n",
|
||||
" l.backward()\n",
|
||||
" \n",
|
||||
" # 梯度下降\n",
|
||||
" conv2d.weight.data -= lr * conv2d.weight.grad\n",
|
||||
" conv2d.bias.data -= lr * conv2d.bias.grad\n",
|
||||
" \n",
|
||||
" # 梯度清0\n",
|
||||
" conv2d.weight.grad.fill_(0)\n",
|
||||
" conv2d.bias.grad.fill_(0)\n",
|
||||
" if (i + 1) % 5 == 0:\n",
|
||||
" print('Step %d, loss %.3f' % (i + 1, l.item()))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"weight: tensor([[ 0.9948, -1.0092]])\n",
|
||||
"bias: tensor([0.0080])\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"print(\"weight: \", conv2d.weight.data)\n",
|
||||
"print(\"bias: \", conv2d.bias.data)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python [default]",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.6.13"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
||||
@ -0,0 +1,290 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# 5.4 池化层"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"0.4.1\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"import torch\n",
|
||||
"from torch import nn\n",
|
||||
"\n",
|
||||
"print(torch.__version__)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 5.4.1 二维最大池化层和平均池化层"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"def pool2d(X, pool_size, mode='max'):\n",
|
||||
" X = X.float()\n",
|
||||
" p_h, p_w = pool_size\n",
|
||||
" Y = torch.zeros(X.shape[0] - p_h + 1, X.shape[1] - p_w + 1)\n",
|
||||
" for i in range(Y.shape[0]):\n",
|
||||
" for j in range(Y.shape[1]):\n",
|
||||
" if mode == 'max':\n",
|
||||
" Y[i, j] = X[i: i + p_h, j: j + p_w].max()\n",
|
||||
" elif mode == 'avg':\n",
|
||||
" Y[i, j] = X[i: i + p_h, j: j + p_w].mean() \n",
|
||||
" return Y"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"tensor([[4., 5.],\n",
|
||||
" [7., 8.]])"
|
||||
]
|
||||
},
|
||||
"execution_count": 3,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"X = torch.tensor([[0, 1, 2], [3, 4, 5], [6, 7, 8]])\n",
|
||||
"pool2d(X, (2, 2))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"tensor([[2., 3.],\n",
|
||||
" [5., 6.]])"
|
||||
]
|
||||
},
|
||||
"execution_count": 4,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"pool2d(X, (2, 2), 'avg')"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 5.4.2 填充和步幅"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"tensor([[[[ 0., 1., 2., 3.],\n",
|
||||
" [ 4., 5., 6., 7.],\n",
|
||||
" [ 8., 9., 10., 11.],\n",
|
||||
" [12., 13., 14., 15.]]]])"
|
||||
]
|
||||
},
|
||||
"execution_count": 5,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"X = torch.arange(16, dtype=torch.float).view((1, 1, 4, 4))\n",
|
||||
"X"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"tensor([[[[10.]]]])"
|
||||
]
|
||||
},
|
||||
"execution_count": 6,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"pool2d = nn.MaxPool2d(3)\n",
|
||||
"pool2d(X) "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"tensor([[[[ 5., 7.],\n",
|
||||
" [13., 15.]]]])"
|
||||
]
|
||||
},
|
||||
"execution_count": 7,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"pool2d = nn.MaxPool2d(3, padding=1, stride=2)\n",
|
||||
"pool2d(X)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"tensor([[[[ 1., 3.],\n",
|
||||
" [ 9., 11.],\n",
|
||||
" [13., 15.]]]])"
|
||||
]
|
||||
},
|
||||
"execution_count": 8,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"pool2d = nn.MaxPool2d((2, 4), padding=(1, 2), stride=(2, 3))\n",
|
||||
"pool2d(X)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 5.4.3 多通道"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 9,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"tensor([[[[ 0., 1., 2., 3.],\n",
|
||||
" [ 4., 5., 6., 7.],\n",
|
||||
" [ 8., 9., 10., 11.],\n",
|
||||
" [12., 13., 14., 15.]],\n",
|
||||
"\n",
|
||||
" [[ 1., 2., 3., 4.],\n",
|
||||
" [ 5., 6., 7., 8.],\n",
|
||||
" [ 9., 10., 11., 12.],\n",
|
||||
" [13., 14., 15., 16.]]]])"
|
||||
]
|
||||
},
|
||||
"execution_count": 9,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"X = torch.cat((X, X + 1), dim=1)\n",
|
||||
"X"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 10,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"tensor([[[[ 5., 7.],\n",
|
||||
" [13., 15.]],\n",
|
||||
"\n",
|
||||
" [[ 6., 8.],\n",
|
||||
" [14., 16.]]]])"
|
||||
]
|
||||
},
|
||||
"execution_count": 10,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"pool2d = nn.MaxPool2d(3, padding=1, stride=2)\n",
|
||||
"pool2d(X)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python [default]",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.6.3"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
||||
@ -0,0 +1,106 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# 6.2 循环神经网络"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"0.4.1\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"import torch\n",
|
||||
"\n",
|
||||
"print(torch.__version__)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"tensor([[ 5.2633, -3.2288, 0.6037, -1.3321],\n",
|
||||
" [ 9.4012, -6.7830, 1.0630, -0.1809],\n",
|
||||
" [ 7.0355, -2.2361, 0.7469, -3.4667]])"
|
||||
]
|
||||
},
|
||||
"execution_count": 2,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"X, W_xh = torch.randn(3, 1), torch.randn(1, 4)\n",
|
||||
"H, W_hh = torch.randn(3, 4), torch.randn(4, 4)\n",
|
||||
"torch.matmul(X, W_xh) + torch.matmul(H, W_hh)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"tensor([[ 5.2633, -3.2288, 0.6037, -1.3321],\n",
|
||||
" [ 9.4012, -6.7830, 1.0630, -0.1809],\n",
|
||||
" [ 7.0355, -2.2361, 0.7469, -3.4667]])"
|
||||
]
|
||||
},
|
||||
"execution_count": 3,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"torch.matmul(torch.cat((X, H), dim=1), torch.cat((W_xh, W_hh), dim=0))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python [default]",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.6.3"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
||||
@ -0,0 +1,283 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# 6.3 语言模型数据集(周杰伦专辑歌词)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"0.4.1\n",
|
||||
"cpu\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"import torch\n",
|
||||
"import random\n",
|
||||
"import zipfile\n",
|
||||
"\n",
|
||||
"device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')\n",
|
||||
"print(torch.__version__)\n",
|
||||
"print(device)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 6.3.1 读取数据集"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"'想要有直升机\\n想要和你飞到宇宙去\\n想要和你融化在一起\\n融化在宇宙里\\n我每天每天每'"
|
||||
]
|
||||
},
|
||||
"execution_count": 2,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"with zipfile.ZipFile('../../data/jaychou_lyrics.txt.zip') as zin:\n",
|
||||
" with zin.open('jaychou_lyrics.txt') as f:\n",
|
||||
" corpus_chars = f.read().decode('utf-8')\n",
|
||||
"corpus_chars[:40]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"corpus_chars = corpus_chars.replace('\\n', ' ').replace('\\r', ' ')\n",
|
||||
"corpus_chars = corpus_chars[0:10000]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 6.3.2 建立字符索引"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"1027"
|
||||
]
|
||||
},
|
||||
"execution_count": 4,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"idx_to_char = list(set(corpus_chars))\n",
|
||||
"char_to_idx = dict([(char, i) for i, char in enumerate(idx_to_char)])\n",
|
||||
"vocab_size = len(char_to_idx)\n",
|
||||
"vocab_size"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"chars: 想要有直升机 想要和你飞到宇宙去 想要和\n",
|
||||
"indices: [981, 858, 519, 53, 577, 1005, 299, 981, 858, 856, 550, 956, 672, 948, 1003, 334, 299, 981, 858, 856]\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"corpus_indices = [char_to_idx[char] for char in corpus_chars]\n",
|
||||
"sample = corpus_indices[:20]\n",
|
||||
"print('chars:', ''.join([idx_to_char[idx] for idx in sample]))\n",
|
||||
"print('indices:', sample)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 6.3.3 时序数据的采样\n",
|
||||
"### 6.3.3.1 随机采样"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# 本函数已保存在d2lzh_pytorch包中方便以后使用\n",
|
||||
"def data_iter_random(corpus_indices, batch_size, num_steps, device=None):\n",
|
||||
" # 减1是因为输出的索引x是相应输入的索引y加1\n",
|
||||
" num_examples = (len(corpus_indices) - 1) // num_steps\n",
|
||||
" epoch_size = num_examples // batch_size\n",
|
||||
" example_indices = list(range(num_examples))\n",
|
||||
" random.shuffle(example_indices)\n",
|
||||
"\n",
|
||||
" # 返回从pos开始的长为num_steps的序列\n",
|
||||
" def _data(pos):\n",
|
||||
" return corpus_indices[pos: pos + num_steps]\n",
|
||||
" if device is None:\n",
|
||||
" device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')\n",
|
||||
" \n",
|
||||
" for i in range(epoch_size):\n",
|
||||
" # 每次读取batch_size个随机样本\n",
|
||||
" i = i * batch_size\n",
|
||||
" batch_indices = example_indices[i: i + batch_size]\n",
|
||||
" X = [_data(j * num_steps) for j in batch_indices]\n",
|
||||
" Y = [_data(j * num_steps + 1) for j in batch_indices]\n",
|
||||
" yield torch.tensor(X, dtype=torch.float32, device=device), torch.tensor(Y, dtype=torch.float32, device=device)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"X: tensor([[18., 19., 20., 21., 22., 23.],\n",
|
||||
" [12., 13., 14., 15., 16., 17.]]) \n",
|
||||
"Y: tensor([[19., 20., 21., 22., 23., 24.],\n",
|
||||
" [13., 14., 15., 16., 17., 18.]]) \n",
|
||||
"\n",
|
||||
"X: tensor([[ 0., 1., 2., 3., 4., 5.],\n",
|
||||
" [ 6., 7., 8., 9., 10., 11.]]) \n",
|
||||
"Y: tensor([[ 1., 2., 3., 4., 5., 6.],\n",
|
||||
" [ 7., 8., 9., 10., 11., 12.]]) \n",
|
||||
"\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"my_seq = list(range(30))\n",
|
||||
"for X, Y in data_iter_random(my_seq, batch_size=2, num_steps=6):\n",
|
||||
" print('X: ', X, '\\nY:', Y, '\\n')"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### 6.3.3.2 相邻采样"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# 本函数已保存在d2lzh_pytorch包中方便以后使用\n",
|
||||
"def data_iter_consecutive(corpus_indices, batch_size, num_steps, device=None):\n",
|
||||
" if device is None:\n",
|
||||
" device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')\n",
|
||||
" corpus_indices = torch.tensor(corpus_indices, dtype=torch.float32, device=device)\n",
|
||||
" data_len = len(corpus_indices)\n",
|
||||
" batch_len = data_len // batch_size\n",
|
||||
" indices = corpus_indices[0: batch_size*batch_len].view(batch_size, batch_len)\n",
|
||||
" epoch_size = (batch_len - 1) // num_steps\n",
|
||||
" for i in range(epoch_size):\n",
|
||||
" i = i * num_steps\n",
|
||||
" X = indices[:, i: i + num_steps]\n",
|
||||
" Y = indices[:, i + 1: i + num_steps + 1]\n",
|
||||
" yield X, Y"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 9,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"X: tensor([[ 0., 1., 2., 3., 4., 5.],\n",
|
||||
" [15., 16., 17., 18., 19., 20.]]) \n",
|
||||
"Y: tensor([[ 1., 2., 3., 4., 5., 6.],\n",
|
||||
" [16., 17., 18., 19., 20., 21.]]) \n",
|
||||
"\n",
|
||||
"X: tensor([[ 6., 7., 8., 9., 10., 11.],\n",
|
||||
" [21., 22., 23., 24., 25., 26.]]) \n",
|
||||
"Y: tensor([[ 7., 8., 9., 10., 11., 12.],\n",
|
||||
" [22., 23., 24., 25., 26., 27.]]) \n",
|
||||
"\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"for X, Y in data_iter_consecutive(my_seq, batch_size=2, num_steps=6):\n",
|
||||
" print('X: ', X, '\\nY:', Y, '\\n')"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python [default]",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.6.3"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
||||
@ -0,0 +1,292 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# 6.5 循环神经网络的简洁实现"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"1.0.0 cuda\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"import time\n",
|
||||
"import math\n",
|
||||
"import numpy as np\n",
|
||||
"import torch\n",
|
||||
"from torch import nn, optim\n",
|
||||
"import torch.nn.functional as F\n",
|
||||
"\n",
|
||||
"import sys\n",
|
||||
"sys.path.append(\"..\") \n",
|
||||
"import d2lzh_pytorch as d2l\n",
|
||||
"device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')\n",
|
||||
"\n",
|
||||
"(corpus_indices, char_to_idx, idx_to_char, vocab_size) = d2l.load_data_jay_lyrics()\n",
|
||||
"\n",
|
||||
"print(torch.__version__, device)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 6.5.1 定义模型"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"num_hiddens = 256\n",
|
||||
"# rnn_layer = nn.LSTM(input_size=vocab_size, hidden_size=num_hiddens) # 已测试\n",
|
||||
"rnn_layer = nn.RNN(input_size=vocab_size, hidden_size=num_hiddens)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"torch.Size([35, 2, 256]) 1 torch.Size([2, 256])\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"num_steps = 35\n",
|
||||
"batch_size = 2\n",
|
||||
"state = None\n",
|
||||
"X = torch.rand(num_steps, batch_size, vocab_size)\n",
|
||||
"Y, state_new = rnn_layer(X, state)\n",
|
||||
"print(Y.shape, len(state_new), state_new[0].shape)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# 本类已保存在d2lzh_pytorch包中方便以后使用\n",
|
||||
"class RNNModel(nn.Module):\n",
|
||||
" def __init__(self, rnn_layer, vocab_size):\n",
|
||||
" super(RNNModel, self).__init__()\n",
|
||||
" self.rnn = rnn_layer\n",
|
||||
" self.hidden_size = rnn_layer.hidden_size * (2 if rnn_layer.bidirectional else 1) \n",
|
||||
" self.vocab_size = vocab_size\n",
|
||||
" self.dense = nn.Linear(self.hidden_size, vocab_size)\n",
|
||||
" self.state = None\n",
|
||||
"\n",
|
||||
" def forward(self, inputs, state): # inputs: (batch, seq_len)\n",
|
||||
" # 获取one-hot向量表示\n",
|
||||
" X = d2l.to_onehot(inputs, vocab_size) # X是个list\n",
|
||||
" Y, self.state = self.rnn(torch.stack(X), state)\n",
|
||||
" # 全连接层会首先将Y的形状变成(num_steps * batch_size, num_hiddens),它的输出\n",
|
||||
" # 形状为(num_steps * batch_size, vocab_size)\n",
|
||||
" output = self.dense(Y.view(-1, Y.shape[-1]))\n",
|
||||
" return output, self.state"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 6.5.2 训练模型"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# 本函数已保存在d2lzh_pytorch包中方便以后使用\n",
|
||||
"def predict_rnn_pytorch(prefix, num_chars, model, vocab_size, device, idx_to_char,\n",
|
||||
" char_to_idx):\n",
|
||||
" state = None\n",
|
||||
" output = [char_to_idx[prefix[0]]] # output会记录prefix加上输出\n",
|
||||
" for t in range(num_chars + len(prefix) - 1):\n",
|
||||
" X = torch.tensor([output[-1]], device=device).view(1, 1)\n",
|
||||
" if state is not None:\n",
|
||||
" if isinstance(state, tuple): # LSTM, state:(h, c) \n",
|
||||
" state = (state[0].to(device), state[1].to(device))\n",
|
||||
" else: \n",
|
||||
" state = state.to(device)\n",
|
||||
" \n",
|
||||
" (Y, state) = model(X, state) # 前向计算不需要传入模型参数\n",
|
||||
" if t < len(prefix) - 1:\n",
|
||||
" output.append(char_to_idx[prefix[t + 1]])\n",
|
||||
" else:\n",
|
||||
" output.append(int(Y.argmax(dim=1).item()))\n",
|
||||
" return ''.join([idx_to_char[i] for i in output])"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"'分开戏想暖迎凉想征凉征征'"
|
||||
]
|
||||
},
|
||||
"execution_count": 6,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"model = RNNModel(rnn_layer, vocab_size).to(device)\n",
|
||||
"predict_rnn_pytorch('分开', 10, model, vocab_size, device, idx_to_char, char_to_idx)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# 本函数已保存在d2lzh_pytorch包中方便以后使用\n",
|
||||
"def train_and_predict_rnn_pytorch(model, num_hiddens, vocab_size, device,\n",
|
||||
" corpus_indices, idx_to_char, char_to_idx,\n",
|
||||
" num_epochs, num_steps, lr, clipping_theta,\n",
|
||||
" batch_size, pred_period, pred_len, prefixes):\n",
|
||||
" loss = nn.CrossEntropyLoss()\n",
|
||||
" optimizer = torch.optim.Adam(model.parameters(), lr=lr)\n",
|
||||
" model.to(device)\n",
|
||||
" state = None\n",
|
||||
" for epoch in range(num_epochs):\n",
|
||||
" l_sum, n, start = 0.0, 0, time.time()\n",
|
||||
" data_iter = d2l.data_iter_consecutive(corpus_indices, batch_size, num_steps, device) # 相邻采样\n",
|
||||
" for X, Y in data_iter:\n",
|
||||
" if state is not None:\n",
|
||||
" # 使用detach函数从计算图分离隐藏状态, 这是为了\n",
|
||||
" # 使模型参数的梯度计算只依赖一次迭代读取的小批量序列(防止梯度计算开销太大)\n",
|
||||
" if isinstance (state, tuple): # LSTM, state:(h, c) \n",
|
||||
" state = (state[0].detach(), state[1].detach())\n",
|
||||
" else: \n",
|
||||
" state = state.detach()\n",
|
||||
" \n",
|
||||
" (output, state) = model(X, state) # output: 形状为(num_steps * batch_size, vocab_size)\n",
|
||||
" \n",
|
||||
" # Y的形状是(batch_size, num_steps),转置后再变成长度为\n",
|
||||
" # batch * num_steps 的向量,这样跟输出的行一一对应\n",
|
||||
" y = torch.transpose(Y, 0, 1).contiguous().view(-1)\n",
|
||||
" l = loss(output, y.long())\n",
|
||||
" \n",
|
||||
" optimizer.zero_grad()\n",
|
||||
" l.backward()\n",
|
||||
" # 梯度裁剪\n",
|
||||
" d2l.grad_clipping(model.parameters(), clipping_theta, device)\n",
|
||||
" optimizer.step()\n",
|
||||
" l_sum += l.item() * y.shape[0]\n",
|
||||
" n += y.shape[0]\n",
|
||||
" \n",
|
||||
" try:\n",
|
||||
" perplexity = math.exp(l_sum / n)\n",
|
||||
" except OverflowError:\n",
|
||||
" perplexity = float('inf')\n",
|
||||
" if (epoch + 1) % pred_period == 0:\n",
|
||||
" print('epoch %d, perplexity %f, time %.2f sec' % (\n",
|
||||
" epoch + 1, perplexity, time.time() - start))\n",
|
||||
" for prefix in prefixes:\n",
|
||||
" print(' -', predict_rnn_pytorch(\n",
|
||||
" prefix, pred_len, model, vocab_size, device, idx_to_char,\n",
|
||||
" char_to_idx))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"epoch 50, perplexity 10.658418, time 0.05 sec\n",
|
||||
" - 分开始我妈 想要你 我不多 让我心到的 我妈妈 我不能再想 我不多再想 我不要再想 我不多再想 我不要\n",
|
||||
" - 不分开 我想要你不你 我 你不要 让我心到的 我妈人 可爱女人 坏坏的让我疯狂的可爱女人 坏坏的让我疯狂的\n",
|
||||
"epoch 100, perplexity 1.308539, time 0.05 sec\n",
|
||||
" - 分开不会痛 不要 你在黑色幽默 开始了美丽全脸的梦滴 闪烁成回忆 伤人的美丽 你的完美主义 太彻底 让我\n",
|
||||
" - 不分开不是我不要再想你 我不能这样牵着你的手不放开 爱可不可以简简单单没有伤害 你 靠着我的肩膀 你 在我\n",
|
||||
"epoch 150, perplexity 1.070370, time 0.05 sec\n",
|
||||
" - 分开不能去河南嵩山 学少林跟武当 快使用双截棍 哼哼哈兮 快使用双截棍 哼哼哈兮 习武之人切记 仁者无敌\n",
|
||||
" - 不分开 在我会想通 是谁开没有全有开始 他心今天 一切人看 我 一口令秋软语的姑娘缓缓走过外滩 消失的 旧\n",
|
||||
"epoch 200, perplexity 1.034663, time 0.05 sec\n",
|
||||
" - 分开不能去吗周杰伦 才离 没要你在一场悲剧 我的完美主义 太彻底 分手的话像语言暴力 我已无能为力再提起\n",
|
||||
" - 不分开 让我面到你 爱情来的太快就像龙卷风 离不开暴风圈来不及逃 我不能再想 我不能再想 我不 我不 我不\n",
|
||||
"epoch 250, perplexity 1.021437, time 0.05 sec\n",
|
||||
" - 分开 我我外的家边 你知道这 我爱不看的太 我想一个又重来不以 迷已文一只剩下回忆 让我叫带你 你你的\n",
|
||||
" - 不分开 我我想想和 是你听没不 我不能不想 不知不觉 你已经离开我 不知不觉 我跟了这节奏 后知后觉 \n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"num_epochs, batch_size, lr, clipping_theta = 250, 32, 1e-3, 1e-2 # 注意这里的学习率设置\n",
|
||||
"pred_period, pred_len, prefixes = 50, 50, ['分开', '不分开']\n",
|
||||
"train_and_predict_rnn_pytorch(model, num_hiddens, vocab_size, device,\n",
|
||||
" corpus_indices, idx_to_char, char_to_idx,\n",
|
||||
" num_epochs, num_steps, lr, clipping_theta,\n",
|
||||
" batch_size, pred_period, pred_len, prefixes)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python [default]",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.6.3"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
||||
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
@ -0,0 +1,122 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# 8.1 命令式和符号式混合编程"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"10"
|
||||
]
|
||||
},
|
||||
"execution_count": 1,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"def add(a, b):\n",
|
||||
" return a + b\n",
|
||||
"\n",
|
||||
"def fancy_func(a, b, c, d):\n",
|
||||
" e = add(a, b)\n",
|
||||
" f = add(c, d)\n",
|
||||
" g = add(e, f)\n",
|
||||
" return g\n",
|
||||
"\n",
|
||||
"fancy_func(1, 2, 3, 4)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"\n",
|
||||
"def add(a, b):\n",
|
||||
" return a + b\n",
|
||||
"\n",
|
||||
"def fancy_func(a, b, c, d):\n",
|
||||
" e = add(a, b)\n",
|
||||
" f = add(c, d)\n",
|
||||
" g = add(e, f)\n",
|
||||
" return g\n",
|
||||
"\n",
|
||||
"print(fancy_func(1, 2, 3, 4))\n",
|
||||
"\n",
|
||||
"10\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"def add_str():\n",
|
||||
" return '''\n",
|
||||
"def add(a, b):\n",
|
||||
" return a + b\n",
|
||||
"'''\n",
|
||||
"\n",
|
||||
"def fancy_func_str():\n",
|
||||
" return '''\n",
|
||||
"def fancy_func(a, b, c, d):\n",
|
||||
" e = add(a, b)\n",
|
||||
" f = add(c, d)\n",
|
||||
" g = add(e, f)\n",
|
||||
" return g\n",
|
||||
"'''\n",
|
||||
"\n",
|
||||
"def evoke_str():\n",
|
||||
" return add_str() + fancy_func_str() + '''\n",
|
||||
"print(fancy_func(1, 2, 3, 4))\n",
|
||||
"'''\n",
|
||||
"\n",
|
||||
"prog = evoke_str()\n",
|
||||
"print(prog)\n",
|
||||
"y = compile(prog, '', 'exec')\n",
|
||||
"exec(y)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python [default]",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.6.3"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
||||
@ -0,0 +1,192 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# 8.3 自动并行计算"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2019-05-10T16:16:41.669018Z",
|
||||
"start_time": "2019-05-10T16:16:36.457355Z"
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import torch\n",
|
||||
"import time\n",
|
||||
"\n",
|
||||
"assert torch.cuda.device_count() >= 2"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2019-05-10T16:17:29.013953Z",
|
||||
"start_time": "2019-05-10T16:16:41.673871Z"
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"x_gpu1 = torch.rand(size=(100, 100), device='cuda:0')\n",
|
||||
"x_gpu2 = torch.rand(size=(100, 100), device='cuda:2')"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2019-05-10T16:17:29.021652Z",
|
||||
"start_time": "2019-05-10T16:17:29.017222Z"
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"class Benchmark(): # 本类已保存在d2lzh_pytorch包中方便以后使用\n",
|
||||
" def __init__(self, prefix=None):\n",
|
||||
" self.prefix = prefix + ' ' if prefix else ''\n",
|
||||
"\n",
|
||||
" def __enter__(self):\n",
|
||||
" self.start = time.time()\n",
|
||||
"\n",
|
||||
" def __exit__(self, *args):\n",
|
||||
" print('%stime: %.4f sec' % (self.prefix, time.time() - self.start))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2019-05-10T16:17:29.069210Z",
|
||||
"start_time": "2019-05-10T16:17:29.023602Z"
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"def run(x):\n",
|
||||
" for _ in range(20000):\n",
|
||||
" y = torch.mm(x, x)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2019-05-10T16:17:29.767144Z",
|
||||
"start_time": "2019-05-10T16:17:29.071262Z"
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Run on GPU1. time: 0.2989 sec\n",
|
||||
"Then run on GPU2. time: 0.3518 sec\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"with Benchmark('Run on GPU1.'):\n",
|
||||
" run(x_gpu1)\n",
|
||||
" torch.cuda.synchronize()\n",
|
||||
"\n",
|
||||
"with Benchmark('Then run on GPU2.'):\n",
|
||||
" run(x_gpu2)\n",
|
||||
" torch.cuda.synchronize()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2019-05-10T16:17:30.282318Z",
|
||||
"start_time": "2019-05-10T16:17:29.770313Z"
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Run on both GPU1 and GPU2 in parallel. time: 0.5076 sec\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"with Benchmark('Run on both GPU1 and GPU2 in parallel.'):\n",
|
||||
" run(x_gpu1)\n",
|
||||
" run(x_gpu2)\n",
|
||||
" torch.cuda.synchronize()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python [conda env:py36]",
|
||||
"language": "python",
|
||||
"name": "conda-env-py36-py"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.6.8"
|
||||
},
|
||||
"varInspector": {
|
||||
"cols": {
|
||||
"lenName": 16,
|
||||
"lenType": 16,
|
||||
"lenVar": 40
|
||||
},
|
||||
"kernels_config": {
|
||||
"python": {
|
||||
"delete_cmd_postfix": "",
|
||||
"delete_cmd_prefix": "del ",
|
||||
"library": "var_list.py",
|
||||
"varRefreshCmd": "print(var_dic_list())"
|
||||
},
|
||||
"r": {
|
||||
"delete_cmd_postfix": ") ",
|
||||
"delete_cmd_prefix": "rm(",
|
||||
"library": "var_list.r",
|
||||
"varRefreshCmd": "cat(var_dic_list()) "
|
||||
}
|
||||
},
|
||||
"types_to_exclude": [
|
||||
"module",
|
||||
"function",
|
||||
"builtin_function_or_method",
|
||||
"instance",
|
||||
"_Feature"
|
||||
],
|
||||
"window_display": false
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
||||
Binary file not shown.
@ -0,0 +1,247 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2019-05-15T16:12:27.380643Z",
|
||||
"start_time": "2019-05-15T16:12:25.699672Z"
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Thu May 16 00:12:26 2019 \n",
|
||||
"+-----------------------------------------------------------------------------+\n",
|
||||
"| NVIDIA-SMI 390.48 Driver Version: 390.48 |\n",
|
||||
"|-------------------------------+----------------------+----------------------+\n",
|
||||
"| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |\n",
|
||||
"| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |\n",
|
||||
"|===============================+======================+======================|\n",
|
||||
"| 0 TITAN X (Pascal) Off | 00000000:02:00.0 Off | N/A |\n",
|
||||
"| 46% 75C P2 87W / 250W | 10995MiB / 12196MiB | 0% Default |\n",
|
||||
"+-------------------------------+----------------------+----------------------+\n",
|
||||
"| 1 TITAN X (Pascal) Off | 00000000:04:00.0 Off | N/A |\n",
|
||||
"| 54% 83C P2 93W / 250W | 11671MiB / 12196MiB | 64% Default |\n",
|
||||
"+-------------------------------+----------------------+----------------------+\n",
|
||||
"| 2 TITAN X (Pascal) Off | 00000000:83:00.0 Off | N/A |\n",
|
||||
"| 62% 83C P2 193W / 250W | 12096MiB / 12196MiB | 92% Default |\n",
|
||||
"+-------------------------------+----------------------+----------------------+\n",
|
||||
"| 3 TITAN X (Pascal) Off | 00000000:84:00.0 Off | N/A |\n",
|
||||
"| 51% 82C P2 166W / 250W | 8144MiB / 12196MiB | 58% Default |\n",
|
||||
"+-------------------------------+----------------------+----------------------+\n",
|
||||
" \n",
|
||||
"+-----------------------------------------------------------------------------+\n",
|
||||
"| Processes: GPU Memory |\n",
|
||||
"| GPU PID Type Process name Usage |\n",
|
||||
"|=============================================================================|\n",
|
||||
"| 0 44683 C python 3289MiB |\n",
|
||||
"| 0 155760 C python 4345MiB |\n",
|
||||
"| 0 158310 C python 2297MiB |\n",
|
||||
"| 0 172338 C /home/yzs/anaconda3/bin/python 1031MiB |\n",
|
||||
"| 1 139985 C python 11653MiB |\n",
|
||||
"| 2 38630 C python 5547MiB |\n",
|
||||
"| 2 43127 C python 5791MiB |\n",
|
||||
"| 2 156710 C python3 725MiB |\n",
|
||||
"| 3 14444 C python3 1891MiB |\n",
|
||||
"| 3 43407 C python 5841MiB |\n",
|
||||
"| 3 88478 C /home/tangss/.conda/envs/py36/bin/python 379MiB |\n",
|
||||
"+-----------------------------------------------------------------------------+\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"!nvidia-smi"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2019-05-15T16:12:29.958567Z",
|
||||
"start_time": "2019-05-15T16:12:27.383299Z"
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import torch"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2019-05-15T16:12:47.137875Z",
|
||||
"start_time": "2019-05-15T16:12:29.962468Z"
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"Linear(in_features=10, out_features=1, bias=True)"
|
||||
]
|
||||
},
|
||||
"execution_count": 3,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"net = torch.nn.Linear(10, 1).cuda()\n",
|
||||
"net"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2019-05-15T16:12:47.143709Z",
|
||||
"start_time": "2019-05-15T16:12:47.139895Z"
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"DataParallel(\n",
|
||||
" (module): Linear(in_features=10, out_features=1, bias=True)\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
"execution_count": 4,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"net = torch.nn.DataParallel(net)\n",
|
||||
"net"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2019-05-15T16:12:47.206714Z",
|
||||
"start_time": "2019-05-15T16:12:47.145069Z"
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"torch.save(net.state_dict(), \"./8.4_model.pt\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2019-05-15T16:12:47.260076Z",
|
||||
"start_time": "2019-05-15T16:12:47.208314Z"
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"new_net = torch.nn.Linear(10, 1)\n",
|
||||
"# new_net.load_state_dict(torch.load(\"./8.4_model.pt\")) # 加载失败"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2019-05-15T16:12:47.317397Z",
|
||||
"start_time": "2019-05-15T16:12:47.262131Z"
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"torch.save(net.module.state_dict(), \"./8.4_model.pt\")\n",
|
||||
"new_net.load_state_dict(torch.load(\"./8.4_model.pt\")) # 加载成功"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2019-05-15T16:12:47.370299Z",
|
||||
"start_time": "2019-05-15T16:12:47.319323Z"
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"torch.save(net.state_dict(), \"./8.4_model.pt\")\n",
|
||||
"new_net = torch.nn.Linear(10, 1)\n",
|
||||
"new_net = torch.nn.DataParallel(new_net)\n",
|
||||
"new_net.load_state_dict(torch.load(\"./8.4_model.pt\")) # 加载成功"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python [default]",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.6.8"
|
||||
},
|
||||
"varInspector": {
|
||||
"cols": {
|
||||
"lenName": 16,
|
||||
"lenType": 16,
|
||||
"lenVar": 40
|
||||
},
|
||||
"kernels_config": {
|
||||
"python": {
|
||||
"delete_cmd_postfix": "",
|
||||
"delete_cmd_prefix": "del ",
|
||||
"library": "var_list.py",
|
||||
"varRefreshCmd": "print(var_dic_list())"
|
||||
},
|
||||
"r": {
|
||||
"delete_cmd_postfix": ") ",
|
||||
"delete_cmd_prefix": "rm(",
|
||||
"library": "var_list.r",
|
||||
"varRefreshCmd": "cat(var_dic_list()) "
|
||||
}
|
||||
},
|
||||
"types_to_exclude": [
|
||||
"module",
|
||||
"function",
|
||||
"builtin_function_or_method",
|
||||
"instance",
|
||||
"_Feature"
|
||||
],
|
||||
"window_display": false
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
||||
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
@ -0,0 +1,353 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# 10.6 求近义词和类比词\n",
|
||||
"## 10.6.1 使用预训练的词向量"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"1.0.0\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"dict_keys(['charngram.100d', 'fasttext.en.300d', 'fasttext.simple.300d', 'glove.42B.300d', 'glove.840B.300d', 'glove.twitter.27B.25d', 'glove.twitter.27B.50d', 'glove.twitter.27B.100d', 'glove.twitter.27B.200d', 'glove.6B.50d', 'glove.6B.100d', 'glove.6B.200d', 'glove.6B.300d'])"
|
||||
]
|
||||
},
|
||||
"execution_count": 1,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"import torch\n",
|
||||
"import torchtext.vocab as vocab\n",
|
||||
"\n",
|
||||
"print(torch.__version__)\n",
|
||||
"vocab.pretrained_aliases.keys()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"['glove.42B.300d',\n",
|
||||
" 'glove.840B.300d',\n",
|
||||
" 'glove.twitter.27B.25d',\n",
|
||||
" 'glove.twitter.27B.50d',\n",
|
||||
" 'glove.twitter.27B.100d',\n",
|
||||
" 'glove.twitter.27B.200d',\n",
|
||||
" 'glove.6B.50d',\n",
|
||||
" 'glove.6B.100d',\n",
|
||||
" 'glove.6B.200d',\n",
|
||||
" 'glove.6B.300d']"
|
||||
]
|
||||
},
|
||||
"execution_count": 2,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"[key for key in vocab.pretrained_aliases.keys()\n",
|
||||
" if \"glove\" in key]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"cache_dir = \"/Users/tangshusen/Datasets/glove\"\n",
|
||||
"# glove = vocab.pretrained_aliases[\"glove.6B.50d\"](cache=cache_dir)\n",
|
||||
"glove = vocab.GloVe(name='6B', dim=50, cache=cache_dir) # 与上面等价"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"一共包含400000个词。\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"print(\"一共包含%d个词。\" % len(glove.stoi))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"(3366, 'beautiful')"
|
||||
]
|
||||
},
|
||||
"execution_count": 5,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"glove.stoi['beautiful'], glove.itos[3366]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 10.6.2 应用预训练词向量\n",
|
||||
"### 10.6.2.1 求近义词"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"def knn(W, x, k):\n",
|
||||
" # 添加的1e-9是为了数值稳定性\n",
|
||||
" cos = torch.matmul(W, x.view((-1,))) / (\n",
|
||||
" (torch.sum(W * W, dim=1) + 1e-9).sqrt() * torch.sum(x * x).sqrt())\n",
|
||||
" _, topk = torch.topk(cos, k=k)\n",
|
||||
" topk = topk.cpu().numpy()\n",
|
||||
" return topk, [cos[i].item() for i in topk]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"def get_similar_tokens(query_token, k, embed):\n",
|
||||
" topk, cos = knn(embed.vectors,\n",
|
||||
" embed.vectors[embed.stoi[query_token]], k+1)\n",
|
||||
" for i, c in zip(topk[1:], cos[1:]): # 除去输入词\n",
|
||||
" print('cosine sim=%.3f: %s' % (c, (embed.itos[i])))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"cosine sim=0.856: chips\n",
|
||||
"cosine sim=0.749: intel\n",
|
||||
"cosine sim=0.749: electronics\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"get_similar_tokens('chip', 3, glove)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 9,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"cosine sim=0.839: babies\n",
|
||||
"cosine sim=0.800: boy\n",
|
||||
"cosine sim=0.792: girl\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"get_similar_tokens('baby', 3, glove)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 10,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"cosine sim=0.921: lovely\n",
|
||||
"cosine sim=0.893: gorgeous\n",
|
||||
"cosine sim=0.830: wonderful\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"get_similar_tokens('beautiful', 3, glove)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### 10.6.2.2 求类比词"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 11,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"def get_analogy(token_a, token_b, token_c, embed):\n",
|
||||
" vecs = [embed.vectors[embed.stoi[t]] \n",
|
||||
" for t in [token_a, token_b, token_c]]\n",
|
||||
" x = vecs[1] - vecs[0] + vecs[2]\n",
|
||||
" topk, cos = knn(embed.vectors, x, 1)\n",
|
||||
" return embed.itos[topk[0]]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 12,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"'daughter'"
|
||||
]
|
||||
},
|
||||
"execution_count": 12,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"get_analogy('man', 'woman', 'son', glove)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 13,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"'japan'"
|
||||
]
|
||||
},
|
||||
"execution_count": 13,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"get_analogy('beijing', 'china', 'tokyo', glove)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 14,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"'biggest'"
|
||||
]
|
||||
},
|
||||
"execution_count": 14,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"get_analogy('bad', 'worst', 'big', glove)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 15,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"'went'"
|
||||
]
|
||||
},
|
||||
"execution_count": 15,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"get_analogy('do', 'did', 'go', glove)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python [default]",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.6.8"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
||||
@ -0,0 +1,2 @@
|
||||
from .utils import *
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@ -0,0 +1,170 @@
|
||||
import torch
|
||||
import torchvision
|
||||
from torch.utils.data import DataLoader
|
||||
import torch.nn as nn
|
||||
import torch.nn.functional as F
|
||||
import torch.optim as optim
|
||||
import matplotlib.pyplot as plt
|
||||
|
||||
n_epochs = 3
|
||||
batch_size_train = 64
|
||||
batch_size_test = 1000
|
||||
learning_rate = 0.01
|
||||
momentum = 0.5
|
||||
log_interval = 10
|
||||
random_seed = 1
|
||||
|
||||
torch.manual_seed(random_seed)
|
||||
|
||||
|
||||
train_loader = torch.utils.data.DataLoader(
|
||||
torchvision.datasets.MNIST('./data/', train=True, download=True,
|
||||
transform=torchvision.transforms.Compose([
|
||||
torchvision.transforms.ToTensor(),
|
||||
torchvision.transforms.Normalize(
|
||||
(0.1307,), (0.3081,))
|
||||
])),
|
||||
batch_size=batch_size_train, shuffle=True)
|
||||
test_loader = torch.utils.data.DataLoader(
|
||||
torchvision.datasets.MNIST('./data/', train=False, download=True,
|
||||
transform=torchvision.transforms.Compose([
|
||||
torchvision.transforms.ToTensor(),
|
||||
torchvision.transforms.Normalize(
|
||||
(0.1307,), (0.3081,))
|
||||
])),
|
||||
batch_size=batch_size_test, shuffle=True)
|
||||
|
||||
examples = enumerate(test_loader)
|
||||
batch_idx, (example_data, example_targets) = next(examples)
|
||||
print(example_targets)
|
||||
print(example_data.shape)
|
||||
|
||||
|
||||
fig = plt.figure()
|
||||
for i in range(6):
|
||||
plt.subplot(2,3,i+1)
|
||||
plt.tight_layout()
|
||||
plt.imshow(example_data[i][0], cmap='gray', interpolation='none')
|
||||
plt.title("Ground Truth: {}".format(example_targets[i]))
|
||||
plt.xticks([])
|
||||
plt.yticks([])
|
||||
plt.show()
|
||||
|
||||
|
||||
|
||||
|
||||
class Net(nn.Module):
|
||||
def __init__(self):
|
||||
super(Net, self).__init__()
|
||||
self.conv1 = nn.Conv2d(1, 10, kernel_size=5)
|
||||
self.conv2 = nn.Conv2d(10, 20, kernel_size=5)
|
||||
self.conv2_drop = nn.Dropout2d()
|
||||
self.fc1 = nn.Linear(320, 50)
|
||||
self.fc2 = nn.Linear(50, 10)
|
||||
def forward(self, x):
|
||||
x = F.relu(F.max_pool2d(self.conv1(x), 2))
|
||||
x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))
|
||||
x = x.view(-1, 320)
|
||||
x = F.relu(self.fc1(x))
|
||||
x = F.dropout(x, training=self.training)
|
||||
x = self.fc2(x)
|
||||
return F.log_softmax(x)
|
||||
|
||||
def train(epoch):
|
||||
network.train()
|
||||
for batch_idx, (data, target) in enumerate(train_loader):
|
||||
optimizer.zero_grad()
|
||||
output = network(data)
|
||||
loss = F.nll_loss(output, target)
|
||||
loss.backward()
|
||||
optimizer.step()
|
||||
if batch_idx % log_interval == 0:
|
||||
print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
|
||||
epoch, batch_idx * len(data), len(train_loader.dataset),
|
||||
100. * batch_idx / len(train_loader), loss.item()))
|
||||
train_losses.append(loss.item())
|
||||
train_counter.append(
|
||||
(batch_idx*64) + ((epoch-1)*len(train_loader.dataset)))
|
||||
torch.save(network.state_dict(), './model.pth')
|
||||
torch.save(optimizer.state_dict(), './optimizer.pth')
|
||||
|
||||
#train(1)
|
||||
|
||||
def test():
|
||||
network.eval()
|
||||
test_loss = 0
|
||||
correct = 0
|
||||
with torch.no_grad():
|
||||
for data, target in test_loader:
|
||||
output = network(data)
|
||||
test_loss += F.nll_loss(output, target, size_average=False).item()
|
||||
pred = output.data.max(1, keepdim=True)[1]
|
||||
correct += pred.eq(target.data.view_as(pred)).sum()
|
||||
test_loss /= len(test_loader.dataset)
|
||||
test_losses.append(test_loss)
|
||||
print('\nTest set: Avg. loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
|
||||
test_loss, correct, len(test_loader.dataset),
|
||||
100. * correct / len(test_loader.dataset)))
|
||||
|
||||
network = Net()
|
||||
optimizer = optim.SGD(network.parameters(), lr=learning_rate,
|
||||
momentum=momentum)
|
||||
|
||||
train_losses = []
|
||||
train_counter = []
|
||||
test_losses = []
|
||||
test_counter = [i*len(train_loader.dataset) for i in range(n_epochs + 1)]
|
||||
|
||||
|
||||
|
||||
for epoch in range(1, n_epochs + 1):
|
||||
train(epoch)
|
||||
test()
|
||||
|
||||
|
||||
fig = plt.figure()
|
||||
plt.plot(train_counter, train_losses, color='blue')
|
||||
plt.scatter(test_counter, test_losses, color='red')
|
||||
plt.legend(['Train Loss', 'Test Loss'], loc='upper right')
|
||||
plt.xlabel('number of training examples seen')
|
||||
plt.ylabel('negative log likelihood loss')
|
||||
plt.show()
|
||||
|
||||
|
||||
examples = enumerate(test_loader)
|
||||
batch_idx, (example_data, example_targets) = next(examples)
|
||||
with torch.no_grad():
|
||||
output = network(example_data)
|
||||
fig = plt.figure()
|
||||
for i in range(6):
|
||||
plt.subplot(2,3,i+1)
|
||||
plt.tight_layout()
|
||||
plt.imshow(example_data[i][0], cmap='gray', interpolation='none')
|
||||
plt.title("Prediction: {}".format(
|
||||
output.data.max(1, keepdim=True)[1][i].item()))
|
||||
plt.xticks([])
|
||||
plt.yticks([])
|
||||
plt.show()
|
||||
|
||||
|
||||
continued_network = Net()
|
||||
continued_optimizer = optim.SGD(network.parameters(), lr=learning_rate,
|
||||
momentum=momentum)
|
||||
|
||||
network_state_dict = torch.load('model.pth')
|
||||
continued_network.load_state_dict(network_state_dict)
|
||||
optimizer_state_dict = torch.load('optimizer.pth')
|
||||
continued_optimizer.load_state_dict(optimizer_state_dict)
|
||||
|
||||
for i in range(4,9):
|
||||
test_counter.append(i*len(train_loader.dataset))
|
||||
train(i)
|
||||
test()
|
||||
|
||||
fig = plt.figure()
|
||||
plt.plot(train_counter, train_losses, color='blue')
|
||||
plt.scatter(test_counter, test_losses, color='red')
|
||||
plt.legend(['Train Loss', 'Test Loss'], loc='upper right')
|
||||
plt.xlabel('number of training examples seen')
|
||||
plt.ylabel('negative log likelihood loss')
|
||||
plt.show()
|
||||
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
File diff suppressed because it is too large
Load Diff
|
After Width: | Height: | Size: 2.7 MiB |
@ -0,0 +1,20 @@
|
||||
elle est vieille . she is old .
|
||||
elle est tranquille . she is quiet .
|
||||
elle a tort . she is wrong .
|
||||
elle est canadienne . she is canadian .
|
||||
elle est japonaise . she is japanese .
|
||||
ils sont russes . they are russian .
|
||||
ils se disputent . they are arguing .
|
||||
ils regardent . they are watching .
|
||||
ils sont acteurs . they are actors .
|
||||
elles sont crevees . they are exhausted .
|
||||
il est mon genre ! he is my type !
|
||||
il a des ennuis . he is in trouble .
|
||||
c est mon frere . he is my brother .
|
||||
c est mon oncle . he is my uncle .
|
||||
il a environ mon age . he is about my age .
|
||||
elles sont toutes deux bonnes . they are both good .
|
||||
elle est bonne nageuse . she is a good swimmer .
|
||||
c est une personne adorable . he is a lovable person .
|
||||
il fait du velo . he is riding a bicycle .
|
||||
ils sont de grands amis . they are great friends .
|
||||
Binary file not shown.
@ -0,0 +1,523 @@
|
||||
MSSubClass: Identifies the type of dwelling involved in the sale.
|
||||
|
||||
20 1-STORY 1946 & NEWER ALL STYLES
|
||||
30 1-STORY 1945 & OLDER
|
||||
40 1-STORY W/FINISHED ATTIC ALL AGES
|
||||
45 1-1/2 STORY - UNFINISHED ALL AGES
|
||||
50 1-1/2 STORY FINISHED ALL AGES
|
||||
60 2-STORY 1946 & NEWER
|
||||
70 2-STORY 1945 & OLDER
|
||||
75 2-1/2 STORY ALL AGES
|
||||
80 SPLIT OR MULTI-LEVEL
|
||||
85 SPLIT FOYER
|
||||
90 DUPLEX - ALL STYLES AND AGES
|
||||
120 1-STORY PUD (Planned Unit Development) - 1946 & NEWER
|
||||
150 1-1/2 STORY PUD - ALL AGES
|
||||
160 2-STORY PUD - 1946 & NEWER
|
||||
180 PUD - MULTILEVEL - INCL SPLIT LEV/FOYER
|
||||
190 2 FAMILY CONVERSION - ALL STYLES AND AGES
|
||||
|
||||
MSZoning: Identifies the general zoning classification of the sale.
|
||||
|
||||
A Agriculture
|
||||
C Commercial
|
||||
FV Floating Village Residential
|
||||
I Industrial
|
||||
RH Residential High Density
|
||||
RL Residential Low Density
|
||||
RP Residential Low Density Park
|
||||
RM Residential Medium Density
|
||||
|
||||
LotFrontage: Linear feet of street connected to property
|
||||
|
||||
LotArea: Lot size in square feet
|
||||
|
||||
Street: Type of road access to property
|
||||
|
||||
Grvl Gravel
|
||||
Pave Paved
|
||||
|
||||
Alley: Type of alley access to property
|
||||
|
||||
Grvl Gravel
|
||||
Pave Paved
|
||||
NA No alley access
|
||||
|
||||
LotShape: General shape of property
|
||||
|
||||
Reg Regular
|
||||
IR1 Slightly irregular
|
||||
IR2 Moderately Irregular
|
||||
IR3 Irregular
|
||||
|
||||
LandContour: Flatness of the property
|
||||
|
||||
Lvl Near Flat/Level
|
||||
Bnk Banked - Quick and significant rise from street grade to building
|
||||
HLS Hillside - Significant slope from side to side
|
||||
Low Depression
|
||||
|
||||
Utilities: Type of utilities available
|
||||
|
||||
AllPub All public Utilities (E,G,W,& S)
|
||||
NoSewr Electricity, Gas, and Water (Septic Tank)
|
||||
NoSeWa Electricity and Gas Only
|
||||
ELO Electricity only
|
||||
|
||||
LotConfig: Lot configuration
|
||||
|
||||
Inside Inside lot
|
||||
Corner Corner lot
|
||||
CulDSac Cul-de-sac
|
||||
FR2 Frontage on 2 sides of property
|
||||
FR3 Frontage on 3 sides of property
|
||||
|
||||
LandSlope: Slope of property
|
||||
|
||||
Gtl Gentle slope
|
||||
Mod Moderate Slope
|
||||
Sev Severe Slope
|
||||
|
||||
Neighborhood: Physical locations within Ames city limits
|
||||
|
||||
Blmngtn Bloomington Heights
|
||||
Blueste Bluestem
|
||||
BrDale Briardale
|
||||
BrkSide Brookside
|
||||
ClearCr Clear Creek
|
||||
CollgCr College Creek
|
||||
Crawfor Crawford
|
||||
Edwards Edwards
|
||||
Gilbert Gilbert
|
||||
IDOTRR Iowa DOT and Rail Road
|
||||
MeadowV Meadow Village
|
||||
Mitchel Mitchell
|
||||
Names North Ames
|
||||
NoRidge Northridge
|
||||
NPkVill Northpark Villa
|
||||
NridgHt Northridge Heights
|
||||
NWAmes Northwest Ames
|
||||
OldTown Old Town
|
||||
SWISU South & West of Iowa State University
|
||||
Sawyer Sawyer
|
||||
SawyerW Sawyer West
|
||||
Somerst Somerset
|
||||
StoneBr Stone Brook
|
||||
Timber Timberland
|
||||
Veenker Veenker
|
||||
|
||||
Condition1: Proximity to various conditions
|
||||
|
||||
Artery Adjacent to arterial street
|
||||
Feedr Adjacent to feeder street
|
||||
Norm Normal
|
||||
RRNn Within 200' of North-South Railroad
|
||||
RRAn Adjacent to North-South Railroad
|
||||
PosN Near positive off-site feature--park, greenbelt, etc.
|
||||
PosA Adjacent to postive off-site feature
|
||||
RRNe Within 200' of East-West Railroad
|
||||
RRAe Adjacent to East-West Railroad
|
||||
|
||||
Condition2: Proximity to various conditions (if more than one is present)
|
||||
|
||||
Artery Adjacent to arterial street
|
||||
Feedr Adjacent to feeder street
|
||||
Norm Normal
|
||||
RRNn Within 200' of North-South Railroad
|
||||
RRAn Adjacent to North-South Railroad
|
||||
PosN Near positive off-site feature--park, greenbelt, etc.
|
||||
PosA Adjacent to postive off-site feature
|
||||
RRNe Within 200' of East-West Railroad
|
||||
RRAe Adjacent to East-West Railroad
|
||||
|
||||
BldgType: Type of dwelling
|
||||
|
||||
1Fam Single-family Detached
|
||||
2FmCon Two-family Conversion; originally built as one-family dwelling
|
||||
Duplx Duplex
|
||||
TwnhsE Townhouse End Unit
|
||||
TwnhsI Townhouse Inside Unit
|
||||
|
||||
HouseStyle: Style of dwelling
|
||||
|
||||
1Story One story
|
||||
1.5Fin One and one-half story: 2nd level finished
|
||||
1.5Unf One and one-half story: 2nd level unfinished
|
||||
2Story Two story
|
||||
2.5Fin Two and one-half story: 2nd level finished
|
||||
2.5Unf Two and one-half story: 2nd level unfinished
|
||||
SFoyer Split Foyer
|
||||
SLvl Split Level
|
||||
|
||||
OverallQual: Rates the overall material and finish of the house
|
||||
|
||||
10 Very Excellent
|
||||
9 Excellent
|
||||
8 Very Good
|
||||
7 Good
|
||||
6 Above Average
|
||||
5 Average
|
||||
4 Below Average
|
||||
3 Fair
|
||||
2 Poor
|
||||
1 Very Poor
|
||||
|
||||
OverallCond: Rates the overall condition of the house
|
||||
|
||||
10 Very Excellent
|
||||
9 Excellent
|
||||
8 Very Good
|
||||
7 Good
|
||||
6 Above Average
|
||||
5 Average
|
||||
4 Below Average
|
||||
3 Fair
|
||||
2 Poor
|
||||
1 Very Poor
|
||||
|
||||
YearBuilt: Original construction date
|
||||
|
||||
YearRemodAdd: Remodel date (same as construction date if no remodeling or additions)
|
||||
|
||||
RoofStyle: Type of roof
|
||||
|
||||
Flat Flat
|
||||
Gable Gable
|
||||
Gambrel Gabrel (Barn)
|
||||
Hip Hip
|
||||
Mansard Mansard
|
||||
Shed Shed
|
||||
|
||||
RoofMatl: Roof material
|
||||
|
||||
ClyTile Clay or Tile
|
||||
CompShg Standard (Composite) Shingle
|
||||
Membran Membrane
|
||||
Metal Metal
|
||||
Roll Roll
|
||||
Tar&Grv Gravel & Tar
|
||||
WdShake Wood Shakes
|
||||
WdShngl Wood Shingles
|
||||
|
||||
Exterior1st: Exterior covering on house
|
||||
|
||||
AsbShng Asbestos Shingles
|
||||
AsphShn Asphalt Shingles
|
||||
BrkComm Brick Common
|
||||
BrkFace Brick Face
|
||||
CBlock Cinder Block
|
||||
CemntBd Cement Board
|
||||
HdBoard Hard Board
|
||||
ImStucc Imitation Stucco
|
||||
MetalSd Metal Siding
|
||||
Other Other
|
||||
Plywood Plywood
|
||||
PreCast PreCast
|
||||
Stone Stone
|
||||
Stucco Stucco
|
||||
VinylSd Vinyl Siding
|
||||
Wd Sdng Wood Siding
|
||||
WdShing Wood Shingles
|
||||
|
||||
Exterior2nd: Exterior covering on house (if more than one material)
|
||||
|
||||
AsbShng Asbestos Shingles
|
||||
AsphShn Asphalt Shingles
|
||||
BrkComm Brick Common
|
||||
BrkFace Brick Face
|
||||
CBlock Cinder Block
|
||||
CemntBd Cement Board
|
||||
HdBoard Hard Board
|
||||
ImStucc Imitation Stucco
|
||||
MetalSd Metal Siding
|
||||
Other Other
|
||||
Plywood Plywood
|
||||
PreCast PreCast
|
||||
Stone Stone
|
||||
Stucco Stucco
|
||||
VinylSd Vinyl Siding
|
||||
Wd Sdng Wood Siding
|
||||
WdShing Wood Shingles
|
||||
|
||||
MasVnrType: Masonry veneer type
|
||||
|
||||
BrkCmn Brick Common
|
||||
BrkFace Brick Face
|
||||
CBlock Cinder Block
|
||||
None None
|
||||
Stone Stone
|
||||
|
||||
MasVnrArea: Masonry veneer area in square feet
|
||||
|
||||
ExterQual: Evaluates the quality of the material on the exterior
|
||||
|
||||
Ex Excellent
|
||||
Gd Good
|
||||
TA Average/Typical
|
||||
Fa Fair
|
||||
Po Poor
|
||||
|
||||
ExterCond: Evaluates the present condition of the material on the exterior
|
||||
|
||||
Ex Excellent
|
||||
Gd Good
|
||||
TA Average/Typical
|
||||
Fa Fair
|
||||
Po Poor
|
||||
|
||||
Foundation: Type of foundation
|
||||
|
||||
BrkTil Brick & Tile
|
||||
CBlock Cinder Block
|
||||
PConc Poured Contrete
|
||||
Slab Slab
|
||||
Stone Stone
|
||||
Wood Wood
|
||||
|
||||
BsmtQual: Evaluates the height of the basement
|
||||
|
||||
Ex Excellent (100+ inches)
|
||||
Gd Good (90-99 inches)
|
||||
TA Typical (80-89 inches)
|
||||
Fa Fair (70-79 inches)
|
||||
Po Poor (<70 inches
|
||||
NA No Basement
|
||||
|
||||
BsmtCond: Evaluates the general condition of the basement
|
||||
|
||||
Ex Excellent
|
||||
Gd Good
|
||||
TA Typical - slight dampness allowed
|
||||
Fa Fair - dampness or some cracking or settling
|
||||
Po Poor - Severe cracking, settling, or wetness
|
||||
NA No Basement
|
||||
|
||||
BsmtExposure: Refers to walkout or garden level walls
|
||||
|
||||
Gd Good Exposure
|
||||
Av Average Exposure (split levels or foyers typically score average or above)
|
||||
Mn Mimimum Exposure
|
||||
No No Exposure
|
||||
NA No Basement
|
||||
|
||||
BsmtFinType1: Rating of basement finished area
|
||||
|
||||
GLQ Good Living Quarters
|
||||
ALQ Average Living Quarters
|
||||
BLQ Below Average Living Quarters
|
||||
Rec Average Rec Room
|
||||
LwQ Low Quality
|
||||
Unf Unfinshed
|
||||
NA No Basement
|
||||
|
||||
BsmtFinSF1: Type 1 finished square feet
|
||||
|
||||
BsmtFinType2: Rating of basement finished area (if multiple types)
|
||||
|
||||
GLQ Good Living Quarters
|
||||
ALQ Average Living Quarters
|
||||
BLQ Below Average Living Quarters
|
||||
Rec Average Rec Room
|
||||
LwQ Low Quality
|
||||
Unf Unfinshed
|
||||
NA No Basement
|
||||
|
||||
BsmtFinSF2: Type 2 finished square feet
|
||||
|
||||
BsmtUnfSF: Unfinished square feet of basement area
|
||||
|
||||
TotalBsmtSF: Total square feet of basement area
|
||||
|
||||
Heating: Type of heating
|
||||
|
||||
Floor Floor Furnace
|
||||
GasA Gas forced warm air furnace
|
||||
GasW Gas hot water or steam heat
|
||||
Grav Gravity furnace
|
||||
OthW Hot water or steam heat other than gas
|
||||
Wall Wall furnace
|
||||
|
||||
HeatingQC: Heating quality and condition
|
||||
|
||||
Ex Excellent
|
||||
Gd Good
|
||||
TA Average/Typical
|
||||
Fa Fair
|
||||
Po Poor
|
||||
|
||||
CentralAir: Central air conditioning
|
||||
|
||||
N No
|
||||
Y Yes
|
||||
|
||||
Electrical: Electrical system
|
||||
|
||||
SBrkr Standard Circuit Breakers & Romex
|
||||
FuseA Fuse Box over 60 AMP and all Romex wiring (Average)
|
||||
FuseF 60 AMP Fuse Box and mostly Romex wiring (Fair)
|
||||
FuseP 60 AMP Fuse Box and mostly knob & tube wiring (poor)
|
||||
Mix Mixed
|
||||
|
||||
1stFlrSF: First Floor square feet
|
||||
|
||||
2ndFlrSF: Second floor square feet
|
||||
|
||||
LowQualFinSF: Low quality finished square feet (all floors)
|
||||
|
||||
GrLivArea: Above grade (ground) living area square feet
|
||||
|
||||
BsmtFullBath: Basement full bathrooms
|
||||
|
||||
BsmtHalfBath: Basement half bathrooms
|
||||
|
||||
FullBath: Full bathrooms above grade
|
||||
|
||||
HalfBath: Half baths above grade
|
||||
|
||||
Bedroom: Bedrooms above grade (does NOT include basement bedrooms)
|
||||
|
||||
Kitchen: Kitchens above grade
|
||||
|
||||
KitchenQual: Kitchen quality
|
||||
|
||||
Ex Excellent
|
||||
Gd Good
|
||||
TA Typical/Average
|
||||
Fa Fair
|
||||
Po Poor
|
||||
|
||||
TotRmsAbvGrd: Total rooms above grade (does not include bathrooms)
|
||||
|
||||
Functional: Home functionality (Assume typical unless deductions are warranted)
|
||||
|
||||
Typ Typical Functionality
|
||||
Min1 Minor Deductions 1
|
||||
Min2 Minor Deductions 2
|
||||
Mod Moderate Deductions
|
||||
Maj1 Major Deductions 1
|
||||
Maj2 Major Deductions 2
|
||||
Sev Severely Damaged
|
||||
Sal Salvage only
|
||||
|
||||
Fireplaces: Number of fireplaces
|
||||
|
||||
FireplaceQu: Fireplace quality
|
||||
|
||||
Ex Excellent - Exceptional Masonry Fireplace
|
||||
Gd Good - Masonry Fireplace in main level
|
||||
TA Average - Prefabricated Fireplace in main living area or Masonry Fireplace in basement
|
||||
Fa Fair - Prefabricated Fireplace in basement
|
||||
Po Poor - Ben Franklin Stove
|
||||
NA No Fireplace
|
||||
|
||||
GarageType: Garage location
|
||||
|
||||
2Types More than one type of garage
|
||||
Attchd Attached to home
|
||||
Basment Basement Garage
|
||||
BuiltIn Built-In (Garage part of house - typically has room above garage)
|
||||
CarPort Car Port
|
||||
Detchd Detached from home
|
||||
NA No Garage
|
||||
|
||||
GarageYrBlt: Year garage was built
|
||||
|
||||
GarageFinish: Interior finish of the garage
|
||||
|
||||
Fin Finished
|
||||
RFn Rough Finished
|
||||
Unf Unfinished
|
||||
NA No Garage
|
||||
|
||||
GarageCars: Size of garage in car capacity
|
||||
|
||||
GarageArea: Size of garage in square feet
|
||||
|
||||
GarageQual: Garage quality
|
||||
|
||||
Ex Excellent
|
||||
Gd Good
|
||||
TA Typical/Average
|
||||
Fa Fair
|
||||
Po Poor
|
||||
NA No Garage
|
||||
|
||||
GarageCond: Garage condition
|
||||
|
||||
Ex Excellent
|
||||
Gd Good
|
||||
TA Typical/Average
|
||||
Fa Fair
|
||||
Po Poor
|
||||
NA No Garage
|
||||
|
||||
PavedDrive: Paved driveway
|
||||
|
||||
Y Paved
|
||||
P Partial Pavement
|
||||
N Dirt/Gravel
|
||||
|
||||
WoodDeckSF: Wood deck area in square feet
|
||||
|
||||
OpenPorchSF: Open porch area in square feet
|
||||
|
||||
EnclosedPorch: Enclosed porch area in square feet
|
||||
|
||||
3SsnPorch: Three season porch area in square feet
|
||||
|
||||
ScreenPorch: Screen porch area in square feet
|
||||
|
||||
PoolArea: Pool area in square feet
|
||||
|
||||
PoolQC: Pool quality
|
||||
|
||||
Ex Excellent
|
||||
Gd Good
|
||||
TA Average/Typical
|
||||
Fa Fair
|
||||
NA No Pool
|
||||
|
||||
Fence: Fence quality
|
||||
|
||||
GdPrv Good Privacy
|
||||
MnPrv Minimum Privacy
|
||||
GdWo Good Wood
|
||||
MnWw Minimum Wood/Wire
|
||||
NA No Fence
|
||||
|
||||
MiscFeature: Miscellaneous feature not covered in other categories
|
||||
|
||||
Elev Elevator
|
||||
Gar2 2nd Garage (if not described in garage section)
|
||||
Othr Other
|
||||
Shed Shed (over 100 SF)
|
||||
TenC Tennis Court
|
||||
NA None
|
||||
|
||||
MiscVal: $Value of miscellaneous feature
|
||||
|
||||
MoSold: Month Sold (MM)
|
||||
|
||||
YrSold: Year Sold (YYYY)
|
||||
|
||||
SaleType: Type of sale
|
||||
|
||||
WD Warranty Deed - Conventional
|
||||
CWD Warranty Deed - Cash
|
||||
VWD Warranty Deed - VA Loan
|
||||
New Home just constructed and sold
|
||||
COD Court Officer Deed/Estate
|
||||
Con Contract 15% Down payment regular terms
|
||||
ConLw Contract Low Down payment and low interest
|
||||
ConLI Contract Low Interest
|
||||
ConLD Contract Low Down
|
||||
Oth Other
|
||||
|
||||
SaleCondition: Condition of sale
|
||||
|
||||
Normal Normal Sale
|
||||
Abnorml Abnormal Sale - trade, foreclosure, short sale
|
||||
AdjLand Adjoining Land Purchase
|
||||
Alloca Allocation - two linked properties with separate deeds, typically condo with a garage unit
|
||||
Family Sale between family members
|
||||
Partial Home was not completed when last assessed (associated with New Homes)
|
||||
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
@ -0,0 +1,10 @@
|
||||
Data description:
|
||||
|
||||
Penn Treebank Corpus
|
||||
- should be free for research purposes
|
||||
- the same processing of data as used in many LM papers, including "Empirical Evaluation and Combination of Advanced Language Modeling Techniques"
|
||||
- ptb.train.txt: train set
|
||||
- ptb.valid.txt: development set (should be used just for tuning hyper-parameters, but not for training)
|
||||
- ptb.test.txt: test set for reporting perplexity
|
||||
|
||||
- ptb.char.*: the same data, just rewritten as sequences of characters, with spaces rewritten as '_' - useful for training character based models, as is shown in example 9
|
||||
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
|
After Width: | Height: | Size: 565 KiB |
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in new issue