MXNet

MXNet基本指南

  1. 打开mxnet source activate gluon # 注意Windows下不需要 source
    退出环境 source deactivate

  2. GPU版本进入环境后如果用指定的卡可以 CUDA_VISIBLE_DEVICES=2 python,这样数据只能分配在一个GPU上

  3. 可以尝试将数据全部放进内存,如果是不规则数据集,numpy处理不了可以用python自带的数组处理
  1. CUDA_VISIBLE_DEVICES=2 jupyter notebook ,进入jupyter后再导入mxnet,如果使用GPU训练,也只训练在一块卡上
  2. 升级 pip install mxnet --upgrade,安装每日更新版本可以加上--pre参数。pip search mxnet可以看到还有许多mxnet版本,比如mxnet-cu75、mxnet-cu80、mxnet-cu90等
  3. 可以不让MXNet占用过多显存,设置保留的百分数 export MXNET_GPU_MEM_POOL_RESERVE=5

MXNet如何处理训练模式和测试模式

Gluon:若在调研网络时被with autograd.record()包裹,那么这时Gluon是训练模式。如果没有则是测试模式。可以参见在论坛的帖子得到一些证明
MNNet的Module加载模型并运算默认是训练模式,测试模式需要指明mod.forward(Batch([x]),is_train=False),MNNet的新版加载函数(v1.2.1以上)加载模型方式(mx.gluon.nn.SymbolBlock.imports)并运算默认是测试模式
C++:默认是测试模式

模型的导入导出

在1.3.0版本之前,我对于Hybrid模型采用export导出,LSTM这种无法Hybrid化的模型采用save_params的方式。
但是save_params方式一方面无法载入C++,一方面每次Python导入模型都要重新定义网络结构,很麻烦

今天才知道一个v1.2.1之后引入了有趣的导入函数mx.gluon.nn.SymbolBlock.imports,模型采用export导出后(1.3.0版本之后rnn、lstm等也可以顺利导出了),就有params和json两个文件,分别存储权重和网络结构,即可预测了

下面这个例子显示的结果一样,但是更简单便捷。
DNN的输入: (batch, 535) 输出:(batch, 43)
LSTM的输入:(duration, batch, 535) 输出:(duration, batch, 43)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
import mxnet as mx 
from mxnet import gluon
from collections import namedtuple

data= mx.random.uniform(shape=(3,535))
##### New way #####
# import LSTM model
net = gluon.nn.SymbolBlock.imports('LSTM-symbol.json', ['data'], param_file='LSTM-0000.params', ctx=mx.cpu())
print net(mx.nd.zeros(shape=(3,1,535))) #100 can be other number
# import DNN model
net = gluon.nn.SymbolBlock.imports('DNN-symbol.json', ['data'], param_file='DNN-0000.params', ctx=mx.cpu())
print net(data) #100 can be other number
##### Old way #####
# import DNN model
sym = mx.symbol.load('DNN-symbol.json')
mod=mx.mod.Module(symbol=sym)
mod.bind(data_shapes=[('data',(1,535))])
mod.load_params('DNN-0000.params')
Batch=namedtuple('Batch',['data'])
mod.forward(Batch([data]),is_train=False)
print mod.get_outputs()[0]

数据操作

expand_dims和flatten

data是(3,4)的形状,如果想变成(3,1,4),可以reshape,但是更好的办法是.expand_dims(axis=1),再变回去也只要.flatten()就行,因为flatten函数会将输入的(d1,d2,d3…)维度变为(d1,d2d3…)

nd.concatenate(被弃用,改为nd.concat)

1
2
3
4
5
6
7
8
9
10
11
12
print img_list[0].shape #(1L, 3L, 64L, 64L) 每个都是这样的形状
print len(img_list) #13233
nd.concatenate(img_list).shape #(13233L, 3L, 64L, 64L)

train_data = mx.io.NDArrayIter(data=nd.concatenate(img_list),
batch_size=64)
train_data.reset()
for batch in train_data:
print batch
break
#输出DataBatch: data shapes: [(64L, 3L, 64L, 64L)] label shapes: []
#即每个batch是64张图

nd.concatenate([history,temp],axis=1)或者nd.concat(history,temp,dim=1)对应F.concat(history, temp, dim=1)

计算L2Loss

1
2
3
4
5
6
7
8
9
import mxnet as mx
from mxnet import gluon
import numpy as np
loss1=gluon.loss.L2Loss(batch_axis=1)
a=mx.nd.random.uniform(0, 10,shape=(3,2,4))
b=mx.nd.random.uniform(0, 10,shape=(3,2,4))
print loss1(a,b)
print np.mean(np.square((a[:,0,:]-b[:,0,:]).asnumpy()))/2
print np.mean(np.square((a[:,1,:]-b[:,1,:]).asnumpy()))/2

网络

RNN

1
2
3
4
5
6
7
8
9
10
layer = mx.gluon.rnn.RNN(100, 3)
#只知道每个time-steps的输出维度是100,有三个隐层,具体几个time-steps当前未知
layer.initialize()
input = mx.nd.random_uniform(shape=(6, 8, 10))
# 默认TNC模式,方便取到跨batch的数据
# 代表time-steps是6,每个time-steps对应的输入维度是10,batch_size为8
# 6*10->6*100
# by default zeros are used as begin state
output = layer(input)
print output.shape

Embedding

1
net.weight.data().asnumpy()

mask-RNN

重要的SequenceMask函数

第二个参数表示这个mini-batch内几个样本是真实的,这里代表两个真实

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
x = mx.nd.array([[[  1.,   2.,   3.],
[ 4., 5., 6.]],
[[ 7., 8., 9.],
[ 10., 11., 12.]],
[[ 13., 14., 15.],
[ 16., 17., 18.]]])
#x.shape=(3L,2L,3L)
res=mx.nd.SequenceMask(x,mx.nd.array([2,1]), use_sequence_length=True)
print res
#表明第一个batch保留两个time-stseps,第二个batch保留1个time-stsep
# 得到
# [[ 1. 2. 3.]
# [ 4. 5. 6.]]
# [[ 7. 8. 9.]
# [ 0. 0. 0.]]
# [[ 0. 0. 0.]
# [ 0. 0. 0.]]]
#这样的话 res[:,0,:]取出的就是第一个batch加了mask的结果
# [[ 1. 2. 3.]
# [ 7. 8. 9.]
# [ 0. 0. 0.]]
# res[:,1,:]取出的就是第2个batch加了mask的结果
# [[ 4. 5. 6.]
# [ 0. 0. 0.]
# [ 0. 0. 0.]]

首先解决带mask的loss

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# -*- coding: utf-8 -*-
import mxnet as mx

a=mx.random.normal(0,1,shape=(50,128,43))
b=mx.random.normal(0,1,shape=(50,128,43))
mask = mx.nd.array([10]*128) #如果这里是[50]*128那么这两个loss的结果一样

loss = mx.gluon.loss.L2Loss(batch_axis=1)

def L2LossMask(a,b,mask):
#类似于gluon.loss.L2Loss(batch_axis=1),但是可以用mask方式计算
maskloss=[]
maska = mx.nd.SequenceMask(a, mask, use_sequence_length=True)
maskb = mx.nd.SequenceMask(b, mask, use_sequence_length=True)
for i in range(a.shape[1]):
index = int(mask[i].asscalar())
maskloss.append(mx.nd.sum((maska[:index,i,:]-maskb[:index,i,:])**2)/(2*index*a.shape[2]))
return mx.nd.concat(*maskloss, dim=0)

print L2LossMask(a,b,mask) # right
print loss(a,b) # wrong

网络可视化

sym.list_outputs()
列出一个模型输出端口的名字

sym.list_arguments()
列出一个模型的输入端口的名字以及权重和偏置的名字

sym.tojson()
可以打印出网络结构

mod.get_outputs()
列出前馈的输出

显示网络结构 viz.plot_network
直接显示网络结构mx.viz.plot_network(symbol=sym)

1
graph = Import["ExampleData/mxnet_example2.json", {"MXNet", "NodeGraphPlot"}]
1
graph = Import["ExampleData/mxnet_example2.json", {"MXNet", "NodeGraph"}]

MXNet高阶应用

MXNet与C++联动

步骤

  1. 在python中训练MXNet模型
  2. 在python中导入模型,并进行预测
  3. 在C++中导入模型(在小例子上进行验证两个接口结果一致)
  4. 在C++项目中使用模型

配置C++平台

  1. 在C++/常规中添加“附加包含目录”,即工作目录,方便定位c_predict_api.h的位置。如果能成功#include的话,不设置也行
  2. 在链接器/输入中增加“附加依赖项”,即libmxnet.lib
  3. 修改“活动解决方案平台”为x64
    1. 拷贝libmxnet.dll和libmxnet.lib和c_predict_api.h到工作目录
    2. cpp文件加入#include <c_predict_api.h>

C++使用指南

  • 可运行单输入单输出 默认采用预测方式
  • 可运行多输入多输出 默认采用预测方式
  • 可运行多输入多输出 但是在输出端口可以只输出一个端口的数据
  • 修改预测支持一个mini-batch只需要修改input_shape_data中的batch_size,并且将一个mini-batch的输入数据压平送进网络。在设置输入输出端口的vector的大小时候都要把它设置为一个batch数据长度的batch_size倍

利用HDF5文件做迭代器用于训练

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
import mxnet as mx
from mxnet import nd,gluon,autograd
from mxnet.gluon import nn
import h5py

net = nn.Sequential()
with net.name_scope():
net.add(nn.Dense(32,in_units=2,activation="tanh"))
net.add(nn.Dense(1))
net.initialize()

# load data from file
with h5py.File('test_data_SE.h5', 'r') as h5file:
X_h5 = h5file["Input"]
y_h5 = h5file["Output"]
num_examples=X_h5.shape[0]

batch_size = 512
epochs=10
dataiter = mx.io.NDArrayIter(X_h5, y_h5, batch_size=batch_size)
square_loss = gluon.loss.L2Loss()
trainer = gluon.Trainer(net.collect_params(), 'adam', {'learning_rate': 0.3})

for epoch in range(epochs):
total_loss = 0
dataiter.reset()
for iBatch, batch in enumerate(dataiter):
with autograd.record():
output = net(batch.data[0])
loss = square_loss(output, batch.label[0])
loss.backward()
trainer.step(batch_size)
total_loss += nd.sum(loss).asscalar()
print("Epoch %d, average loss: %f" % (epoch, total_loss/num_examples))

print(net(nd.array([[-1,-0.9]]))[0].asnumpy())

单输入单输出Seq2Seq模型

Python代码(HybridBlock版本)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
import mxnet as mx
from mxnet.gluon import nn
print("mxnet version: "+mx.__version__)

mx.random.seed(1234) #Getting the same result everytime
def get_net():
# construct a MLP
net = nn.HybridSequential()
with net.name_scope():
net.add(nn.Dense(5, activation="relu"))
net.add(nn.Dense(2))
# initialize the parameters
net.collect_params().initialize()
return net

# forward
x = mx.nd.array([[0.1,0.2,0.3]])
net = get_net()
net.hybridize()
print('=== net(x) ==={}'.format(net(x)))

net.export('model')

############## Re-importing the net ##############
from collections import namedtuple
sym = mx.symbol.load('model-symbol.json')
mod=mx.mod.Module(symbol=sym)
mod.bind(data_shapes=[('data',(1,3))])
mod.load_params('model-0000.params')
Batch=namedtuple('Batch',['data'])
data=mx.nd.array([[0.1,0.2,0.3]])
mod.forward(Batch([data]),is_train=False)
print mod.get_outputs()

C++导入模型再预测 代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
#include <stdio.h>

// Path for c_predict_api
#include <mxnet/c_predict_api.h>

#include <iostream>
#include <fstream>
#include <string>
#include <vector>
#include <assert.h>

// Read file to buffer
class BufferFile {
public:
std::string file_path_;
int length_;
char* buffer_;

explicit BufferFile(std::string file_path)
:file_path_(file_path) {

std::ifstream ifs(file_path.c_str(), std::ios::in | std::ios::binary);
if (!ifs) {
std::cerr << "Can't open the file. Please check " << file_path << ". \n";
length_ = 0;
buffer_ = NULL;
return;
}

ifs.seekg(0, std::ios::end);
length_ = ifs.tellg();
ifs.seekg(0, std::ios::beg);
std::cout << file_path.c_str() << " ... " << length_ << " bytes\n";

buffer_ = new char[sizeof(char) * length_];
ifs.read(buffer_, length_);
ifs.close();
}

int GetLength() {
return length_;
}
char* GetBuffer() {
return buffer_;
}

~BufferFile() {
if (buffer_) {
delete[] buffer_;
buffer_ = NULL;
}
}
};

void PrintOutputResult(const std::vector<float>& data) {
for (int i = 0; i < static_cast<int>(data.size()); i++) {
printf("%.8f\n", data[i]);
}
printf("\n");
}

int main(int argc, char* argv[]) {

// Models path for your model, you have to modify it
std::string json_file = "./simple prediction model/model-symbol.json";
std::string param_file = "./simple prediction model/model-0000.params";

BufferFile json_data(json_file);
BufferFile param_data(param_file);

// Parameters
int dev_type = 1; // 1: cpu, 2: gpu
int dev_id = 1; // arbitrary.
mx_uint num_input_nodes = 1; // 1 for feedforward
const char* input_key[1] = { "data" };
const char** input_keys = input_key;

// input-dims
int data_len = 3;

const mx_uint input_shape_indptr[2] = { 0, 2 };
const mx_uint input_shape_data[2] = { 1,static_cast<mx_uint>(data_len) };
PredictorHandle pred_hnd = 0;

if (json_data.GetLength() == 0 || param_data.GetLength() == 0)
return -1;

// Create Predictor
assert(0==MXPredCreate((const char*)json_data.GetBuffer(),
(const char*)param_data.GetBuffer(),
static_cast<size_t>(param_data.GetLength()),
dev_type,
dev_id,
num_input_nodes,
input_keys,
input_shape_indptr,
input_shape_data,
&pred_hnd));
assert(pred_hnd);

std::vector<mx_float> vector_data = std::vector<mx_float>(data_len);
mx_float* p = vector_data.data();
p[0] = .1;
p[1] = .2;
p[2] = .3;

MXPredSetInput(pred_hnd, "data", vector_data.data(), data_len);

// Do Predict Forward
MXPredForward(pred_hnd);

mx_uint output_index = 0;

mx_uint *shape = 0;
//shape相当于1*3的向量
mx_uint shape_len;

// Get Output Result
MXPredGetOutputShape(pred_hnd, output_index, &shape, &shape_len);

size_t size = 1;
for (mx_uint i = 0; i < shape_len; ++i) size *= shape[i];

std::vector<float> data(size);

assert(0==MXPredGetOutput(pred_hnd, output_index, &(data[0]), size));

// Release Predictor
MXPredFree(pred_hnd);

// Print Output Data
PrintOutputResult(data);
return 0;
}

简单的多输入多输出网络

Python代码(普通版本)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
from mxnet import nd
from mxnet.gluon import nn

class HybridNet(nn.Block):
def __init__(self, **kwargs):
super(HybridNet, self).__init__(**kwargs)
with self.name_scope():
self.dense0 = nn.Dense(3)
self.dense1 = nn.Dense(3)
self.dense2 = nn.Dense(6)

def forward(self,x,y):
result1 = nd.relu(self.dense0(x))+nd.relu(self.dense1(y))
result2 = nd.relu(self.dense2(result1))
return [result1,result2]

net = HybridNet()
net.initialize()
x = nd.random.normal(shape=(4,3))
y = nd.random.normal(shape=(4,5))
res=net(x,y)
print "output1:",res[0]
print "output2:",res[1]

Python代码(HybridBlock版本)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
from mxnet import nd
from mxnet.gluon import nn

class HybridNet(nn.HybridBlock):
def __init__(self, **kwargs):
super(HybridNet, self).__init__(**kwargs)
with self.name_scope():
self.dense0 = nn.Dense(3)
self.dense1 = nn.Dense(3)
self.dense2 = nn.Dense(6)

def hybrid_forward(self, F,x,y):
result1 = F.relu(self.dense0(x))+F.relu(self.dense1(y))
result2 = F.relu(self.dense2(result1))
return [result1,result2]

net = HybridNet()
net.initialize()
net.hybridize()
x = nd.random.normal(shape=(4,3))
y = nd.random.normal(shape=(4,5))
res=net(x,y)
print "output1:",res[0]
print "output2:",res[1]
net.export('model')

print("############## Re-importing the net ##############")
from collections import namedtuple
sym = mx.symbol.load('model-symbol.json')
mod=mx.mod.Module(symbol=sym,data_names=['data0','data1'])
mod.bind(data_shapes=[('data0',(1,3)),('data1',(1,5))])
mod.load_params('model-0000.params')
Batch=namedtuple('Batch',['data'])
mod.forward(Batch(data=[x,y]))
print mod.get_outputs()

C++导入模型再预测 代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
#include <mxnet/c_predict_api.h>
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
#include <assert.h>

// Read file to buffer
class BufferFile {
public:
std::string file_path_;
int length_;
char* buffer_;

explicit BufferFile(std::string file_path)
:file_path_(file_path) {

std::ifstream ifs(file_path.c_str(), std::ios::in | std::ios::binary);
if (!ifs) {
std::cerr << "Can't open the file. Please check " << file_path << ". \n";
length_ = 0;
buffer_ = NULL;
return;
}

ifs.seekg(0, std::ios::end);
length_ = ifs.tellg();
ifs.seekg(0, std::ios::beg);
std::cout << file_path.c_str() << " ... " << length_ << " bytes\n";

buffer_ = new char[sizeof(char) * length_];
ifs.read(buffer_, length_);
ifs.close();
}

int GetLength() {
return length_;
}
char* GetBuffer() {
return buffer_;
}

~BufferFile() {
if (buffer_) {
delete[] buffer_;
buffer_ = NULL;
}
}
};

void PrintOutputResult(const std::vector<float>& data) {
for (int i = 0; i < static_cast<int>(data.size()); i++) {
printf("%.8f\n", data[i]);
}
printf("\n");
}

int main(int argc, char* argv[]) {

// Models path for your model, you have to modify it
std::string json_file = "./model-symbol.json";
std::string param_file = "./model-0000.params";

BufferFile json_data(json_file);
BufferFile param_data(param_file);

// Parameters
int dev_type = 1; // 1: cpu, 2: gpu
int dev_id = 1; // arbitrary.
mx_uint num_input_nodes = 2;
mx_uint num_output_nodes = 2;
const char* input_key[2] = { "data0" , "data1" };
const char** input_keys = input_key;
//output_key name maybe should modify
const char* output_key[2] = { "hybridnet0__plus0" , "hybridnet0_relu2" };
const char** output_keys = output_key;

// input-dims
int data0_len = 3;
int data1_len = 5;
const mx_uint input_shape_indptr[3] = { 0,2,4 };
const mx_uint input_shape_data[4] = {1,static_cast<mx_uint>(data0_len),1,static_cast<mx_uint>(data1_len) };
PredictorHandle pred_hnd = 0;

if (json_data.GetLength() == 0 || param_data.GetLength() == 0)
return -1;

// Create Predictor
assert(0 == MXPredCreatePartialOut(
(const char*)json_data.GetBuffer(),
(const char*)param_data.GetBuffer(),
static_cast<size_t>(param_data.GetLength()),
dev_type,
dev_id,
num_input_nodes,
input_keys,
input_shape_indptr,
input_shape_data,
num_output_nodes,
output_keys,
&pred_hnd));
assert(pred_hnd); //ERROR HERE

std::vector<mx_float> vector_data0 = std::vector<mx_float>(data0_len);
mx_float* p0 = vector_data0.data();
p0[0] = 1;p0[1] = 2;p0[2] = 5;
MXPredSetInput(pred_hnd, "data0", vector_data0.data(), data0_len);

std::vector<mx_float> vector_data1 = std::vector<mx_float>(data1_len);
mx_float* p1 = vector_data1.data();
p1[0] = 5; p1[1] = 3; p1[2] = 1; p1[3] = 4; p1[4] = 5;
MXPredSetInput(pred_hnd, "data1", vector_data1.data(), data1_len);

// Do Predict Forward
MXPredForward(pred_hnd);

mx_uint output0_index = 0;
mx_uint *shape0 = 0;
//shape相当于1*3的向量
mx_uint shape0_len;
// Get Output Result
MXPredGetOutputShape(pred_hnd, output0_index, &shape0, &shape0_len);
size_t size0 = 1;
for (mx_uint i = 0; i < shape0_len; ++i) size0 *= shape0[i];

mx_uint output1_index = 1;
mx_uint *shape1 = 0;
//shape相当于1*5的向量
mx_uint shape1_len;
// Get Output Result
MXPredGetOutputShape(pred_hnd, output1_index, &shape1, &shape1_len);
size_t size1 = 1;
for (mx_uint i = 0; i < shape1_len; ++i) size1 *= shape1[i];

std::vector<float> data0(size0);
assert(0 == MXPredGetOutput(pred_hnd, output0_index, &(data0[0]), size0));

std::vector<float> data1(size1);
assert(0 == MXPredGetOutput(pred_hnd, output1_index, &(data1[0]), size1));

// Print Output Data
printf("output0:\n");
PrintOutputResult(data0);
printf("output1:\n");
PrintOutputResult(data1);

// Release Predictor
MXPredFree(pred_hnd);

return 0;
}

训练模板

单输入单输出

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
import mxnet as mx
from mxnet.gluon import nn
from mxnet import nd,gluon,autograd,gpu
import h5py
import os

ctx = gpu()
net = nn.HybridSequential()
with net.name_scope():
net.add(nn.Dense(128, activation="relu"))
net.add(nn.Dropout(0.1))
net.add(nn.Dense(128, activation="relu"))
net.add(nn.Dropout(0.1))
net.add(nn.Dense(128, activation="relu"))
net.add(nn.Dropout(0.1))
net.add(nn.Dense(32))
net.initialize(ctx=ctx)
net.hybridize()

val_file = h5py.File('../data/TargetModel/validation_normalization_Target.h5', 'r')
X_val_h5 = nd.array(val_file["Input"][:]).as_in_context(ctx)
y_val_h5 = nd.array(val_file["Output"][:]).as_in_context(ctx)
val_file.close()

# load data from file
with h5py.File('../data/TargetModel/training_normalization_Target.h5', 'r') as h5file:
X_h5 = h5file["Input"]
y_h5 = h5file["Output"]

num_examples=X_h5.shape[0]
min_val_loss=float("inf")

epochs=100
batch_size = 128
dataiter = mx.io.NDArrayIter(X_h5, y_h5, batch_size=batch_size)
square_loss = gluon.loss.L2Loss()
trainer = gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': 0.1})

for epoch in range(epochs):
total_loss = 0
dataiter.reset()
for iBatch, batch in enumerate(dataiter):
with autograd.record():
output = net(batch.data[0].as_in_context(ctx))
loss = square_loss(output, batch.label[0].as_in_context(ctx))
loss.backward()
trainer.step(batch_size)
total_loss += nd.sum(loss).asscalar()
if iBatch%100==0:
print("Epoch %d, Batch: %d/%d, average loss: %f"%(epoch,iBatch,num_examples/batch_size,nd.mean(loss).asscalar()))
print("Epoch %d finished, average loss of training set: %f" % (epoch, total_loss/num_examples))
val_loss = nd.mean(square_loss(net(X_val_h5), y_val_h5)).asscalar()
print("\n-----loss of validation set: %f-----\n" % val_loss)
if(val_loss < min_val_loss):
min_val_loss=val_loss
net.export('TargetModel')
print("---validation set got a smaller loss---\n---------------Save net----------------\n")

导入测试

1
2
3
4
5
6
7
8
9
10
11
import mxnet as mx
from mxnet.gluon import nn
from collections import namedtuple
sym = mx.symbol.load('TargetModel-symbol.json')
mod=mx.mod.Module(symbol=sym)
mod.bind(data_shapes=[('data',(1,1+523))])
mod.load_params('TargetModel-0000.params')
Batch=namedtuple('Batch',['data'])
data=mx.nd.array([range(1+523)])
mod.forward(Batch([data]),is_train=False)
print mod.get_outputs()

多输入多输出

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
import mxnet as mx
from mxnet.gluon import nn
from mxnet import nd,gluon,autograd,gpu
import h5py
import os

ctx=gpu()

class JoinModel(nn.HybridBlock):
def __init__(self, **kwargs):
super(JoinModel, self).__init__(**kwargs)
self.encodeNet=nn.HybridSequential()
self.decodeNet=nn.HybridSequential()
self.fc=nn.HybridSequential()
with self.name_scope():
self.encodeNet.add(nn.Dense(128,activation="relu"))
self.encodeNet.add(nn.Dense(128))
self.decodeNet.add(nn.Dense(128,activation="relu"))
self.decodeNet.add(nn.Dense(32))
self.fc.add(nn.Dense(256,activation="relu"))
self.fc.add(nn.Dropout(0.1))
self.fc.add(nn.Dense(256,activation="relu"))
self.fc.add(nn.Dropout(0.1))
self.fc.add(nn.Dense(256,activation="relu"))
self.fc.add(nn.Dropout(0.1))
self.fc.add(nn.Dense(32))


def hybrid_forward(self,F,text,history):
temp = self.encodeNet(text)
result1 = self.decodeNet(temp)
result2 = self.fc(F.concat(history, temp, dim=1))
return [result1,result2]

net = JoinModel()
net.initialize(ctx=ctx)
net.hybridize()

val_file = h5py.File('../data/JoinModel/validation_normalization_Join.h5', 'r')
text_val_h5 = nd.array(val_file["Input1"][:]).as_in_context(ctx)
history_val_h5 = nd.array(val_file["Input2"][:]).as_in_context(ctx)
UnitVec_val_h5 = nd.array(val_file["Output1"][:]).as_in_context(ctx)
val_file.close()

# load data from file
with h5py.File('../data/JoinModel/training_normalization_Join.h5', 'r') as h5file:
text_h5 = h5file["Input1"]
history_h5 = h5file["Input2"]
UnitVec_h5 = h5file["Output1"]

num_examples=text_h5.shape[0]
min_val_loss=float("inf")

epochs=100
batch_size = 128
dataiter = mx.io.NDArrayIter([text_h5,history_h5], UnitVec_h5, batch_size=batch_size)
square_loss = gluon.loss.L2Loss()
trainer = gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': 0.1})

for epoch in range(epochs):
total_loss = 0
dataiter.reset()
for iBatch, batch in enumerate(dataiter):
with autograd.record():
output = net(batch.data[0].as_in_context(ctx),batch.data[1].as_in_context(ctx))
loss1 = square_loss(output[0], batch.label[0].as_in_context(ctx))
loss2 = square_loss(output[1], batch.label[0].as_in_context(ctx))
loss = loss1+loss2
loss.backward()
trainer.step(batch_size)
total_loss += nd.sum(loss).asscalar()
if iBatch % 100 == 0:
print("Epoch %d, Batch: %d/%d, average loss: %f"%(epoch,iBatch,num_examples/batch_size,nd.mean(loss).asscalar()))
print("Epoch %d finished, average loss of training set: %f" % (epoch, total_loss/num_examples))
res = net(text_val_h5,history_val_h5)
val_loss = nd.mean(square_loss(res[0],UnitVec_val_h5)+square_loss(res[1],UnitVec_val_h5)).asscalar()
print("\n-----loss of validation set: %f-----\n" % val_loss)
if(val_loss < min_val_loss):
min_val_loss=val_loss
net.export('JoinModel')
print("---validation set got a smaller loss---\n---------------Save net----------------\n")

导入测试

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
import mxnet as mx
from mxnet import nd
import numpy as np
############## Re-importing the net ##############
print("############## Re-importing the net ##############")
from collections import namedtuple
sym = mx.symbol.load('JoinModel-symbol.json')
mod=mx.mod.Module(symbol=sym,data_names=['data0','data1'])
mod.bind(data_shapes=[('data0',(1,524)),('data1',(1,128))])
mod.load_params('JoinModel-0000.params')
Batch=namedtuple('Batch',['data'])
x = nd.random.normal(shape=(1,524))
y = nd.random.normal(shape=(1,128))
mod.forward(Batch(data=[x,y]),is_train=False)
print mod.get_outputs()
print sym.list_outputs()

MXNet源码阅读

io.py

位于E:\Anaconda\envs\gluon\Lib\site-packages\mxnet
阅读如何自定义迭代器