基于 ModelArts的古詩詞自動生成
在中國文化傳統中,詩有著極為獨特而崇高的地位。詩歌開拓了人類的精神世界,給人們帶來了無限的美感。本文將介紹如何使用一站式AI開發平臺,自動生成屬于你的藏頭詩。

環境準備
ModelArts: https://www.huaweicloud.com/product/modelarts.html
AI開發平臺ModelArts是面向開發者的一站式AI開發平臺,為機器學習與深度學習提供海量數據預處理及半自動化標注、大規模分布式Training、自動化模型生成,及端-邊-云模型按需部署能力,幫助用戶快速創建和部署模型,管理全周期AI工作流。
OBS: https://www.huaweicloud.com/product/obs.html
對象存儲服務(Object Storage Service,OBS)提供海量、安全、高可靠、低成本的數據存儲能力,可供用戶存儲任意類型和大小的數據。適合企業備份/歸檔、視頻點播、視頻監控等多種數據存儲場景。
模型和素材準備
文件已經上傳至obs共享桶,在notebook中使用代碼可以直接讀取。
源文件地址:https://github.com/jinfagang/tensorflow_poems
實際操作
首先,在ModelArts中創建開發環境:在“開發環境”選項下選擇notebook。
創建一個notebook,打開JupyterLab,選擇tensorflow環境,開始體驗。
使用華為云提供的接口,使用代碼將公有桶poems中的文件拷貝到本地work路徑下:
import moxing as mox import os obspath = 'obs://poems/poems/' #目標文件夾 localpath = os.path.join(os.environ['HOME'],'work/test/') #本地文件夾 mox.file.copy_parallel(obspath ,localpath) #批量拷貝obs://poems
安裝指定版本numpy:
!pip install --upgrade pip !pip install numpy==1.16.0
導入包,定義train函數:
import tensorflow as tf from poems.model import rnn_model from poems.poems import process_poems, generate_batch tf.app.flags.DEFINE_integer('batch_size', 64, 'batch size.') tf.app.flags.DEFINE_float('learning_rate', 0.01, 'learning rate.') tf.app.flags.DEFINE_string('model_dir', os.path.abspath('./model'), 'model save path.') tf.app.flags.DEFINE_string('file_path', os.path.abspath('./data/poems.txt'), 'file name of poems.') tf.app.flags.DEFINE_string('model_prefix', 'poems', 'model save prefix.') tf.app.flags.DEFINE_integer('epochs', 50, 'train how many epochs.') FLAGS = tf.app.flags.FLAGS tf.app.flags.DEFINE_string('f', '', 'kernel') def run_training(): if not os.path.exists(FLAGS.model_dir): os.makedirs(FLAGS.model_dir) poems_vector, word_to_int, vocabularies = process_poems(FLAGS.file_path) batches_inputs, batches_outputs = generate_batch(FLAGS.batch_size, poems_vector, word_to_int) input_data = tf.placeholder(tf.int32, [FLAGS.batch_size, None]) output_targets = tf.placeholder(tf.int32, [FLAGS.batch_size, None]) end_points = rnn_model(model='lstm', input_data=input_data, output_data=output_targets, vocab_size=len( vocabularies), rnn_size=128, num_layers=2, batch_size=64, learning_rate=FLAGS.learning_rate) saver = tf.train.Saver(tf.global_variables()) init_op = tf.group(tf.global_variables_initializer(), tf.local_variables_initializer()) with tf.Session() as sess: sess.run(init_op) start_epoch = 0 checkpoint = ("./model/poems-42") if checkpoint: saver.restore(sess, "./model/poems-42") print("## restore from the checkpoint {0}".format(checkpoint)) start_epoch += int(checkpoint.split('-')[-1]) print('## start training...') try: n_chunk = len(poems_vector) // FLAGS.batch_size for epoch in range(start_epoch, FLAGS.epochs): n = 0 for batch in range(n_chunk): loss, _, _ = sess.run([ end_points['total_loss'], end_points['last_state'], end_points['train_op'] ], feed_dict={input_data: batches_inputs[n], output_targets: batches_outputs[n]}) n += 1 if batch%50==0: print('Epoch: %d, batch: %d, training loss: %.6f' % (epoch, batch, loss)) if epoch % 6 == 0: saver.save(sess, os.path.join(FLAGS.model_dir, FLAGS.model_prefix), global_step=epoch) except KeyboardInterrupt: print('## Interrupt manually, try saving checkpoint for now...') saver.save(sess, os.path.join(FLAGS.model_dir, FLAGS.model_prefix), global_step=epoch) print('## Last epoch were saved, next time will start from epoch {}.'.format(epoch))
開始訓練(也可以跳過訓練,直接調用模型42進行預測):
def main(): run_training() if __name__ == '__main__': main()
導入預測相關包并加載checkpoints:
import numpy as np start_token = 'B' end_token = 'E' model_dir = './model/' corpus_file = './data/poems.txt' lr = 0.0002 def to_word(predict, vocabs): predict = predict[0] predict /= np.sum(predict) sample = np.random.choice(np.arange(len(predict)), p=predict) if sample > len(vocabs): return vocabs[-1] else: return vocabs[sample] def gen_poem(begin_word): tf.reset_default_graph() batch_size = 1 print('## loading corpus from %s' % model_dir) poems_vector, word_int_map, vocabularies = process_poems(corpus_file) input_data = tf.placeholder(tf.int32, [batch_size, None]) end_points = rnn_model(model='lstm', input_data=input_data, output_data=None, vocab_size=len( vocabularies), rnn_size=128, num_layers=2, batch_size=64, learning_rate=lr)#,reuse=True saver = tf.train.Saver(tf.global_variables()) init_op = tf.group(tf.global_variables_initializer(), tf.local_variables_initializer()) with tf.Session() as sess: sess.run(init_op) saver.restore(sess, "./model/poems-48") x = np.array([list(map(word_int_map.get, start_token))]) [predict, last_state] = sess.run([end_points['prediction'], end_points['last_state']], feed_dict={input_data: x}) word = begin_word or to_word(predict, vocabularies) poem_ = '' i = 0 while word != end_token: poem_ += word i += 1 if i > 24: break x = np.array([[word_int_map[word]]]) [predict, last_state] = sess.run([end_points['prediction'], end_points['last_state']], feed_dict={input_data: x, end_points['initial_state']: last_state}) word = to_word(predict, vocabularies) return poem_ def pretty_print_poem(poem_): poem_sentences = poem_.split('。') for s in poem_sentences: if s != '' and len(s) > 10: print(s + '。')
調用模型生成詩歌
poem = gen_poem('人') pretty_print_poem(poem_=poem)
至此,本次實現先告一段落,關于多個字的藏頭詩生成還沒進行探索,歡迎在評論區分享指導。
另外,有興趣的小伙伴歡迎加入MDG中國礦業大學站,QQ群:781169338,共建 ModelArts 生態!
AI AI開發平臺ModelArts Jupyter notebook 對象存儲服務 OBS
版權聲明:本文內容由網絡用戶投稿,版權歸原作者所有,本站不擁有其著作權,亦不承擔相應法律責任。如果您發現本站中有涉嫌抄襲或描述失實的內容,請聯系我們jiasou666@gmail.com 處理,核實后本網站將在24小時內刪除侵權內容。
版權聲明:本文內容由網絡用戶投稿,版權歸原作者所有,本站不擁有其著作權,亦不承擔相應法律責任。如果您發現本站中有涉嫌抄襲或描述失實的內容,請聯系我們jiasou666@gmail.com 處理,核實后本網站將在24小時內刪除侵權內容。