技术控

    今日:60| 主题:49312
收藏本版 (1)
最新软件应用技术尽在掌握

[其他] A simple workflow for deep learning

[复制链接]
万花痛ぅ 发表于 2016-9-30 23:50:27
247 5

立即注册CoLaBug.com会员,免费获得投稿人的专业资料,享用更多功能,玩转个人品牌!

您需要 登录 才可以下载或查看,没有帐号?立即注册

x
As a follow-up to my Primer On Universal Function Approximation with Deep Learning , I’ve created a project on Github that provides a working example of building, training, and evaluating a neural network. Included are helper functions in Lua that I wrote to simplify creating the data and using some functional programming techniques.
  The basic workflow for the example is this:
  
       
  • Create/acquire a training set;   
  • Analyze the data for traits, distributions, noise, etc.;   
  • Design a deep learning architecture including the layers and activation functions. Also make sure you understand the type of problem you are trying to solve);   
  • Choose hyper parameters, such as cost function, optimizer, and learning rate;   
  • Train the model;   
  • Evaluate in-sample and out-of-sample performance.  
  My personal preference is to limit the use of a deep learning framework to building and training models. To construct the datasets and analyze performance, it’s easier to use R (YMMV of course). What’s nice about this approach is that if you primarily work in Python or R, then you can continue to use the tools you’re most familiar with. It also means that it’s easy to swap out one deep learning framework with another without having to start over. These frameworks are also a bit of a bear to setup (I’m looking at you, TensorFlow), particularly if you want to leverage GPUs. It’s also convenient to use a Docker image for this purpose to isolate the effects of a specialized configuration and make it repeatable if you want to work on *gasp* a second computer.
  Which Deep Learning Framework?

  Having some experience with TensorFlow, Theano, and Torch, I find Torch to have the friendliest high-level semantics. Theano and TensorFlow are much more low-level, which is not as well suited to practitioners or applied researchers. That means it’s a little harder to get started. The trade-off with Torch is that you have to learn Lua, which is a simple scripting language but also has some awkward paradigms (I’ve never been a fan of the Prototype object model).
   On the other hand, Theano and TensorFlow are built on Python, so most people will be familiar with the language. However, my time could be spent better if I didn’t have to write my own mini-batch algorithm. As an alternative, Keras provides a semantically rich high-level interface that works with both Theano and TensorFlow. I will be adding a corresponding function approximation example in the deep_learning_ex project to make it easier to compare. At that point, it will be easier to compare compute performance, as well as how close the optimizers are to each other.
     
A simple workflow for deep learning-1 (understand,techniques,framework,including,function)
   Deep learning can be painfully slow
     In terms of TensorFlow, unless you plan on working for Google, I wouldn’t recommend using it. The fact that it requires Google’s proprietary Bazel build system means it’s DOA for me. When I want to work with deep learning, I really don’t have the patience to wait for a 1.1 GB download of just the build system . I mean, I only have Time Warner Cable for crissakes. Others have reported that even using pre-built models, like SyntaxNet are slow, so unless you have the compute power and storage capacity of Google’s data centers along with the bandwidth of Google Fiber, you’re better off watching YouTube.
  Learning Deep Learning Frameworks

  Torch/Lua

   Learning Torch can be split into two tasks: learning Lua, and then understanding the Torch framework, specifically the nn package. Most people will find that learning Lua will take the majority of the time, as nn is nicely organized and easy to use.
   If you already are comfortable with programming languages, then this 15 minute tutorial is good. Alternatively, this other 15 minute tutorial is a bit more terse but rather comprehensive. This will cover the basics. Beyond that, you need to understand how to work with data, which is less well covered. The simplecsv module can simplify I/O.
   The actual data format that the optimzer needs is a table object with an attached size method. Each element of this table is itself a table with two elements: input and corresponding output. So this can be considered a row-major matrix representation of the data. To use the provided StochasticGradient optimizer, the data must be constructed this way, as shown in ex_fun_approx.lua . It is up to you to reserve some data for testing.
   From a practical perspective, you don’t need to know much about Torch itself. It’s probably more efficient to familiarize yourself with the nn package first. I spend most of my time in this documentation. At some later point, it might be worthwhile learning how Torch itself works, in which case their github repo is flush with documentation and examples. I haven’t needed to look elsewhere.
  Keras/Theano

   If Theano is like Torch, then Keras is like the nn package. Unless you need to descend into the bits, it’s probably best to stay high-level. Unlike nn , there are alternatives to Keras for Theano, which I won’t cover. Like Torch, Keras comes pre-installed in the Docker image provided in the deep_learning_ex repository. The best way to get started is to read the Keras documentation , which includes a working example of a simple neural network.
   As with nn , the trick is understanding the framework’s interface, particularly around what expectations it has for the data. Keras essentially expects a 4-tuple of (input training, input testing, output training, output testing). Their built-in datasets all return data organized like this (actually two pairs representing input and output).
  Conclusion

   Deep learning doesn’t need to be hard to learn. By following the prescribed workflow, using the provided Docker image , and streamlining your learning of deep learning frameworks to the essentials, you can get up to speed quickly.
  Have any resources you’d like to share? Add them in the comments!
友荐云推荐




上一篇:Lesser known Git commands
下一篇:The Ultimate Guide to What’s New in VMware End-User Computing
酷辣虫提示酷辣虫禁止发表任何与中华人民共和国法律有抵触的内容!所有内容由用户发布,并不代表酷辣虫的观点,酷辣虫无法对用户发布内容真实性提供任何的保证,请自行验证并承担风险与后果。如您有版权、违规等问题,请通过"联系我们"或"违规举报"告知我们处理。

souzt 发表于 2016-10-1 19:37:20
太邪乎了吧?
回复 支持 反对

使用道具 举报

啊神地方撒 发表于 2016-10-2 05:00:01
每天只签到不留言的,升级永远没有见贴就留言的快。说明:”复制粘贴很重要!
回复 支持 反对

使用道具 举报

风华 发表于 2016-10-8 15:20:47
楼下的且行且珍惜!
回复 支持 反对

使用道具 举报

美丽心情归来 发表于 2016-11-11 20:30:25
运气就是机会碰巧撞到了你的努力。
回复 支持 反对

使用道具 举报

yh7jt 发表于 2016-11-16 13:15:17
站位支持
回复 支持 反对

使用道具 举报

*滑动验证:
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

我要投稿

推荐阅读

扫码访问 @iTTTTT瑞翔 的微博
回页顶回复上一篇下一篇回列表手机版
手机版/CoLaBug.com ( 粤ICP备05003221号 | 文网文[2010]257号 )|网站地图 酷辣虫

© 2001-2016 Comsenz Inc. Design: Dean. DiscuzFans.

返回顶部 返回列表