综合编程

Re-Share: vtreat Data Preparation Documentation and Video

微信扫一扫,分享到朋友圈

Re-Share: vtreat Data Preparation Documentation and Video

I would like to re-share vtreat ( R version
, Python version
) a data preparation documentation for machine learning tasks.

vtreat is a system for preparing messy real world data for predictive modeling tasks (classification, regression, and so on). In particular it is very good at re-coding high-cardinality string-valued (or categorical) variables for later use.

A nice introductory video lecture on vtreat can be found here
, and the latest copy of the lecture slides here
. Or, you can check out chapter 8 “Advanced data preparation” of
Zumel, Mount, Practical Data Science with R
, 2nd Edition, Manning 2019

– which covers the use of vtreat.

The vtreat documentation is organized by task (regression, classification, multinomial classification, and unsupervised), language (R or Python) and interface style (design/prepare, or fit/prepare). In particular the R code now supports variations of the interfaces, allowing users to choose what works best with their coding style. Either design/prepare, which is very fluid when combined with wrapr::unpack notation
or the fit/prepare (which uses mutable state to organize steps).

Mac 五笔输入法使用小记

上一篇

How can I place 2 iframes next to each other?

下一篇

你也可能喜欢

评论已经被关闭。

插入图片

热门栏目

Re-Share: vtreat Data Preparation Documentation and Video

长按储存图像,分享给朋友