技术控

    今日:275| 主题:57907
收藏本版 (1)
最新软件应用技术尽在掌握

[其他] Grep is Losing Its Grip

[复制链接]
把持不住 投递于 2016-12-2 15:35:18
130 2
Grep is the ubiquitous command line tool for finding lines in files that match a pattern. Originally invented by computer science luminary Ken Thompson in November 1974, it was originally developed for the Unix operating system, but is available today, in some form or another, on almost all systems. Grep has been the defacto standard for programmers everywhere to find stuff in files. However, as time and technology has advanced, the sheer size and number of files has grown at a rapid rate. A good example is the source code for the linux kernel which at version 1.0 in 1994 consisted of 170,000 lines of code, and as of version 4.8 is now over 22M lines of code .
   

Grep is Losing Its Grip

Grep is Losing Its Grip-1-技术控-technology,available,developed,computer,invented

  As you can well imagine, grep on most systems is fairly dated when taking modern multicore processors into account. Grep uses a single thread to do its work, and performance clearly suffers over large filesets even with our modern powerful systems.
  Now you can break up a grep in a variety of ways in order to inject some potential concurrency upside for your search. I say “potential” as there are many factors involved including your search tree, hardware, IO bottlenecks, etc. Here is one way:
  1. % find . -type f -print0 | xargs -0 -P [number_of_processes] grep [pattern]
复制代码
  This gets the job done, but surely we can more productive than this for our everyday development. In fact, why don’t we have a tool that is both fast, concurrently capable, and tailored to the needs of developers? In fact, we do! In the Perl world we have Andy Lester’s excellent ack! This works on any Perl (orActivePerl of course!) A great benefit for developers, it ignores your coredump files, binaries, backup and code repository files. It also has the advantage of using Perl's regular expressions which have always been top notch amongst languages. Typical usage:
  1. % ack [pattern]
复制代码
But I suggest you try the following:
  1. % ack --thppt
复制代码
Ack, why didn’t you specify the directory to search from? Well in order to be maximally productive, it defaults to recursive directory search from the current working directory! This is perfect when searching your code trees. However, ack has a performance weakness...it still isn’t concurrent out-of-the-box (but could be used in a similar way like in the above grep example).
   This brings us to Geoff Greer’s feature-rich, ack-inspired, The Silver Searcher (great name!). This fantastic tool claims to be an order of magnitude faster than ack due to its implementation in C and use of pthreads to provide concurrency among many other improvements. Now, instead of “ack”, we can just type the clever “ag”, which is the periodic table chemical symbol for silver:
  1. % ag [pattern]
复制代码
It also ignores files you normally don't want to be searching in your codebase and looks for .gitignore and .hgignore files and you can even specify your own .ignore file.
   Lastly, we’ve been doing a lot of work with the Go language here at ActiveState. We are working towards our upcomingActiveGo™ distribution (link to signup page) and I would be remiss if I didn’t mention another fantastic grep-like tool built out of Go that is performance equivalent to ag.
   I discovered this when Dave Cheney gave a shoutout to Monochromegane’s The Platinum Searcher on episode 16 of the stellar Go Time podcast . This solution is coded in Go and makes fantastic use of Go’s built in concurrency support to obtain its solid performance. Like ack and ag it also ignores code repo files you don’t want to search through. Imitation is clearly the highest form of flattery. Installation is a snap as you can just grab the binary you need for mac, windows or linux and put it in your path and you’ve got another great grep-like tool. Basic usage is the same as The Silver Searcher, but with platinum’s chemical symbol instead:
  1. % pt [pattern]
复制代码
If you’re still working with your system installed grep, I highly recommend you move to a higher performance and developer oriented tool such as the Silver or Platinum Searchers!
  Having opened the article mentioning Ken Thompson, and his invention of grep, we’ve come full circle...Ken Thompson is one of the co-creators of Go a mere 35 years later! Happy searching!



上一篇:TiKV 的 MVCC(Multi-Version Concurrency Control)机制
下一篇:浅入浅出 Android 安全:第三章 Android 本地用户空间层安全
my580236 投递于 2016-12-3 15:50:08
我看着大家顶!
回复 支持 反对

使用道具 举报

福降 投递于 2016-12-6 03:43:19
你瞧你吧!看背影急煞千军万马,转过头吓退百万雄狮。
回复 支持 反对

使用道具 举报

我要投稿

回页顶回复上一篇下一篇回列表
手机版/CoLaBug.com ( 粤ICP备05003221号 | 文网文[2010]257号 | 粤公网安备 44010402000842号 )

© 2001-2017 Comsenz Inc.

返回顶部 返回列表