Hacker News new | past | comments | ask | show | jobs | submit login
Go Replace - simple and fast search and replace tool for command line (solovyov.net)
70 points by piranha on May 22, 2013 | hide | past | favorite | 63 comments



    find ~/my/docs/ -type f -name '*.txt' -exec sed -i.bak 's/inheritance/composition/g' {} +
This will search all files ending in `.txt` and will exec `sed 's/inheritance/composition/g'` on it. Sed modifies files in-place and saves a backup file with .bak extension (-i.bak) and is called the minimum necessary times (-exec +).

My best advice to someone using the command line is to learn `find` + `xargs` or, even better, `find` + `parallel`.


Compare with "gr what-is-it -r here-you-go". I do not have to remember all that argument hell.


Probably slower than the go version, but more flexible since you have more over the files processed.

        # bash
	# find/replace
	function fr {
		local pattern=$1
		local replacement=$2
		local program

		if [[ -z "$replacement" ]]
		then
			program="grep $pattern"
		else
			program="sed -i s/$pattern/$replacement/g"
		fi

		while read line
		do
			$program "$line"
		done
	}

	# usage
	find . | fr find-this
	find . | fr replace-this with-that


Well, one can write an one-line shell alias (or script) instead of reinventing the wheel by rewriting the whole thing anew in Go. (Which is certainly good as learning experience for Go, but the end result is easier to achieve with the shell)


When you start adding coloring, nice output, .hg/.gitignore support, it becomes just a big mess. Been there, done that.


coloring is already supported by grep, .hg/.gitignore support is just a few lines


Just do it then.


A C wrapper around `sed -e "s/what-is-it/here-you-go/" would be a little more efficient, and a lot simpler. Reinventing the wheel is bad for X+Y+Z reasons, where X, Y and Z are the bugs, options, and features in the last wheel.


Okay, so what I learnt from this:

1. You've imported droundy/goopt, wsxiaoys/terminal really easily. The first provides option-parsing, and the second provides helpers for emitting ANSI escape codes.

2. Practically everything is overloaded. You can == to strcmp(), + to concatenate strings.

3. path/filepath gives you some nice goodies for path manipulation. But you have to deal with the fallouts of making naive assumptions like whether or not to recurse symlinks (why isn't this a flag in filepath.Walk?). This has resulted in ugliness in your walkFunc callback.

4. You have been able to abstract quite a bit without using Java-like factories, although your inconsistent error passing has me somewhat confused.

5. Since it's so strongly typed and there are no raw pointers, it's not going to be hard to debug at all.

Overall, I'd say Go is an effective language. But not beautiful, or novel in the slightest. It resembles Algol-68 (yes, a 1968 language) quite strongly, and I don't think it will drive compiler technology. Their syntax is specifically engineered to be parseable quickly (and hence super-fast AOT compilation): what the final program looks like is secondary concern. I'm not sure I see the appeal.

[edit: clarified that the note on compiler technology was an opinion]


> your inconsistent error passing has me somewhat confused.

That was my first somewhat big program in Go plus coming from Python made a bit of hustle here. I still believe exceptions to be much better than this error-returning. :\ Never learned to do it nicely. :\

> Overall, I'd say Go is an effective language. But not beautiful, or novel in the slightest.

Yes, this is more or less my feelings nowadays. It's good for certain cases and if I ever need something fast, I'll use Go and not C.

> I'm not sure I see the appeal.

There is some. It's still much simpler than C (GC + stricter compiler), plus it has goroutines (not much in this app, though), plus interfaces are really wonderful (you don't need to declare that you implement interface, just implement it and you're done).


On C. C has been around for _much_ longer than Go, and a lot of people understand C inside out. The C ecosystem is fantastic: there are extremely good implementations of virtually everything that you can use to build a production-grade application.

On Go. It'll take time for a programmer to fully understand how to use a new language, and it'll be a long time before great implementations appear. If the language isn't that much of an improvement, why will an existing C programmer take the effort to learn it? Aren't the returns diminishing? Computers will only get faster, llvm will only improve (tooling + compile time + link time): in that respect, doesn't Go seem a bit short-sighted?


And there is a whole lot of people who don't know C inside out and don't want to dive in it.

One needs to explicitly manage memory, deal with raw pointers, ugly header files and all that stuff. No interfaces, too.


In the respect of existing C programmers, maybe. Having myself never mastered C because of its complexity, in a few months part-time effort starting from getting a book on Go I built a sophisticated website. In the respect of non-C programmers, Go is extremely useful.


C isn't any more difficult to understand, IMO. Obviously Go has the advantage of hindsight and an opportunity to go back on the trade-offs that were made for compiler efficiency before C was standardized. However, at it's core I think C is much more simple than Go and theoretically should be easier to understand.

I find GC to be a useful tool but have you ever looked into the implementation of one? Or have you run into a situation where it was necessary to reason about the performance characteristics of a GC in relation to some algorithm or tight main loop?

I don't find C's lack of GC any more complex than GC. In fact I find it much more simple. It requires more discipline on the part of the programmer (and perhaps that is where the appeal of a GC comes from). Perhaps it's because I started programming with addressing memory and building up a mental model of how it works and have only encountered GC as a useful addition to my programming toolbox (instead of an always-present assumption).

As well, co-routine trampolining (which is what I assume the playful pun go-routines are referring to... please correct me if I am wrong) isn't a very "simple" thing to understand either. At least they are not any more simple than threads so I don't understand how they make learning Go any more easy for non-C programmers (except perhaps exposure to co-routines in a language that already has them like Python).

I guess what I am saying is that you could have just as easily mastered as much C in those few months as you did Go.


I believe the goroutines are cooperative, yielded either explicitly or by calling various library functions. So you don't need quite the same sort of locking as you do with preemptively-multitasked threads, and there's also a much larger set of atomic operations. This should make things much easier...


I think some people's intellects are more suited to learning and mastering C, which is great for the development of major products. For general business applications development and websites I think C may be overkill compared to Go. In a test of 2K concurrent users hitting my website on a cheap box, each served a unique page, all users are served fast, so I'm glad that Go is handling GC instead of me. I've done multi-threaded code in C# and find goroutines much simpler.

I have goroutines referencing objects created in other goroutines. It seems to do my own GC without becoming too unwieldy I'd have to write something like the generic GC code Go already has built-in. It was nice to not have to think about that at all and still get great performance.


I would argue that C is probably the easiest language to understand, if you know the basics of hardware. C is essentially portable assembly. The C standard is tiny, and the behaviors are very clearly defined. The challenge is not in understanding the language, but rather in honing the engineering discipline required to use it to write real-world programs.

It lacks a garbage collector because there was no concept of garbage collector when the language was created. A C program does not provide enough information for a good garbage collector: the best we can do is a conservative garbage collector, Boehm. The garbage collector adds to the running-cost to the program for the promise of automatic memory management. While the GC might run concurrently, any sort of relocation (for compacting) will require it to pause the entire program: these GC pauses can be fatal if your program is the Linux kernel or some high-frequency trading application. Even otherwise, it is important to remember that every GC introduces trade-offs; the "freedom" from manual memory management is only worth it if your GC is good.

When I'm not writing C, I like writing Ruby. I like that it's a beautiful evolving language packed with features. And for the little applications I'm writing, I don't care even if it takes a second longer. I like that I'm not being verbose about anything, and not manually managing memory. Writing something like Jekyll in C would be an absolute pain in the arse, and totally not worth it. Obviously, the lesson is: use the right tool for the right job.

The pthreads API can certainly be very intimidating, and it requires a lot of practice to master threading in C. I like that Go has taken the concept of coroutines from Lisp (Scheme's call/cc) and turned it into something called Gorountines in imperative land. Although it's nothing novel, I won't deny that it's a nice abstraction to work with while doing multi-threaded programming.

If we were to redo C from scratch today, I'd definitely bake in more safety features. Probably design a close garbage collected dialect. For better or worse, that's an entirely theoretical scenario: we have Go today, but I'm not sure where exactly it fits in:

1. It obviously can't replace C in linux, git, zlib, ssh, openssl, libcurl, nginx, varnish, or anything as core.

2. Since it doesn't have generics or any higher OO features, it can't displace C++ in chrome or llvm.

3. When python, ruby, javascript are around, why will anyone want to use a strongly typed language for writing web applications? Okay, maybe some intensive web services like search.

Tools? The most popular Go repositories on GitHub are dotcloud/docker (software deployment tool), burke/zeus (Rails preloader) and ha/doozerd (a very specific kind of datastore).

Might be useful in Android development, but is everyone's stuck in a Dalvik swamp there.


> Writing something like Jekyll in C would be an absolute pain in the arse, and totally not worth it.

Well, here you go: https://github.com/piranha/gostatic

Wasn't pain in the ass, wasn't hard, works so much faster than Jekyll (or my previous engine, cyrax, which was in Python), that it's even funny.


> When python, ruby, javascript are around, why will anyone want to use a strongly typed language for writing web applications?

Because Go is multiples faster at runtime! After using Go I lost interest in learning Python, which seems only slightly easier to code in.


>Overall, I'd say Go is an effective language. But not beautiful, or novel in the slightest

You are right on target. Go was designed to be boring[1][2]. Programming language should not be about tricks and traps. It's a tool. Tools are not built for beauty. They are built for functionality.

[1]http://talks.golang.org/2012/splash.article [2]http://aeronotix.pl/blog/go-is-boring

In [1](2.Introudction) even the inventors agree that Go can be boring to some people. Guess what, It's boring by design.


and proudly declares that it will not use any modern advances to compiler technology

Where do they say anything of the kind?


Your last paragraph contains some spurious statements:

"proudly declares that it will not use any modern advances to compiler technology"

Where? I've read most of the available documentation, blog-posts etc. and have yet to come across a statement even vaguely resembling this.

"what the final program looks like is secondary concern"

Have you read about "go fmt"? The entire idea of it is to format the code to a standard. You don't need to use it, but it's recommended and, frankly, makes every other language without a similar feature seem like "what the final program looks like is secondary concern".

You're welcome to your opinion, but I think you'd best do some more research before making such statements without references.


> proudly declares that it will not use any modern advances to compiler technology

My apologies: this is an opinion, rather than an explicit declaration from the project. Their compiler "gc" starts with a traditional recursive descent parser using bison, and chose not to use llvm. Although one can argue that llvm doesn't provide enough information to implement a great jit, all aot compilers are moving towards using it. By having a common compiler infrastructure, compilers can focus on what is important. They currently have a simple parallel mark-and-sweep garbage collector, and there is some scope for development in the context of concurrent programs (wasn't it their goal to make it easy to write concurrent programs with goroutines?). Nothing spectacular though: the garbage collector in a mature JVM or V8 (which has to deal with a weakly typed language!) is far ahead; that's where the garbage collection research happens.

> what the final programs looks like is secondary concern

The grammar has been specifically engineered so that the language can be analyzed and parsed without a symbol table. "go fmt" is just syntactical sugar: you cannot change the underlying parser. I'm not bikeshedding about which syntax is "better", but pointing out the motivation behind the current syntax. For instance, it's very different from ruby, which was designed ground-up to resemble human language as closely as possible (hence Poignant Ruby etc).

[edit: gc does not use a gcc backend; gccgo does]


Although one can argue that llvm doesn't provide enough information to implement a great jit, all aot compilers are moving towards using it.

Actually, all serious Common Lisp implementations, for example, do incremental AOT compilation these days, and none of them is ever going to "move towards using LLVM". In the broader landscape of programming language implementations, as far as IR generality is concerned, LLVM is actually not significantly better than Java bytecode, and for many languages, trying to use it would probably generate more problems than actually using it would solve.


The gc compiler does not use a gcc backend. You're confusing it with gccgo. The llgo project is working on an llvm-based Go compiler.


Right, gccgo. However, that does seem to be the dominant implementation: llgo is being developed as an education exercise (excerpt from project page). From the Go FAQ:

We also considered using LLVM for gc but we felt it was too large and slow to meet our performance goals.


The 'dominant' implementation is gc, not gccgo. IIRC, gc is based off the Plan 9 compliers which Pike et al write. gccgo is an alternative implementation which (generally) trails gc.


Most go programs that don't need to be linked with existing C libraries are using the gc compiler.

Currently there are a lot of really good C libraries that haven't been ported to go, so anyone who wants to link to those is using the gccgo compiler, but it will probably be less used as the native 3rd party libraries catch up, or the C libraries are ported (this is happening at a furious rate)


No need to apologise to me :-) As I said, you're entitled to your opinion.

You'll probably be interested in some of what was talked about in the "Go Fireside Chat" (http://youtu.be/p9VUCp98ay4) in regards to re-designing the compiler, alternate compilers, LLVM etc. It sounds like they're aware of exactly the issues you're concerned about.


I have to break out the manual every time I use sed. Or awk. If this works reliably without any edge cases, it'll be fab!


It does for me. :) I partially did it because I was too tired of looking up sed's regexp syntax and was afraid I'm going to fuck up everything.


Nice! svn support would be useful (ignore .svn directories) and the -r option is easy to remember, but perhaps not so much in line with the already messy standard Unix options (-r or -R for recursive operation).

Note that specifically for Go source code, there is "go fix -r" for syntax/context-aware code replacement (see "godoc fix").


In case of go fix `-r` does something absolutely unrelated.

To be honest, I was just lazy inventing a command line key and it's recursive by design, so it happened this way. Plus, as you say, it's easy to remember because of mnemonics.

I haven't touched svn repos in years, but yeah, that could be added. I know they exist, it just that I haven't felt the need to support them.


Btw, it ignores .svn and a bit more common stuff if it doesn't find itself in a git or hg repo.


Exactly. gr can be for find/sed like ack is for grep.


I really like Facebook's codemod.py for stuff like that: https://github.com/facebook/codemod

It has something like automatic and manual modes, so you can check (and fix) every diff during the replacement


Well you see... It's in Python, which breaks the deal for me (I despise slow startup). But interactive mode sounds interesting, that's something I'll consider implementing.


By slow startup do you mean 0.01s of wall time? This sounds more like a prejudice than a decision with a strong technical basis.


By slow startup I mean that 0.1s is the fastest small Python program using optparse, os, sys, re will start up on my system (i7 2.0, ssd). Search and replace tool will be slower. I constantly get 0.02s-0.3s results from 'gr' on my small-to-medium repositories (up to ~80k sloc). 'ack' takes more than a second usually.

Also it was written when I had HDD (not SSD) and codebase was on an encrypted image. It made enormous difference to me.


btw, here is ag, which also looks .xignore files: https://github.com/ggreer/the_silver_searcher


Yes, I've mentioned it in my post. :) It's nice, but it does not perform replaces. I could add .xignore support though.


I love this little gem. I can do in seconds what takes my colleagues an hour (they generally don't know about macros, extending their editors, regexes, etc). It showed me that solving text problems outside your text editor ends up benefiting more people.

Its a little bit different from Go Replace, giving the option to drop on your editor if the current replacement can't be appied directly (this is the great feature).


For x86 binaries:

go get github.com/piranha/goreplace

go install github.com/piranha/goreplace

is another way to get it if you have GO and GOPATH set up.

I personally aliased it to 'gor' in my zshrc as the git plugin already occupied the 'gr' (short for git remote).


Ha-ha, I've got an issue asking to add some notes about conflict with oh-my-zsh alias. :) I hastily suggested `gp`, but it's already taken by same plugin, so I changed it to `gor` there, thanks for idea.


Is tab completion not enough?


It's often is. But somehow people end up with 'git remote' aliased to 'gr', and I ask the same question. :) I use gr very often and so not have to tab complete is a plus.


This isn't working for me. I get:

    C:\Users\swah\docs>gr TEST
    panic: Given path should be anchored at /

    goroutine 1 [running]:
    main.NewIgnorer(0xc08005bcf0, 0x29, 0x0, 0x0, 0x140110, ...)
            /Users/piranha/dev/go/src/goreplace/ignore.go:32 +0xa4
    main.main()
            /Users/piranha/dev/go/src/goreplace/goreplace.go:62 +0x1b7

    goroutine 2 [runnable]:
[edit] New version is working!


Argh! I see, I think I can solve this. Give me a bit of time please. :)


Should be fixed, can you download new version please?


Working! Do you use colors? They don't work the same in DOS, I believe. I'm seeing some escape sequences.

    ←[0;32;49mRS\terms.tex
    ←[0m←[1;39;49m←[0;33;49m25:←[0m   ←[0;39;43mOEM←[0m  &                                      \\ \hline


It's been a while but those look like ANSI escapes so I think if you add something like DEVICEHIGH=C:\DOS\ANSI.SYS to your CONFIG.SYS then you should find it starts to interpret them correctly. I think ANSI.SYS uses up less than 30K.


Eh, it's not very easy to fix that unfortunately. I opened an issue and will try to work on this a bit, but then in a meanwhile... Maybe I can strip colors for outputting things on windows.

Edit: argh, outputting colors on windows means making system calls. Which means cgo and good-bye cross-compiling. I'll probably just strip all colors...


Hmm, yeah, actually codemod.py had the same issue and I changed it to use Python's colorama library.

https://pypi.python.org/pypi/colorama


Oh, it's the library I use has no idea about colors in DOS. :( I'll have to do some research on how to fix this... :\


Wow, it's super fast AND the output looks great. Good job, piranha! Copied to /usr/bin, I will use this a lot.


If you're the only user on your box, you can just create ~/bin and some distributions will automatically add it to your personal PATH. That way you don't lose track of what you've copied into /usr/bin, leaving it to be maintained by packages.


you can also very easily add ~/bin to PATH in e.g. your .bashrc


Thanks!


what's the rationale behind 'gr foo -r bar' instead of simply 'gr foo bar'?


Just happened so, plus if you forgot to put quotes around argument you could easily end up with a lot of problems. :)


Are 32bit binaries coming? For the rest, great job!


Here we go, links are in readme. They work for me on 64-bit OS X and 64-bit Linux and `file` reports they are indeed 32-bit, but I haven't tested them on real 32-bit OS (I don't have any ATM).


Hah, haven't thought about them (I use osx too much it seems). They will appear as soon as I figure out how to add them to Makefile without making it too ugly. :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: