Hacker News new | past | comments | ask | show | jobs | submit login

    The default wrapping in most tools disrupts the visual structure of the code,
    making it more difficult to understand.
It's 2013. Let's fix the tools.

You think we would have gotten past the point of having to manually figure out "where should I break these lines for the best readability." Code is meant to be consumed by machines, and a machine should be capable of parsing the stuff, figuring out visual structure, and wrapping dynamically to account for window width and readable line-lengths in a way that preserves the visual structure.




Find me an alternative to side-by-side diff tools (such as meld) that work poorer with longer lines, and then we can talk.

Code is simply easier to read by humans when it isn't stretched out obtusely in one direction. We should be making code easier to be read by humans. Machines don't matter.


I'm sorry, but I disagree. As it stands, it is currently more important to display code as it appears in the file. One day, everyone may have their style for all languages and every tool supports it etc. That's not today.


I agree. I'm saying this is a bug. The fact that we need to care about this is a bug.

If I'm working with code on my phone, I'm going to have different preferences than on my tablet, or my laptop, or doing side-by-side comparisons, or my 30" display, or a projector.

Probably the best current examples of what can be done with this are docco and it's friends. iPython notebook is another example of what we can do when we decide that hardware improvements in the past couple decades may be taken advantage of to improve production and consumption of code, rather than declaring that no progress shall be made after 1976.


The hardware may have improved, but the meatware has not. We don't use narrow, fixed-width columns because of technical limitations, but because they are easier for human beings to read.


I'm disappointed to see this is still the top comment, for a few reasons.

1) Lines longer than 80 are an indicator that your code is getting too verbose, at least for Python. 2) You should be taking the time to figure out how to maximize readability of your code. And breaking the lines is damn near instantaneous when compared to the time taken to pick a good variable name or decide on overall design structure. 3) Not everyone will have the same editor. Your code should stand on its own and be readable in its own right.


> 1) Lines longer than 80 are an indicator that your code is getting too verbose, at least for Python.

I completely disagree, partly because I prefer to err on the side of verboseness, and partly because 80 is just ridiculously short. Look at the examples in PEP 8:

        def __init__(self, width, height,
                 color='black', emphasis=None, highlight=0):
What about that code is "too verbose"? It's an extremely simple constructor with 6 arguments (5 explicit).

I understand that readability is subjective, but I find it extremely hard to believe that anyone honestly finds that more readable on two lines than one.


>but I find it extremely hard to believe that anyone honestly finds that more readable on two lines than one.

FWIW, I do. Especially since the new line is a clear point of separation between required and optional arguments.

But I also find HN extremely hard to read because of the ridiculously long lines, and often find myself jumping to the wrong line when I reach end of the screen.


I hadn't noticed the line break serving as the separation between required and optional arguments. That's a good point, and I now don't see an issue with that particular line break in that context. But I still think it's silly to always break at 80 character in similar examples.


Exactly. 80 columns with python is incredibly frustrating to hold to. Python is a verbose language at the best of times, when you add in 4 or even 8 column indent it's just silly. 20% of your line is probably taken up by leading spaces or one or more 'self.'s. The introduction of keyword args only compounded the issue.


I think it's worth the time to spend a little while with an IDE that PEP-8 checks your code in real-time. It really forces you to think about better formatting and breaking up complex statements. In your example, if my declaration is running that long, I put the first line break after the open-parenthesis. Then all parameters end up in a nice block, and I've found that to be very easy on the eyes when going back to look at forgotten code. Generally speaking, I've found that adhering to the 80-char limit has greatly improved my code formatting and readability. 100 is okay with me too, though.


"Code is meant to be consumed by machines"

Well, partly. Code is actually a piece of writing intended for two very different audiences - one human, one machine. The machine doesn't actually care how long your lines are - this is an optimization for humans.

And I hear you about fixing the tools. Wouldn't it be easy if everyone used the same editor. But alas, this isn't a problem with the tools. It's an inherent limitation of text. That "figuring out visual structure" bit only works if your editor deeply understands your language. And different editors understand languages to different degrees and in different ways.

They say parsing is a solved problem, but I dare you to parse Ruby or Python correctly.


While Ruby and Perl cannot be parsed, Python can be parsed correctly.


Technically correct. Practically there is a tiny subset of Perl that the parser cannot decide correctly, which is used by virtually nobody. Almost all of the code on CPAN can be parsed just fine by this module: https://metacpan.org/module/PPI


I don't understand what you mean by parsing if not what the interpreter does. Can you give an example?


Perl (and I assume Ruby from the comment) requires the running context of the interpreter to make decisions about how to interpret some code constructs. Because you need the running state, it is not parseable.


Parsing as in determining what constructs different lines represent, without running them. Imagine e.g. a documentation generator that needs to figure out what all your methods and parameters are, but not actually execute them (obviously there are many other use cases).

A friend at an old job had a piece of perl code where one of the lines would be a comment or not when you ran it, depending on user input. He would bring it out whenever someone suggested writing something in perl.


Uh, ruby can be parsed, it's just hard. It's waaaay saner than perl.


If you emulate the way the MRI parses it, sure.

For example:

  # Is this some_method with an empty block, or an empty hash?
  some_method {}

  # Is this a regex or a comment?
  #
  # It depends how many arguments "whatever" takes, and you
  # don't know that until you run the code that defines it.
  #
  # Interpretation is implementation-defined. It just so
  # happens that the MRI determines that the below ISN'T a
  # regex because it has a SPACE character just after the "/".
  whatever  / 25 ; # / ; raise SystemExit.new;


I upvoted you for this - I think it's a really interesting idea.

Why can't my editor tokenize my code and then show it in the format I want? Why do I, as the programmer, have to worry about whether one whitespace convention works for you versus me, when you could just come up with whatever scheme you like and view code that way?


>Why can't my editor tokenize my code and then show it in the format I want?

Because editors are written by humans. And as of now, we haven't sold a whole lot of problems with their use.

For example, you get an IDE with lots of features, but a subpar editor, crammy UI and GC pauses (3 of the five most popular are written in Java).

Or you get something like VIM, with a great editor, but subpar compatibility with the rest of your system, not very good understanding of the code (refactoring, tokenizing etc). '80s GUI capabilities etc, ad-hoc collection of plugins to fix basic pain points (like file navigation).

Or you get Emacs, with millions of configurable options, a subpar extension language, script in various stages of great and rot, '80s GUI capabilities, etc.

In general, we lack tools that run the whole gamut: great editor, fluent shortcuts (either Emacs or Vim style), 2013 GUI capabilities, refactoring and intimate knowledge of program syntax (to the level of understanding the AST, no BS regex used for syntax highlighting), embedded REPLs and terminals, etc.

Something like a Lisp Machine + Smalltalk UI + Light Table + Visual Studio + Vim/Emacs combo.


"something like VIM, with a great editor, but subpar compatibility with the rest of your system"

Excuse me, but vim has much better compatibility/interoperability with the rest of my system than all the IDEs written in Java.


Between ^Z, :<range>!, :grep, make, and other tools, vim has great compatibility with my system, thanks :)


Do it have an in-editor terminal?

An in-editor debugger?

Does it offer a REPL?

Does it work with your build system and your SCM system without some ho-hum third party plugins?


In-editor terminal? ^Z or :shell (Asking for an in-editor terminal makes about as much sense as my whinging that Eclipse doesn't have an in-terminal editor)

In-editor debugger? I admit this is less than ideal.

A REPL? Languages suited to REPLs have REPLs, and a lot of them support SLIME, allowing me to interact with the REPL from vim. For everything else there's tmux.

Build system? :make and errorformat

SCM integration? ^Z. Or fugitive I guess. (a third party plugin, yes. But if you think it's ho-hum, you probably haven't used it).


Visual Studio with a VIM emulator plugin is pretty nice...


Well, there's stuff like word wrap and GNU indent and a host of pretty-printers, but I think you're looking for something more resource-intensive and integrated.

The question then becomes, why configure a transformer when you can just format it right the first time?


Because, aside from Python and its One True Way to format, different people have different ideas of what "formatting it right" means.


Funny, someone was just talking about 79 characters per line in regards to using soft tabs for editor display consistency, the other day.

http://www.reddit.com/r/java/comments/1j7iv4/would_it_not_be...

These are useful for static code analysis and finding congruence with typesetting conventions:

https://pypi.python.org/pypi/flake8

https://pypi.python.org/pypi/condent

https://pypi.python.org/pypi/pep8ify


Code is meant to be consumed by machines

I hope you're joking...


Poor word choice on my part. The intention was that machines don't need to jump through any hoops to be able to figure out what code means - it's a language designed such that it can be parsed mechanically without ambiguity. Compare this to a machine interpreting English.

I was not intending to downplay the significance of code being intended for human consumption.


But how many characters are in a typical line of English?


Traditionally,

  Anything from 45 to 75 characters is widely-regarded as a satisfactory
  length of line for a single-column page set in a serifed text face in
  a text size. The 66-character line (counting both letters and spaces)
  is widely regarded as ideal.
— Robert Bringhurst, The Elements of Typographic Style, 3rd Edition, §2.1.2


It depends on your typography.


More than 80.


No, thanks. Reasonably short lines means I can get a good idea of what's going on by skimming in one direction: downwards. If there are lines that run off towards the eastern horizon, even if they fit on my screen, I still have to follow along them to make sure nothing of interest is being hidden.

Plus, just getting the code to do the thing I want is enough work; let's not also start worrying about how to write code in such a way that most tools will understand how to wrap it. That sounds like bringing all the joys of CSS to my plain text. Because the tools are _not_ going to be perfect. Have you ever seen Perl code with punctuation in a comment at the end of a line, because some idiot syntax highlighter got confused about where a regex ended?


Completely agreed.

Newer versions of emacs do nice code wrapping (dynamically, to the width of the buffer, without modifying the buffer text). Xcode does nice code wrapping. I've stopped manually wrapping my code, and it's great.


What do you do when you view a diff, and what you see there doesn't match your editor?

Edit: I know emacs can probably view diffs. That's not the point, unless you expect every window to be the same width, and every tool to wrap everything exactly the same.


We should fix the diff tools too while we are at it. Why are we diffing by lines and not doing a semantic diff of the code? If the compiler doesn't care about it, why should the diff?


You essentially want to replace Unix pipelines with an API then. Nobody else has managed to do this successfully, completely replacing text.


Isn’t this the heart of the problem, though?

Raw text is a limited medium, and working at the level of plain text analysis rather than semantic analysis does limit how much our tools can achieve. The UNIX philosophy of having many small, text-based tools and chaining them together represents a common platform, but it’s also a least common denominator platform.

As long as we allow that glass ceiling to remain, the best our tools will ever do is push it incrementally higher, one tiny step at a time. If we want to make big leaps, we’re going to have to sacrifice some of that generality so we can use more powerful but specific tools. Unfortunately, that means that any new programming language wanting to take on the established standards needs not only a compiler/interpreter, but also the rest of the ecosystem: a comprehensive tool chain, ample library coverage, documentation and training resources, and so on. It is also, almost inevitably, going to need some standardised way of bridging the gap to today’s established languages for interoperability and backward compatibility purposes.

This is why I think the view that programming languages should be designed optimally for humans is short-sighted, and I suspect most of the big success stories in the coming years will be languages that were designed with clean semantics and easy parsing in mind. Those languages will better support building that surrounding ecosystem, and a good language with a good ecosystem is more practically useful than a slightly better language with a limited ecosystem.


> working at the level of plain text analysis rather than semantic analysis does limit how much our tools can achieve.

We can add semantic analysis. I have no problem with that. But removing or harming the ability to do text analysis might not stop us from using more sophisticated tools, but it does stop us from putting tools together quickly in a way not previously predicted. This is why Unix is so powerful. If you haven't read The Art of Unix Programming, I urge you to do so. It puts the argument and examples forward far more convincingly than I can.

Would moving to semantic tools harm the ability to use our existing text-based tools? I'd say so. A simple tool like diff works better, for example, if you take your big list of Python imports and put each one on a separate line, and keep them sorted. Patching works smoother this way too, reducing the likelyhood of the need for manual conflict resolution. If we eliminate doing this kind of arrangement by hand, and instead start relying on a semantic editor, we'll lose this ability. I have yet to see a tool that does semantic diffs, patches and merges better than diff, patch and git do.


But removing or harming the ability to do text analysis [...] does stop us from putting tools together quickly in a way not previously predicted.

Well, OK, but we’ve been using these text-based tools for a year or two now, and I don’t see many radical advances taking place in how we use or combine them. Are you sure you’re not chasing an illusion here?

The freeform text-based tools represent a great deal of flexibility, to be sure, but they were conceived at a time when flexible text manipulation was about as much as one could hope for. Today, we can do more.

I have yet to see a tool that does semantic diffs, patches and merges better than diff, patch and git do.

As long as everything is limited to manipulation of freeform text files, perhaps you never will. That doesn’t mean better tools aren’t possible; it just means they aren’t possible within the constraints you’re choosing to impose.


> Well, OK, but we’ve been using these text-based tools for a year or two now, and I don’t see many radical advances taking place in how we use or combine them. Are you sure you’re not chasing an illusion here?

I'm not claiming recent advances. I'm claiming existing power that has been around for decades, which we would lose if we compromised the text tooling available today.

Have you read TAOUP? Do you understand the extent of the power that existing text tooling gives us today? Are you experienced in the advanced use of the existing tools, so you are able to make comparisons about their power?

> As long as everything is limited to manipulation of freeform text files, perhaps you never will. That doesn’t mean better tools aren’t possible; it just means they aren’t possible within the constraints you’re choosing to impose.

This is backwards. We move forward when people show how it can be done. Please show us how we can improve diffs, patches and merges by moving to semantic data structures over text, without compromising any existing capabilities. Even just illustrating specifics of how these tools might work, rather than implementing them, will do something for your argument. The onus is on you.


I'm claiming existing power that has been around for decades, which we would lose if we compromised the text tooling available today.

Why would we lose it? The power of those tools isn’t in a particular executable, it’s in the algorithms they embody. For example, it is useful to compare two text streams reasonably efficiently and identify differences. How those differences are then presented obviously matters, but if you’ve got the algorithms and the ideas underlying them, producing a new tool to apply those ideas in a different context is the easy part.

The only significant difference I see is that if you made a major change, for example adopting a more structured storage model or using some sort of action/history analysis to better capture a programmer’s intent, then you would have more data to use in your algorithms, and you might therefore be able to present more interesting results.

Have you read TAOUP? Do you understand the extent of the power that existing text tooling gives us today? Are you experienced in the advanced use of the existing tools, so you are able to make comparisons about their power?

Yes, though I find your emphasis on that one book a little surprising. For one thing, the UNIX philosophy was established for decades before Raymond wrote that particular work. For another, I seem to recall that he gives examples of both text and binary formats being useful in the book. I don’t think his point was that text formats were good and non-text ones bad; I think he was arguing that things like adaptability and composability were good and that flexible and standardised formats helped to achieve those things.

We move forward when people show how it can be done.

Right, so why aren’t the programming language community picking up on decades of research and industrial progress with databases and HCI? Programming languages and the related tools are, fundamentally, just a user interface to design and control a complex, highly structured set of data.

Please show us how we can improve diffs, patches and merges by moving to semantic data structures over text, without compromising any existing capabilities.

You’re begging the question, by starting from the position that having an equivalent to today’s text-based diffs, patches and merge tools is desirable. I don’t think that is necessarily true.

As a programmer, I want to be able to specify how my software should work, and I want to be able to explore and modify that specification effectively, and I need to be able to do these things in collaboration with others. My claim is that to do those things much better than we do them today, we may need to move to a different representation than freeform text and then build new tools that are designed to solve our problems in terms of that new representation.

The problem is that there is so much momentum behind text-based formats today that we are effectively stuck around a local maximum. No one individual could possibly meet your challenge today, and I’m sure you were well aware of that when you made it. That doesn’t mean the community as a whole couldn’t do it, but it would need some serious collaboration, which realistically means one of the heavyweight organisations with the resources to bootstrap a whole new software development ecosystem would need to throw its weight behind such a project. Unfortunately, most if not all such organisations are commercial in nature, and the commercial incentives don’t align with moving in that direction.


Scaling.

The same diff tool works for text files, latex and other markup files, for any computer language, and so on. I can write downstream tools that take the output of the diff and do further processing, without worrying about what the output might look like for brainfuck. And how long would we have to wait for Torvalds to add a C++ or Java extension to git? :)


Not all languages are compiled - sometimes line breaks are meaningful, and it would be good if diff tools work across all languages.


Good question. When I do a diff from within emacs, using ediff, it "just works" -- diffs are shown side-by-side w/ the same wrapping applied. (Ediff is also good at showing you what changed within the line.)

If I'm diffing in an external tool (like gitx, or just on the command line), then I just see (possibly) long lines. So far, it hasn't been an issue for me. Obviously, diff, as a line-oriented protocol, breaks down as lines get really long -- but it's still just code, so it's not like my lines are ever insanely-long. :)

I also use a mode that highlights my current (physical) line in the file, so I still tend to have a very good sense of what the physical line in the file is, despite the wrapped visual display. E.g., http://imgur.com/a/wAXHJ

One of the oft-touted benefits of wrapping lines at (say) 80 chars is it makes it easy to do side-by-side viewing of files -- using dynamic wrapping gives you this same benefit, but even more so, since you can heads-up different files at whatever width your current display happens to have. (Or however many files you want to have side-by-side.)

Also, there's a nice side-benefit to diffs, which is you don't get the noise that comes from a change that forces a manual rewrapping -- e.g., maybe I decide my variable "id" was too generic, so I change it to "frobnackId", but then this pushes some function call over the 80-character limit. If I'm manually rewrapping, I get a weird diff of multiple lines changing over the rewrap.

(On the other hand, a definite down-side is if you do find yourself using an editor that can't wrap nicely, code with long lines can be quite annoying.)


Vim shows diffs with wrapped lines just fine as far as I can tell. I have it set up to show both versions side by side, with changed lines highlighted and the particular change on each line also highlighted. I can wrap one or both, depending on how wide I make either.

Can you explain what problem you think might occur? Its not clear to me what problems you anticipate.


Humans are set up to see differences visually when they see patterns. If the way that lines wrap depends on the width of your window, then it becomes hard to match up the output visually from windows of different widths.

An example: I type "git diff" into one window, and look for some particular change area in my editor in a different window. If the wrap and alignments depend on the widths of my windows, then they won't match unless the widths of my windows also match. And if I have to make them match, then the original ideal of "make your window however wide you want and it'll just look right" is defeated.


Can't say I experience this problem, the highlighting of relevant lines takes care of this. Human eyes are terrible at tracking across long lines of text without jumping up or down a line (this is why splitting text into columns is common in large books). On the other hand jumping between regions of color is easy.

Furthermore, I don't think this is true: "And if I have to make them match, then the original ideal of "make your window however wide you want and it'll just look right" is defeated." Being able to resize both, with the constraint that they must both be kept in sync (which to be clear, I don't consider necessary), is better than the alternative: no effective resizing at all.

You should probably look into integrating git-diff with your editor though (use git-difftool).


Especially in languages where whitespace matters, I think it's the wrapping of the lines itself that is problematic


The editor wrapping the line does not actually change the whitespace though. It is a presentation change only.


>What do you do when you view a diff, and what you see there doesn't match your editor?

Mentally account for one more line or two?


Alrighty. Where will you be implementing it first?


And how will you get every editor available on board?


Does it matter?

Progress sometimes requiring taking a couple steps backward if we've reached a local maximum. An unwillingness to compromise backward compatibility - a perennial commitment towards designing for the lowest common denominator - limits the progress that can be made.


> An unwillingness to compromise backward compatibility

Are you referring to the unwillingness to bump up the maximum line length? I don't think it's because of backward compatibility. I limit just about every line of code I write in any language to 80 columns, and my horizontal screen real estate is 5760 pixels.


That is in stark contrast to SICP: "programs must be written for people to read, and only incidentally for machines to execute"

Whom should I listen to... Hmmm, tough choice...

I mean, I've just seen Gerry Sussman in a recent video saying he doesn't know how to compute, so you must be a winner.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: