Hacker News new | past | comments | ask | show | jobs | submit login
Lisp-like html as a replacement to bbcode/markup/textile (94.249.190.129)
133 points by JimmyRuska on Aug 27, 2012 | hide | past | favorite | 79 comments



This would never replace bbcode.

The best thing to replace that is markdown.

People don't really grok tags very well at all.

BBCode only works because of WYSIWYG editors that do the work for the user. That and the wide acceptance meaning that sites like Flickr produce BBCode embed statements.

Markdown works well for people just typing in plain text but still getting something formatted without trying to tag things.

If you have a WYSIWYG editor then who cares whether this is used... and as the embed stuff isn't compatible why pick this over BBCode?

If you don't have a WYSIWYG editor, and assuming you're in a web page and a textfield, then markdown works superbly well (ignoring the ugly image syntax).

I struggle to see, given the title and mention of BBCode, how the linked article overcomes any of the issues of the existing methods or offers any benefit.

Not to say that there isn't a place for this... but BBCode isn't it.


Err, replacement is maybe too strong a word. I should have said alternative. I don't have any such ambitions, but I plan to use this instead of bbcode on all my personal sites from now on. There's a lot of room for convenience in the form of transformations like I did with the tables. I could use this for my blog editor and add a {coffeescript ...} and {dart ...} tag for example, or automatically pass closure on js code. For users, it's really powerful because it can let them use style tags safely on elements.


> BBCode only works because [...]

Any board out there is full of posts with broken quotes and markup. I don't think BBCode works. It's widespread but it doesn't mean it works.

I otherwise agree with your points.


his underlying point, and he as much as said it, is that bbcode IS successful because it WAS successful.

when it was born, it was convenient because it looked and acted like html while limiting the users ability fuck things up. at the same time it didn't go confusing the machine that was going to process it. it was clearly never designed for anything but bridging that gap... some 15+ years ago.

Flickr produces bbcode embeds, but imgur produces like 8 different embeds. existing adoption is not the sole mark of success for anything digital. facebook anyone?

the wysiwyg editors make it easier to generate "human usable code that limits fucking up". It matters because users can still fuck things up but for some reason they need to be able to read it.

THAT is why this type of thing should be used (on the developer level) as a replacement. It's easy to read, it's easy to grasp, it's consistent and it's no more difficult to parse (for erlang or haskell or php) at the server level than bbcode.


I think that this is an interesting experiment, and reminds me of Clojure's hiccup[1] library.

In hiccup, you model html to be rendered as nested vectors and maps, such that

    <span class="foo">bar</span>
becomes

    (html [:span {:class "foo"} "bar"])
Where this is a real win is when you realize that it's just data:

    (html [:ol
            (for [x (range 1 4)]
              [:li x])])
Lisps are well suited to model html.

[1] https://github.com/weavejester/hiccup


Ruby has had these type of libraries for years. The original is probably Markaby, but today there are alternatives that are both faster (https://github.com/camping/mab/) and has more features (http://erector.rubyforge.org/).

It's an interesting concept, but I find that (1) it's hard to work on it together with designers and (2) for larger templates I don't find it that elegant.


The theme between these efforts seems to me: Instead of a separate language in a separate world, our views are normal objects/data (in all their glory and power) that happen to know how to print themselves as html.


I'm certainly no cultist of MVC, but I would tend to agree with them that tightly coupling presentation details to the basic model is usually unwise.


You're not supposed to put these into your models. You should create a different class/method that accepts models and returns HTML:

    class TweetWidget
      def initialize(tweet)
        @tweet = tweet
      end
      
      def content; … end
    end


Oleg Kiselyov has done some nice work where he defines SXML, a representation of XML in Scheme, and parsers/ pretty printers between XML and Scheme. This allows you to use hygienic macros (hygiene becomes important in the absence of namespaces). See http://www.okmij.org/ftp/Scheme/xml.html#SXML-spec

Lua's table syntax is well-suited to this style of embedding html/xml syntax in a programming language, e.g., if you define the appropriate functions a and i to represent anchors and italics, then you can define an anchor element:

    a {href=url, name="next",
       "Go to the",
       i "next",
       "element"}


Most languages will have literals that can be used to easily represent HTML data:

  [a => {href=>url, name=>'next'}, 'Go to the', [i => 'next'], 'element'];    # perl5

  [a => {:href(url), :name<next>}, 'Go to the', [:i<next>], 'element'];       # perl6

  [:a => {:href=>url, :name=>'next'}, 'Go to the', [:i => 'next'], 'element'] # ruby
  
Your Lua example is very similar to the HTML::AsSubs CPAN module (https://metacpan.org/module/HTML::AsSubs). Here is your example using this module:

    use HTML::AsSubs;

    sub url { 'http://www.anythingfornow.com' }

    my $h = html(
        a( {href=>url, name=>'next'},
           'Go to the',
           i('next'),
           'element',
        ),
    );

    say $h->as_HTML;


   a { ... }
and

   i "next"
are not table elements, they are function applications.

All of your examples use more magic characters than the Lua example I gave.


The HTML::AsSubs example uses functions and works exactly (see [1]) the same way your Lua code does.

The first set of examples are just to show that you can model HTML data within common Hash/Array literals available in most languages.

re: magic characters - The perl6 example does use some extra shortcut niceties :) However the perl5 example will actually work unchanged in perl6. The Ruby example is only slightly different to perl5 because it uses symbol sigil (Perl like Lua allows barewords on LHS of a table/hash key assignment).

[1] - The only difference is the Lua a() function receives all its content via a single Table. Whereas HTML::AsSubs a() is a variadic function and uses the Hash literal {} for assigning HTML attributes.


Which is, of course, just CGI.pm's html functions again.


The table and lists parsers are exactly that. Syntax transformations after I have a list made of dom nodes and binary elements. I can make a DSL parser attached to any tag I make. I just don't know what else would be useful.


Looks a lot like TeX, which is a good thing. In fact this might even be better. I'm tired so I might be missing an obvious weakness but to take the analogy a bit further here are some comparisons between this suggestion and a more TeX-like version.

The author's suggestion:

  {b {i This is in italics and bold.}}
  {Henny+Penny We can use google fonts anywhere if we just import them first with the google-font code}
  {macro foobar {u {b %s}}}
TeX-like (yes I'm making up the keywords):

  {\b {\i This is in italics and bold.}}
  {\font {Henny+Penny} We can use google fonts anywhere if we just import them first with the google-font code}
  {\macro {foobar} {\u {\b #1}}}
I would definitely prefer the first alternative over the TeX-like. The analogy also suggests though that instead of HTML "<br />" you could have a TeX-like atom "\br" instead of "{br}"; saves only one character, but easy to see inside a block of text.


Main problem with author's suggestion is that he mixes everything together, so he'll get into the markup version of DLL-hell.

E.g. How do I use relative links? Is {subdir Hello world} a relative link, a font-name, or a new and yet unsupported tag?

Html handles this: <a href='subdir'> versus <font name='subdir'> versus <subdir>...

Oh, and why support font names and colors directly in tags in 2012? He should support class names instead!

Why is "fontname from URL" hardcoded for Google fonts? Why not a generic syntax that handles whatever site you might want to use.

Why support simple macros without any support for formatting numbers and currency? Your server-site language should support this, so why send it to the browser?

Image (pic) elements are missing height/width, so we're back to the relayout flashes that the NCSA_Mosaic browser had whenever it loaded an image.

Exercise for the reader: let your editor remove one } by random. Figure out yourself where it's missing by just reading the source.


you can use the _style attribute with any element. For example {b_style="font-size:20;font-family:Courier New" content}. I intended it to work with other html attributes, so using multiple attributes would look like this {b_attr1='asdf'_attr2="asdf" content}. It would be just like html almost except for the quirky syntax. I only allow syntax for now because it's intended to be safe and I didn't want to allow things like onclick or anything javascript. Allowing a class attribute is also easy, but for now I didn't want to because the site I will be adding it to soon could use a previously defined class that's width 600 or something. Right now style is well controlled if you try to make things too wide or use something like display:none


Why don't browsers support TeX anyway? Maybe it could even be embedded in a HTML page, like SVG.

   <!doctype html>
   <tex>
   Hello, World
   \bye
   </tex>


There is actually an XML version of TeX: http://getfo.org/texml/


I always thought it was an obvious chose given the people who started this (academics).


Nice experiment.

There's an issue with curly braces { }. When typing in Russian (and in other Cyrillic languages) you need to switch keyboard layout before and after each curly brace. It quickly becomes tiresome to do: type-in-russian, switch-layout, type-{, switch-layout, type-in-russian etc.

This problem also exist in Markdown too: square brackets [ ] and curly braces { } both live on the same keys as Russian 'Х' and 'Ъ'. Typing Russian Markdown with links is tiresome. Curiously, the only easy to type characters in Cyrillic layout are parens ( ).

Update. I wrote about JCUKEN layout above: http://en.wikipedia.org/wiki/JCUKEN. But on some keyboards the layout is extended. E.g. Mac keyboard allows you to enter square brackets [ ] with `~ key in default Russian layout [1, 2]

[1]: http://store.storeimages.cdn-apple.com/2544/as-images.apple....

[2]: http://store.apple.com/ie-business/product/MC184RS/B/apple-w...


Out of curiosity, do Russian developers permanently use an English layout, or just map these keys elsewhere?


Switching layout is not an issue when typing code. You just type in English because keywords and API are obviously in English. The problem exist only for markup languages where you type prose.

Still (in addition to English letters) these are hard to type in Russian layout: { } ' @ # $ ` (and more - depends on a keyboard and operating system).

Update. I've just noticed square brackets [ ] on my (Cyrillic) Mac keyboard. They are on a key with tilde ~ and backtick `. Before this very comment I didn't know I could type square brackets in Cyrillic layout... This should make Markdown much easier!


Actually on Mac with a combination of alt/option key and the letter you should be able to right {} in cyrillic without switching layout. Search a little, and openthe "keyboard viewer" tool.


Pretty much all programming languages out there were designed with QWERTY layout in mind. You see, who in his right mind would make brackets more easily reachable than parenthesis?

Hence, many many developers use a US layout instead of their native one, even if their native one is somewhat similar to the US one. This is true for pretty much all European layouts and certainly for more foreign ones.

Its a mess, basically.


I'm Bulgarian, not Russian. But developers here all use standard US-English layouts. In fact, everybody I know just uses a US-English layout by default, you can't really do anything except write Cyrillic text with a Cyrillic layout.


You just reinvented the RTF spec. It quickly gets out of hand, trust me.


I can say html is better than RTF, and I like this syntax more than html. It accepts _style attributes on any tag so I do not encourage command stacking.


If you look at the old Scribe, it used a markup like in this example:

    @hemlock runs in the editor process and interacts with other Lisp processes
    called @i[eval servers].  A user's Lisp program normally runs in an eval
    server process.  The separation between editor and eval server has several
    advantages:
    @begin[itemize]
    The editor is protected from any bad things which may happen while debugging a
    Lisp program.

    Editing may occur while running a Lisp program.

    The eval server may be on a different machine, removing the load from the
    editing machine.

    Multiple eval servers allow the use of several distinct Lisp environments.
    @end[itemize]
To make things italic, you would write @i[some text here]. Longer commands would be like in the itemize example. Each paragraph would be an item in this simple version.


Xarn came up with a very similar implementation of an s-expression-based markup language a few years ago:

http://cairnarvon.rotahall.org/2010/05/25/towards-a-better-b...

It's fun to play around with for a minute, but unless you really like /prog/, you're rarely going to want that kind of flexibility in basic text formatting. Still, a sexpcode parser would make for a decent board gimmick, so I'm kind of surprised that no one (that I can remember, anyways) used it for that.


Minor downside: You have to ensure that the contents of a {code ..} block have all their open-brackets matched with close-brackets. This can be quite tricky with larger code examples. The bbcode [/code] equivalent is slightly less problematic.

Other than that it looks like a good alternative.


In my opinion all tags should have an optional short/long form i.e.: {code ..} and {code}..{/code}.


you mean like CURL? http://en.wikipedia.org/wiki/Curl_(programming_language) - used it years ago and it did not make my life easier.


but curl is a programming language itself. This is intended to be safe and primarily for markup. I did change from parens to angle brackets because of curl. It made much more sense because brackets are much less used in everyday chat, on non programmer sites. I made this after I made an html parser, so it's parsed in very much the same way. Curl is very very different.


Years trying to move the style to the css and be semantic in the html.

I like it, but comes a little bit late. 8 years ago probably was nicer.


Cgi.tcl (http://expect.sourceforge.net/cgi.tcl/) has been around since the mid-nineties. Also, htmlgen (http://tclxml.sourceforge.net/xmlgen/htmlgen.html).


I've found HAML (http://haml.info/) to be my ideal reduced HTML syntax, strange not to see it mentioned here.


HAML is fantastic overall, but there are two ways in which I don't think it's addressing the same problem as something like this:

* Whitespace-sensitive means I don't want to use it in a text-field in a browser -- in an editor I can have guide lines to show me the indentation levels and what lines up with what, and keyboard shortcuts for indenting/unindenting blocks quickly.

* Nesting and inline stuff just doesn't work well since you'd end up with stuff like a %b on one line and then an %i on the next line unless you just went plain-HTML for the line. HAML is much more suited for page structure, and not so great for inline content.

Something like the following line is really cool with a quasi-TeX feel: "Using multiple tags: {b {i This is italics and bold.}}"


That look nice, but four things:

divs. How to write something like <div class="text" id="lore">Test lore ipsum</div> ? Does this have a chicken scheme style named parameters?

Inline HTML, especially useful for macros. {html <i>i</i>} should become <i>i</i>

A unification <script> <img> <video> <audio> and <iframe>. It's not that easy to translate to html, but i felt there should be one mechanism, since they all just embed a another medium. The sematics would make more sense IMO.

The macro expression takes a argument (the macro name). Can my macro aswell do this aswell? Any why doesn't the colour use it?


I set it to parse attributes with this syntax {b_class="text"_id="foo" content}. It only supports the style attribute at the moment, because I was worried about people using stuff like onclick or inheriting a class with a huge width. I structured it like this mostly because it was easy to parse. I also support tags like p, span, cite, section, article, aside, sub, sup, hr and pre. I don't know what you mean in the last part about the macros.


I believe Clojure has something similar to what the OP is talking about thanks to Hiccup's macros: https://github.com/weavejester/hiccup


Also there's Hiccups older Common Lisp relative CL-WHO - http://weitz.de/cl-who/

Other languages also have similar solutions - http://stackoverflow.com/questions/671572/cl-who-like-html-t...


Hmm, an awful lot of <br>s in the generated code. Never mind that I'm not a big fan of explicit <p>'s, either, as s-expression like formatting isn't that helpful once your actual content pushes the starting tag beyond the screen. SGML is actually not a bad idea if your markup content is pretty low (< 5%).

It's still a bit line-noisy compared against markdown and, well, I've yet to see a forum using bbcode properly.


A new BR for every enter I press. The goal was to make the text in the textarea readable. Maybe that's why markdown uses 2 line breaks for every <br/> but I'm not sure I want to do that.


I still don't see a reason for manual linebreaks here, which would mostly be useful for things like poems etc, where you explicitely need to mandate a certain structure. For everything else paragraphs seem a better solution, especially compared to pervasive use for <br /><br /> to simulate post-paragraph spacing.


For developers that's right but when normal people are chatting hit enter they expect a new line. I can't assume they mean the new text to be a paragraph unless they specify it. I supposed I could add escaping for newlines but I'm not going to auto <p> either.


If you're aiming for the bbcode sector, sure. Bloggers etc. (i.e. the textile/markdown crowd) would usually expect more well-formed results.

Not sure whether that's a good demographic to aim at. Basically everything is better than bbcode, yet it's quite firmly entrenched after all these years (with a bit of tag-limited HTML and WYSIWYG as alternatives).


Once again see Erik Naggum's "Enamel"


That's the first thing that came to mind as I read the article. More info:

http://genius.cat-v.org/erik-naggum/xml-sgml-nml-lisp


Is there actually a public Lisp library to do that? Erik is no longer among us and brought his code with him; Tim Bradshaw had his version called TML/DTML that AFAIK was never released.


Something similar from the Tcl world: xmlgen / htmlgen http://wiki.tcl.tk/5976


This is almost identical to the syntax used by OCamldoc, a tool for generating API documentation from OCaml sources:

http://caml.inria.fr/pub/docs/manual-ocaml-4.00/manual029.ht...


I'm a little bit late to the party and haven't had time to read all the comments yet, but I like the idea and the fact that it's written in Erlang.

So I just wanted to ask: is it intended to be open-source? I couldn't find the link to the source on the site and as mentioned I didn't read through comments yet.

EDIT: I skimmed all the comments but still haven't found any mention about source code. Also, I remembered YAJET[1] as something at least equally interesting.

[1]: Home page is here: http://www.yajet.net/ and docs here: http://www.yajet.net/yajet/doc/yajet.html


I'm not quite done with it yet. I may post source after I've used it for awhile and worked out any kinks. Honestly doing it in Erlang was not a great idea. I had to do a lot of micro optimization to make it run at reasonable speeds. I hope to rewrite it in D one day and maybe make a NIF plugin.


> Honestly doing it in Erlang was not a great idea.

I was suspecting this and it's one of the main reasons I'd like to see the source :-) Please don't hesitate to post it when you think it's ready!


This loses markdown's biggest draw: a normal, readable document outside the interpreter.


Could be fun:

    {defmacro foo [bar] {li {i {bar}}}}


The missing instruction: a literal '{' is written '\{'.


No domain name! I love it! How often do we see this on HN? Fantastic.

As for the idea in the title, isn't this more or less what Viaweb did?

Create a DSL for generating HTML elements and structure that can itself be embedded in an HTML page. Then let users POST commands along with their data to your custom interpeter (that you built from Lisp/Scheme).


No.

How does it replace markup?! It's still writing html but (more or less) using {} instead of <>. That's just silly, imo.

Markup (and restructured text) are so much more concise because you don't need to know what's <i> or <h3> or whatever.

Also, the page itself uses markup in HTML (not CSS)... wow.. back in 1995 again, are we?


Line by line parsers are extremely limited and markdown makes use of everyday characters we use in text for their formatting. The page itself uses style tags at the top so not sure what you mean. This is just showing off my progress so far of something I plan to use. Not an announcement telling you this is the future.


Ah well, not saying you shouldn't use that. Just saying i wouldn't ;)


you forgot to say that you weren't intending to use it to replace your use of html, but for replacing your use of bbcode-likes.

it's the bees knees as a bbcode replacement.


The difference isn't just replacing {} with <>. Notice that the tag name is only specified once. The end tag is only an end } not a </tag>. This is more satisfying to programmers because the SGML style allows mixups like <a><b>text</a></b>.


I'm not sure if expecting the programmer to know all HTML tags but not how they are used is the right way to go :P

Either you say "ok, i want to get rid of HTML and make it really easy to build a website" or not. This approach is neither, in my opinion.

In my opinion, that's a problem of most of those "fancy" markup languages. Why should i bother using this (or haml for that matter), if it only adds another layer of complexity and yet another "language" to learn (for me and far more important for every other person that may join a project in the future)? It may look prettier but it introduces overhead and potential other problems. Is it worth it? For me, it's not.


Like all other html "improvements" this just trades one set of obscure conventions for another. A bit more consistent perhaps, and the macros are cool. How is this better than any other templating library though?

Is the source available?


The difference is normal users are able to use this more easily when used as a markup language. People use parenthesis all the time as a delimiter for their ideas (they do this sometimes). So telling them to {i do this sometimes}, isn't much of a stretch when they want to change tones or context of what they say. If you know html, this is pretty much the same thing but with {tag content} instead of {tag}content{/tag}. Not a lot of memorization involved and if I allow any tag then it's the same set of html tags plus the syntax transformations I've added. There is no source. It's not quite finished, and even so, the source is in erlang. Not the most common of languages.


Thanks for sharing this. Looks very cool!

I would prefer the element name in front of the brackets. Like this:

h3{We can use google fonts easily}

This separates the element name more from the content.


I don't get how people likes Lisp syntax so much, for me it looks horrible. I would prefer markdown as a replacement for BBCode


Line by line parsers are much much more limited. This can support syntax transformations and html attributes. This is more of an html-alternative in terms of capability. Markdown links are ugly and you need double space instead of single space for every space you input. I could do something like replace things nested in as italics but I'd rather have normal characters not turned into formatting. Lisp might look ugly in comparison to python, but not html and bbcode.


What's wrong with Markdown links [1]? I'd have a hard time coming up with a better syntax than this:

    [Google][]
      [google]: http://www.google.com/
    [Google][1]
      [1]: http://www.google.com/
    [Google](http://www.google.com)
I'm not sure what you mean about double-spacing either. Line breaks (<br />) are made by adding two spaces to the end of a line. If you don't like that, try GitHub-flavored Markdown [2].

[1]: http://daringfireball.net/projects/markdown/syntax#links [2]: http://github.github.com/github-flavored-markdown/


I was only aware of the last one. This has the capability to be used as a replacement to html, not just a limited subset. It can also do syntax transformations like the table, which also accepts other syntax transforms and tags. Markdown can't take html attributes either.


I used to think the same thing about Lisp syntax. I wondered how anyone could like it at all. For the last several years I've done a lot of coding in Scheme, and I have to admit, I've grown to really like it. I think the primary reason people dislike Lisp/Scheme-style syntax is that it is so very different than what they are used to. For years we've been programming in Algol-like syntaxes (C, C++, Java, etc) -- and when you're used to that, Lisp-like languages look so foreign that the natural response is often "ew! that looks awful!".

I know it's probably been said here multiple times before, but you really do get used to the parentheses, and eventually I think you'll even appreciate them (if you decide to really give Lisp or Scheme a chance).


Looks interesting. I would have prefered the use of () in place of {} though.


That would require a lot more escaping when you aren't writing markup (as using parenthesis in regular writing is so common).


It's pretty easy to switch the delimiters. I plan to add it to a roleplay site first because they use parens much more than angle brackets. You can see the old version here, but it's a line by line parser http://rp.eliteskills.com/editing.php . They're using it every day.


Yep, my thoughts exactly.

If you call it lispy it's gotta use parens.

The curlies immediately suggest some broken json or perl hash.

I can see the lispy spirit in applying the car to the cdr, and I like it!

One first bug report (or feature removal request):

&nbsp; makes it to the output, but is replaced with the U+00A0 character in the input textarea.

{p like &nbsp; so}

What's the concern with a real lisp syntax?

I for one like it!


Great idea, and so very obvious in hindsight.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: