I have invented a language called Croml, in which all source code compiles to the same program. This program reads in a file, sorts the lines, and writes it back out to another file. An empty source file accomplishes this task.
I did actually use LINQPad. My comment was in jest :P
However, there's no Main in LINQPad which takes string[] args. Also File.WriteAllLines takes an array as its second parameter, while OrderBy returns an IEnumerable.
Absolutely yes, up to the point you're working with text. When objects (either implicit or explicit) start to interact with each other, shell scripting falls short quickly.
'Text' covers a very large domain. Once you get any data to a ordered text form, Unix command line text processing utilities have literally zero competition when it comes succinctness, power and how quickly you can create solutions.
Utility of shell goes down. Not because problems are represented as 'Text', but rather shell languages lack features like exception handling, proper error checking and many other things- Which make it difficult to write large programs in it.
As a next extension, you can learn Perl.
I assure you, after that you will not need anything ever.
I read his post to mean he is talking about the 'Text' domain of things, not everything. He claims you won't want anything beyond Perl if/when you stick to handling text.
I like perl because of how it looks similar to php, and because it looks similar to php, the structure is somewhat like C. I am not sure how good Python is at text manipulation, but I'll bet it is similar to perl and can do, if not all the things perl can, most things perl can.
Languages I know are: C, Objective-C, Perl, PHP, Javascript, and bash.
Languages I played in: Python, Ruby, C++, C#, lua, go.
I want to learn C++ as it has very good cross platform stuff. But don't know when to start. I want to learn python so I can help with mailpile, but don't know when to start. I want to learn go as I want to write my own chat protocol, but don't know when to start.
I am not biased with languages, I just don't know other languages and can't give a pros/cons comparing between them.
This is my favorite python implementation in this thread. I daresay it's beautiful.
Including whitespace it's still half as long as the "11 line" core of the C++ script.
There's an unnecessary 'r'. Combined with the line rhythm established by opening the files in consecutive lines you ensure readers who would have been confused by the 'w' will quickly understand.
And the last line reads like programmer English; by which I mean English with a SVO word order. Not intuitive to the average person but quite parsable by anybody who's used a modern imperative language.
I can't even nitpick your choice of inp instead of input, the rhythm you setup and the contrast with 'out' means it's quite obvious what you mean.
In my opinion using reduce, map and zip is not a good idea in this case. What are they needed for? I don't even think your approach is more functional than the examples above.
I mean, this one line should be equally functional and .. it's shorter and even more understandable:
$ time sort < t > t-sh
real 0m0.016s
user 0m0.008s
sys 0m0.008s
$ time py -c "import sys; open(sys.argv[1],'w').writelines(sorted(open(sys.argv[2]).readlines()))" t-py t
real 0m0.088s
user 0m0.056s
sys 0m0.012s
BTW also a different result (I curled this page for the data).
It is faster mainly because of the startup time of the python interpreter.
You can improve performances using the "-S" on the python interpreter.
I get inconsistent results with time, I think maybe because of context switching as stated here [1].
So what? The Unix Shell library consists of all the executables that are on the system - for me this is the beauty. There is not much of abstraction between the Unix Shell and the system under it. And shell is very tolerant to other programing languages as well - as long as they allow to represent data in text form and through streams or files ;)
Yes, I know that this is the correct way of doing it in bash. I posted this because someone might test the speed of the two scripts and conclude "bash is faster" while they actually measured the speed of "sort" probably.
Microbenchmarks are a distraction. Macrobenchmarks matter.
The real problem with bash is the mess you get when you start needing whitespace or arrays or error handling or non tabular objects or a computation not already implemented as a system program.
Well, if you want a REPL for C++, there is always ROOT. Though, for the love of all that is holy in this world, I don't know why they chose to have a REPL for C++. Or why they chose the profoundly ungoogleable name of "ROOT".
Indeed, what kills languages like C or C++11 in the scripting domain, is the need for declarations and other kind of boiler plate. That said, with C++11 there are means to write a library requiring less declarations, but it's still a cultural problem (beside the hard work that it would require).
It's not really boilerplate in the classical sense, he's just talking about headers and lines that have only a brace on them. I think it's a fair statement.
Plus, he could have reduced lines even further to get down to about 6, like
I use C++ every day but I take objection to the following claim
"powerful string manipulation functions"
boost::algorithm::split (and the rest of algorithm, frankly) is unintuitive to use. Regex requires looking up the syntax every time. No encoding/unicode support. Literally 100's of popular string libraries, all incompatible, and that's not even counting all the homebrew. char* everywhere so lots of copying to work with string objects (which is what all of the algorithms work on). Dozens of different and slightly different ways of converting other types and object to/from strings. Poor string formatting functionality, and the solutions that exist are verbose and cumbersome (strstream, boost::format, ...) .
I still love the combination of low-level power and high-level abstraction that C++ provides, but string handling is one of the most problematic areas, in my experience (which is of course colored by the type of work I do, but still).
Yup. C++ isn't a terrible language for scripting, it's mostly the fact that it lacks a single sleek standard library like C# or Python or whatnot offer for normal shell-scripting tasks.
The problem with that is that any such "sleek standard library" doesn't translate the existing (massive) body of C++ code. Using QT is already much better than the standard library though.
Completely agreed - working in areas that I do char* style strings are still commonplace, and that does make you appreciate boost or std::string a great deal more. But it does not hold a candle to the ease of string manipulation in a lot of scripting languages.
Strings are a known problem in C++. Wish the standard committee could end this string nonsense once and for all. Until then I'll just stick to std::string and char *.
I think this is just helping perpetuate an artificial meaningless classification of languages.
This entire "scripting language" vs. ? ("general purpose language"? "systems programming language"?) is really not well defined in the first place.
What makes some language a scripting language? Is Python a scripting language or a general purpose language? Scripting for a specific platform? "General" scripting language?
I think most people think of "scripting languages" as languages used for automating small tasks that aren't suitable (the languages, that is) for large applications. By that definition any general purpose language is automatically a "scripting" language but not the other way around.
And what does having a REPL vs. not or an interpreter vs. a compiler have to do with any of this?
> And what does having a REPL vs. not or an interpreter vs. a compiler have to do with any of this?
A lot. As a class, languages that can be executed by sending program text to stdin without polluting the execution environment are suitable for a whole class of programming techniques that languages that can't aren't.
For example, heredoc-ing to inline one language's code into another, generating code at runtime, or executing on another machine over ssh are much trickier propositions in Java or C++ than they are in Awk or Python.
REPL availability matters because it's common to use one or more REPLs as a primary UI to a machine. Lots of people use Bash or another *sh, some people use Python or a Lisp, but I've yet to hear of anyone using a C++ REPL as their shell even if such a thing may exist.
Theoretically, these are properties of the implementation and not the language, but language features tend to be so coupled to that implementation decision that it doesn't matter. (go with its 'go run' is a maybe-exception)
We also can't have any meaningful discussion using your definition. What does "polluting the execution environment" mean? Are we restricting the discussion to platforms/languages where "stdin" has a meaning?
By it's nature, a scripting language's job is to "pollute it's environment", i.e. to perform some modification of the state of the system it is scripting. A scripting language is most certainly not a "filter", something that takes some input via a pipe and produces some output.
Stdin isn't important, but the ability to treat the whole environment as a process that takes source code as an input is.
Mutating the environment isn't necessary to be a useful program. The techniques of "move the code to the data, not the data to the code" and "share by communicating, don't communicate by sharing" depend on this.
The actual implementation behaviors of C and Java prevent me from treating a remote system as an abstract, environment-free machine, at least without doing a ton of tooling. The almost unavoidable, unwanted side-effects on the local filesystem due to executing source code is what I'm referring to as pollution.
We're not going to get into a functional vs. imperative discussion I hope ;-)
So if Python makes a .pyc file that's not pollution but gcc making an a.out is? How about we create a RAM disk, compile into that, and dispose of it after we're done?
A compiler is a process that takes source code as input. You simply need to draw your circle a little larger.
The reality is that the lines are blurry, definitely more blurry today then they were in 1998 (that paper that was referred to). They are blurrier because computers are faster and with more storage, compile is now more of a continuum with JIT and there are many languages that straddle multiple categories.
I've used C for "scripting", e.g. a "quick and dirty" parse some files and spit out some results and I use Python for "production" style very large applications. C++ has become more expressive and safer but I'm not sure what we get by saying it's a "scripting language".
http://bellard.org/tcc supports executing C programs "directly". By piping things through clang, for example (a number of C++ compilers can still translate to C), it can be made to execute C++ programs directly in memory, "just like" python or perl or ...
A spectacular use of this ability is in using tcc as a linux bootloader. Instead of loading vmlinuz, it loads the C source for the kernel, compiles it, and boots the result. It doesn't even need an operating system (try that with python).
> I've yet to hear of anyone using a C++ REPL as their shell even if such a thing may exist.
Actually, it seems such a thing does exist, and there are quite some people using it (like, CERN) -- see a recent link from proggit for a C REPL called "CINT" (and its top comments for a C++ REPL dubbed "Cling") at:
It isn't well defined? I always thought a scripting language was evaluated/compiled at runtime (therefore slower, often allowing dynamically generated code to be executed, no low-level memory management like pointers), versus traditional languages which are compiled (either to assembly, or some intermediate representation, but dynamically generated code in the original language is necessarily out of the picture).
Obviously the name "scripting" comes about from the fact that such langauges are intended for "scripting" automating interaction with "objects" (application scripting, HTML scripting, shell scripting, etc.), and that high-level features and on-the-fly execution are more important than performance.
I've never met anyone who thought Python wasn't a scripting language -- I mean, you run the Python interpreter on the script file. And I've never met anyone who would call a compiled language (C, Java, etc.) a scripting language. The distinction is pretty clear to me; maybe other people can think of counterexamples?
Do we need to compile to machine languages? What about a VM? What about the Python VM is different than the Java VM or the .NET VM? Is C# therefore a scripting language or a ______ language. (fill in the blank)
ActionScript? What about JIT compilers? JavaScript?
At any rate, I think you're trying to say scripting langugage == interpreted language but there are probably interpreted languages (let's say Prolog) that you wouldn't call scripting languages and there are compiled languages that can be used for "scripting". I agree that typically scripting languages are interpreted. But I would call Python a general purpose language and not a scripting language.
Well there is a distinction between the language and the language implementation. Being scripted or compiled is a feature of the implementation but a lot of people think that it's a feature of the language. It is kind of a pedantic point to make but there are for example CINT (http://en.wikipedia.org/wiki/CINT) which is an interpreter for C/C++ or BeanShell (http://en.wikipedia.org/wiki/BeanShell) which is an interpreter for Java.
Being executed through an interpreter, compiled to native code or something in-between (intermediary code) are several possible implementations of a given language.
In the 80s, the same language, BASIC, could be either interpreted at runtime or compiled to intermediary code. And there was even a C language interpreter.
A surprisingly difficult question that I've wrestled with a lot (I did a PhD in "compilers and scripting languages" and people can be quite picky about semantics!). Here's how I think about it:
A scripting language is a language L with respect to an environment E where a sequence of operation that can be performed manually in E (a script) can also be performed by invoking a program in language L.
A stronger definition can involve multiple environments E for the same language.
A general purpose language L with respect to an environment E is a language in which every possible application that can run in E can be performed by invoking a program written in language L.
Then again we can strengthen this definition by including multiple platforms or environments.
Most people I know use "scripting language" in a derogatory way, "it's only a scripting language, you can't use it for real applications" and often with respect to perfectly good general purpose languages (which is why this approach isn't always constructive).
They usually share a set of features in terms of typing, extensibility, ability to be used more as glue language than real applications, interactivity
A well known paper that discusses those capabilities is the John Ousterhout's paper for the 1998 IEEE COMPUTER, "Scripting: Higher Level Programming
for the 21st Century".
The four points on how "scripting languages differ from classical compiled languages" would mean even Java qualifies as a scripting language, and I've never seen anyone credibly claiming that.
Wikipedia is more clear on this: http://en.wikipedia.org/wiki/Scripting_language "it is uncommon to use Java as a scripting language due to the lengthy syntax and restrictive rules about which classes exist in which files" -- I'd say this applies to C++11 as well.
F# has static types, but it feels pretty light for doing scripting. It's got a REPL and a script mode to execute things without an explicit compile phase. Perhaps "doesn't require heavy type annotations" is a better criteria than no static typing.
Agree with you, without knowing F#. I guess F# is related to Haskell and in Haskell you have static types, but you need them only where the compiler/interpreter is not able to infer them.
F# has some inheritance from OCaml and ML. Much older than Haskell. And type inference can be done in many languages. There's no reason why, for instance, Java and C# require type annotations all over the place. They could add in type inference everywhere, although it'd probably mean a overhaul of the compiler, and it wouldn't work in every case (overloading).
There's no reason why, for instance, Java and C# require type annotations all over the place.
Wouldn't their type systems be a big reason why? ML and Haskell have type systems very different from Java specifically because they went for systems that were inferable. F# has an iffy "just assume it is int" step to make it work with the C# type system. F#'s inference algorithm doesn't work as well, and it impacts how the language gets used. For example people tend to overuse the pipe operator because it helps the inference engine get the right type without annotations.
F#'s "assume it is int" is only for a few operators, such as +, as a convenience. It has nothing to do with interop with C# at all.
F# inference is left-to-right, which is one reason to use the |> operator, yes. F# had additional type inference, for instance, accessing members on a binding would infer object types, but they removed that. Haskell has a more complete type inference system.
I'm not seeing anything in C# that prohibits inference of types for fields, methods return types, or parameters. The type system C# has is essentially a subset of F#.
I tend to say that sed could be considered as a domain-specific scripting language usable to edit text. awk is a more generic language - to my shame I used it only for parsing and translating text - even without a REPL you can 'test' quite a lot of constructs by providing the expressions to awk as an argument.
I concede that REPLs is not a must for a scripting language, but it will definitely make it more enjoyable ;)
Would be great if somebody could post Python/Ruby/.. code here that achieves the same, just to compare. Even with all the C++11 additions it never felt like scripting to me, and I use it practically every day.
Small nitpcik to the author: please define variables when you're going to use them, not C-style all at the beginning of the function. It makes code easier to understand. It doesn't make the reader wonder 'hey what's this variable going to be used for' then having to crawl through all code underneath it. It creates code that's easier to refactor. Also, but more arguably for such a small sample: http://stackoverflow.com/questions/1452721/why-is-using-name...
On your first point; I was matching the behaviour of the C++ program in the blog post, which allows any number of command line arguments but only uses the first two.
On your second point, I actually thought about writing the whole thing as
Agreed, it does and you made the right choice. But in reality once you know Haskell - bind is a very common operation and reading this version is very natural.
Here's a Perl example. This isn't optimally compact, but this is more or less how I would write a script like this if I had to put it into production at $work (maybe with an extra check to make sure that exactly two command line arguments were provided).
use strict;
use warnings;
open my $read,"<",$ARGV[0] or die $!;
my @lines = <$read>;
close $read;
open my $write,">",$ARGV[1] or die $!;
print $write $line foreach my $line(sort @lines);
close $write;
Here's the shortest I've gotten after looking at some of the comments:
IO.write$*.pop,ARGF.lines.sort.join
And yes, it does work even without the space between `write` and `$*`.
Also after testing, I realized the ('\n') is not required for join. When you call 'lines', it still has the '\n' character in the string, and when you join, it defaults to join without a delimiter, so it's putting them back together with the newline still there.
The thing is in ruby you are already in main method so there's no need to declare main function as an entry point.
The main reason that the c++ version the code is longer has some historical/performance related issues!
In c++11 it could've been with smaller standard library at design but that could break the old codes!
Although obviously, "Little code =! Better code".
What you want to achieve is actually more important.
Btw, C++ as a scripting language? at first you might think that way but truly that's a big lie :)
> Would be great if somebody could post Python/Ruby/.. code here that achieves the same, just to compare.
It's actually really hard to completely rewrite that program in most scripting languages: they tend not to have the same concept of undefined behaviour.
'sys.argv[2]' in python (with sys.argv = ['thing.py']) is fully defined (raises an IndexError). 'argv[2]' in C++ (with argv = (char*[2]) { "./thing", 0 }) is undefined.
If you scripting language has an FFI library (ctypes in python, say) then you could probably do something equivalent.
Defined behavior is a subset of undefined behavior. Specifically, "raises an IndexError" is a perfectly legal thing for a C++ implementation to do in this case. Approximately zero real world implementations do this, but nothing says they can't.
good point about the UB. Together with shin_lao's comment about the error/exception handling bascially sums up why the author's claim only makes some sense, but not a lot.
Not bad, however with the ''.join(data) you're effectively doubling the amount of memory needed for large files, because it will build the entire output string in memory (and you've already got the list in memory). Better to use writelines(). You can also use sorted() to iterate over and sort the input lines automatically:
From the other comment, it is completely equivalent, and I will admit that the pipe is due purely to my ignorance. I usually use `sort` for sorting the output of other commands, and so I forget that it can open a file on its own.
Don't worry about the cat it does have a purpose: portability.
cat in.txt | sort > out.txt
works in powershell(cat is an alias for Get-Content and sort is an alias for sort-object)(without the sort it wouln't work).
ifile, line and data have to be defined before the loop.
ofile could be defined closer to usage, but I think it's much clearer where it is, grouping in and out defs together making it obvious how argv is used for each.
every single line of code is clear, understandable and expressive
It's been a while since I looked at C++ but that statement doesn't apply to me. For example:
for(const auto &i : data) {
ofile << i << std::endl;
}
I have no idea what that's doing or why it makes any sense. If I knew C++ better maybe that wouldn't be the case but, for example, the Ruby example someone else provided is obvious to me (and I don't work with Ruby).
It feels there's a lot of telling the machine how you want to do something going on in here (instead of what you want it to do).
I'd be really interested in hearing how this sample differs from a previous C++ implementation.
I think people sometimes feel compelled to combine auto and the range based for loop when they shouldn't. In this example the range based for loop adds clarity but the use of auto doesn't. I would have written something like:
Actually, the only thing this tells you about what you are getting out of the vector is that it can be implicitly converted to a string.
If data is a vector of, for example, pointers to const char, on each iteration this code will unnecessarily copy each item into a temporary string before printing it. Using auto would avoid this step, regardless of data's type.
Note that the typical non-range-based for loop also omits the vector item type... Is this that much less clear?:
for (int i = 0; i < data.size(); ++i) {
ofile << data[i] << std::endl;
}
I think that some amount of knowledge was assumed. You would have understood it if you had known Java. For-each loops in Java have a very similar syntax.
You probably know something that is similar to Ruby, and that's why you understand it without ever learning it. I bet that you wouldn't understand it if you didn't know any programming language at all.
Maybe it's true that C++ has become less verbose and more easily flexible than it once was through some of the additions of the C++11 standard. I'd argue that a scripting language is not defined by these criteria though, a scripting language is not compiled down to a binary by definition - that's what defines it.
The real differentiator is that there is no user-visible compile step. Scripting languages can be compiled to binary, through a JIT or even AOT, but when this happens, it's hidden from the user. Consider Python and Java: both are compiled to binary formats, but Python hides this while Java does not. It is common to call Python a scripting language, but it has been a long time since anyone said that of Java (and even back when they did, it was intended as an insult more than as a technical classification).
C++ is adopting some of the properties of scripting languages, but I'm not aware of any implementations that have removed the explicit compile step. I don't think it's really accurate to call C++ scripting as long as that step is still required, and I haven't seen very much interest in taking it out.
A little bit of command line foo would fix that. If you really wanted to go gung-ho you could use Linux's binfmt_misc to do it directly to .cpp files for you.
And before you complain it's not the same thing, that's basically all a scripting language does. The compilation step is hidden inside the command wrapper.
Also of note: does more than just C/C++: FORTRAN, Java, Pascal, even assembly; and includes a "realcsh" and "realksh" for that C and kernel REPL you've been craving. Just "apt-get install binfmtc". I test out quick little things in C++ with this all the time.
Would it, though? C and C++ introduce a lot of complexity in the compilation stage that are hard to hide without leaking much.
You have to import the headers, using include guards if your "script" spans multiple files. Any external headers have to be in the header search path, libraries have to be explicitly linked and also be in the linker path, you have to have a makefile or similar in order to manage the compilation complexity.
If you are using, say, python, all you have to do is add the shebang, "import" the desired packages and you are good to go. One has to install the eggs, packages, or whatever the name is beforehand, but after that you can just use them.
Eh, it's all semantics; nearly any language can be compiled or interpreted, and while it is verbose, it's not insurmountable to make scriptable C++; I've got some templates I tweaked with long enough to make them pretty straightforward to cut and paste and use with binfmtc (see my other post) to quickly test things (I may have to post those templates . . . ). I also have a template for Python that does similar things, because in all honesty, while it might slow down learning a bit, at least I'm learning the correct way to do things by always having warnings cranked to the max.
Agreed that C++11 is better than C++Original. But no. It's still 19 non-blank lines, all of which you have to think about and maintain. Compare this to an idiomatic version written in a "real" scripting language like Python.
import fileinput, sys
for line in sorted(fileinput.input()):
sys.stdout.write(line)
Advantages over C++:
* Fewer import/using lines (5 lines in the C++ version).
* No variable declarations (4 lines in the C++ version).
* Automatic iteration over file or file-like objects is very nice. No need to build a list via getline() and the terribly-named "push_back()" function.
* No "data.begin(), data.end()" parameters to the sort function -- sane defaults, people.
* "for line in file" is so much easier to read than "for(const auto &i : data)".
* Return 0 is implicit. Explicit is better than implicit, I know, but this is a very sane default. If there's an exception, Python won't return 0.
* Better automatic error handling. What does the C++ version print if a file doesn't exist or if there's a read error?
* Thanks to Python's standard library (fileinput module), it automatically handles stdin, multiple input files, etc.
I eliminated the loops and did the processing with STL algorithms and iterators. It is ugly as sin if you try to process lines like the original code. Most scripting languages do rather well with lines but C++ does not.
Hah, I was just about to post more or less exactly the same code. :)
It's funny, had the author tweaked the problem just slightly to try make C++ look good by saying that the program should output sorted words instead of lines, you could have deleted the entire Line nonsense. Had the problem been to output sorted, unique words, you could have made the rather elegant:
int main(int argc, const char* argv[])
{
using input = istream_iterator<string>;
using output = ostream_iterator<string>;
ifstream inputfile(argv[1]);
ofstream outputfile(argv[2]);
set<string> words{input(inputfile), input()};
copy(words.begin(), words.end(), output(outputfile, "\n"));
}
But then again, it's not exactly in C++'s favor as a scripting language that copying words is easy, while lines is hard.
Yeah, it is not a scripting language for processing lines. Although the line nonsense has a benefit though it could be adapted to parse out records and by overloading the less than operator to sort on a specific column. But then you would might use AWK instead.
The thing which (to me) makes a scripting language /better/ for actual scripting, is that it's run from sourcecode.
In unix terms, if the file starts with '#!'
Why? Because I can write a 4 line BASH script which I plonk into /usr/local/bin and it just works. When I want to check why something happened on the filesystem, I can open the script inplace, and step through it in the REPL.
No faffing around with trying to figure out where the source code is, no compile/link/whatever...
Of course, while developing something a little more serious in a scripting language, I use a linter+unittests in pretty much the same way as a compiler... but that's besides the point.
Haskell is usually much more concise than C++, but I don't consider it a scripting language. (OK, I could use an interpeted haskell, I suppose... Just as you could use a BBC micro ASM interpreter in a VM... but whatever. It's just silly)
All the posted python/ruby/etc... above has the same problem. A OOM condition will terminate them all, which is generally the desired behavior (and thus the reason that process termination is the default handler for OOM conditions in all environments).
What error specifically are you looking to see handled that wouldn't be exactly equivalent in your scripting language of choice?
That isn't true at all. You know enough memory management to avoid memory issues in your example. C++11 didn't save you from needing to understand it -- you understand it so avoided needing it.
- Compile time with -O3 is roughly the same as Python VM startup and has to be done only once.
It still requires two separate steps. Also, compile gets slower as the program gets larger, which isn't nearly as true for Python.
Thinking it over, memory management isn't really a good argument here. Even if your program leaks, if we're talking about a job script? Who cares, the OS will clean it up when the process closes. And either way, stack allocation and references will do just fine for small jobs, you shouldn't need to be doing lots of pointer stuff for a little script.
There are many sane subsets of C++ that save us from worrying about memory. If you just keep everything on the stack or in an auto_ptr, you should do fine.
This is not true. auto_ptr (in C++11 unique_ptr) and stack discipline are not memory safe. There is no sane subset of C++ that is memory-safe. I can provide (and have provided—see my posting history) dozens of examples.
That's missing the point though. Broadly speaking, when people talk about languages with automatic storage management being "safer" they are not talking about a correctness proof of their memory handling. In fact some languages, perl among them, fail to be safe from leaks in all cases, yet no one flips out about it.
The point is practical: is the language as typically used subject to routine "accidental" memory leaks? That's surely true for C, and remains true for most C++ idioms used up until the last few years or so.
It's not true of the kind of RAII style being talked about in the linked article. In that style it's routine to write large projects that literally never call operator delete, and need to resort to an operator new only in rare circumstances (often for compatibility with older APIs).
Modern C++ when used at this level[1] really does have the same kind of casual robustness against leaks and free-memory issues that you expect to see from garbage collected environments. And it's not even hard.
[1] Which is not to say that all contemporary C++ can be written in this model. Obviously if you're doing syscall-level code you'll need to be touching memory (and probably the heap) directly. But that's sort of the point.
> The point is practical: is the language as typically used subject to routine "accidental" memory leaks? That's surely true for C, and remains true for most C++ idioms used up until the last few years or so.
If this were true, we would expect to see large C++ codebases without memory-related security vulnerabilities. But the security history of every large C++ codebase that I have seen or heard of says otherwise. I would love it to be true, but I don't think it's a tenable position that C++, even "modern" C++, is memory-safe in practice.
We can argue over whether the C++ deployed in practice is "real" modern C++, but I think that enters into no true Scotsman territory really quickly. The fact is that C++ is not memory-safe in theory and has not been shown to be memory-safe in practice. For example, I know of real security bugs in Firefox that were caused by issues that are not fixed by any "modern" C++ idioms.
> If this were true, we would expect to see large C++ codebases without memory-related security vulnerabilities.
OK, we're talking past each other. The linked article and my point was about C++'s suitability for achieving software quality in tasks that are traditionally done by "scripting" languages. Security analysis is an entirely different world, and I tend to agree that other languages have a head start there as far as memory safety.
But that said, "memory safety" is hardly a big contributor to the overall vulnerability list. C++ is much less used on web backends, and it's likewise true that almost no large web service codebase exists without non-memory-related security vulnerabilities. I don't know if there are any deployed Rust codebases of this size, but I'd expect them to have their share of whoppers too.
Wikipedia claims that a scripting language should be interpreted and I would argue that this means that you need a good interpreter program too. There is actually a language called Ch (http://en.wikipedia.org/wiki/Ch_(computer_programming)) that tries to achieve just that and it work pretty nicely but it's not full-blown C++.
I disagree with the author's central claim (where's the REPL?), but he makes an interesting point.
C++ has matured to the point where it has 95% of what a scripting language needs. It wouldn't be hard to write a thin wrapper that provided the final 5%, and it would come as a welcome convenience to programmer who are used to working with the C++ libraries.
Oh. Wait a moment. It's already been done. It's called Lua. Ho hum.
I wouldn't say it's gone into scripting language territory yet, but C++ is becoming a much simpler language for everyday programming. It's still a large complicated beast with sharp corners, but that only really occurs when you delve into library-writing territory. Everyday C++ code, when written in a modern style, is just cleaner, and C++14 is going to make it even better.
I think the OP is trying to blur the lines of "use the best tool for the job." C/C++ are powerful languages but they usually require a build infrastructure of some kind. (make, etc.)
If you need access to a native library that isn't exposed through any other tool, sure, then writing a C++ tool is an acceptable route.
I'd expect part of the definition of "scripting language" to include being interpreted. He addresses this as an extra:
compile time with -O3 is roughly the same as Python VM startup and has to be done only once
It's a non-scripting language hassle to have to compile. Of course, you just need a little front-end to automatically compile if needed for you. IIRC Perl actually does this.
But I like his emphasis on large standard libraries, enabling compact scripts, esp string processing, and memory managed.
WHy couldn't you do this trick with C (i.e. compile & run)? It's mainly libraries, though memory management isn't natural. Actually, I could believe that many scripting languages actually started like that, but shifted to their own syntax asap.
This trick can also be done with java, by keeping a server in the background to run it to cheat the VM startup tax (and auto-compiling as needed). Java verbosity is a problem, but you can write C-like code in Java. The biggest problem is the detail of Java libraries - they give you a lot of control, but a scripting language should give you less control, in return for quick functionality (like unix `sort`).
The fact that C++ is a compiled language automatically removes it from my "scripting toolbelt". Aside from that, I just don't find it to be anywhere near as expressive as shell or Python, which is hugely important when you want to understand a script you (or somebody else) wrote several years ago.
IMHO this is a terrible example as you should never write this as it is just unix' sort. Furthermore I think it is even complicated. First, I think it should read from stdin and write to stdout and second this should be really short not what c++ people consider short.
I haven't written much C++ in quite some time. Does it still suffer from slow compilation? There's a module system in the works that should help a lot.
Using template heavy code can cause really slow compilation. If get really liberal with nice things from Boost, a simple looking file can take a couple minutes.
On the other hand, by modularizing the code down into libraries, and generally using incremental compilation, after an initial 'full build', minor builds during development are not too slow.
An example of my problem with C++ is, writing a function in a shared library, which is meant to return a class from the standard library say vector<string>, to the program that calls it is very unwise.
Can you imagine if your Python modules couldn't return objects from the standard library?
This is because the shared library and calling program might have been compiled against a different version of the standard library, and also because the 'flattened names', used to refer to members of a class are not uniform between compilers / compiler versions. You can often get away with stuff on linux, because all the software is compiled in the same environment, but build once run anywhere? No.
This is turning into a bit of a rant. I like C++, but it has so many imperfect, jagged edges, enough to surprise programmers after 10 or 20 years. There is still a lot left to fix.
I adhere to the traditional definition of a scripting language as being a language used for automation; i.e., to control a host application or several host applications.
why would you ignore boilerplate? they count as well! that's why anyone would skip C++ and use python. Because they don't want to write boilerplate code!
that actually sounds precisely like D's sweet spot - a C++-like language, but with decreased boilerplate and some high-level features that you'd expect from a "scripting language".
I'm not sure you get to claim a language is a scripting language and then ignore the boilerplate.
The equivalent python is 4 lines, just one more than the number of steps you're performing.
C++ is good at many things, but quickly creating readable scripts and live-coding with a REPL are not among them.