Software development final exam answers: Algorithms and Data Structures

adrianhoward · on Oct 17, 2012

Heh. I tried answering these before reading the answers. My CS (Computing and AI to be precise) degree was more than 20 years ago now.

1) "Question: Is O(2^n) equal to O(3^n)? Why?" - correct... but only because I got a very similar question wrong back in 1990 and learned my lesson then.

2) "Question: What is the expected run-time of quicksort? What is the worst-case run-time" - Correct - but I had to work it out. Didn't know the answer off the top of my head. 20 years ago I would have know the answer off the top of my head.

3) "What is the difference between a binary tree [EDIT: that should be binary search tree] and a B-tree? When and why does this difference matter?" - had to google b-tree before I was 100% sure of the right answer... the previous couple of questions had primed me to be thinking in terms of algorithmic complexity rather than immediately jumping to high fan out as the reason.

4) First part got wrong - coz back in the day when we were taught heap sort we didn't break down the algorithm in this way with those names. I was taught sort algorithms at A level (for US folk - highschool 16-18 rather than university)... so it's more than 25 years back for me now). The two chunks we had were, as I recall, making heap and sorting heap, rather than making heap and selecting largest element. I could tell you that heap sort has a better worst case time than quicksort off the top of my head though.

5. Couldn't do without googling what 'bipartite' means. Didn't really do much, if any, graph theory in our degree.

The other thing that pops into my head is how little I've had to use the more "CS" bits of my degree in the last 20 odd years of my career. ;-)

tmoertel · on Oct 17, 2012

I'd like to take this opportunity to thank Colin for offering his four-part CS "final exam" and then offering to grade our submissions. It must have been a ton of work, and it is appreciated.

jacques_chester · on Oct 17, 2012

I wish he wasn't ... I'm trying to pay him for his time and he's giving it away for free to total strangers :)

_hgt1 · on Oct 17, 2012

The average score for the first question was 2.8/5, while the second question (which has an answer that's easier to google) scored an average of 4.8/5.

The thing I take away from this exam is that working with Colin must be like dealing with Rimmer from the Red Dwarf episode "Holoship".

http://www.youtube.com/watch?v=bHA7qkiG6DI

seiji · on Oct 17, 2012

I've worked at places who wouldn't hire you if you could answer these questions. They had specific "we don't hire anybody who would get a PhD" policies because they didn't want you "in your head" all day.

Those "companies" failed.

shrughes · on Oct 17, 2012

These aren't PhD-level questions.

eru · on Oct 17, 2012

Though the difference between PhD and earlier degrees seems to be more one of diligence and sustained hard work than necessarily smarts.

seiji · on Oct 17, 2012

Correct. They are gateway questions.

jiggy2011 · on Oct 17, 2012

That seems a very odd strategy, it's basically saying "we don't want to hire anybody who is smart".

I can understand wanting to hire people who know how to solve problems over people who can regurgitate the algorithms textbook on cue. OTOH most of the questions here really aren't PhD level at all, they are typical of what I would expect on a sophomore level undergrad CS exam.

btilly · on Oct 17, 2012

For the record, I took this section and scored 22/25. I took the other 3 parts and expect to score much worse on at least 2 of them.

On question 1 I really think that Colin is wrong and my answer was right. As http://en.wikipedia.org/wiki/Big_O_notation#Equals_sign says, with citations, equality in big-O notation is usually used in an asymmetric way. Sure, Colin's preferred notation would be theoretically better, but the common usage supported by authorities up to and including Knuth is that, for instance, O(n) = O(n^2) but not O(n^2) = O(n). Using the definitions that lead to that, I demonstrated that O(2^n) = O(3^n) but that the reverse is not true.

On question 3, I knew full well about the disk win, but failed to say it. :-( Oh well...

Anyways, as others have said, thank you Colin for being willing to put this together and do the work.

cperciva · on Oct 17, 2012

equality in big-O notation is usually used in an asymmetric way

The equals sign is used in an asymmetric way, to represent something which is not equality. There was a reason I asked "is O(2^n) equal to O(3^n)?" rather than "is O(2^n) = O(3^n)?".

But I gave you 4/5 on that question anyway, since you obviously had the right idea.

btilly · on Oct 17, 2012

To me equality is whatever an = sign represents. shrug

But this reminds me of something amusing. I will never forget how hard it was for me to understand the distinction between isomorphic and naturally isomorphic. You see I thought of "isomorphic" as meaning "the same as". But then how could these two vector spaces be more the same than those two vector spaces? After all they're the same, right?

Evbn · on Oct 17, 2012

In C++, = is the operation that invokes the copy constructor. Is that equality? In C++ , == is equality, insofar as equality has meaning in a mutable language.

joshhart · on Oct 17, 2012

I also scored 22/25. I also agree that I will not be able to replicate that score, except for maybe on the last test. I find it interesting that I generally got lower scores on questions I didn't even have to think about at all, because I didn't bother to put in sufficient amounts of justification to my answer.

I'm not sure what I think about the test as a metric for software engineers OR of the results we'll get. Colin is a Putnam fellow - probably not the guy you want writing a math test :) For the results - first, people on HN aren't a representative sample. Second, people had the opportunity to look at the exam before deciding to take it. My hunch is that people who would not have done well may have decided it wasn't worth their time to hear about it. Not sure what we can make of this. Perhaps if Colin was disappointed, he's not as disappointed as he could have been. And also if he was disappointed, it may not actually matter.

Regardless, if I built my own startup and were hiring, I do think I'd want people who have demonstrated that they're fast, dedicated, and could probably get above 50% of the marks on this exam.

gjm11 · on Oct 17, 2012

If you actually literally did that, then I think Colin should have given you full marks for Q1 since the only difference between what he wanted and what you did was terminological. (Personally, I would write O(2^n) = O(3^n) but would not say that O(2^n) is equal to O(3^n). Which, yes, is rather weird.)

Evbn · on Oct 17, 2012

I would expect Knuth of all people to discard that broken notation. He has he smarts and chutzpah to do better.

I always hard that notation.

ColinWright · on Oct 17, 2012

This also generated substantial discussion when it as submitted over a week ago. Opinions are sharply divided. There are those who are interested in the challenge, there are those who say it's a waste of time and that programmers don't need to know this kind of stuff. There are those who are confused by the whole thing and don't see the point.

You can read that discussion here: http://news.ycombinator.com/item?id=4626097

gjm11 · on Oct 17, 2012

This is not the same thing that was submitted over a week ago. That one was the questions; this one is the answers and some information about how people did.

ColinWright · on Oct 17, 2012

Oops, that's an "oh - bother" moment then.

Apologies to all - I screwed up.

jonsen · on Oct 17, 2012

It's a recurring phenomenon. I believe it demonstrates that software engineering is a much more varied field than is commonly accepted.

Evbn · on Oct 17, 2012

I think it demonstrates that software engineers can't resist petty disputes.

mseebach · on Oct 17, 2012

> I have seen many people use quicksort in situations where the worst case either occurs quite often or can be easily caused by an attacker to occur

Isn't the pivot point in standard quicksort randomized? If so, how can there exist cases where an input will often hit the worst case? Are there attacks that work without having to predict the PRNG?

Of course if you know something about your data, there are often faster ways of sorting it, but that doesn't seem to be the complaint rendered.

sesqu · on Oct 17, 2012

> Isn't the pivot point in standard quicksort randomized?

Quicksort isn't standardized. Randomization is a common approach, both then so are median of 5, middle and first datums.

randartie · on Oct 17, 2012

Question 1 is interesting. Are there any 'not obvious' examples of functions f and g such that O(f(n)) == O(g(n))?

nandemo · on Oct 17, 2012

Of course "not obvious" is subjective, but I offer this example: f = log n! and g = n log n.

mdkess · on Oct 17, 2012

Is this true (by the definition on the site)?

I mean, certainly O(log(n!)) <= O(n log n) since we can just say that log(n!) = log(n) + log(n-1) + ... + log(1) <= log(n) + log(n) + ... + log(n) = n * log(n), but is the bound tight? It's been too long.

EDIT: Answering my own question... yes. http://www.cs.sfu.ca/CourseCentral/307/jmanuch/lec/factorial...

Evbn · on Oct 17, 2012

Informally, the left side is he integral of log over 1 to n, which is equal to n log(n) - n plus bounded error, which asymptotically is n log(n).

shrughes · on Oct 17, 2012

Consider a program that solves the traveling salesman problem in the following fashion. It searches for an algorithm that solves the traveling salesman problem and proof that such an algorithm runs in polynomial time, and then uses said algorithm to find a solution for the input, simultaneously while trying to solve the traveling salesman problem with some non-polynomial time algorithm that takes O(g(n)) time.

Let f(n) be the worst case running time that program has on inputs of size n. Whether O(f(n)) = O(g(n)) is not obvious. Edit: Actually you asked for an example, and I'm not sure I gave one.

cperciva · on Oct 17, 2012

Depends what you consider to be obvious, I guess...

capisce · on Oct 17, 2012

Such as f(x)=2x and g(x)=3x perhaps?

sesqu · on Oct 17, 2012

Another common one might be f(x)=log₂ x and g(x)=logₑ 10x

Evbn · on Oct 17, 2012

Big O f log, where is the factors don't count and the subscripts don't matter.

swah · on Oct 17, 2012

IMO I think Colin is assuming the majority of developers work in lower level languages like he does, but implementing mergesort in Python makes no sense, and so many folks haven't done that since university..

orangethirty · on Oct 17, 2012

I have not agreed with past "exams" but this one is/was interesting and I enjoyed it. Maybe because it is more relevant to general software engineering dealings. Dunno.

pgsandstrom · on Oct 17, 2012

Only two mentions of the result being "disappointing", I somehow expected it to be higher. Still, thanks for providing those exams, they were very fun and refreshing :)

cperciva · on Oct 17, 2012

It probably helped that I was pretty cynical when I started this. ;-)