I've done a fair amount of web development in PHP & Python (as well as introductory C++ & Java). I had to look up "linked list" on Google just to get some understanding of what that is.
I'm going to play devil's advocate and say that while it is good to know these things, they're no longer essential. I graduated two years ago, and have been developing in Ruby since then, I haven't seen a linked list since College. Why?, because I chose to work with a newer language, with a higher level of abstraction. The programmer of the near future will have a different set of challenges to previously. In the same vein, the guys who graduated 10-15 years ago weren't learning Fortran. But 25 or 30 years ago, I'm guessing they were.
Here's a quick example of a case where knowledge of data structures and algorithmic complexity is useful, regardless of the language.
Let's say you've got two lists, named "A" and "B", which each have 1000 integers. You want to return a list of unique integers which are present in both lists (ie the intersection of two lists). These 'lists' could be arrays or linked lists.
The brute force method would be something like this:
Out = []
for a in A:
for b in B:
if a == b and !Out.contains(b)://O(n) list search
Out.append(b)
return Out
But this can be very slow, because you can end up in the territory of O(n^3) iterations across A, B, and Out (internally the language would generally be iterating across Out in order to complete that 'contains' call). In this specific example, that works out to be something along the lines of 1000x1000x1000 = 1000000000 iterations, vastly larger than the size of the original lists.
---
A better way is to sort one or both of the lists then do the comparisons along each of them (this syntax assumes the lists are specifically arrays, but it'd effectively be the same for linked lists):
Out = []
sort(A)
sort(B)
b = 0
for a in xrange(len(A)):
vala = A[a]
valb = B[b]
if vala == valb:
Out.append(vala)
++a
++b
elif vala < valb:
++a
else://vala > valb
++b
return Out
This is definitely better than the above example. Now we're performing two sorts, each O(n log n), then we're doing a linear iteration across both of those lists in parallel, each O(n). So now we end up with an overall complexity of O(n log n) as a result of those initial sorts. Let's estimate the total number of iterations to be around, I dunno, 10000(sort)+2000(iterate/compare) = 12000? The exact number of comparisons in a sort can vary by algorithm used and how things were ordered in the lists to begin with.
Not bad, definitely better than what we were doing before. But we might be able to go a little better...
---
Yet another way is to use some hash sets, which (generally) have O(1) insertions and retrievals, and only store unique values, such that if you insert the same value twice, the second insertion is effectively ignored. We can do something like this:
Out_set = set([])
A_set = set([a in A])
for b in B:
if A_set.contains(b)://O(1) set search
Out_set.append(b)
return list(Out_set)//optional, could return Out_set
Now we end up with an algorithm which is O(n), where we iterated over the items in A once to fill in A_set, then we iterated over the items in B once, and each "contains" call was an O(1) hash lookup inside of A_set. Then finally we iterated over Out_set once to create the output list. This final step is optional, we could also have just returned Out_set directly, but it doesn't effect algorithmic complexity in either case. Now we've got 2000-3000 iterations, depending on how whether we return a set or a list. And this additionally looks a bit simpler than the sort+iterate version I gave above.
---
So just by using our knowledge of data structures, we've turned ~1000000000 iterations into ~2000-3000 iterations. This is on the order of reducing the distance to the moon to around a kilometer. And that's just with two lists that are limited to 1000 items each.
And this sort of thing is a pretty common scenario in any language, no matter how 'abstracted' it is.
Thanks a bunch for the detailed examples. I've noticed this is a huge problem in my code: too many nested for loops. I've generally gotten better about using built in functions like set (in Python at least):
Out = list(set(A+B)) # right?
Or even DISTINCT in SQL (when dealing with webapps).
And now I never doubt the usefulness of fuzzing the DB with hundreds of thousands of random entries. That quicky code that took 40ms to run over several dozen entries turns into a minute or more for several thousand.... Ouch! ;-)
A linked list is a pretty elementary data structure. That being said, these questions seem like they're for a job writing client software on Windows some time before .NET existed.
Of course, way back in the dark ages when I wrote VB for client Windows apps I was all "lets keep this nice and simple, nice and simple.... arrrrrrghhh, why doesn't this POS language do what I want?! HULK SMASH", and then I'd break out the Win32 APIs and hit it with the big guns.
I feel a little sorry for the 'normal' VB programmers, because they'd be following my code and then BLAM it'd disappear into hyperspace, and re-emerge and something magical† would have happened.
It's worth noting that Python has just one linear structure - lists* (Ruby avoids lists and just has arrays). Now, they're actually much fancier under the hood, which is why you can get away with not having a bunch of choices, but it's the approach of "Oh hey, I want to store a bunch of stuff in a sequence of some sort - why should it matter how that's implemented?".
Now, I happen to agree with you that it's important to know how a basic linked list works. Beyond that, though, is far less important to some people than others.
* Okay, I lied. Halfway through writing this I remembered tuples (duh!), and I'm sure someone is going to point out several other things I've missed.
I've done a fair amount of web development in PHP & Python (as well as introductory C++ & Java). I had to look up "linked list" on Google just to get some understanding of what that is.