Hacker News new | past | comments | ask | show | jobs | submit login
Data Structures using C++ (usc.edu)
101 points by michaelrbock on July 29, 2013 | hide | past | favorite | 52 comments



My CS degree was 100% C based; Java shows up in a class about OOP, and our class on Programming Language theory exposes us to Lisp, Java, and Prolog. They used to be Java based and chose to switch back to a C based curriculum around 2007, as well as focusing more heavily on algorithms and data structures while pushing the more "software engineering" type courses in to elective and offering Software Engineering as a specialization on a CS degree.

Joel Spolsky wrote a great article[1] in 2005 that basically says using a massively high level language like Python or Java as the core of a CS curriculum is damaging because the languages aren't "hard" enough. As curmudgeonly as it might sound, I'd be inclined to agree. Even though I rarely write C code anymore (we do everything in Java in our lab), the underlying understanding of memory management that is acquired from having to learn to handle it manually has been invaluable to me as a programmer.

Additionally, C/C++ is the language of UNIX and still the lingua franca of high-performance. Most high-level languages worth their salt will have a foreign interface to dynamically load C libraries; we're in robotics and have to work heavily with ROS/Gazebo as participants in the DARPA Robotics Challenge, and even outside of this a lot of really great tools for mathematical optimization and high-performance numerical analysis are all written in C. I'm one of only two or three people in our lab with the requisite knowledge to be able to effectively hand-roll JNI wrappers, and I don't consider this to be a good thing at all.

It's pretty clear to me that C/C++ aren't going anywhere any time soon, and it's always good to see a CS curriculum that still focuses on C/C++.

1: http://www.joelonsoftware.com/articles/ThePerilsofJavaSchool...


>Joel Spolsky wrote a great article[1] in 2005 that basically says using a massively high level language like Python or Java as the core of a CS curriculum is damaging because the languages aren't "hard" enough. As curmudgeonly as it might sound, I'd be inclined to agree. Even though I rarely write C code anymore (we do everything in Java in our lab), the underlying understanding of memory management that is acquired from having to learn to handle it manually has been invaluable to me as a programmer.

Python and Java are great starting languages, especially for the first introductory course. You're really only learning the high level concepts like basic language syntax(Why are there so many semicolons???), program flow control(functions, if, for, while statements), working with computer memory structures, and basic boolean logic. At that point, an easy to understand language actually makes more sense, because the students are working more with the concepts than with the language.

After that, when you're teaching data structures, it's important to know how the memory is allocated and how that can effect your implementation of the structure. C/C++ are good languages because they force students to deal with memory allocation and pointers. They also don't have pre-rolled data structures like python or java, which is important because the goal is to learn how the structures work, not how to use the standard language libraries.


My opinion is that one can easily teach these "high level concepts" (loops, syntax, control flow, etc) using C.

Furthermore my (admittedly curmudgeonly) take on the C vs "high level languages" debate is that learning something like Python "first" because it's "easier" is actually sort of damaging, in that the student will learn these "high level" concepts without any clue about the underlying implementation. They will then have to learn a bunch of "bag of tricks" that will seem arbitrary to them, and a bunch of "gotchas" to avoid, again, that will seem arbitrary to them, because they have no concept of things like the cost of memory copying (passing-by-value giant arrays as arguments to functions, etc).

I actually think that someone who is incapable or even worse, unwilling to learn programming using C, has no business programming anything at all of any import, beyond the most toy-example tiny scripts here and there.


>I actually think that someone who is incapable or even worse, unwilling to learn programming using C, has no business programming anything at all of any import, beyond the most toy-example tiny scripts here and there.

I'm a C hacker, and I do agree with this 100%.

And keep in mind, when I say "introductory", I'm talking a 1 semester class at most. I'm not saying that it should necessarily be any more than that.

>Furthermore my (admittedly curmudgeonly) take on the C vs "high level languages" debate is that learning something like Python "first" because it's "easier" is actually sort of damaging, in that the student will learn these "high level" concepts without any clue about the underlying implementation. They will then have to learn a bunch of "bag of tricks" that will seem arbitrary to them, and a bunch of "gotchas" to avoid, again, that will seem arbitrary to them, because they have no concept of things like the cost of memory copying (passing-by-value giant arrays as arguments to functions, etc).

Honestly, starting out with C++ when I was in my early teens(I'm in my late 20's now), a lot of those "bag of tricks" stuff I use daily was pretty much magic to me for a long time. It wasn't until I started to learn about dynamic memory management and serious data structure concepts in high school and college that it began to make sense.

That's why I think that the higher level languages are great for learning the basics of program flow control and basic computer data handling. The students don't need to know how dynamic memory allocation works at that point, but they will soon.

The overall point is that a well rounded, theoretical CS education should be teaching that the languages we use are closer to standard tools than the end-all of our products. There will be times where a high-level language will suit your needs perfectly well, and times where a low level language will have the raw power that you need, with the caveat of potentially needing a lot more thought and work to get the same results.


"My opinion is that one can easily teach these "high level concepts" (loops, syntax, control flow, etc) using C."

Then one day you have a line of students going out the door at your office hours who all need help understanding why their program is segfaulting. Some tried to free a pointer to something allocated on the stack, others tried to free a pointer twice, and one or two managed to find some undefined behavior that even you did not know about.

It is not that C lacks high-level structures, it is that any non-trivial C program must deal with low-level issues.

"the student will learn these "high level" concepts without any clue about the underlying implementation"

That is what computer architecture and compilers courses are for. You learn more about the implementation of your programs by writing a compiler than you do by using C, and writing a compiler in a high-level language is almost always less painful than writing it in C. Somewhere in the compilers course students should also learn how a garbage collector works, which will help them understand why they are seeing the behavior they see.


I disagree : you can easily teach high level concepts like control flow, branching, loops, logical operations, if/then, without even using pointers. You can even do some pretty high level numerical algorithm stuff just using plain old stack variables.


This is the first exposure to programming & computation class, anything superfluous such as semicolons or the keywords just trips people up and gets in the way of learning. You have to tell people to ignore all of the extra magic words just to get started, even though they would really like to know what it all means.

Look at Java:

  public class HelloWorld { 
      public static void main(String[] args) {
          System.out.println("Hello, World");
      }
  }
C:

  #import <stdio.h>

  int main() {
    printf("hello");
    return 0;
  }
Python:

  print "Hello"


#import <stdio.h> -> #include <stdio.h>


In C, you can't even get user input without using pointers. Command-line argument? argv is a pointer. fgets/fscanf into a variable? You have to use a pointer. Yes, you can tell the students to treat those stars and ampersands as necessary magic that they don't need to understand right now, but you're still using pointers.


Reading inputs without using pointers is pretty complicated in C...


Like a lot of thirtysomethings I learned on a home microcomputer, and on those the order was: BASIC, then assembly when you found out that BASIC was too slow and flabby, then a compiled language such as C or Pascal when you found out that writing anything serious in ASM was like trying to build a ship in a bottle with lentils, toothpicks, and glue.

Similarly the three core languages of a CS curriculum should be Python, followed by assembly, followed by C. The assembly background helps prep you for the gnarlier C concepts like pointers, so it's important to get that in there first before dropping someone into C. C is a systems language whose intended audience is people familiar with assembly but in need of something a bit easier to manage.


> Like a lot of thirtysomethings I learned on a home microcomputer, and on those the order was: BASIC, then assembly when you found out that BASIC was too slow and flabby, then a compiled language such as C or Pascal when you found out that writing anything serious in ASM was like trying to build a ship in a bottle with lentils, toothpicks, and glue.

That's certainly one plausible order, though more of people I knew (about the same time, though I'm technically just out of the 30-something range) did BASIC followed by Pascal, followed by C, followed by Assembly. Rather than BASIC -> Assembly -> Pascal/C.

> The assembly background helps prep you for the gnarlier C concepts like pointers, so it's important to get that in there first before dropping someone into C.

I personally think its at least as easy to learn pointers in C (or even first someplace where they are more opaque and you don't engage directly in pointer arithmetic, like Go or, I think, Rust, or if you were me in the 1980s, Pascal -- and then C) and then learn Assembly than the other order.

> C is a systems language whose intended audience is people familiar with assembly but in need of something a bit easier to manage.

Sure, and when C was invented, that was probably a common route among people who were already programmers; but its a bridge that can be crossed in either the assembly-to-higher-level direction or vice-versa, and seems to me to work quite well either way.


Yes, with even a basic understanding of assembly all the pain of pointers disappears.

I taught myself assembly from the output of Borland C because I needed more speed (for a platform game). I was hand-optimizing compiler output and turned out to be an amazing way to learn.

> C is a systems language whose intended audience is people familiar with assembly but in need of something a bit easier to manage.

A larger part of that is portability, which is a lot easier to manage.


I studied Computer Science at USC and was a student in 2 of Professor Redekopp's classes. Suffice to say that he was a top notch educator and the favorite professor of most students in the Engineering school.

At USC, at least while I was there, 101 and 102 (the class from which these slides are linked) were taught in C++. The Algorithms class used prolog, and the majority of the remaining core CS classes were in Java. All in all, I'm very grateful for the diverse foundation we were provided, it has helped me immensely in my career as a developer.


Sadly I never took Redekopp as the AP CS course had me place out of CS101. CS102 was taught by Crowley when I took it, and four years later when I was TAing it.


Professor Redekopp was one of the best teachers I've ever had. I highly recommend any incoming CS students take him for any classes you can.


> Joel Spolsky wrote a great article[1] in 2005 that basically says using a massively high level language like Python or Java as the core of a CS curriculum is damaging because the languages aren't "hard" enough.

There's certainly a value in getting closer to the machine (as C does), OTOH, there's a value in not having a lot of boilerplate cruft between you and the algorithm, which makes something like Python valuable. Really, I think a CS program that exclusively uses any one language is probably robbing students of valuable insight that comes from the different perspectives that working through different languages (particularly, languages from different design paradigms) provides.


My CS class in HS was in C++ (Turbo C++ in DOS mode with the Borland graphics). I'm undecided as to whether it was the best approach. On one hand I appreciate learning it. On the other hand, it makes you think too much about the computer to appreciate the algorithm.


I learned algorithms from Segdewick's "Algorithms in C", and had a different experience: I really liked understanding the algorithms and data structures right down to the level of bytes in memory. It felt... less magical. In a good way.


Most C++ implementations of linked lists are bad, including std::list and apparently, this course's.

People misunderstand linked lists so badly, that they actually list the motivation for linked lists as fine-grained growth, as opposed to an array-based list.

The main benefits of linked lists over array-lists are:

* Can put data structures in lists even if they are already in some other data structure. For example, have some ordering between array elements.

* In intrusive style (can find the list links based on the item), can add after that position in O(1).

* In intrusive style with doubly-linked list, can also delete an item in that position in O(1).

Non-intrusive lists are incompatible with having the same O() while also allowing an item to be in multiple data structures simultaneously.

std::list indeed does not allow this, and is almost useless.

It's sad to see this taught incorrectly in courses, too, perpetuating the uselessness of std::list.


Your points 2 and 3 are satisfied by using an iterator to keep track of the position.


But then you have to pull those iterators into almost everything that deals with your objects, making the code hard to follow, and wasting memory. The problem also gets worse when your objects are in more than one list (which you can't literally do with std::list anyway) since you have to keep several iterators around.


[deleted]


1. You can maintain a linked list whose elements are already stored in another data structure. This allows you to impose a different order on the elements without having to copy them.

2. http://www.boost.org/doc/libs/1_45_0/doc/html/intrusive/intr...

Were there any other terms you had trouble with?


point one satisfied using a dynamic array of pointers.


With O(N) rather than O(1) costs for any list operations.


in my experience, most array operations are read, not change.


I work on a high performance system which needs to handle lots of concurrency of requests with various possible scenarios.

In order to efficiently time out requests, need to have the requests sit in a chronological list. But they also need to be quickly found by other orderings, so they also sit in hash tables.

When a request is complete, I need to efficiently delete it from all these structures.

Using ptr arrays rather than lists would be catastrophic for our performance.

There is virtually no advantage whatsoever to the array approach over the list approach in our case and in many others.


Understanding the C++ std containers is almost a complete lesson in data structures. std::set is typically a red black tree, std::unordered_set is a hash table, etc. Looking at how these containers are implemented and understanding how/when to use which really separates the men from the boys when it comes to scaling and efficiency.

You can probably learn the concepts from other languages, but it's been my experience that C++ devs typically understand these concepts more so than most.


Uhh that yellow

//

Actually, I'm glad that C++ has been my first programming language. Its syntax is pretty straightforward, and it's even better to work with when you've got extremely clear guidelines set up – at my university (UPC BarcelonaTech) we had a course PRO1, where you got seriously punished if you used spaces instead of tabs, or if you set 2 as your tabstop, or if you used goto, to name a few. Now I cannot say that I share all those guideline rules, but at least I understand what a good C++ code style is.


Similar:

Stanford - Abstractions and data structures C++

http://see.stanford.edu/see/lecturelist.aspx?coll=11f4f422-5...

Edit:

The only thing that takes a little research is tracking down the C++ libraries that coincide with the class.


Thanks for this. It is worth pointing out to others that these lectures include videos.


This one actually includes all the practical material. Thanks :)


I think I'm in the minority, but I'm happy to see C++ in a college course. I thought it was all Python and Java these days.


A friend of mine has a colleague that deliberately uses C++ in an introductory CS class to "thin the herd" of students he has to teach.


i hear stuff like this all the time. all it accomplishes is to make CS a field of very similar people that happen to have the required mental models for entry. it's ridiculous. other fields don't have this arrogant "you either get it or you don't" attitude. even math, the most opaque of opaque subjects, tries to dampen the learning curve to thinking like a mathematician.

most kids that weren't living their childhoods on a computer simply don't understand how to think about computers and programming. to deny them entry to your field because they can't figure out how pointers work is toxic - especially when introductory CS classes are huge and students don't get much one on one help.

programmers that think like programmers are a dime a dozen. programmers with more diverse mental models are the ones that innovate, even if it takes them longer to grok programming.


I agree with you. This method does drive away many talented young people from programming, and it does create CS culture issues several years later when all of the graduates think the same way.


CS125 at UIUC was taught in Scheme (using the SICP book no less) as the first course in CS and it did a pretty thorough job of separating the wheat from the chaff.


It is now taught in Java while the data structures class (225) is C++. The Computer Engineering curriculum is going the other way and now teaches computer architecture, assembly, and C (in that order) before sending students over to the CS department for data structures.

Former TAs tell me that CompE students find CS 225 to be much easier than Computer Science students.


I can imagine. Just for reference, I graduated in '98, so things have probably changed quite a bit since I was there. Every class after CS125 used C++ (except AI, which was in Lisp), but C++98 was still in committee and the STL wasn't really available (templates existed, though). So it really was more like "C with Classes" than modern C++.

I wonder if CS373 is still destroyer of students that it was in my day. The average score (before the curve) was around 40%.


Ah yes. Intro classes, where the point isn't to teach people but sort out the ones you don't have to teach.


That it will do.


I attend Texas A&M University, and maybe it' just because Stroustrup works here, but all of my programming classes are C++ based. I have to learn Haskell this upcoming semester, I'm slightly worried about that!


Just be careful that they're not teaching you Haskell as a way to 'ease you into' C++ template metaprogramming!


Michael Crowley's a great professor and a swell dude. His Operating Systems class is also a great crash course that forces you into C programming competence.


I don't know what the fuss is about with java and python.

It was really beneficial in school to work on large java projects where we were exposed to eclipse, svn, log4j, junit, jdbc, etc. It was unbelievably useful going into my job. On the other hand, we used C when we were studying something low-level like thread communication or operating systems. C/C++ absolutely has its place in the college cs curriculum as do java and python.


My school teaches an intro course in python and then switches to c++ for the data structures and beyond.

It's gotta be easier to just start with c++ and run with it.


Just for some context, USC starts intro to CS with C++ and then does Data Structures in C++, then switches to Java for more advanced OOP. Upper division classes are usually in Java or C++.


I'm taking C++ (almost through with the beginning semester) after basically being self-taught in python, js and php. It's interesting, and I can definitely appreciate how it teaches you fundamentals in a way higher languages don't necessarily. Sort of like learning to drive stick when all you've known is an automatic.


This seems incomplete without the labs and assignments included. The github link on the assignments page leads nowhere.


I was a teaching assistant for that class for 2 years, HNs got me feeling nostalgic.


I once tried to code an assignment involving TF-IDF on a corpus of docs. It ran blazingly fast compared to python. But was debugging segmentation faults until the last minute.


Learning C++ was the best. That course is just missing the Algorithm Analysis which mine had along with Data Structures, all C++ and no #includes allowed.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: