> It's not better than a generator, but I'm surprised nobody has mentioned the v...

philsnow · on July 19, 2023

> No need to read the lines all into memory first.

It looks like that code does read the whole file:

(with a foo.csv that is 350955 bytes long:)

  % python -V
  Python 3.11.4
  % python
  >>> f = open("foo.csv")
  >>> f.tell()
  0
  >>> header, *records = [row.strip().split(',') for row in f]
  >>> f.tell()
  350955

I thought that using a list comprehension to bind header and records was eagerly consuming the file, so I changed it to a generator comprehension with

  >>> f.close()
  >>> f.open("foo.csv")
  >>> header, *records = (row.strip().split(',') for row in f)
  >>> f.tell()
  350955

nope, I guess the destructuring bind does it?

  >>> f.close()
  >>> f.open("foo.csv")
  >>> headers, records = f.readline().strip().split(','), (row.strip().split(',') for row in f)
  >>> f.tell()
  125

not as neat, though. Is there a golf-ier way to do it?*

me-vs-cat · on July 20, 2023

The parent poster was pointing out that this requires having two in-memory complete copies of the file:

    [... for row in open(filename).readlines()]

The readlines return value is one copy, and the list comprehension is another copy. However, that first copy can be avoided with:

    [... for row in open(filename)]

The entire file must still be read to evaluate the list comprehension.

Additionally, this doesn't do what you think it does:

    >>> header, *records = (row.strip().split(',') for row in f)

Compare to this, using a variable for clarity:

    >>> gen = (row.strip().split(',') for row in f)
    >>> header, *records = next(gen)