Evidence-based Software Engineering – book beta

Mekantis · on July 10, 2020

It makes me wonder. Is there enough research and sufficient good data to make evidence-based software engineering? It's not something that comes up much and whenever I do see research, it usually feels disconnected from realities in software development. That said, I'm going to read this and see if there's anything valuable to learn.

agentultra · on July 10, 2020

I don't think there is a lot of data from what I've seen. This book claims to pull together some 620 data sets for analysis. I have only glanced over a few sections that interest me and I don't see citations to some papers that I've found interesting.

It'll be interesting to see if the author can pull it together in an engaging way.

There is some effort in IEEE and ACM circles to pull together research like this as well. Worth trialing a membership to find out.

baxter001 · on July 10, 2020

This is a very heterogeneous text, pulling in the psychology of perception, economic theory, a description of how regressions work and static analysis of a whole host of programing languages along with R snippets on how to perform the analysis.

Strikes me more like a manic episode than a useful text.

spekcular · on July 10, 2020

Yes. I read the entire thing once, about a year ago, when I was bored one night. I don't want to be overly critical of someone's book draft, so let me just say that:

A) I don't understand why an entire elementary statistics pseudo-textbook is bolted on at the end, forming the entire back half of the text, and

B) the title interested me because it promised concrete information that would improve my own software development, and while I found many things in the first half of the book, I didn't find this.

(To be more precise re: point A, I do understand, because the author offers his reasons. I just disagree strongly.)

Also, the margins are small to the point of being ridiculous, which made the work annoying to read.

jointpdf · on July 10, 2020

You could make your point a lot more constructively if you omitted the denigration of the author’s substantial effort and people who suffer genuine manic episodes (which tend not to be this pretty).

iratewizard · on July 10, 2020

It might come off as an attack on the author, but to me it's a succinct description. To the author, maybe it will be constructive criticism (if they are even reading). In general, people who set out to create something grand or complex fall short. Successful complex projects I've seen have all started off simple and grown complex slowly out of necessity.

7532yahoogmail · on July 10, 2020

Disagree. Somewhat strongly. The last thing we need are more nerds in the pejorative sense that can't see past their particular electrons in their favorite elements in the dirt. You gotta at some point lift your gaze. A major issue at offices for team and divisional level roi and competitiveness is silos and the wrong assumption that everything is technical or should be. Anything to pursuade the pendulum to the middle is a benefit. Anything that helps identify and compose parts into a better whole is good.Specialization and narrow focus are good but only when mediated and integrated. It takes both.

pieterk · on July 10, 2020

It hasn't had enough fruits and vegetables yet, baxter007.

doctor_eval · on July 10, 2020

It’s not a complete work yet.

FTA:

> The aim of only discussing a topic if public data is available, has been slightly bent in places (because I thought data would turn up, and it didn’t, or I wanted to connect two datasets, or I have not yet deleted what has been written).

> The outcome of these two aims is that the flow of discussion is very disjoint, even disconnected. Another reason might be that I have not yet figured out how to connect the material in a sensible way. I’m the first person to go through this exercise, so I have no idea where it’s going.

pieterk · on July 10, 2020

One can keep writing forever, but I think... If you can't capture it in 314 pages, you might be writing less of a universal book than intended.

65536 · on July 10, 2020

Writing a book is not just adding pages, it’s also revisiting and revising what you’ve already written.

So it’s entirely possible that at some point the author ends up capturing what they wanted to capture and the total number of pages then is lower than they are today even.

akerro · on July 10, 2020

>If you can't capture it in 314 pages, you might be writing less of a universal book than intended.

I dont find this book to be universal, this topic is so wide that maybe 30 000 pages won't be enough. Some CS books are >1500 pages look at the whole series "the art of computer programming"

m0rc · on July 14, 2020

First of all, I think that the idea and the effort behind the book are a fantastic and worth undertaking.

The issue for me is that the connection between the individual topics and its relevance for Software Engineering is not clear or present (I no doubt that the author sees the relation as obvious).

I would have liked more elaboration of the relation of each topic (in a given section there can be several) with respect to SE, and its positive and negative implications for its practice.

In a similar line, the following two sources are worth reading: * Facts and Fallacies of Software Engineering. I'm typically surprised when some IT person tells me that he do not even know of its existence. * IEEE Voice of Evidence Articles. They are reviews of existing evidence on various topics.

cmehdy · on July 10, 2020

It seems like a daunting task to take on so much, so props for the work put into this. It does kind of read like a beta, with the exception of the bigger thread (which makes it feel a bit more like an alpha). If that's okay, I'd like to offer my thoughts on this.

What I'm reading from the chapters and the first few pages (dense in information!), is that there are initial assumptions and an over-arching structure that would benefit from being highlighted.

The undocumented assumptions are about both the audience and your own approach: Where are your target readers at, who are they in broad terms? What prerequisites are necessary for this book to be manageable, and since it covers multiple fields it's possible to offer some insights into other sources (whether online or physical books or classes) that offer fundamentals, up to perhaps even a parallel to what you're putting together.

The unannounced structure seems (as of right now) to be:

(1) Notes about stuff

(2) We're dealing with humans

(3) We're dealing with humans in a system

(4) Related and/or other relevant systems

(5) What do we do as an engineer in this mess?

(6-7-8) How does that practically unravel ?

(9-10) Here's a crash course in mathematical stuff we'll need later

(11) Regression. A lot of it.

(12) Some other stuff than Regression to think about too

(13-14) Practical applications of the last 4 chapters we've been going through

(15) Let's introduce R now!

I think it would read better if there was some progression along those lines:

(1) Tells you what this book is at its core, then very briefly what's going to hit you, and which assumptions you will make and what prereqs your readers might need/rely on

(1-bis) Marked optional: Historical stuff, contextual stuff not directly related to the book's core but useful nonetheless. Perhaps the reader will want to take a look at that at their own leisure.

(2) Bring in the practical tool you have a preference for (R), since it's both useful for data manipulation/visualization and programming concepts. Nothing big, the learning can and will happen throughout the book.

Now you can leverage that tool to illustrate (and low-key practice) the concepts that you introduce about various fields. The next few chapters don't need to change too much but they can be made much more interactive by using R, jupyter notebooks, etc.

(3) Who do we build stuff for? (Humans)

(4) Humans don't live in a vacuum, so let's consider their system

(5) (Perhaps marked optional) Those systems coexist and interact with other systems. Why it's relevant to know about a few.

(6) What do we do as an engineer? (Build, design, creatively, etc)

(7) Basically your 6-7-8 with some fun R stuff

--- And I'd honestly end there, and separate the rest into its own book.

That book can be about introducing the maths, with more or less optional parts/prereqs, leveraging R, introducing the bits of data manipulation that are deemed useful, and so on.

thesz · on July 10, 2020

A quote from book:

Some languages support the creation of executable code at runtime, e.g., concatenating characters to build a sequence corresponding to an executable statement and then calling a function that interprets the string just as-if it appeared in a source file.

I took time and did a search for any mention of higher-order functions and combinators (as in "parsing combinators"). There is no any there.

The book is way out of date right now, even before publishing.

disgruntledphd2 · on July 10, 2020

Note that he's requiring there to be published data/research before including something, so that may account for what you believe is missing.

thesz · on July 10, 2020

Should I do his work here then? Search for publications, etc.

kqr · on July 10, 2020

Publications with proper controlled experiments of adequate sample size. I'm fairly sure you'll find none.

thesz · on July 10, 2020

"Of adequate sample size," of course.

Let me quote book again:

A study by Iivonen analysed the defect detection performance of those involved in testing software at several companies. Table2.8 shows the number of defects detected by six testers (all but the first column, show percentages), along with self-classification of seriousness, followed by the default status assigned by others.

Six testers. Six!

Here's a small research about expressiveness of different programming languages: http://www.cs.stir.ac.uk/~kjt/techreps/pdf/TR141.pdf

Also contains six entries, if you include Haskell.

Expresiveness of language is measured by Halstead metric, which predicts density of defects in linear fashion: https://ieeexplore.ieee.org/document/8447959

The book quotes neither Personal Software Process nor Team Software Process research. It also does not mention at all the very basic thing about cost of software development: cost to fix a defect is proportional to the time between defect introduction and its discovery. On which I am sure there are plenty of publications to find research results in: https://www.researchgate.net/figure/Cost-of-Fixing-a-Defect-...

I have strong feeling I am doing the job of the author here. It is very sad.

andrewcooke · on July 10, 2020

this is excellent, except that it doesn't read so well.

jonpurdy · on July 10, 2020

The topic is directly related to my interests as a PM but it goes way over my head quickly. Looking forward to seeing the iterations though!

pieterk · on July 10, 2020

just like good code, good text goes through several iterations, or beta versions.

Congratulations, Derek!