It is very interesting to see in such projects how they expose the weakness of certain Java idioms. The mentioned JavaBeans getters and setters are obsolete pattern, for which in most cases there’s no good reason to keep it in the code. Java ecosystem is probably the richest one on the design patterns, some of which receive „anti“ prefix over the years (e.g. self-contained singleton). At the same time new languages explore and verify the ideas and coding styles that might be worth to borrow and make mainstream. Porting exercise can be such source of new best practices pushing us to rethink what is worth keeping and what is pointless overengineering.
I was struck by the lack of inheritance in go, that to me is brilliant.
I’ve worked with C# for a decade, and I’ve yet to see a use of inheritance that wouldn’t have been easier to maintain in the long run without using inheritance. We’ve limited our own useage to override methods in the standard library, but even then it’s often used to implement things that are really just terrible practices. Like adding search functions for AD extension fields or increasing the timeout in one of the older web clients.
Until 1.8 Java’s composition over inheritance story was weak. 1.8 introduced default functions on interfaces, which finally allows shared functionality without inheritance. In some senses Java’s patterns in this area have been reset, and many old code bases need to catch up (and probably never will).
Well, from where I sit, interfaces are another kind of inheritance - inheritance of API rather than implementation. But once you can have default functions, that to me looks exactly like inheritance of a base class, except that the base is called "interface" instead of "class".
It's a difference of inheriting data vs. functionality, so yes, the API is what's being inherited.
Looking at the history of inheritence a few languages I know:
- C++: multiple inheritance super confusing expressions and understanding of things like diamond inheritance. Initialization becoming a really complex thing to understand.
- Java: learning from the mistake of C++, determines multiple inheritance is bad, only allows inheriting data from one super class and then API (and now in 1.8 functionality) from any number of interfaces.
- Rust: realizes inheritance of any data is bad, and only allows for any number of Trait's (interfaces) to be implemented on types.
There's a difference, and ultimately it's that you're not carrying data via the inheritance chain, only functions. The benefit here being that it's much easier to reason about a type and what data it has that determines how it works in the broader system. This allows for very explicit inclusion of data from other types, as opposed to implicit inclusion with C++ and/or Java. In the end moving away from data inheritance leads to fewer mistakes and easier understanding of the code.
With Java 1.8 you can ask the same thing about Java and the answer would be the same.
I didn’t mean to suggest that it’s idiomatic in C++ to create strange inheritance graphs, anyone who’s done it once realizes it’s a bad pattern. The issue is more that the language -allows- it.
Composition over inheritance solves only certain language problems where limitations of expressive power make the code less transparent or flexible, but it’s not a solution for decomposition. The problems with properly used inheritance arise only in sufficiently old and evolved code, because decomposition of one domain does not necessarily describe well another (evolved) domain. Programmers tend to think that every problem in the world should be addressed by better coding, so they try to change their coding practices. In fact, solution here is managerial - throw away the old code and do the decomposition once again, for the new circumstances and new requirements. OOD and inheritance work well, so let’s not blame them for the use cases which they cannot solve.
100% this. We're never going to built a perfect sand castle.
IMHO - There are many managerial short comings in software development that lead to abuse of legacy design. OOP by design allows you to abstract and ignore not only original implementation decisions but reflections of those decisions in the architecture. Polymorphism makes sense when it's the classic ICat: IAnimal example but so often it _becomes_ IHouseFly : IAnimal because all the contracts expect IAnimal and the deadline is.....tomorrow.
I personally don't have a good solution given the pragmatic counter argument that the ROI on a system which is cheap to develop but is patched over 5 years may be equal to or cheaper than one that is expensive to develop and _still_ needs to be patched over it's 5 year lifetime. Let's call this the used car problem. A new car is more reliable but now that used cars are reliable _enough_ it's harder to convince people _not_ to gamble with a lower upfront cost.
I absolutely _love_ Rust and the code I'm written feels bullet proof. No idea how long it would take a team of my C# peers to be even 1/10th as productive in Rust/Go as they are in Visual Studio and C#.
Indeed! This has always mystified me. Why is it so easy to implement what is now considered to be an anti-pattern (inheritance) when it's so boilerplate and annoying to implement composition? Why does C# not have language support for delegating members? Why do I have to buy and use ReSharper just to generate all that boilerplate? It's a constant battle talking to "just get it done" developers about why inheritance is bad.
Inheritance is only considered an anti-pattern by some.
Since the 90's, any good CS book about OOP paradigms had discussions about is-a and has-a and how to make the best use of each, depending on the desired application architecture.
The thing is, such books usually aren't taught on bootcamps.
Inheritance isn’t an anti-pattern in academia. I do part time work as an examiners for CS students, and I see their car/animal examples everywhere in introductionary courses.
When we have interns, they’ll sometimes build things with inheritance. So it’s certainly still a thing.
I’ve yet to see a real world use of it, where you wouldn’t have been better of not using it though. My real world is the relatively boring world of enterprise public sector software, however, and maybe I’m simply oblivious to where inheritance might be worthwhile.
That is the thing, the architect should have a proper understanding of is-a and has-a relations and apply them appropriately.
Initially, VBX only allowed for composition as well, COM introduced interface inheritance with delegation, when one wanted to override a couple of methods, but not the remaining several ones.
And now UWP offers mechanisms to do implementation inheritance in COM, because everyone got tired to write delegating code for is-a relations.
Inheritance and composition are both tools, it is up to each one to learn how to use them appropriately.
Traits are okay in principle but, as a Rust beginner, the huge number of trait implementations make browsing the api docs overwhelming. For example, the list of implementations for String don't even fit on the screen at once. [1] It seems very repetitive and I wish there were an easier way to get an overview of what functionality is available and find the right method to call.
It's not a great solution, but I've found it beneficial to lean heavily on the Rust docs search bar when looking for specific behavior. It's very fast, and fuzzy enough that it finds what I'm looking for most of the time.
I'm also a c# dev but with less experience. Could you elaborate? Does the mean you copy paste shared code to all of the derived classes? I think I know why you say it's a maintenance nightmare because you start off with a method or properties that make sense in all of the derived classes but overtime the classes become less cohearant and start to develope warts and it would have been easier to modify the duplicated code.
I think you formulate it nicely. Sometimes the things it made sense to share stop making sense.
Something that means rework, sometimes you end up with a parent method which is overridden in every child, possibly because each override was spread out over a long time period and no one bothered to look outside the child.
Inheritance makes sense with interfaces in C# but for the rest I think composition is just a better way of sharing.
Watch out for hype that golang authors and supporters claim without properly backing it up, there's quite a bit of it.
There are other languages that have better support for composition, while still maintaining support for inheritance (e.g. Kotlin). Inheritance has its uses, as evident by the fact that Rust is considering adding support for inheritance (e.g. writing GUIs).
I’m not sure what you mean by “rust is considering adding support for inheritance”; as far as I know that’s not true. Did I forget something? Do you have a link to the proposal?
I came across it a while ago, I think it was mentioned by pcwalton or someone if I'm not mistaken, specifically citing writing GUI code as a rationale. That being said, Rust (and Java, Kotlin, Scala, C#, etc) have default interface method implementations which might alleviate some of the need for full blown inheritance, unlike golang's interfaces.
Yes indeed. Porting code is also a great way to learn a language. And you learn a lot about idioms and patterns on both sides. Go innovated quite a bit in terms of simplicity. However, they also did a few things that they are now trying to fix like having generics and exceptions. Lack of overloading is something that I miss in other languages as well. E.g. in Javascript you often end up with these type checks to sort of fake things. These often get added defensively because the lack of typing just causes people to call things with the wrong types and it is common for that kind of code to not be noticed (fixed a few such issues when introducing typescript recently).
A lot of stuff in Java is just because (by now) it has a lot of history and baggage. JavaBeans were kind of cool back in the late nineties but mostly it's a really annoying convention by now. The notion of accessors is not obsolete but unfortunately was not part of the language design when they created Java. Since Java has reflection, they came up with naming conventions that allow code to inspect object instances and do things with properties based on naming conventions. This has been key to a lot of the Java enterprise stuff that happened after this. And it also facilitated creating UI builders that came with IDEs like jbuilder, netbeans, visual cafe, etc. when having a UI builder was still a base requirement for an IDE. Eclipse sort of broke that tradition by not including one (initially).
These days UI builders are basically very uncommon; which makes javascript frontend work particularly tedious and repetitive. Because there are no conventions, no typing, and quite often no meaningful test coverage in JS, it is basically impossible to deal with that in a UI tool. So, accessors matter and are not obsolete at all.
Most modern languages have more elegant ways to do essentially the same type of accessor mechanisms in the language. And also languages like Smalltalk had this (as well as UI building tools, refactoring, and a lot of other stuff that is still science fiction in the javascript world).
If you look at Kotlin, they fixed this while retaining the ability to expose code back to Java.
For example kotlin has properties with accessors that you can optionally override. Normally you just type val foo="bar" and you have a string property with an inferred type of String. The setters/getters generated under the hood and used automatically when you assign or use the variable. If you want you can customise the accessors or use something called delegated properties that e.g. turn a property getter into a function or use lazy intialization. Once compiled, a java class that uses that code would see the normal setters and getters as if it was a normal Java class. Likewise when accessing Java code from Kotlin you use java properties as if they were normal kotlin properties (i.e. without using setFoo(foo)/getFoo()). This makes Kotlin a really nice way of using legacy Java code.
I’ve been diving deep into Kotlin lately and I am now convinced that Kotlin is the next evolution for Java codebases and the JVM. It fixes a lot of the quirks in Java with great interop support so we can still use all the jvm/Java knowledge,libraries,performance in a more modern, more ergonomic and safer language.
I agree. I wonder how well the transpiler in IntelliJ works. I'm using Kotlin in my spare time a bit in a maven sub-module of a project, but used it from scratch (because of the great interoperability with Existing Java Code). I'd love to switch to it in other modules as well (still Java code), but wonder if it generates idiomatic Kotlin code out of the box, but maybe not. That said I'm a Kotlin beginner, but it seems everything from Effective Java is so much easier with Kotlin and the defaults are simply better.
Yes, the converter gets you eighty percent of the way there. Usually it is a bit too conservative and e.g. biased towards making all types nullable because it is hard to reason about nullability in Java. In the same way it attempts to turn any method starting with the prefix get into a property, which does not always make sense when it is clearly a bit of business logic instead of a property. Finally, it seems to slab a lot of redundant generic type hints on things, which don't always work. As soon as you remove them, Kotlin's type inference usually gets it. So I typically spent a minute cleaning up after converting a file.
After that you get to the idiomatic stuff like e.g. making properties read only getting rid of multiple constructors by introducing default parameters. Getting rid of the builder pattern (mostly redundant in kotlin), introducing data classes where that makes sense, using lateinit vars to make nullable vals nonnullable, etc. Technically you are at that point improving things.
Lombok is a syntactic sugar for bad design decisions: there should not exist operation like setName - either there’s additional semantics deserving a method with a descriptive name, or the class is just a structure with writable state. I also agree with the comment about unnecessary magic.
huge? pretty trivial.. the point of a javabean is you don't need to understand anything. As for annotations, it's not like developers don't need to know a crapton of other annotations as well. It's just the way it is, so a few more aren't a big deal. If you need to debug java beans, you are doing something wrong.
you can also delombok if you ever get tired of it, and now your back to manual.
I personally don't use lombok, as i'm not offended by the verbosity, given that ides since forever have done all the work, but if it's something you are bothered by, well. that.
If an argument is that javabeans are the wrong pattern to use, that's a different argument, and unrelated to lombok.
The article is more about the process and not the motivation for this port. What problem with Java/JVM caused the organization to commission this expensive porting exercise? And what are the benefits they have achieved after the port.
Bonuses questions: do they other libraries stem from the java root client too or did they evolve separately (even out of house)? Why did you choose Java to port from? Are there interesting lessons in the commonality / differences between the language implementations?
RavenDB is written in C#. At first, it was Windows only but now is cross-platform database engine (Windows / Linux / Mac OS).
As a result, the original and most featureful client is for C# / .NET.
Java client is a port of C# client, done in house by Hibernating Rhinos (the creators of RavenDB).
As far as I can tell, other clients (Python / Node.js and the Go client that I wrote) were contracted to outside people.
The company suggested starting from Java code base. It makes sense because C# client heavily uses LINQ, which is unique to C# (neither Java nor Go has LINQ-like capabilities).
I didn't dig much into non-Java clients so can't speak much to that.
Overall, I was surprised how similar I was able to make Go code to Java code.
Changing from exceptions to errors was pervasive but a simple, mechanical transformation.
Porting functions using generics was the biggest hurdle.
Porting functions that use overloading was easy but annoying.
That being said, a Java code base that heavily used virtual functions and deep inheritance hierarchy would be more challenging to port to Go. Lucky for me, this code wasn't.
Author states: "I was contracted to port a large Java code base to Go". That makes me think his focus is on earning the money, rather than questioning the motives of his client :)
The fact that he tried to port a lot of the Java code directly to Go, where Go had no suitable comparable feature. With generics, he mentions there are two different approaches in Go, but then only mention the one he calls the least preferred one, i.e. using interface{}.
But he never talks about motivation of the original code. Like why it was generic in Java. Sometimes generic code in other languages can be ported to Go to two or three functions, because the general use of the generic code wasn't as generic as the designer may have envisioned, or it just became a nail to their hammer.
Like was the inheritance really important, or could it be implemented another way in Go?
But I guess if you're not questioning your client's motives, you may not be fully questioning how something ought to work, and instead just ensure it works as it does right now.
It's possible they just needed a Go-library for their client code, and then the client code can be tweaked later on to be more idiomatic Go.
With "porting" I guess he doesn't mean replacing one codebase with another one, but creating another client in a different language. In fact, he states
The objective of the port was to keep it as close as possible to Java code base, as it needs to be kept in sync with Java changes in the future
so it doesn't look like they are ditching their Java client.
It’s a client library by the looks. So presumably they have a Go codebase building up and they needed to be able to access RavenDB from there. Caveat, I haven’t finished the article and I’m not familiar with either the RavenDB or Go ecosystems.
It's interesting that the resulting go code has 43k lines of code, while the python client for raven only has 6k lines.
I don't know whether they have equivalent feature sets - but I kind of wonder how it would have turned out if the go port would have been based on the python version.
Python is a much more concise language. There are no brackets/parenthesis to surround blocks, that's 10% to 20% less lines just for that. A truckload of data classes and static declarations don't need to exist in python because of its very dynamic nature. Last but not least, python idioms like list comprehension and single line if else can replace a whole code block from another language.
Expect a python program to be half or a third of the size of java/c#/go.
Like with all languages the implementer matters. There's a remarkable amount of variance that I've encountered in the tersiness of both Go and Java code.
For example, Go's err != nil pattern is often cited as being ugly, but good go code will often remove errors by design.
This pattern is known in numerical computing as NaN. It’s drawback is that when the computation produces NaN, one has no idea what triggered it. But that can be mitigated if the program prints stack trace on the first NaN. In case of Go that corresponds to logging error when it happens.
Another thing is that in many cases there is no good sentinel value to return that naturally leads to exit from loops or complex logic to check for error at the end of a function.
Go error handling is a little bit more verbose, although I find it more readable and consistent than checked exceptions in Java, that everyone seems to find a way to abuse or ignore (wrap into catch all).
Go codebases do tend to be a little shorter due to lack of getters/setters... and generics are not used that often in production codebases anyway, relative to all the rest of typical code (that’s procedural anyway).
what do generics have to do with making it longer?
If you want to argue 'usually' then you could consider all the build scripts and XML files, class boilerplate and exception code of Java to be 'usually longer'.
I’m sure that’s a good chunk of it. Go also lacks list comprehensions, so you have a for loop instead. Go has more boilerplate, but not more complexity.
I personally have ported about 20K LoC from a java application to go; the majority of the code was from an underlying library used to model the data structures being used. It was mostly a class-by-class, function-by-function process of porting the base data structure and then all of its implementations, and culminated with the port of the services which uses all of it.
Regarding the author's frustration of moving POJOs over (and their variable declarations), I used sublime text to select all variables based on a shard token, then cut them and moved them over by word. You can then lower case the first letters of every word, and then find-and-replace by type using a shared token. I found this method very quick and effective.
Huh? As a package maintainer for a lot of Go stuff, I had to deal with tricky cyclic dependencies several times, especially in Google own Go packages like golang.org/x/build and google cloud.
Go expects to have all dependencies available at any time, downloading them from Internet if necessary. Cyclic dependencies are not a problem in that case because all dependencies are built together.
Packaging for a distro requires building Go packages isolated from the network in a chroot and having all the dependencies previously built in the same manner. So it is an iterative process, building blocks by building blocks. If you have a cyclic dependencies, you can't build iteratively, and you'll have to excise certain part of the code to eliminate the cycle.
Thanks for sharing. I understood the contract was about replicating as much as possible the same structure or workflow as the original Java code but if you had to develop the same client in Go from scratch, where would be the main differences in your approach and how the Go code would benefit ?
Some of these examples were obviously cherry picked for one side of the other. Comparing against getters/setters, which are written once, generally automatically by the IDE, and never looked at again and omitting the ubiquitious "if err, _ = f()" rote seems a bit disingenuous.
It's just it's nice to be able to track code coverage over time, which is what codecov.io provides.
Also, running all tests (to get full code coverage) takes 20 minutes and makes fan on my laptop unhappy.
The way it works is that on every checkin the CI job runs all the tests with code coverage enabled and uploads the results to codecov.
Codecov can then plot coverage over time.
My gripe is with inaccurate accounting of empty lines (like comments or struct definitions) by codecov. Go's tool to visualize this count them properly. I don't know if it's codecov or maybe I'm not sending the data properly.
> Codecov is barely adequate. For Go, they count non-code lines (comments etc.) as not executed. It's impossible to get 100% code coverage as reported by the tool.
I have an open source Go project that has some comments in methods, but still achieves 100% coverage using Codecov.io — I'm not sure what I do differently to yourself? (Perhaps I'm not using any inline struct definitions?)
I don't know, maybe I'm not counting things right but for example https://codecov.io/gh/ravendb/ravendb-go-client/src/master/d... shows less then 80% coverage and there are only 2 lines not exected out of at least 18, which should be at most 10% counted as not covered.
In the example you link, there are 18 coloured lines, 4 of which are not green: 4/18 = ~0.22 = ~22%. This tallies-up as expected with the 77.78% coverage shown at the top of the file.
Codecov* doesn't count an 'if' statement as having full coverage unless one tests both outcomes: so the yellow lines here have been executed, but do not count towards your coverage score.
Granted, one could argue that that's not very generous! But on the other hand: those yellow lines have not been fully tested, despite being executed, so I can understand their decision.
In the linked code, just implement a couple of simple tests to test for the expected error conditions: it's easy (here at least) and ensures the code behaves as expected. (Obviously not all partial/no coverage lines will be so easy to hit with tests, it might not always be possible to easily get 100% coverage, but hey: start with the low hanging fruit!)
* I say Codecov here, but I highly suspect that they may simply be using Go's coverage reports under the hood?
> Today if I was settings this up from scratch, I would stick with just AppVeyor, as it can now do both Linux and Windows testing and the future of TravisCI is murky, after it was acquired by private equity firm and reportedly fired the original dev team.
I would rather go with CirrusCI which has windows, macos, linux and freebsd support, is much faster and is easier to work with.
appveyor being the slowest, and travis having the worst features.
I’d say that’s not nearly enough time to write 50k loc from scratch! (Unless I’m horribly slow and no one is telling me.) Porting is a very different kind of work from feature development, since you already know the inputs and outputs of your system.
I like the Go language for all of its efficiency and easy concurrency, but part of me can't help but think that they were dying to make some new syntax just for the sake of it.
That's how you define a method in Go. The local to the left of the function name is the receiver (what `this` would be in Java).
Note that there are a few odd bits to Go methods: the receiver can be a value or a pointer, and if the receiver is a pointer it might be null, because method calls on concrete types are statically dispatched so
type S struct{}
func (self *S) Foo() {
fmt.Printf("%v\n", self)
}
func main() {
var s *S
s.Foo()
}
In Go, the type follows the identifier. To use a single letter or short sequence of letters is idiomatic Go for short-lived identifiers or ‘quintessential’ identifiers.
q of type Query. So in the function you can access the fields on q (Where you call the function on) via q.FieldName. If you do it with a pointer receiver (q *Query) like explained here (https://tour.golang.org/methods/4) you can directly manipulate the object, not the copy like in this example.
I wonder if it would have been easier (in terms of complexity, not in hours spend coding) to rebuild the project in Go (in 100% idiomatic Go) and port it to Java afterward. Sounds weird but since Go has such a strong focus on simplicity and Java probably implements most things Go uses, a port from Go to Java is probably a lot easier.
I'm not sure a tool could do 100% translation but I'm sure it could do a lot.
A surprisingly large amount of time was just moving the order of variable declaration from Java's "type name" to Go's "name type" and renaming, say, "String" to "string".
If a tool did that for me, it would save a ton of time.
Unfortunately, the upfront time investment to learn enough to write even the simplest translator would probably be greater than time saved on one project.
He knows it, that's why he mentions lack of null (nil) in Strings in Go as an issue when porting the code.
But depending on how the Java client uses null, "" can do just as well. It's not like you have many other options (except to add your own composite struct on top of String, or to use a guard value that's still a string).
I'm wondering if the goal of this porting project would've been achieved by Java's newly gained native/AOT compilation feature. Probably not, as a huge Java .so brings the JVM's infrastructure for gc etc., and too much overhead. Still, I hope we can get rid of the language-centric ecosystems we have in favour of established OS/POSIX ways for shared libs, or an evolution thereof.
I understand the goal, and my comment wasn't so much arguing in favour of one language vs another, but rather that developing the same thing for multiple languages over and over seems wasteful.
In those client things the amount of the code that can be moved into a cross-language shared library might be lower than expected. E.g. the wire protocol and socket handling might be an option - but it's often not that big.
The bigger part might be transformation of the programming languages types into something useful for the client (e.g. through serialization), which has be be redone anyway for each language. And after that the question comes up whether sharing the remaining things yields enough benefits to justify the hassle of having a dependency which is less portable, requires another build system, etc.
That's my general experience with those kinds of projects - I don't know enough about RavenDB in particular to tell if it's the same here.
Looking at the way the wind blows it rather is towards more language centric ecosystems, than towards less of it.
That said I'm still waiting for when the language ecosystems start to standardize how they interact with each other. That is a very hard nut to solve, but it would also be a quantum leap forward.
Yes. The JVM is Java's biggest downside, so why would you want to just move to a different language on the same overly complex (to put it mildly) runtime.
I'd argue it's one of javas strengths, its fast and other than increasing max heap size tuning is only necessary for the hardcore FAANG etc
What's complex about installing openjdk-<version>jdk from your package manager? Your app normally bundles library dependencies so no virtual environment fuckery like python Ruby et.al
In order to understand the issues you might have with an application you need to have a decent understanding of its runtime environment. From the hardware on up. The JVM adds a huge new runtime that you need to understand how it works and its possible failure points in order to be able to effectively develop software and diagnose issues. So, IMO, it goes against the basic precept of software engineering. That of taking a problem and breaking it down and simplifying it in order to develop a well understood, maintainable systems. You don't simplify something by adding complexity.
> You don't simplify something by adding complexity.
Yes, you do, it's called abstraction.
For example, C compiler adds complexity when compared with assembler, but that lets you significantly simplify your code.
And without abstraction, modern software development would not be possible.
Of course too much abstraction can be a problem (which is why some people still program in assembly), but that does not mean JVM will always be the wrong solution.
It's a matter of delegating responsibility. The stack is too deep and too complex for any one person to understand. I have a good understanding of the Java language and the JVM. At some point I have to trust the JVM engineers to get things right on all the platforms the JVM runs on.
I have a rudimentary understanding of the Linux and BSD kernels and how they can impact certain parts of my applications. But I have zero knowledge of Windows. But I don't need to, because the JVM engineers do, and things Just Work (TM).
I'm not smart enough to learn the intricacies of every platform, and thankfully I don't need to be.
I guess you have similar issues with POSIX and the abstractions offered by high level programming languages? I assume you write some CPU specific assembly and understand all the possible failure modes of that hardware?
It's a client library for a database server. They were contracted by RavenDB to build a Go client library, the existing Java client is not a bad starting point for that as the languages are fairly similar.