I've done some ultra-fast optimizations of algorithms in C# before. Stack allocation and unmanaged pointers was essential for this. It seems that Span<T> is making this approach, in a sense, more accessible to more developers.
This makes my optimization skills less valuable as more developers will know how to do this ;)
However, this will greatly enhance awareness of C# as a language that can be used for reasonably high-performance code. Most people don't seem to understand how fast C# can be with good optimization techniques. This will help increase adoption of C# in the developer community.
OTOH, I'd say this gives you another tool to your optimization toolbox, and one that you can market as "this thing I was doing before with unmanaged pointers, can now be done in a safer way" (and call your previous clients to sell them the new-and-shiny-good-way-to-do-it).
Yeah, if Linq optimization work has taught me anything, `Span` will get lumped into "weird advanced things" by most developers and there will still be a role for developers that understand how to use it.
(Aside: I still can't believe the number of developers I meet that seem to think of basic Linq concepts like `ToLookup` as "weird advanced things", which leads directly to long rants about `ToList` and why I consider it harmful.)
Yes and no. You're right on the money about Span being an advanced construct. However, before Span, stack allocated memory was essentially C programming in the CLR, and most of the benefits of the CLR (cross architecture compatibility, for one) therefore goes out the window, as you have dependencies on raw memory layout and stack space (determined by the CPU architecture and OS) and things like memcpy, which depend on binaries from the C runtime. If Span does not compromise these benefits, then the developer no longer has to think about those things.
There will be an gain in ease of use. Granted, average developers aren't going to care about this (nor will they need to). But you should expect to see an increase in performance of many libraries, as most open source projects generally won't use unsafe code in order to increase performance.
C# was mostly irrelevant to me a long time when Microsoft's implementation was closed, but there are some neat things about it. They've done a lot of interesting stuff in the language since it first came out, including pragmatic sugar-y stuff like type inference (`var`), async/await, and recently some moves towards more functional-style pattern matching though they're not totally there yet (https://www.kenneth-truyers.net/2016/01/20/new-features-in-c... discusses proposals, some of which didn't make C# 7). Interfaces and value types also seemed like important things to have early, and there's some other handy looking stuff like the SustainedLowLatency GC mode (defer full-heap compactions as long as it can).
Can be tempting to think of it as a Java clone because of its early history and the shared general category (OO-focused GC'd imperative statically typed language whose first major impl was bytecode/VM-based), but there's signs of more to it than that.
I feel as though, with C#, Microsoft's brilliance was to get a number of very clever people in a room together (Anders Hejlsberg, Eric Lippert, etc).
Somebody at Microsoft said to themselves, these guys are clearly brilliant, let's see what happens if we hire them all, task them with creating a new object oriented programming language, pay them a tonne of money, and see what happens.
The problem with C# for me is the fragmented ecosystem that's hard to navigate, coupled with dense and hard to find documentation that seems to be spread out over 14 different msdn domains and GitHub projects.
CoreFX? CoreCLR? No idea what these are or what they're for.
And don't get me started on the dumpster fire that is roslyn. I've tried no less than 10 times over the last few months to get "intellisense" working in emacs via roslyn and it has never worked.
C#'s a cool language that I want to like but it's just so much work to get started with.
I used to use omnisharp with atom and frankly it was just rubbish. It crashed all the time and required constant restarts, and (still) takes a massive amount of time (and memory) to start.
I went back to using golang, and didn’t regret it for a second. The tooling in go is excellent and you can use a simple editor for many things. That’s really compelling for people.
So... to be fair, if you were to idly pickup .net core as a linux user, and expect vim or emacs to do the job, you’ll find it a terrible experience.
...but, with the right tooling, it can be really great. vs, cscode or rider (not free) make using c# a pleasure to work with.
The dotnet cli tool is still quite terrible (how do you install dotnet watch again? why does it take 25s to restart a simple web app? You think git has obscure inconsistent incomprehensible command line flags...? ha! remind me why ‘dotnet build’ is running like 50 shell commands from the .csproj file...but only in this configuration you didn’t specify excplictly? surprise!) ...but its usable, and its slowly improving.
I feel like a lot of C# enthusiasts shoot down complaints as people being rubbish, or trolling or ‘doing it wrong’; but the reality is c# on a platform other than windows is new, and still pretty raw.
Pointing people in the right direction (you do need a proper IDE) and acknowledging that doesn’t suit everyone is better than telling them they’re ‘doing it wrong’; thats just jerk behaviour and comes off as fanboiism.
Let me know when Go has something comparable to Windows Forms, WPF, UWP, Xamarin.Forms, Avalonia.
Oh, and when I have to use Go, I find a chore to write generic code as if I was back in 1992 using Borland C++ 2.0 for MS-DOS, with pre-processor based code generation for BIDS or still using Java 1.4 (EOL in 2008).
When was the last time you wrote a micro service that compiled to a single binary and interop'd seamlessly using grpc in C#?
I'm going to guess the answer is never, because it's basically impossible to do.
Do you care? I doubt it, but some people really do.
It's just different use cases for different things. Sure, go has its problems, but needing a large heavy IDE isn't one of them. That's really important to some people.
/shrug
> Oh, and when I have to use Go, I find a chore to write generic code...
Oh man, don't even start. If you don't like go, don't use it. It was just one example of tooling that's better than C#. Rust, node, clojure, heck, even python has a better story for 'pick up and start using' than C# does at the moment.
Do we really need to start digging down into how fundamentally terrible nuget is as a package manager, and how I personally find it like installing Master of Magic using 3 1/4 inch disks one by one when it screws up, or the feed screws up (like it did TODAY for about 4 hours)?
nah. No one cares.
Let's just leave it at: There are some valid concerns people have with the C# ecosystem, and using C# on non-windows platforms. It's a work in progress.
At least we now have a coherent direction for everything going forward~
Nuget is such a pile of rubbish it's not even worth arguing about; if that's the best package manager you've ever used, you should really go check https://github.com/fsprojects/Paket out.
(they also helpfully articulate why nuget isn't really very good)
The irony is that the folk from go-world have finally acknowledged the package management solution they have is really terrible, and they're building a lovely new one (https://github.com/golang/dep).
Maybe sometime in the future you'll be eating your words, when go has a lovely package manager.
There are many more names for things in C# than what you really need to be aware of as a C# developer. Want to write C# that targets Linux? Download the .NET Core SDK. Simple as. Everything else: .NET Standard, .NET Framework, are just names used by the .NET contributors to keep things straight in their head, but you honestly rarely deal with that stuff if you’re starting fresh with .NET Core.
If you’re trying to write C# in emacs you’re doing it wrong. C# has probably the best tooling in the world if you embrace it.
C# has a decade and a half of hyper-fast Microsoft-funded idea churn behind it, so I agree that it can be super confusing. It's very hard to distill it down to the "modern" stack when every Google hit gives you piles of results that are painfully out of date.
I love c#, but I've been using it since '05. I don't envy newcomers.
What is it with using C# w/ emacs that is wrong? Compared to say development of Java or python code on emacs? Although it doesn't have all the tooling present from say VS, it is still managable with refactoring, autocompletion and some more functionality working already.
I am the current maintainer of omnisharp-emacs and would love to know out what are the most pressing problems people encounter using C# on emacs via omnisharp - https://github.com/OmniSharp/omnisharp-emacs
I've never been one of those guys who gets super into his editor, as I don't see it as being the driving force in productivity or efficiency. VS Code or VS Community are my go-to, because they have all the features that are possible. I'm not being pulled away to find a solution to a solved problem: it's there, and it's solved.
Ah, .NET debugging is not implemented in omnisharp-emacs as debugger service is not provided by the underlying omnisharp-roslyn project (the parent language server project).
So you are expected to download VS Code, use sdb or something else if you want to have step-by-step debugger...
It probably seems fragmented because it's easy to get things confused, and Microsoft isn't making life easy during their journey in open sourcing the whole stack (or at least significant portions of it).
There's the languages (C#, F#), there's the runtime (CLR - Common Language Runtime), and then various frameworks (including the base framework, and the ASP.NET framework). Then there's the difference between the original variants of these and the new 'Core' variants.
Plus there's various other related bits like the .NET Standard projects which are meant for library developers to target to make library cross-platform portability easier.
As for difficulty getting intellisense working via emacs - that's certainly been something demoed, but it's not really a well tested use-case at this time.
If you're coming at it from a non traditional (Windows) developer perspective - then you're looking at a massive ecosystem in flux and in migration from being a primarily single platform to multiplatform.
The fragmentation was really what allowed the new breakneck pace of development. I’m a c# developer full time since 1.0 and I have huge problems following even though I’m 100% on the “old” (desktop) frameworks.
I’m hoping that things will slow down and converge again in a while.
Sure.
Sharing mostly the same view of this (hi)story... but: did those guys leave? lost their muses? What happened besides maturing and "standards"-lockdown... outside of some smaller bubbles and marketing gigs the whole eco-system seams stagnant and mostly irrelevant to today's internet/web development... despite so obvious interest of the company.
What happened? What did not happen? (besides mono and unity)
C# is superior to Java in every way I can think of. Just a couple examples, unsigned types, simple value types without boxing, and a real generics implementation (java generics are truly awful). The reflection API is c# is also far more useful and easy to understand.
As a language, yes C# is probably better. If I may grant Java a pithy defense as a C# user not as a contradiction but additional information.
On the implementation side, Hotspot blows the CLR out of the water on performance. All things being equal that is, not handicapping the Java code, bottlenecks "for realism" and other benchmark-defeating tricks.
Java is the superior serverside solution. Other than Hotspot performance, the other big selling point is Java's vastly larger opensource library selection.
I like both. I like the fat OOP space. Maintenance reasons is a huge one, I don't want to get burned having to maintain some fly-by-night's code in 10 years from a Frankenstein's creation of random JS libraries. Both Java and C# are industrial-strength designs, with good tooling and a swath of devs.
Between the two, I'd still reach for C# myself, because at this point you can have an almost entirely C# codebase from the server to the client. It's good enough in the performance and library aspects that Java holds over its head. I'm expecting the C# package to be quite complete once it adds compiling to wasm to its list. But if I made a career switch, it would be from C# to Java, and nothing else. I admire Rust and (especially) Elixir but no jobs in my area for either of those.
Java is faster for high-level code. The moment you start using structs and stackalloc and whatnot, Hotspot cannot really keep up.
Even for high-level code, it depends on the kind. E.g. Java generic collections have to box value types; I'm not sure if Hotspot even tries to avoid it - I think it does these days - but this kind of analysis is necessarily limited in scope. In C#, value type generics are fully reified, even on JIT level - every combination of type parameters gets its own JIT-compiled implementation.
I didn't know that. That's beneficial for me to know and makes sense.
I recently made my career change to become a C# developer so I did a good indepth review on the overall landscape. Came down to Java and C# as my best bets, C# won because it fits a lot of my work history and I've heard multiple people that I trust say they just can't find .Net developers. To me that's a feature. I really like it and don't understand all the hatred and fashion-oriented programming trends going on. Honestly, I'm more about maintenance and all the best tooling to help me get complex tasks completed without issue. Of course, maintaining employment is important too, has to be jobs around. I spoke with a few companies, one is switching from a .Net stack to Node and I think that's insane. I'm not sure why you'd actually willingly accept a downgrade like that. Maybe if you're starting from scratch, sure do what you will, but overall I expect the "app" side to JS to dwindle away as wasm takes over. I intend to build most of my webpages leveraging as much of the native browser functionality as possible, leaving apps to Xamarin and wasm for native code solutions.
That is just temporary glitch, until Java gets value types.
The biggest problem is that no-one wants to do a Python3 on the pile of Java code written in the last 20 years, so of course that have been doing baby steps, which are starting to see the light now post-Java 9.
And they will come, because Java is feeling the pressure on Fintech from companies that want to move away from C++, but still feel some pain, ergo Pony.
Java also needs such features for Project Metropolis, the JVM can only be successfully rewritten in Java, if there are no performance regressions.
Also, IBM and Azul JVMs do have language extensions for value types.
Well, it's been a temporary glitch for, what, 13 years now?
Yes, Java will fix this eventually (although, doing this for generic collections would also require reified generics, no?). But, of course, C# will also advance further in that time.
The general problem, I think, is that the process that Java uses for language evolution is deliberately slower than C#, so C# can bring new features faster. There's a trade-off there in that there's also less breakage, and acquired Java skills take longer to become obsolete. Which probably has something to do with why Java is still very prominent in the enterprise, and in industries like banking.
Well, C# did not switch hands among a company acquisition, whose owner lost sight how to move forward the language.
Even C# does move slowly, these lovely C# 7.2 features that I can already use on my private projects, will take years to be allowed on my typical set of customers, the enterprise.
I just helped one transition to .NET 4.6.1.
Likewise another one just moved into Java 8 this year.
Worse is the Android situation, regardless of the features Java might get, even feature parity with C#, Google will cherry pick only the ones they care about.
The nice thing about many new C# features is that they are still supported on older runtimes. So you can ship stuff that runs against 4.5 or even 4.0, while using e.g. the new property syntax, or inline out, or even pattern matching (for your own classes). For something like tuples that needs library support, they now provide the requisite bits as NuGet packages - if I remember correctly, the one for tuples is also 4.0 up.
Yeah, but that is a side effect of the CLR original target, as a means to support any language, including C like ones.
Which is ironic that nowadays JVM gets more languages, with compiler backends pretending to be Java like, while .NET SDK 1.0 even had multiples examples of programming languages.
Anonymous inner classes are more powerful than c# anonymous types. I think c# wins out overall, but there are a few corners where Java has features c# is missing, coming from the fact that c# was a bit more pragmatic where Java was a bit more oop-purist in its origins.
But, they're equivalently powerful with respect to an API.
In Java, a method might expect one instance of a class, to which you can pass an instance of anonymous class, whereas in C# a method might expect a set of delegates, to which you can pass a set of anonymous delegates or lambda expressions.
However, thanks to C#'s type inference, it actually ends up being far less verbose, yet more flexible, because anonymous delegates and lambdas can close over local variables. So, strictly speaking, they're more powerful than anonymous inner classes.
On the other hand, as someone that greatly admires Smalltalk, I sometimes envy Java's slightly better purity.
What are the features that c# is missing?
As the other comment explained delegates are the equivalent of anonymous inner classes and are far better.
Checked exceptions are really a nuisance, I would call them a negative feature.
Java enums in c# are basically classes with readonly properties if you need to pass constructor arguments.
Streams in Java are a huge clusterfuck.
I can't think to anything that Java has that is better than c#.
Delegates are not the equivalent of anonymous inner classes. They're commonly used for similar design patterns (observer and other callbacks), yes, but they're not the same thing. To be more specific, an anonymous delegate / lambda can be thought of as an implementation of an interface with a single method. But with Java anonymous inner classes, you can implement multiple methods at once.
The reasons why Java enums are better, is because they're actually constrained to the domain you specify. In C#, the range of values for any enum is the same as its underlying integer type - it's just that some of those values have names, while others don't. But it's always legal to cast (and few people know this, but 0 is always a valid enum value that you don't need to cast) - so any method you write that accepts an enum value has to validate it. In Java, since enum is basically a final class with a bunch of private singletons, you are guaranteed that the reference you get points to one of those.
That said, given that C# now has pattern matching, and it seems to be getting more powerful with every release, I'd be surprised if it didn't get something like case classes very soon.
Delegates and anonymous inner classes are equivalent, modulo API:
//Java
Bax.addHandler(new Foo()
{
@Override
public void Bar()
{
doSomething();
}
@Override
public int Baz(X n)
{
return computeInt(n);
}
});
//C#
Bax.AddHandler(() => DoSomething(), n => ComputeInt(n));
But, anonymous delegates and lambdas are at once syntactically nicer and more powerful, because of type inference and the fact that they can capture local variables.
Regarding Java enums, the actual equivalent feature in C# is class nesting. There is a tiny bit more ceremony involved in defining them, but they're more flexible than Java enums. For instance, you can decide whether or not to implement them as readonly (final) fields, or as static properties, depending on how much you care about cache friendliness.
public abstract partial class SomeEnum
{
private SomeEnum(){}
public abstract int Biz();
sealed class AFoo : SomeEnum { public override Biz() => 2 }
sealed class ABar : SomeEnum { public override Biz() => 4 }
sealed class ABaz : SomeEnum { public override Biz() => 8 }
}
//Singleton implementation:
public abstract partial class SomeEnum
{
public static readonly SomeEnum Foo = new AFoo();
public static readonly SomeEnum Bar = new ABar();
public static readonly SomeEnum Baz = new ABaz();
}
//Cache friendly implementation:
public abstract partial class SomeEnum
{
public SomeEnum Foo => new AFoo();
public SomeEnum Bar => new ABar();
public SomeEnum Baz => new ABaz();
}
It's also worth noting that extension methods on C# enums give you most (all?) of the power of Java enums.
That is only equivalent within the boundaries of your API (i.e. when you use them as pure callbacks). But your delegates are two different objects, while in the Java example, it's a single object. This makes a difference if, for example, object identity matters.
From practical purpose, Java anonymous classes can be used in any scenario where you need to create a one-off object that derives from a class or implements one interface. A delegate can only be used in a scenario where the receiving variable or function wants a function type.
I'm not arguing that delegates are lambdas are bad, mind you. For the common scenario involving callbacks, they're vastly superior. But they're not a complete replacement for Java inner classes.
But they aren't really anonymous then, since you have to write a wrapper class for every class or interface that you intend to implement inline.
By the way, I totally forgot about F# object expressions! But they're a good example of this feature. Better than in Java, in fact, because they let you implement multiple interfaces. Also, IIRC, they're true closures (whereas Java only lets you close over finals, not writable locals).
Ideally, you'd have both those and lambdas in a language, like C# and Scala do. If I had to choose, though, I'd definitely pick lambdas - that is the 90% use case.
Brevity is the feature, not anonymity. What C# actually lacks is syntax sugar. But, a few years ago, I spent an hour or so implementing `Ad-hoc` classes for most of the interfaces and abstract classes that I thought I would ever need, and it's been sufficient over 90% of the time[1]. N.b. these classes could have been generated programmatically.
I'm not claiming that the technique is exactly equal to what Java gives you out of the box, but rather that C# can get within epsilon. In other words, for my purposes, the prefix AdHoc- might as well be a keyword (as in AdHocIEquality<T>, AdHocIEnumerable<T>†, AdHocDisposable, AdHocHttpHandler, etc...), because it's indistinguishable from syntax sugar.
On the other hand, the F# object expression really is more than just syntax sugar, because of the way it interacts with type inference (no need to upcast to satisfy the type checker), and (as you noted) that it can implement an arbitrary set of interfaces. But, it's not all carrots and apples: F# lambdas don't work well with protected members. Meanwhile, C# can close over just about anything (a ref local, such as Span<T>, being the obvious exception).
On your delegates example for C#, since the parameters of the lambda and methods match, and you aren't relying on any closure behaviour, you can use 'method group syntax' to make it even more succinct:
> An actual roadmap to rewrite remaining C++ from OpenJDK in Java
Very cool. I wasn't aware of that. Do you think this will help introduce more "low level" features in Java since it is paramount to at least keep the same level of performance, and definitely improve it in the future?
Note that I wasn't completely fair, the .NET team also has some plans to incrementally move the runtime into C#, but so far it has only been mentioned on an InfoQ interview, no official plans revealed.
"So, in my view, the primary trend is moving our existing C++ codebase to C#. It makes us so much more efficient and enables a broader set of .NET developers to reason about the base platform more easily and also contribute."
> What are the features that c# is missing? As the other comment explained delegates are the equivalent of anonymous inner classes and are far better.
Not entirely. .NET delegates don't support generic type parameters (aka first-class polymorphism). Methods in C# and Java do.
C# interfaces also don't support static methods, which are useful for factory patterns. C#'s factory patterns require a lot more code as a result.
C# also has a bunch of artificial limitations, like not being able to specify type constraints on Delegate, Enum, and a few other built-in types. It's just nonsense.
The rest of C# is overall better than Java though.
There is no delegate equivalent of passing around an instance of IFoo:
interface IFoo
{
void Bar<T>();
}
This is known as first-class polymorphism.
Classes with methods are strictly more powerful than delegates, but they shouldn't be. It's even worse than that actually, because you can't even create an open instance delegate to a generic interface or abstract class method (you'll get a runtime error, not even a static error).
A java package is a namespace right? In .NET you can't query a namespace, but you can query a module (very rarely used) or assembly. Surely Java has something to query a jar.
There's nothing built-in quite like that. However, in normal circumstances, you can do it on top of what's in the JDK: pick a class you know is in the jar, load it as a classpath resource URL, parse that URL to find the jar, then read the jar using the built-in jar support. There are other ways to do it, eg starting from the classpath and working down. There are any number of libraries for doing this, eg:
An assembly is closer to a JAR. It is a single unit which can contain multiple namespaces. However, in the Java runtime, the package is the top-level organizational unit. In .NET, it's the assembly. The fully qualified type name in .NET contains the assembly, namespace, and type name. In Java, it's just the package and type name.
I like the irony,
you can not say that C# is superior to Java in every way in a discussion about C# Span while Java as ByteBuffer (the same API) since Java 4 (circa 2002 i think).
ByteBuffer provides universal access to on heap/off heap data.
JNI or Unsafe allow to create ByteBuffer from native pointers, created in C/C++ or in Java using malloc, memcopy them, etc.
This is used (and abused) by most web servers, DataStax or LMAX-Exchange have even created their whole business on that.
You also have have CharBuffer, DoubleBuffer, FloatBuffer, etc
There is no stack allocated ByteBuffer in Java (Java provides no stack access, this is religious) but in Java can do relaxed data access on Buffer element, volatile access, opaque access, CAS, etc.
> handy looking stuff like the SustainedLowLatency GC mode
Don't forget "server mode" GC. I have a long-running, memory-intensive console app which was taking about 45 minutes to process a typical batch of data. With some profiling I discovered that after the 20 minute mark, most of the time was spent in GC. Some of the classic optimisations (classes to structs, new list to list.clear) helped get the processing down to 30 minutes. After some more Googling I discovered and tried server mode GC. Suddenly it took less than 13 minutes.
out of curiosity why do you know so much about a language that was "mostly irrelevant to you"? Were you forced to develop in it for work, or what happened? For being irrelevant, your comment suggests you've been tracking it super closely (you link a discussion of proposals) ... (quoting you) "since it first came out"!
I just read about and play with things I don't use for work or any real projects--recently-ish I posted about Dart/Flutter here, and I worked through some basic Kotlin exercises after Google announced it as a supported language for Android. ML languages fascinated me early on; I think OCaml was the first language I saw with both strong typing and good type inference to keep code from getting verbose. Once I started looking deeper the pattern matching seemed like a key feature. Stuff like WebKit's new Riptide GC for JS is interesting (concurrent and generational but not copying is a neat corner of the space I hadn't heard much about). Rust is making progress on low-cost memory- and thread-safety. Computers can be interesting!
HS/college was a lot of C(++). My work has been Perl (I'm old) and then Python/Django (with the usual sprinkling of other stuff: JS, bash, etc.). Go has been a surprisingly intense hobby over the past few years and it's probably the only lang I could bust something interesting out in soon without much of a learning curve. (I guess also some of the close cousins of JS out there, but not really a brand-new environment then.)
I did play with very early (pre-`var`) C# more than 10 years back (I'm old!), but that was before most of these improvements happened and before I understood the relevance of the things like value types that were in there from the start. It's interesting now mostly as another point in the space of programming languages/environments, e.g. different tradeoffs around GC, generics, etc. Guess it has somewhat more potential to be relevant to me as an open/cross-platform project, but it's hard to see myself making the big time investment to actually get up to building-stuff speed in it just for giggles. Similar feelings about Swift or Kotlin, FWIW. (Not that I'm totally ruling anything out, but I've got stuff to do!)
That specific discussion of proposals I just Googled up looking for the C# pattern matching stuff. I first heard about it via a different link posted on lobste.rs a while back, but it was faster to Google this than look for that.
C# is a more polished and "prettier" Java. It has done well with it's incorporation of functional programming, generics, dynamic programming and syntactic sugar in the last few releases. Too bad it's pretty much confined to the windows ecosystem.
Our server library and framework software suite and Apps have been running cross-platform for several years and with the support of .NET Core 2.0 they now run fast and flawlessly on Linux which is a very popular deployment target for our Customers, in fact all our .NET Core Live Demos were developed on Windows and deployed to and running on Linux:
Each project can also be opened and developed on Linux or Mac with VS Code or Rider. Xamarin's solutions has been making C# a popular language for developing native high-performance iOS/Android Apps for several years and the stigma of C# server apps being confined to Windows should be eradicated with the advent of .NET Core.
It's definitely not confined to the windows ecosystem. I develop C# on OSX via Jetbrain's Project Rider and run the developed code on anything from linux to docker instances.
In some alternate reality, progress on the Java language doesn't get lost in the shuffle in the death of Sun after 1.6 was released. Those five years of relative stasis hurt, especially where C# was making such strides.
I took the AP CS test in high school with Java 6, and four years later, the jobs I was looking at after college were still Java 6. The ones where companies weren't still on Java 5, or even more archaic releases.
I’m not a C# programmer but can someone explain how it deals with memory validity? For example what if after the Span is created the underlying memory is freed or reallocated (e.g. moved to a different region in memory because it needs to grow)? What if I create some data on the stack and then return a Span<T> that points to it? The data on the stack would be implicitly deallocated as the function returns. Basically how does C# solve the iterator invalidation problem in C++?
The linked document kinda sorta explains it, by saying that Span is a by-ref type. Let me explain what this actually means.
In .NET (below C# level, in the VM itself), there are two fundamental types of pointers - managed (e.g. int&), and unmanaged (e.g. int* ).
An unmanaged pointer is basically the same as a C pointer. It points to whatever you tell it to point, and it's your responsibility to ensure that it's still valid when you dereference it. So you can get dangling pointers to locals that went out of scope, for example. If you point it at some memory that is managed by GC, you also need to make sure that GC doesn't automatically move that memory, because the pointer will not be updated - the VM provides an opcode for "pinning" managed objects so that they don't move to facilitate this. Also, because these pointers are "dumb", you can do pointer arithmetic on them, cast them to/from integer types, etc. And they can be used in any position a valid type can.
A managed pointer, in contrast, is a pointer that is guaranteed to be memory-safe. For managed pointers that reference managed objects and their fields, this means that GC is aware of those pointers, and adjusts them as it moves the objects around in memory, just like it adjusts regular object references (so you don't need to pin anything; things "just work"). When you have a managed pointer to stack-allocated data, the VM basically makes it illegal to return such a pointer, or stash it away into a variable that can outlive the scope - this is enforced by the bytecode verifier, simply by prohibiting fields of managed pointer types in heap-allocated objects (including, recursively, in any structs). So the only legal operation that you can do with a managed-pointer-to-local is to pass it into a function call - since the stack frame of the calling function is guaranteed to be there for the duration of the call, that is memory-safe.
On C# level, unmanaged pointers are just pointers (int* ), and managed pointers are used to implement ref types, as in "void Foo(ref int x)". Until recently, function arguments were really the only place they were permitted, so they were only used to pass arguments by reference. Recently, they've also added the ability to declare ref locals and return by reference, subject to all the verification rules - e.g. if you return a ref, you cannot return a ref to a local, it must be a ref to a field, or a ref argument that you got from the caller.
The runtime additionally has some types, that effectively wrap a managed pointer and add some functionality to it. One existing example is TypedReference (https://docs.microsoft.com/en-us/dotnet/api/system.typedrefe...) - this is basically a type-erased managed pointer, plus runtime type of whatever it points to. You can then "downcast" it to a proper typed managed pointer, and because it knows the target type, it can verify that the cast is valid. There's also a type called ArgIterator, which is basically the .NET type-safe equivalent of C's va_list. These are used for various low-level stuff like C++/CLI vararg functions.
Now, these wrapper types, because they encapsulate a managed pointer, have all the same verification restrictions that managed pointers themselves have. And all these types, including managed pointers themselves, are collectively called by-ref types.
Now, Span<T> is basically just a new by-ref type, that combines a managed pointer with span length under the hood. As such, it's subject to all the same restrictions, which together are sufficient to make sure that it's not possible to use it in such a way that it points to invalid memory.
Thanks for taking the time to type this up. Having just written a class similar to Span in C++ I was curious as to how this would be accomplished in C#.
This looks great! In other languages, I frequently miss having a simple, widely adopted abstraction like Rust's &[T]; C++ iterators are kind of a pain by comparison, and often feel like an abstraction maybe one level too low/broad (as always, concepts could help that).
It's nice that they the documented the interaction with the GC too.
The greatest benefit of `Span<T>` in C# is that it allows you to do things that once required an unsafe context in an efficient, (memory) safe, and type-safe manner.
I’m most excited for the “value types as references” such as ref returns an “in”-parameters. I often use arrays of structs for memory locality, and they often grow larger than the size where they can be passed in registers. In a 64bit app with all double precision floats there isn’t much you can do with 16 bytes or less.
Being able to send 3 structs of e.g 32 bytes each (such as Vector4<double>) to a function without hidden copying will be great.
As for the future, I wish c# had const refs working like in c++ instead of readonly and a plethora of ireadonly collections! It could optimize away many things on the CLR level too. I guess it's not doable without breaking some things, but I think it is worth it.
It seems like that's a step in the right direction, although as I can see implementation and specification are not finalized so it is hard to say whether any compromises will have to be made when it is released.
What I really want is to have some basic code contracts (like pure methods) embedded in the language in as frictionless way as possible. It eliminates need for many simple unit tests, null checks, etc.
I'm not sure if this feature will be only for structs or also for classes - I hope both will work.
Although I switched to C# 7 years ago from C++, and I like it, I have to say that C++ syntax seems more readable to me - const T& seems more readable to me than ref readonly T.
Another thing that might be missing here is propagating immutability information down to JIT. AFAIK, codegen could optimize code better if immutability is guaranteed (at least that's part of the rationale of using const & in C++). There is PureAttribute in System.Diagnostics.Contracts, however CodeContracts seem to be mostly dead in terms of compiler and VS using information from them for static analysis.
I wish this ref readonly feature could basically render all IReadOnly* interfaces useless, so that collections would have proper "ref readonly" interfaces. I don't see that in proposal - I wonder if there is a way I could contribute with that remark? I think it makes a good reason that's missing from the rationale on that github page.
Lastly, I wish there was more discussion about how C++ or other languages with stronger checks do it. Pretty sure there's huge opportunity to learn on other's mistakes :).
As a side note - this is what makes HN great - I can post a comment and get response from Ben Adams himself! One of the greatest .NET Core contributors who made it really fast! Thank you for that link :)
The `in` modifier on parameters, to specify that an argument is passed by reference but not modified by the called method.
The `ref readonly` modifier on method returns, to indicate that a method returns its value by reference but doesn't allow writes to that object.
The `readonly struct` declaration, to indicate that a struct is immutable and should be passed as an `in` parameter to its member methods.
The `ref struct` declaration, to indicate that a struct type accesses managed memory directly and must always be stack allocated.
ReadOnlySpan also gives immutability; though will only work over contiguous memory rather than a more general collection type; for which you'd still need to use a IReadOnly* interface either as param or generic constraint.
Thank you for additional clarification, I can see that some of related issues like https://github.com/dotnet/csharplang/issues/38 are not yet closed too, so it may cause additional confusion about its status.
I think that a C++'s const& equivalent would be very useful in C#, especially to simplify making objects and value types immutable by default and also more performant. I created https://github.com/dotnet/csharplang/issues/1118 as a starter point for discussion, although I'm not sure if linking it to "ref readonly" that's available for value types is the right association to make in terms of more general C# roadmap and design.
If I understand correctly, your proposal applies to the first case. The two arguments could be reduced to one, which is currently not possible for the general case without generic or template structs, but that alone would in my view not be worth introducing fat data types to the standard. An implementation doesn't need the new type either, it can already bounds check deferences of the pointer against the lvalue it was created from[1].
For the second case, I agree that would be nice but it can be worked around by using a struct.
> which is currently not possible for the general case
Not sure what you're thinking, since it does not require templates or generics, and works for the general case. I know it does, because I implemented it for D without generics or templates.
> it can be worked around by using a struct
Yes, but nobody does since it is clumsy, hence all the array bounds overflows in C. Even tiny bits of syntactic sugar, which is what this is, can have dramatic and transformative effects.
> [1]
That inescapably makes all pointers fat pointers, which is far more overhead than what I proposed. My proposal does not have any unavoidable overhead.
--
As for implementation experience, the D community finds they work extremely well. It's much more than an "it would be nice" feature.
> Not sure what you're thinking, since it does not require templates or generics, and works for the general case. I know it does, because I implemented it for D without generics or templates.
This was in favour of your proposal, about how it isn't currently possible to implement this construct generically as a user of the language, hence, it would need to be implemented in the language itself.
I think I do see the point you're trying to make. However:
> Even tiny bits of syntactic sugar, which is what this is, can have dramatic and transformative effects.
This is true, but its use would come mainly from bounds checking at compile time and runtime (otherwise it's just the struct) which is hardly even done for regular arrays.
I don't understand your comment, could you clarify a bit? Are you saying that Python's slice operator is a copy of D's? Or that there's some sort of contiguous memory guarantee with some sort of Sliceable in D? In Python?
Starting a new job and C# will be a part of it eventually. Reading through this is really confusing -- can anyone recommend me a good book that is relatively up to date? I am good with C and several other languages, so don't need anything that is too newbie-friendly.
Jon Skeet's C# in Depth is pretty solid. The fourth edition isn't done yet - and at the rate of change lately may be out of date by the time it is released, but it goes into some great detail on the guts of the language, and how it has changed over time.
YMMV, but I found the language spec a good way to learn the language (not the class library). Concepts are introduced in order, and each section is well-readable.
This will make writing IOT apps easier as one often have to scan/process contiguous buffers. Using string was slow and onerous - this could be cool, and when combined with .NetCore starting to make C# a contender in device space and other low level uses.
This is pretty cool. Ive seen a surprising number of things hidden in c# that can affect performance. For example stack allocation. Anyway im looking forward to the tools, frameworks, and abstractions that come from these features.
Not if you compile with -checked+ - then it throws. Although the right way to do this is to use a checked-block:
checked { sum += bytes[i]; }
It's one obscure C# feature that I rarely see used, but which I find indispensable, and really wish more languages adopted it. Integer overflow vulnerabilities have become more prominent in the past few years, so perhaps there will be some uptake. Interestingly, C# had this feature since the very first release back in 2001.
Glad to see that they'll help devs with the pinned memory issue (for buffers used with IO Completion Ports). But had to look up what "RIO Sockets" were - turns out the API helps you pre-register your buffers. Something you had to "just know" to do before.
C# 7.0 works for the most part in .NET 4.5 or even 4.0
Here it seems there is a dependency on the 4.7 runtime, is this the case? Where can I see a matrix of what C# versions work on which (full/desktop) runtimes?
Span definitely requires VM updates (Mono has been picking up commits to support it) so that feature at least is not going to work on 4.5/4.0 without a polyfill.
I read the blogposts, I have 10 years of C# under my belt (with very scare unsafe usage). Why is this important for me? What problems does this feature solve?
The fact that it can point to unmanaged memory is a benefit for me. Just last week I’ve been porting some performance critical code to C++\CLI and this will allow me to use a span of of object instead of having to pin arrays. One other application is that it is now possible to implement a memory pool using a contiguous piece of memory and only returning slices of it, previously you would have to have multiple instances of a fixed size array in the pool.
I don't sling C#, but the best I can summarize the doc's claims, it looks like a standard for referencing memory in previously unsupported ways, e.g. for representing a substring of a longer string, or a section of a buffer pool. If you do performance-critical code, you can avoid copying some data or alternatively defining your own ad-hoc slice types. If you don't, optimized libraries that you call will get another tool to minimize data-copying.
It will have some limitations, like that it can only be used on the stack. One reason is that the efficient representation on new CLRs will use pointers into the middle of objects and they argue that supporting those on the heap will slow down the GC too much. (Go's approach to non-copying slices/strings was just to support pointers into the middle of allocations, and I guess accept the costs at GC time.) Other reasons relate to concurrency, etc.
I can't do much better than that not knowing more about C# and the CLR than I do!
Lots of high performance goodness. Implement function taking Span as a parameter and it can be called with an array, stackalloc'd memory, native memory with a single implementation and in a type safe way.
However, an item I think is often overlooked is the safety:
ArraySegment or the triple: "array, offset, length" when used; only provides a suggestion to the called function - it still has full access to the array and can happily read and write out of offset->length portion of the array. Span only allows access to the window, so the called function can't operate out of its permitted bounds.
Further to this ReadOnlySpan finally gives read-only array elements; so rather than having to a defensive copy and passing out a newly allocated array; you can pass out (or in) a ReadOnlySpan
C# is being used more and more for high performance applications. This feature will allow developers working in gamedev/finance/server networking to get even closer to the metal without sacrificing safety.
If you wanted to replace a bunch of characters in a string. I.e. all 'a' to 'b' you can operate on the span, and not have to re-allocate a new string after each replacement
I wrote an incredible similar class for protected memory accesses for c++ back in 2008. Lead to beautiful code and in debug mode it could verify accesses. Subsets, copies, type conversions etc were beautiful. Also combined it with a shared pointer class as well. Probably should open source those...
I’m not sure if I’m missing something, but isn’t this basically just the same as Java 7’s Buffer, ByteBuffer, etc?
These, too, support the same types of backing memory (backed by native memory, stack-allocated memory, or a Java array), same access pattern, etc, and are also used by many third-party libraries now for such stuff (including many networking libraries, and graphics libraries such as LWJGL3)
> There is no way to use stack allocated memory in Java.
Not directly, but most JIT compilers do it.
It is tricky and fragile, but one can make use of the JIT logs and try to re-write the code so that it takes the right decision regarding escape analysis.
As for an actual support at language level, we need to wait for the outcome of projects Valhalla and Panama.
Not all improvements have to be innovative. The most mundane things can vastly improve the pleasure of using a language. Considering that array slicing originated in Fortran, Buffer is hardly innovative - yet, as you've pointed out, it has been a huge benefit for Java developers.
This makes my optimization skills less valuable as more developers will know how to do this ;)
However, this will greatly enhance awareness of C# as a language that can be used for reasonably high-performance code. Most people don't seem to understand how fast C# can be with good optimization techniques. This will help increase adoption of C# in the developer community.