Hacker News new | past | comments | ask | show | jobs | submit login
ESE: The server-grade, embedded DB ships with Windows (github.com/microsoft)
93 points by mnkypete on Jan 30, 2021 | hide | past | favorite | 37 comments



IIRC this storage engine used to have a 2GB limit, but it did not properly check this limit and would corrupt itself if you tried to store too much. This caused problems for Exchange, Outlook, the ironically named Visual Source Safe, and probably others.


This engine was never storage for Outlook, nor I _think_ Visual Source Safe (but can't be sure of that). The JET Blue engine often gets confused with the JET Red engine, which were both completely different implementations of the JET API spec. See notes bottom of this page: https://docs.microsoft.com/en-us/windows/win32/extensible-st...

I say "we" because I am one of the developers who has worked on the engine for the last 15 years (around half of it's existence).

Since I have worked on the engine, it has had a limit of slightly less than 2^31 database pages, which since it has a variable page size, has historically been 4 to 64 TBs.

Even before the 2nd rewrite in the 90s, I believe it had a 16 GB limit, so I really think the 2 GB is JET Red.

Brett Shirley [MSFT]


Yes, Exchange 4.0 databases had a 16Gb limit and it was recognized then that this was insufficient. The next release increased that limit dramatically.


This explains ONE of the causes of corruption of PST files back in the day.


PST never used this storage engine. It used NDB.



This Windows only embedded engine has long been part of Microsoft Exchange and Active Directory servers. The API was public but not well documented. Berkeley DB was a quality cross-platform alternative but I suspect that some permutation of RocksDB now dominates this use case.

It should be noted that local SQL Server using fast IPC communication is quite good and rumors of Microsoft Exchange replacing ESE with SQL Server have popped up many times over the years.


I thought this was the Jet of MS Access but they call that JET Red.

From the ESE wiki page:

> For JET Red storage engine of Microsoft Access, see Microsoft Jet Database Engine.

> Extensible Storage Engine (ESE), also known as JET Blue, is an ISAM (indexed sequential access method) data storage technology from Microsoft. ESE is the core of Microsoft Exchange Server, Active Directory, and Windows Search. It's also used by a number of Windows components including Windows Update client and Help and Support Center. Its purpose is to allow applications to store and retrieve data via indexed and sequential access.

> ESE provides transacted data update and retrieval. A crash recovery mechanism is provided so that data consistency is maintained even in the event of a system crash. Transactions in ESE are highly concurrent making ESE suitable for server applications. ESE caches data intelligently to ensure high performance access to data. In addition, ESE is lightweight making it suitable for auxiliary applications.

> The ESE Runtime (ESENT.DLL) has shipped in every Windows release since Windows 2000 [...]

I was wondering how that might fare with the SQL Server edition meant to be bundled with apps. Besides being non-SQL seems it could be lighter keeping transactions and concurrency.


Oh, so this is source code for "JET blue" under MIT license. For now without comments, tests and build scripts - but that's comming.

Looks a bit like Microsoft's c++ toolkit that serves a similar use as lmdb, I guess? (although jet seems to do more than "just" db).

https://symas.com/lmdb/


Looks like it's slightly more abstract than LMDB, in LMDB you're given sub-databases on which to implement your own index, this sounds like it handle some of the boilerplate as part of the framework. The mention of support for schemaless in the README suggests it might possibly also define row structures and stuff like that


TIL the software responsible for almost exactly 10% of the utterly inexplicable Application warnings in Event Viewer on my Windows 10 PC is considered by someone to be "server-grade."


Justin, that event you are talking about, I assume is ESENT event 642? If so, that was literally - MY fault. Not the ESE teams fault, but _MY_ fault. Feel free to reply here with your flame. ;-) I can handle it. Really. :)

In the server cloud business all the events streams (and we have several) are rolled up into optics web reports and analyzed and sifted for potential emergent issues and health metrics ... in this case, an event useful to our cloud server business leaked into the events of our Windows client business. This is (in my biased opinion;) less about being "server grade", and more about not being "properly embeddable" ... as generally embedded components should be quiet.

This is part of the challenge of making a DB engine that services two very divergent scenarios. This is the smallest of those challenges. How much memory we can allocate being the largest regular conflict ... Office 365 would not even notice if ESE allocated an extra 1 MB per process, but that would boost our Working Set numbers by 40% (from the last numbers I remember seeing) for each Windows Client Service using us.

Anyways, I have fixed the code to not log that event on Windows Client. It should clear up once that patch gets out.

Cheers, Brett Shirley [MSFT]

P.S. - Oh and I am sorry if it caused you any stress. I know it sounds concerning, but it is truly innocuous. You can ignore it, and hopefully soon you won't see it.


There are some of those! Along with 330, 105, 301, 326, 641, 302, 102, 103, 300, and I just assume a few others, all coming from ESENT.

I guess that's a question: what would it look like on a Windows 10 system if it had a truly broken ESENT installation? As far as I know ESENT has never directly been a problem. Its log output is just part of the noise that now makes using the Event Viewer a complete waste of time unless one knows precisely when a problem was logged.

> Feel free to reply here with your flame. ;-) I can handle it. Really. :)

Believe me, if I thought the state of the Event Viewer and its logs were the fault of any one person, I'd be able to muster some sort of impassioned, vituperative response. I'm not angry, son, I'm just so disappointed in you. Ahem.


LOL! "Disappointed", Nice, Justin. :)

Those other ones are Information events however, not Warning events. ESE (as part of being "embeddable") has a couple of logging level controls (JET_paramEventLoggingLevel, and JET_paramNoInformationEvent). Each component is responsible for setting it's event logging level, the ESE default is ... let me call it, "fully diagnosable mode" (for us, not you ;).

Anyways, all events start like this: "svchost (48984,R,98) TILEREPOSITORYS-1-5-18 .." ... the service exe name at beginning, and the "instance" name there at the end. If you get me a few of the noisier components, I will try to find the Windows component owners and suggest to them that they set the logging level higher.

Thank You, Brett Shirley [MSFT]


I know this as ESENT (since this is apparently the name of the .dll). Good to know it's still around!

I saw this used as the "we really need to plug this performance hole" caching solution on a web server.

Without knowing too much about how well it did, I have to assume it does especially well in combination with IIS?

Would be interesting to find out how tightly coupled these benefits, if any, are.


I believe the original RavenDB was based on ESE (ESENT).


>Microsoft Exchange is one of the best database servers you'll ever use. You can make offline changes on multiple devices, and it will handle things automatically. It just works.

  -- Me : 2012   Source: http://mikewarot.blogspot.com/2012/01/
I stand by that assessment. To me the online/offline capabilities of the Exchange/Outlook pair was Microsoft's best product.


I love perusing different MS open-source codebases, just to see all the different conventions, styles, and practices used by different teams across the company.

This repo in particular sure shows its age, because just clicking around, this code is totally nonsensical to me, haha. I couldn't even make a guess as to what it's doing.

https://github.com/microsoft/Extensible-Storage-Engine/blob/...


This is good feedback for the project, thank you. I have lived with the naming convention, so long that it is second nature ... I didn't even think of documenting it ... I will write up something on the naming convention of ESE on the gibhub project's wiki soon!

Brett Shirley [MSFT] (yes, the Brett Shirley contributor to the project ;-)


As well as stripping comments, did they also strip and replace variable names? This code reminds me of trying to grok autogenerated Java GUI code in CS101.


Nah, it just looks like microsoft's usual hungarian notation https://docs.microsoft.com/en-us/windows/win32/stg/coding-st...


Yes, Hungarian was fresh in 1988 so naturally heavy use in this code base.


I haven't seen Hungarian notation in so long. It's so painful to look at. And yet back in the day I too was convinced it was a good idea.


Confession: I still LOVE it. If I had a green field project today, I would still use it. Brett Shirley [MSFT] ESE Developer.


Yeah it wasn't until after I moved from C++ to other languages that I changed my mind on this (ESE appears to be primarily C++).

When you can 100% rely on your IDE to give you the metadata about a variable using color and/or hover pop-ups, storing that same metadata in the variable name just feels heavy and unnecessary. With C++ IDEs, you will never get 100% accuracy, so you might as well give in and go with Hungarian.


    BOOL    fVisible;
    INT cbKey = 0;
    BOOKMARK        bmSearch;
    ULONG           cbReq;
I don't think so.



Cool. I remember picking up issues of Visual Basic Programmer's Journal as a kid and seeing mentions of JET with no clue what it was.

Neat that it is still around. A shining example of code being eternal!


This code base in indeed old. It began life in March of 1989 at Microsoft and was originally written in C.


It is curious they even released that in this state where all the comments are stripped. I would not really consider it free software in this current state (at least this is probably not GPL compatible, because this is not the "source code" by the GPL definition -- this is not the "preferred form of the work for making modifications to it")

But they plan to re-release with "cleaned up" comments. It will be way more interesting then, and with no ambiguity about its free software compatibility status.

Edit: to be clear I'm not really complaining, it is more that I'm eager to see the real thing, and also wanted to remind people what "source code" is by the definition of an important licence for free software. Here this is specified as "MIT License", which is often considered compatible with GNU GPL v2 or v3: but be warry that it is doubtful this is compatible with the GPL in this state, you'll have to wait a little if you want to do an integration in this direction.


Many times the comments are the reason code isn't open sourced. Companies want to audit them to ensure there's nothing damaging before releasing. The same goes for variable names.

This is a whole lot better than nothing.


"... in order to stay on the safe side with the very first release of the source code, we have temporarily removed all comments"

This sounds weird, wonder what they're worried about here: Personal Information? Profanity? Or just airing their dirty washing in public.


We're using ESE NT in our product and I'd like to say it's a great DB. Yes, it's have some limitation and quirks, but overall it's great DB. For us killing feature is efficient handling of binary BLOB columns.


So they are open sourcing Jet.


Specifically Blue, not the Red variant used in Access.


Ah, I thought they were the same actually.


No, never the same but they shared a relational query engine called QJET.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: