Hacker News new | past | comments | ask | show | jobs | submit | oppositelock's comments login

Random UUID's are super useful when you have distributed creation of UUID's, because you avoid conflicts with very high probability and don't rely on your DB to generate them for you, and they also leak no information about when or where the UUID was created.

Postgres is happier with sequence ID's, but keeping Postgres happy isn't the only design goal. It does well enough for all practical purposes if you need randomness.


> Postgres is happier with sequence ID's, but keeping Postgres happy isn't the only design goal.

It literally is the one thing in the entire stack that must always be happy. Every stateful service likely depends on it. Sad DBs means higher latency for everyone, and grumpy DBREs getting paged.


Postgres is usually completely happy enough with UUIDv4. Overall architecture (such as allowing distributed id generation, if relevant) is more important than squeezing out that last bit of performance, especially for the majority of web applications who don't work with 10 million+ rows.


If your app isn’t working with billions of rows, you really don’t need to be worrying about distributed anything. Even then, I’d be suspicious.

I don’t think people grasp how far a single RDBMS server can take you. Hundreds of thousands of queries per second are well in reach of a well-configured MySQL or Postgres instance on modern hardware. This also has the terrific benefit of making reasoning about state and transactions much, much simpler.

Re: last bit of performance, it’s more than that. If you’re using Aurora, where you pay for every disk op, using UUIDv4 as PK in Postgres will approximately 7x your IOPS for SELECTs using them, and massively (I can’t quantify it on a general basis; it depends on the rest of the table, and your workload split) increase them for writes. That’s not free. On RDS, where you pay for disk performance upfront, you’re cutting into your available performance.

About the only place it effectively doesn’t matter except at insane scale is on native NVMe drives. If you saturate IOPS for one of those without first saturating the NIC, I would love to see your schema and queries.


Scale isn’t the only reason to have distributed systems. You could very well have a tiny but distributed system


Sometimes distribution is not for performance but tenant isolation for regulatory or general isolation purposes. I work in such an industry.


Fair point. You can still use monotonic IDs with these, via either interleaving chunks to each DB, or with a central server that allocates them – the latter approach is how Slack handles it, for example.


DBRE? I guess DBA is too old fashioned for the cool kids?


Listen, I didn't make the title up, I just grabbed onto it from the SRE world because I love databases.

There are some pragmatic differences I've found, though - generally, DBAs are less focused on things like IaC (though I know at least one who does), SLIs/SLOs, CI/CD, and the other things often associated with SRE. So DBRE is SRE + DBA, or a DB-focused SRE, if you'd rather.


> Random UUID's are super useful when you have distributed creation of UUID's, because you avoid conflicts with very high probability and don't rely on your DB to generate them for you

See Snowflake IDs for a scheme that gives you the benefit of random UUIDs but are strictly increasing. Which is really UUIDv7 but fits in your bigint column. No entropy required.


The whole point of UUID is distributed creation. There's nothing about random ones (UUIDv4) that makes it better for this purpose.


I'm a systems nerd, and I found working with it quite challenging, but rewarding. It's been many years, but I still remember a number of the challenges. SPE's didn't have shared memory access to RAM, so data transfer was your problem to solve as a developer, and each SPE had 256k of RAM. These things were very fast for the day, so they'd crunch through the data very quickly. We double-buffered the RAM, using about 100k for data, while simultaneously using the other 100k as a read buffer for the DMA engine.

That was the trickiest part - getting the data in and out of the thing. You had 6 SPE's available to you, 2 were reserved by the OS, and keeping them all filled was a challenge because it required nearly optimal usage of the DMA engine. Memory access was slow, something over 1000 cycles from issuing the DMA until data started coming in.

Back then, C++ was all the rage and people did their various C++ patterns, but due to the instruction size being so limited, we just hand-wrote some code to run on the SPU's which didn't match the rest of the engine, so it ended up gluing together two dissimilar codebases.

I both miss the cleverness required back then, but also don't miss the complexity. Things are so much simpler now that game consoles are basically PC's with PC-style dev tools. Also, as much as I complain about the PS3, at least it wasn't the PS2.


Yep, all valid. When I started on it we had to do everything ourselves. But by the time I did serious dev on it our engine team had already build vector/matrix libraries that worked on both ppu and spu and had a dispatcher that took care of all the double buffering for me.


Indeed, anyone who mastered the parallelism of the PS3 bettered themselves and found the knowledge gained applied to the future of all multi core architectures. Our PC builds greatly benefitted from the architecture changes forced on us by the PS3


JWT's are perfectly fine if you don't care about session revocation and their simplicity is an asset. They're easy to work with and lots of library code is available in pretty much any language. The validation mistakes of the past have at this point been rectified.

Not needing a DB connection to verify means you don't need to plumb a DB credentials or identity based auth into your service - simple.

Being able to decode it to see its contents really aids debugging, you don't need to look in the DB - simple.

If you have a lot of individual services which share the same auth system, you can manage logins into multiple apps and API's really easily.

That article seems to dislike JWT's, but they're just a tool. You can use them in a simple way that's good enough for you, or you can overengineer a JWT based authentication mechanism, in which case they're terrible. Whether or not to use them doesn't really depend on their nature, but rather, your approach.


You are confusing simplicity (it's easy to understand and straightforward to implement safely) with convenience (I have zero understanding of how it works and couldn't implement it securely if my life depended on it, but someone already wrote a library and I'm just going to pretend all risk is taken care of when I use it).


Am I confusing them?

It's not difficult to implement JWT's, the concept is simple, however, with authentication code, the devil is in the details, and that's true for any approach, whether it's JWT's, or opaque API tokens, whatever. There are many, many ways to make a mistake which allows a bypass. Simple concepts can have complex implementations. A JWT is simply a bit of JSON that's been signed by someone that you trust. There are many ways to get that wrong!

Convenience, when it comes to auth, is also usually the best path, and you need to be careful to use well known and well tested libraries.


Very cool work!

I had to solve a similar problem years ago, during the transition from fixed function to shaders, when shaders weren't as fast or powerful as today. We started out with an ubershader approximating the DX9/OpenGL 1.2 fixed functions, but that was too slow.

People in those days thought of rendering state being stored in a tree, like the transform hierarchy, and you ended up having unpredictable state at the leaf nodes, sometimes leading to a very high permutation of possible states. At the time, I decomposed all possible pipeline state into atomic pieces, eg, one light, fog function, texenv, etc. These were all annotated with inputs and outputs, and based on the state graph traversal, we'd generate a minimal shader for each particular material automatically, while giving old tools the semblance of being able to compose fixed function states. As for you, doing this on-demand resulted in stuttering, but a single game only has so many possible states - from what I've seen, it's on the order of a few hundred to a few thousand. Once all shaders are generated, you can cache the generated shaders and compile them all at startup time.

I wonder if something like this would work for emulating a Gamecube. You can definitely compute a signature for a game executable, and as you encounter new shaders, you can associate them with the game. Over time, you'll discover all the possible state, and if it's cached, you can compile all the cached shaders at startup.

Anyhow, fun stuff. I used to love work like this. I've implemented 3DFx's Glide API on top of DX ages ago to play Voodoo games on my Nvidia cards, and contributed some code to an N64 emulator named UltraHLE.


> contributed some code to an N64 emulator named UltraHLE

That's a blast from the past, I distinctly remember reading up about UltraHLE way back when and then trying it our and for the first time being able to play Ocarina of Time on my middle class PC with almost no issues, that was magical.


Ask your neighbors with solar who they used and if hey liked working with them, and also get a lot of estimates from lots of local contractors. You will see all kinds of system proposals when you do this. Go with the contractor that you like dealing with who has a good system proposal. A good sign for me was when someone was willing to make changes; eg, use micro inverters vs optimizers for my complex roof geometry.

You can't figure out who's good online, interview the local companies. Locals will also get you through the permitting and code compliance process too.


Also, don't overlook the "services" category of CL (e.g. https://sfbay.craigslist.org/search/bbb?query=solar%20instal... it still carries the same "call a rando" risk as does hiring any contractor, but it can be easier than filtering out the higher budget Yelp businesses. Sometimes they list in the for-sale category, too as CL is wont to do


There are companies attempting to recycle them into new batteries, such as Redwood Materials, but from what I know, recycled lithium is more expensive than fresh lithium today.

The problem with used EV batteries is that they've started to degrade, and they degrade in chaotic ways, so you can't offer a predictable product made from old cells. Some cells may have shorts internally, others may have evaporated some electrolyte, or the electrodes may have degraded. Right now, lithium recovery is quite primitive from used cells. I've tried to reuse used batteries myself for storage, and the unpredictable wear made me give up.

Also, EV batteries, which are optimized for power density, may not be the best choice for home storage, where you want the ability to deep cycle to buffer power usage as the NYT article describes. The NMC cells common in EV's don't like to sit at above 90% state of charge (this cutoff is arbitrary, but > 90% results in fast breakdown), and they don't like to go below 20%, so you have a useful range of 70% of the capacity. You can over-provision by 30% or you can use lithium-iron-phosphate cells, which are less power dense, but much more tolerant of deep cycling.

I set my home up like this a long time ago. I use 100% of my solar and export nothing to the CA grid due to batteries. It's not cost effective to do this given the cost of storage when I set this up, but it's really neat to someone of my nerdy predisposition. My goals originally were to have solar based backup power, because I lose power quite a lot despite living in silicon valley, and it's worked great for that too.


It's a nice change for little experimental programs, but production servers need lots of functionality that third party routers offer, like request middleware, better error handling, etc. It's tedious to build these on top of the native router, so convenience will steer people to excellent packages like Gin, Echo, Fiber, Gorilla, Chi, etc.


Honestly, there is a lot of praise of the middleware in these projects, but I recently found out that most of them are unable to handle parsing the Accept and Accept-Encoding header properly, that is: according to the RFC, with weights.

This means that the general perception "these projects are production-quality but stdlib isn't" is misleading. If I have to choose web framework or library that implements feature X incorrectly versus one that doesn't have X at all and I have to write it by myself, I will with no doubt choose the latter.


Please post issue numbers, thanks!


Excuse me, what issue numbers?


Big fan of chi. It is simple, and just works. Also matches the http.Handler interface, so for testing and otherwise just makes life so easy (like using httptest new server - pass it the chi.Mux)


The panics are really annoying. Sometimes, you generate routes dynamically from some data, and it would be nice for this to be an error, so you can handle it yourself and decide to skip a route, or let the user know.

With the panic, I have to write some spaghetti code with a recover in a goroutine.


What kinds of routes would you generate dynamically that couldn't be implemented as wildcards in the match pattern? Genuine question


> or let the user know

Many are misunderstanding when the panic happens. It does not happen when the user requests the path, it happens when the path is registered. The user will never arrive at that path to be notified. You will be notified that you have a logic error at the application startup. It can be caught by the simplest of tests before you deploy your application.


Mmmm, code with recover is just a valid code. Calling it spaghetti seems unjustified.


If you have the means, get a computer science degree at a reasonable school, and don't listen to people telling it's too late, and that ML is your meal ticket. We have two kinds of jobs here in the valley, the glamorous and competitive, and the really challenging and necessary. The latter are more immune to hype cycles and economic downturns. If you do something along the lines of networking/security, cloud infrastructure, and learn something that confuses other people, say, how to use OAuth2 properly as an example, you will be able to work on the infra side in almost any company. Once you get your foot in the door, then you can learn the latest hottest fad wherever it is that you are, on the job. Infrastructure is the computer industry version of cleaning out the stables for the horses, but it's also necessary everywhere and once you prove yourself and show you are capable of learning on the job, you can work on other stuff. If you start to enjoy infra stuff, you can work anywhere. Don't get stuck doing any kind of process though, don't be the compliance guy, for example.

The biggest shortage in silicon valley is that of capable minds and hands. It's on you to make yourself marketable, but once that's done, there's tons of opportunity. I've been working here since 1996. I'm a grey haired old timer now, and I've seen this industry from big companies, startups, boring companies, and fad companies. Get your foot in the door, the first job won't be glorious, but as you demonstrate skill, pay and rank will follow. Don't go to any of the companies staffed by lots of startup bros, because a wrinkle or some greying hair is a disadvantage there, but there are plenty of other places to start.


This, but do your CS degree online and also do a MS rather than an undergrad. Despite what you might think you need essentially no prior coding experience to enroll and complete most MS programs (though you may be conditionally admitted and need remedials). And any school will do for MS, there is no such thing as a prestigious masters degree. Anyone hunting academic prestige beyond undergrad just gets a doctorate.


Not sure what a CS degree is supposed to help with if this guy wants to pursue the "scrappy SV tech startup" world.

He shouldn't be wasting time learning from professors who have never done anything real and have lived in an academic insulated bubble for decades.

Instead he should be building and shipping product after product until something sticks. And spending just as much time on marketing as development, and putting a lot of time and energy into hiring out and training a team to scale up to the next stage.


The degree itself is unimportant, but there are important skills that a CS program teaches, which you can teach yourself, but that degree makes getting interviews a lot easier. Like I said, I've been doing this for a very long time, and worked with many hiring processes.A person without either experience or a recognized degree has a very difficult time getting their foot in the door. I may have misread and thought that the OP was interested in engineering.

For PM's, or management, the skill set and showing that you have it will be different.


Do you think there would be value in a masters for someone with a few years of experience, but no CS undergrad, and a career break?


I've been running pgBouncer in large production systems for years (~10k connections to pgbouncer per DB, and 200-500 active connections to postgres). We have so many connections because microservices breed like rabbits in spring once developers make the first one, but I could rant about that in a different post.

We use transaction level sharing. Practically, this means we occasionally see problems when some per-connection state "leaks" from one client to another when someone issues a SQL statement that affects global connection state, and it affects the query of a subsequent client inheriting that state. It's annoying to track down, but given the understanding of behavior, developers generally know how to limit their queries at this point. Some queries aren't appropriate for going through pgbouncer, like cursor based queries, so we just connect directly to the DB for the rare cases where this is needed.

Why so many connections? Say you make a Go based service, which launches one goroutine per request, and your API handlers talk to the DB - the way the sql.dB connection pooling works in Go is that it'll grow its own pool to be large enough to satisfy the working parallelism, and it doesn't yield them for a while. Similar things happen in Java, Scala, etc, and with dozens of services replicated across multiple failure domains, you get a lot of connections.

It's a great tool. It allows you to provision smaller databases and save cost, at the cost of some complexity.


> microservices breed like rabbits in spring once developers make the first one

microservices talking to the same db... thats not microservices thats a disaster. you basically combine the negatives of the microservice world with the negatives of the monolith - tight coupling.


they can have their own databases, but still be on the same Postgres instance (aka cluster in Postgres parlance).


Databases are there to share data and provide transactional guarantees and even locking. Your data often must be tightly coupled like this, and most databases designed with this in mind and provide benefits when doing so. It doesn't mean your apps need to be, and there are still plenty of benefits in deployment and operations to be had with microservices. Silo the data when it makes sense, but force the issue you end up with a different problem trying to reimplement the benefits of a database in the app layer or with a fault tolerant, guaranteed delivery messaging system (itself a database under the hood).


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: