LLVM

edinbruh@feddit.it · 18 days ago

That’s like… It’s purpose. Compilers always have a frontend and a backend. Even when the compiler is entirely made from scratch (like Java or go), it is split between front and backend, that’s just how they are made.

So it makes sense to invest in just a few highly advanced backends (llvm, gcc, msvc) and then just build frontends for those. Most projects choose llvm because, unlike the others, it was purpose built to be a common ground, but it’s not a rule. For example, there is an in-developement rust frontend for GCC.

Kazumara@discuss.tchncs.de · 18 days ago

that’s just how they are made.

Can confirm, even the little training compiler we made at Uni for a subset of Java (Javali) had a backend and frontend.

I can’t imagine trying to spit out machine code while parsing the input without an intermediary AST stage. It was complicated enough with the proper split.

LeFantome@programming.dev · 17 days ago

I have built single pass compilers that do everything in one shot without an AST. You are not going to get great error messages or optimization though.

Kazumara@discuss.tchncs.de · 16 days ago

Oh! Okay, that’s interesting to me! What was the input language? I imagine it might be a little more doable if it’s closer to hardware?

I don’t remember that well, but I think the object oriented stuff with dynamic dispatch was hard to deal with.

LeFantome@programming.dev · 18 days ago

GCC is adding cool new languages too!

They just recently added COBOL and Modula-2. Algol 68 is coming in GCC 16.

parlaptie@feddit.org · 17 days ago

cool new languages

COBOL

LeFantome@programming.dev · 17 days ago

I guess I should have put a /s but I thought it was pretty obvious. The 68 in Algol 68 is 1968. COBOL is from 1959. Modula-2 is from 1977.

My point exactly was that all the hot new languages are built with LLVM while the “new” language options on GCC are languages from the 50’s, 60’s, and 70’s.

I am not even exaggerating. That is just what the projects look like right now.

optional@sh.itjust.works · 17 days ago

If Algol68 is from 1968, shouldn’t Modula-2 be from 1898?

lad@programming.dev · 16 days ago

I would guess those languages are added for preservation and compatibility reasons, and it’s also an important thing

LeFantome@programming.dev · 14 days ago

I think some are getting used actually, particularly COBOL. I think Modula-2 still gets used in some embedded contexts. But these languages are not exactly pushing the state-of-the-art.

Algol 68 is interesting. It is for sure just for academic and academic enthusiast purposes. Historical and educational value only as you say.

parlaptie@feddit.org · 16 days ago

I had my suspicions that that’s what you were going for, I just thought I’d make it obvious.

Skullgrid@lemmy.world · 17 days ago

It’s new to gcc!

Log in | Sign up@lemmy.world · 16 days ago

BEGIN    
    BEGIN
        Wow, 
        Modula 2! 
    END;    
    I remember Modula 2.
END.

dejected_warp_core@lemmy.world · 12 days ago

Honestly, now that I can see the “business productivity” through-line from COBOL, to BASIC, and most recently, Python, I should probably just learn COBOL.

Log in | Sign up@lemmy.world · 18 days ago

Great optimisation, awwwful compile times.

Log in | Sign up@lemmy.world · 18 days ago

New kid on the block, roc, has it right by splitting application code from “platform”/framework code, precompiling and optimising the platform, then using their fast surgical linker to sew the app code to the platform code.

Platforms are things like cli program, web server that kind of thing. Platforms provide an interface of domain specific IO primitives and handle all IO and memory management, and they also specify what functions app code must supply to complete the program.

It’s pretty cool, and they’re getting efficiency in the area of systems programming languages like C and Rust, but with none of the footguns of manual memory management, no garbage collection pauses, but yet also no evil stepparent style borrow checker to be beaten by. They pay a lot of attention to preventing cache misses and branch prediction failures, which is his they get away with reference counting and still being fast.

A note of caution: I might sound like I know about it, but I know almost nothing.

CanadaPlus@lemmy.sdf.org · 18 days ago

That sounds pretty great. My impression is that relatively little code actually runs that often.

but with none of the footguns of manual memory management, no garbage collection pauses, but yet also no evil stepparent style borrow checker to be beaten by.

That part sounds implausible, though. What kind of memory management are they doing?

Log in | Sign up@lemmy.world · edit-2 18 days ago

Reference counting.

They pay a lot of attention to preventing cache misses and branch prediction failures, which is how they get away with reference counting and still being fast.

CanadaPlus@lemmy.sdf.org · 18 days ago

Oh, you just mean it’s a kind of garbage collection that’s lighter on pauses. Sorry, I’ve had the “my pre-Rust pet language already does what Rust does” conversation on here too many times.

BatmanAoD@programming.dev · 18 days ago

To be fair, the drop/dealloc “pause” is very different from what people usually mean when they say “garbage collection pause”, i.e. stop-the-world (…or at least a slice of the world).

CanadaPlus@lemmy.sdf.org · 17 days ago

Yeah, it might be better, I don’t actually know. It’s not as novel as OP maybe thinks it is, though.

BatmanAoD@programming.dev · 17 days ago

That’s fair; Python, Swift, and most Lisps all use or have previously used reference-counting. But the quoted sentence isn’t wrong, since it said no “garbage collection pauses” rather than “garbage collection.”

Ethan@programming.dev · 17 days ago

Garbage collection is analyzing the heap and figuring out what can be collected. Reference counting requires the code to increment or decrement a counter and frees memory when the counter hits zero. They’re fundamentally different approaches. Also reference counting isn’t necessarily automatic, Objective-C had manual reference counting since day one.

BatmanAoD@programming.dev · 17 days ago

“Garbage collection” is ambiguous, actually; reference counting is traditionally considered a kind of “garbage collection”. The type you’re thinking of is called “tracing garbage collection,” but the term “garbage collection” is often used to specifically mean “tracing garbage collection.”

CanadaPlus@lemmy.sdf.org · 17 days ago

It’s still mentioned as one of the main approaches to garbage collection in the garbage collection Wikipedia article.

Ethan@programming.dev · 16 days ago

Ok, I concede the point, “garbage collection” technically includes reference counting. However the practical point remains - reference counting doesn’t come with the same performance penalties as ‘normal’ garbage collection. It has essentially the same performance characteristics of manual memory management because that’s essentially what it’s doing.

Log in | Sign up@lemmy.world · edit-2 17 days ago

It’s a post rust language.

By your definition any automatic memory management is garbage collection, including rust!

Did you think rust doesn’t free up memory for you? That would be the biggest memory leak in history! No! Rust does reference counting, it just makes sure that that number is always one! What did you think the borrow checker was for?

In roc, because the platform is in charge of memory management, it can optimise, so that a web server can allocate an arena for each client, a game loop can calculate what it needs in advance etc etc.

But like I say, they do a lot of work on avoiding cache misses and branch mispredictions, which are their own source of “stop the world while I page in from main memory” or “stop the pipeline while I build a new one”. If it was doing traditional garbage collection, that would be an utterly pointless microoptimisation.

Rust isn’t a religion. Don’t treat it like one.

When it was very new a bunch of C programmers shit on its ideas and said C was the only real systems programming language, but rust, which was pretty much Linear ML dressed up in C style syntax came from hyper weird functional programming language to trusted systems programming language. Why? Because it does memory management sooooo much better than C and is just about as fast. Guess what roc is doing? Memory management soooooo much better than C, and sooooo much less niggly and hard to get right than the borrow checker and is just about as fast.

Plenty of beginners program in rust by just throwing clone at every error the borrow checker sends them, or even unsafe! Bye bye advantages of rust, because it was hard to please. Roc calculates from your code whether it needs to clone (eg once for a reference to an unmodified value, each time for an initial value for the points in a new data structure), and like rust, frees memory when it’s not being used.

Rust does manual cloning. Roc does calculated cloning. Rust wins over C for memory safety by calculating when to free rather than using manual free, totally eliminating a whole class of bugs. Roc could win over rust by calculating when to clone, eliminating a whole class of unnecessary allocation and deallocation. Don’t be so sure that no one could do better than rust. And the devXP in rust is really poor.

CanadaPlus@lemmy.sdf.org · edit-2 17 days ago

Did you think rust doesn’t free up memory for you? That would be the biggest memory leak in history! No! Rust does reference counting, it just makes sure that that number is always one! What did you think the borrow checker was for?

There is no runtime garbage collection in Rust. Given a legal program, it can detect where free-type instructions are needed at compile time, and adds them. From there on it works like C, but with no memory leaks or errors because machines are good at being exactly correct. If you want to say that’s just a reference counting algorithm that’s so simple it’s not there, sure, I guess you can do that.

Roc has runtime overhead to do garbage collection, it says so right on their own page. It might be a post-Rust language but this feels like the same conversation I’ve had about D and… I can’t even remember now. Maybe Roc is a cool, innovative language. It’s new to me. But, it doesn’t sound like it’s doing anything fundamentally new on that specific part.

Edit: Reading your follow up to the other person, it sounds like it has both a Rust-style compile time algorithm of some sort, and then (reference count-based) garbage collection at run time for parts of the program that would just be illegal in Rust.

Log in | Sign up@lemmy.world · 17 days ago

Roc has runtime overhead to do garbage collection, it says so right on their own page.

I was sceptical about your assertion because the language authors made a design decision not do do garbage collection. So I did a google search for garbage on roc-lang.org to try and find evidence of your claim. It doesn’t say it does garbage collection. It does say overhead, but you’re talking about it like it’s a big slow thing that takes up time and makes thread pauses, but it’s a small thing like array bounds checking. You do believe in array bounds checking, don’t you?

So no, that’s not what it says and you’re using the phrase garbage collection to mean a much wider class of things than is merited. Garbage collection involves searching the heap for data which has fallen out of scope and freeing that memory up. It’s slow and it necessitates pausing the main thread, causing unpredictably long delays. Roc does not do this.

Here’s what the website actually says on the topic.

https://www.roc-lang.org/fast

Roc is a memory-safe language with automatic memory management. Automatic memory management has some unavoidable runtime overhead, and memory safety based on static analysis rules out certain performance optimizations—which is why unsafe Rust can outperform safe Rust. This gives Roc a lower performance ceiling than languages which support memory unsafety and manual memory management, such as C, C++, Zig, and Rust.

Just in case you missed it, that was unsafe rust that lacks the overheads. If you’re advocating for using unsafe to gain a tiny performance benefit, you may as well be writing C, or zig, which at least has some tools to cope with all that stuff.

https://www.roc-lang.org/fast

When benchmarking compiled Roc programs, the goal is to have them normally outperform the fastest mainstream garbage-collected languages (for example, Go, C#, Java, and JavaScript)

Just in case you missed it, roc is not in the list of garbage collected languages.

https://www.roc-lang.org/platforms

The bigger benefit is tailoring memory management itself based on the domain. For example, nea is a work-in-progress Web server which performs arena allocation on each request handler. In Roc terms, this means the host’s implementation of malloc can allocate into the current handler’s arena, and free can be a no-op. Instead, the arena can be reset when the response has been sent.

In this design, heap allocations in a Web server running on nea are about as cheap as stack allocations, and deallocations are essentially free. This is much better for the server’s throughput, latency, and predictability than (for example) having to pay for periodic garbage collection!

Summary: roc doesn’t have the performance disadvantages of garbage collected languages because it’s not a garbage collected language.

Frezik · 17 days ago

I wish more languages used ref counting. Yes, it has problems with memory cycles, but it’s also predictable and fast. Works really well with immutable data.

Log in | Sign up@lemmy.world · 17 days ago

Roc uses immutable data by default. It performs opportunistic in-place mutation when the reference count will stay 1 (eg this code would satisfy the borrow checker without cloning or copying if it were rust - static code analysis).

Frezik · 17 days ago

Thanks, this looks really interesting. I’ve thought for a while that Rust’s borrow checker wouldn’t be such a pain in the ass if the APIs were developed with immutable data in mind. It’s not something you can easily slap on, because the whole ecosystem fights against it. Looks like Roc is taking that idea and running with it.

Log in | Sign up@lemmy.world · 17 days ago

I think that roc and rust are both aiming for fast memory safety, but rust is aiming to be best at mutable data and rpc best at immutable data.

I heard of someone trying to do exactly that - immutable functional programming in roc, but they gave up for the same reason you said - the whole ecosystem is working on the opposite assumption.

As far as I’m aware most of the roc platforms are currently written in rust or zig. Application-specific code is written in roc calling interface/io/effectful functions/api that the platform exposes and the platform calls into the roc code via the required interface.

I do think it’s really interesting, and once they have a desktop gui app platform (which must compile for windows for me to be able to use it for work), I’ll be giving it a good go. I think it’s one of the most interesting new languages to arrive.

Speiser0@feddit.org · 18 days ago

!programming_languages@programming.dev

Lena@gregtech.eu · 18 days ago

Yeah, I think Go’s compiler is so fast partially because it doesn’t use LLVM

Ethan@programming.dev · 17 days ago

TinyGo isn’t that much slower and it uses LLVM

Log in | Sign up@lemmy.world · 18 days ago

That would work!

tatterdemalion@programming.dev · 17 days ago

Isn’t Zig working on their own backend?

Also, pretty excited about the cranelift project.

vpol@feddit.uk · 17 days ago

Yes, and it’s now default for x86_64

brucethemoose@lemmy.world · 17 days ago

I’ll make my own LLVM, with blackjack and hookers.