Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't know Ada / SPARK, and I've been trying to figure this out. Based on the hallucinations I got from ChatGPT, it seems Ada itself is nowhere near as powerful as Rust in safety, while Ada with SPARK disallows some things I was considering to be quite basic, such as shared aliasing of data.

For example, it seems it's not possible to get a sub-string slice reference to an original unbounded string. In rust, a &str -> &str signature is trivial.

So it seems Ada still relies on discipline, while SPARK does not have the zero-cost abstractions that C++ and Rust have.

If that's true (is it?), then I'd definitely choose C++/Rust over Ada any time, since performance is very important to me.



Both Ada and spark have zero costs abstractions, they’re designed to run on embedded platforms.

Spark is a different use case from rust - it’s a full prover, and the goal is formal verification, typically in contexts where human life is at stake ( say, you’re writing software for an artificial heart , to take an extreme example ). This comes at the cost of being less flexible, but they’ve been slowly evolving spark so that it can handle increasingly complex cases .


Less ChatGPT and more language reference manuals, ChatGPT isn't an ancient oracle knows it all, even though Microsoft's marketing sells it as such.

Ada has as much zero cost abstractions as C++ and Rust have, and one of the reasons of Ravenscar is even what to turn off for bare metal deployments, and real time OS deployments.


Seriously, if these people use ChatGPT to seek new knowledge, we are doomed. They are utterly clueless.


To be fair to ChatGPT, trying to find good documentation for Ada +/- Spark hasn't been quite as smooth sailing as trying to find something for C++.

> Ada has as much zero cost abstractions as C++ and Rust have.

Couldn't find anything about it (see above), but does Ada come with any monomorphization tricks?


What does that to do with zero cost abstractions as described originally by Bjarne Stroustroup?

The way generics are implemented in Ada compilers is implementation specific.


How exactly are you going to implement generics in a way that's:

> What you don’t use, you don’t pay for. And further: What you do use, you couldn’t hand code any better.

without monomorphizing the generics?


By letting the compilers decide the best way to implement them.

Also I find funny this point of view on Ada, given the poor examples WG21 has added into the C++ standard library, that will never be fixed due to never-ending ABI drama.


Look. I'm not against Ada. Not my favorite language but definitely huge step in the right direction.

With that said. How does:

> By letting the compilers decide the best way to implement them.

Get you to zero cost generics. What if implementor just says "let them eat memory" and boxes the generic.


By the same way std::regexp gives you zero cost abstractions in handling regular expressions in C++.


I don't think a single point there is true.

Ada has had shared aliases since 1995. Its had zero cost abstractions since before then.

Slicing memory from a string is in the intro manual, for example.

    my_var(2 .. 6)
Ada doesn't rely on you to be disciplined. [0] Memory safety comes with SPARK. Its a theorum prover.

[0] https://blog.adacore.com/memory-safety-in-ada-and-spark-thro...


But doesn't this copy the entirety of that slice? That's not what I meant, I was referring to a shared reference, akin to &str in Rust or std::string_view in C++.


Doing it normally should create a copy of the value, as far as I can tell. Unless you use it to create a renamed variable or by-reference parameter, the same way you can create references in C++.

I think the closest thing to a &str in Ada would be an "access String" or "access constant String", which you would get either from an allocated "String" or from a declared "aliased String". You'd create a subslice with "string(x..y)'Access". Though I'm not sure whether that actually works without explicitly declaring an array of "aliased Character", the Manual is dense (as with C++, it's nearly meaningless unless you already know what it's supposed to mean) and the tutorials generally avoid talking about access objects.


Thank you! I assume 'Access types are not memory-safe, right? Is there a SPARK equivalent which still does not copy (i.e. it only references) that is memory safe?

To be more specific, how would one implement something like a "Get_First_Word" or "Trim_Whitespace" without copying?


I think it's supposed to be safe in Ada by default? As far as I can tell, basic allocated objects cannot be deallocated without an "unchecked" operation, and access objects created from "aliased" declarations are subject to scoping rules [0]. (If you want to know the full details, go figure out however "accessibility levels" are supposed to work [1].) It should preclude functions from returning access objects without a "prefix'Unchecked_Access" operation. I'm not sure how SPARK's borrowing system is tied into all of this.

[0] https://www.adaic.org/resources/add_content/docs/craft/html/...

[1] http://www.ada-auth.org/standards/22rm/html/RM-3-10-2.html


That's for String, not the referenced Unbounded String.


It's a slice, with basically the same implementation as std::string_view. [0]

If you want a copy, with Unbounded String, you'll need to call To_String on the slice.

[0] https://sites.radford.edu/~nokie/classes/320/std_lib_html/ad...


From the documentation you linked, it seems slicing creates a brand new String, which is more like std::string than std::string_view. In other words, it allocates and copies all of the string's characters (although it might allocate on and copy those bytes to the stack).

Also, the Unbounded_String owns its copy of the data, as opposed to referencing other data. The difference with String seems to be just that it can grow. It's still more like std::string than std::string_view.

Note that both std::string and std::string_view are essentially just a pointer and a length (std::string also has a capacity but let's ignore that). The difference is that trying to duplicate a std::string will end up duplicating (deep-copying) the data behind the pointer as well, where as duplicating a std::string_view will not.

Could you help me understand/interpret that link the same way you do, in case I'm missing something?


Allocating a new String, requires... "new String". You issue the "new" command, or the source does.

But what Unbounded does is... "U.Reference (U.First .. U.Last)". It returns a reference. It's not duplicating, because that would defeat the point of its entire existence. Its the buffer, containing one or more string objects, and you're just slicing a reference out of it - because that's the point.

If you want a String, you need to allocate one.

    function Slice
      (Source : in Unbounded_String;
       Low    : in Positive;
       High   : in Natural)
       return   String;
For this - there is no `out` marked. What you're grabbing is part of Unbounded, and the compiler won't let you deallocate it. Because it's owned, as a reference, to the Unbounded String.

For example, here's the actual GNAT source of the function [1]:

   function Slice
     (Source : Unbounded_String;
      Low    : Positive;
      High   : Natural) return String
   is
   begin
      --  Note: test of High > Length is in accordance with AI95-00128

      if Low - 1 > Source.Last or else High > Source.Last then
         raise Index_Error;
      else
         return Source.Reference (Low .. High);
      end if;
   end Slice;
A pre-existing reference is returned. There is no allocation that happens whatsoever.

[1] https://github.com/gcc-mirror/gcc/blob/master/gcc/ada/libgna...


I tried to get some sort of proof based on Godbolt, to see if it generates any memcpy's, but I couldn't manage to do that after quite a few tries. :(

It's really difficult to understand this given how much I know about Ada, so the best I can do is to keep throwing questions at ChatGPT. And I keep getting results that go against what you said.

I've also tried a couple direct review examples from o4-mini-high, one without the documentation link [1] and one with it [2].

It matches what I've managed to learn as well. I know how LLMs work and that they hallucinate a lot, so I can't tell who here is wrong, since you seem to be really experienced, and I barely know anything... what are your thoughts?

Oh, and I really appreciate you walking me though this! Like, a lot a lot! Thank you very much!

[1] https://chatgpt.com/share/684341e3-8e20-8012-b8d0-9847742af9...

[2] https://chatgpt.com/share/68434538-fe24-8012-bdb7-b07db34371...


Never, ever, throw any non mainstream language at an LLM. You will get absolutely nothing but bullshit back. They do not comprehend, and so they cannot move away from generalisation to actually speak about the language. There is not a large enough public source to train the model.

If you're throwing something to a con-artist, don't be surprised if everything you get back does not line up with reality.

EDIT:

Here's a simple example on godbolt: https://godbolt.org/z/hcxhvGnTc

As you can see, it just returns a pointer. No copying happens. There's a different length marked along with the pointer, but that's it.


I see it does what you're saying on the "else" branch, but that just returns the previous string unmodified, which makes sense. The more important one is the "if" branch though.

Looking at your Godbolt example's assembly, even if I add -O3, I see it does a call to "ada__strings__unbounded__unbounded_slice", but until I know the contents of that I can't say whether the pointer it returns is derived from the same allocation as the original one, or from a new allocation that the string was copied to.

You're using the Unbounded_Slice function [1], which calls To_Unbounded_String [2], which calls `new String` [3], which you mentioned in a previous comment that it will allocate, right?

The kind of operation I'm looking for is something like a "Trim_Whitespace" function that re-uses the old allocation without copying all the data, even when there is whitespace to trim.

[1] https://github.com/gcc-mirror/gcc/blob/master/gcc/ada/libgna...

[2] https://github.com/gcc-mirror/gcc/blob/master/gcc/ada/libgna...

[3] https://github.com/gcc-mirror/gcc/blob/master/gcc/ada/libgna...


It returns a new pointer, with the same buffer, as I said already. "There's a different length marked along with the pointer, but that's it."

You need a slice, which has a different length. That is how you do it, without a new allocation.

It's effectively:

    struct String {
      size_t length;
      char\* buffer
     };

    struct String\* Unbounded_Slice(struct String\* original, size_t Low, size_t High) {
      struct String\* slice = malloc(sizeof(struct String));

      // Bounds checking would go here...

      slice->buffer = original + Low;
      slice->length = High;
      return slice;
     }
(The exact definition of a string is implementation-defined. But that's the concept.)

Ada enforces safe ranges, which means you need to carry the length of the slice somehow. It does not use C's 0-terminated strings. So slicing does not work the same way as strtok or other self-modifying systems - the length isn't guessed, it's known.

But if you change one character in the buffer of the slice, it'll be changed in the original Unbounded_String too.

For trimming whitespace, you're right that Unbounded's standard Trim may reallocate. It carries multiple buffers, and when you Trim sometimes it will just hand it back, other times it'll reallocate. [0] Mostly for performance tradeoff. Keeping the original can make iteration slower, as it holds multiple buffers.

So, to implement our own - with one caveat. Slice can't handle 0-length, because range safety is enforced. So in the case of a wholly whitespace string, we'll be doing a whole new allocation.

    -- This line is just for pasting into godbolt
 pragma Source_File_Name (NTrim, Body_File_Name => "example.adb");

    with Ada.Strings.Unbounded;
     with Ada.Strings.Maps;
     use Ada.Strings.Unbounded;

    function NTrim(Source : Unbounded_String) return Unbounded_String is
        Len : constant Natural := Length(Source);
        First, Last : Natural;
        Whitespace : constant Ada.Strings.Maps.Character_Set := Ada.Strings.Maps.To_Set(" " & ASCII.HT & ASCII.LF & ASCII.CR);
     begin
        if Len = 0 then
           return Source;
        end if;

        First := 1;
        while First <= Len and then Ada.Strings.Maps.Is_In(Element(Source, First), Whitespace) loop
           First := First + 1;
        end loop;
    
        Last := Len;
        while Last >= First and then Ada.Strings.Maps.Is_In(Element(Source, Last), Whitespace) loop
           Last := Last - 1;
        end loop;
    
        if First > Last then
           return To_Unbounded_String("");
        end if;
    
        declare
           Trimmed_Length : constant Natural := Last - First + 1;
        begin
           if Trimmed_Length >= 3 then
              return Unbounded_Slice(Source, First, First + 2);
           else
              return Unbounded_Slice(Source, First, Last);
           end if;
        end;
     end NTrim;
The resulting compilation [1] has a few things. Our whitespace map gets allocated and deallocated most of the time. A map is harder to treat as a constant, and the compiler doesn't always optimise that nicely. Most of the code is bounds checking. No off-by-one allowed, here. Where first is greater than last, you get a new full allocation.

[0] https://github.com/gcc-mirror/gcc/blob/master/gcc/ada/libgna...

[1] https://godbolt.org/z/x8Erhqn5n


A few days ago, I had ChatGPT compare Rust and Ada. It tended to penalize Ada for its runtime checks and access values (aka pointers). However, ChatGPT didn't account for the fact that many of Ada's runtime checks would need to be manually implemented by developers in other languages. An Ada compiler, can often optimizes these checks away, knowing where they're genuinely needed and where they can be removed. This often explains why speed comparisons between C and Ada code can be misleading, as they rarely factor in the extra manual effort required to make C code equivalently robust with necessary safety checks.

Regarding access values, I listed out some of Ada's various restrictions. Its scope rules prevent referencing objects at deeper levels, objects must be explicitly marked aliased to create an access value to them, and there's far less need for access values (for instance, no pointers are needed to pass parameters by reference). Additionally, Ada offers the ability to dynamically size some objects and reclaim their memory without explicit memory allocation.

After I highlighted these details, ChatGPT admitted it had unfairly evaluated Ada, concluding it's a very safe and robust language, albeit using different techniques than Rust.


FFS, ChatGPT doesn't even has a clue on what is talking about.


People often have no clue either explaining why LLMs are so successful.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: