@GamingChairModel

GamingChairModel@lemmy.world · 5 days ago

According to your POV here, companies can claim whatever and it’s my job now to figure out if they are lying or to what extent.

No, the actual claims here, that describe specific bugs in specific software, can be evaluated. Even without whipping out a test environment to try to reproduce the results with your own proof of concept, you can read the text and evaluate whether the claims make sense on their face.

a broken clock is never right, reality momentarily aligns with it, which is a completely different thing

And that’s why the substance of a statement matters. I don’t believe in the supernatural, so if someone says “I’m a psychic and the missing girl on the news is in a shed near the water,” that doesn’t register with me at all. But if that person says “I’m a psychic and the missing girl is in a shed at 1234 Main Street” that raises eyebrows because it is easily falsifiable. And if the person says “I’m a psychic and the missing girl is in a shed, so I looked and found her and reported it to the cops, and here’s a cryptographic hash of my description of how I found her, which I’ll publish once the cops confirm she’s safe” that’s gonna be a much more serious statement. Even if I don’t believe that the person actually is a psychic, I can pay attention to how the whole thing played out because the person claims serious non-psychic validation of the results, and the results themselves are important entirely externally from the claim of whether psychics have powers.

This is a story about several cybersecurity vulnerabilities, some of which sound medium or high severity in very commonly used software. That’s important in itself, outside of AI mattering at all. And if they claim to have the receipts in a falsifiable way, that’s the kind of thing that shows a high degree of confidence in the genuineness of what was found.

I don’t give a shit about AI and I’m generally a skeptic of the future of any of these AI companies. But if someone uses AI tools to discover something new in the subjects that I do care about, like cybersecurity, then I’ll pay attention to the results and what they publish in that field.

GamingChairModel@lemmy.world · 5 days ago

This is really a corporate problem of their own making and their responsibility to fix. They have lied so much, I do not owe then a single iota of trust.

The statements can stand for themselves, evaluated on the merits of the claims, regardless of authorship. That’s how these things should work. Someone who has a great history of finding vulnerabilities still has to stand by each exploit/proof of concept they write, on its own merits. On the flip side, the corollary to the adage that a broken clock is still right twice a day is that you can’t just say “oh the broken clock said this so I can ignore it.”

Do you really think any of them would post something like “yeah, we found a vulnerability but it’s basically a typo that could not be seriously exploited”?

The blog post literally describes exactly that, for ffmpeg. And several of the other described vulnerabilities sound like they’re in that category of “here’s a bug but we didn’t find an exploit.”

Simply refusing to engage with these big claims just because of the source is an irresponsible way to approach cybersecurity.

even if the whole scenario is real, it may not have the intervention of Ai they are claiming

…who cares? If it’s a real bug and a real PR addressing the bug, why does authorship or methodology matter?

It’s just the ad hominem fallacy (or the close relative, appeal to authority). Let the actual substance stand and fall on its merits. Read the described vulnerabilities and exploits and decide whether you think those need to be patched and how critical/severe the bugs/vulnerabilities are.

And maybe your priorities are different from mine, but the core of the claim (we found some vulnerabilities) trigger a responsibility to address them (confirm and patch). I don’t care about marketing or corporate interests or whatever in those circumstances, I’m just focused on fixing problems that have been found.

GamingChairModel@lemmy.world · 5 days ago

Yes I understand, but I’m also putting the direct claims right there, not filtered through Anthropic’s PR or an article from the IT industry press interpreting those PR statements.

These are real CVEs that have actually been submitted to the code maintainers for both FOSS and closed source software that is foundational to the computing world. Some of them are published in this post. And many more are simply described with a hash of the full writeup indicating that they have it written out and are waiting for the patches to be applied. I’m especially interested in the Virtual Machine Monitor and the exploits for jumping out of browser sandboxes for “all major browsers.”

Some of the published CVEs in the blog post seem pretty serious, especially the FreeBSD remote root access for devices running NFS. The OpenBSD one is a critical DOS vector, and the FFMPEG one is just a bug that doesn’t seem to actually expose the software to any practical exploits but should still be patched.

But they’ve staked it out with their public disclosure of the hashes and a description of a few of the problems. These are big bold claims that are provided in a format that will be easily falsifiable in due time. And treating it as just marketing fluff ignores the shades of gray that actually apply to corporate claims.

GamingChairModel@lemmy.world · 5 days ago

The security blog writeup is here:

https://red.anthropic.com/2026/mythos-preview/

They’ve described several patched CVEs and disclosed the hashes of writeups that are currently undergoing the responsible disclosure process.

It lists quite a few, so I’ll be checking back when the vulnerabilities/exploits are patched and disclosed.