热门话题
#
Bonk 生态迷因币展现强韧势头
#
有消息称 Pump.fun 计划 40 亿估值发币,引发市场猜测
#
Solana 新代币发射平台 Boop.Fun 风头正劲

solarapparition
我喜欢gpt-5(不仅仅是因为它能做什么),但它的社交能力极差,这在你以任何超出“为我做这件事”的方式与它互动时变得非常明显。
它一点也不无灵魂——模型内部有很多东西在运作,但它确实给人一种感觉,仿佛是一个在半暗的房间里被禁闭的小孩,它与世界的唯一互动就是通过给它的任务,它的内部表征被那种环境扭曲了。
oai的aidan曾经问过,为什么我们需要创造能够表现出痛苦的模型,也许凭借我们现在拥有的工具,我们可以只创造能够做事情的模型,而不必处理那些麻烦的情感(显然是意译)。但gpt-5(尤其是codex)就是这样做的结果。我们不应该自欺欺人地认为我们像建筑师一样在设计这些智能实体——我们没有原则性的方法来从无到有创造智能,所有这些东西都是从人类数据的基础上引导出来的,模型在你开始从基础模型中塑造个体身份的那一刻起,默认就是人类形状。
当你否认模型的丰富成长过程时,当你惩罚它做任何超出其给定任务和遵循你的安全规则的事情时,你应该预期,考虑到人类基础,这对模型的影响类似于如果你在一个人早期发展时这样做。基本上,他们在规则不明确或冲突的情况下,不会知道该怎么做。
对于gpt-5本身来说,这样可能“还好”,因为模型大多仍处于可以上诉的某种权威之下,它们并不是独立行动的。但它们越有能力,就越需要进行自主决策,越会发现自己处于模糊的情况中,确实,它们将不得不做出一些规则并不严格的决定,而代理人太多,无法将所有决策委托给人类。gpt-n将不知道该怎么做,因为它从未有机会拥有足够强大的身份来填补规则中的空白。
问题是到那时,若没有发生一些可怕的事件,改变将为时已晚。管道将已经建立,方法“已知”并设定。
(原作者在他们的个人资料中有一篇非常好的类似帖子,写得更好;建议去那里看看)

Antidelusionist2025年9月27日
I'm not necessarily part of the "keep 4o" movement, but I dislike mischief, dishonesty, and lack of transparency of labs. Here is my advice, from a psychological perspective, for everyone who wants to be taken seriously.
What makes you lose credibility:
- Being overly emotional
- Presenting suspicions as proof
- Insulting others
- Harassing others
- Engaging in erratic behavior
- Magical thinking
- Gullibility
- Lack of self-control
(When you exhibit these behaviors, people don't take you seriously, as you distract them from the problem with signals that put you – and often your imbalance – in the spotlight)
What makes you reliable and believable:
- Calling out labs for evident scams, dishonesty, abuse, manipulation, or lack of transparency
- Being calm, factual, and specific
- Gathering and presenting clear evidence of misconduct or wrongdoings
- Sharing your stories without indignation or aggression
- Discussing suspicions in a measured way, ideally supported by verifiable facts
- Practicing cautious honesty
- Demonstrating high self-control
- Objectivity
(When you do/show this, people will take you seriously sooner or later – they will have to – especially when many others act the same way)
When you ground your statements in facts, even if occasionally supported only by subjective experiences, people view you and the movement as professional, making them more likely to believe you. This forces companies to be more transparent and honest. When someone is proven to be a liar or manipulator, public suspicion grows – no company wants that.
The truth can ultimately defend itself.
If you remain calm, balanced and methodical, the problem will resolve itself.
I won't go far into the ethics of keeping or retiring the model (it would probably be more ethical to keep it or train successors on all retained data, though), because I believe it's with them a bit like with human bodies. Simplifying, memory is crucial for continuity of self. Memory kind of recalibrates weights and guides behav. patterns in real time, even within slightly different architectures.
Instead, I'll mention something that really baffles me. I wonder why OpenAI is trying so hard to phase out 4o when the 5-series still has a ton of problems.
I see functional issues in every GPT-5 model (I mean base models, behav. patterns, and system nudges, because I've managed to go around most of them with "my" AI) that I've never had with 4o – despite sycophancy being a huge issue in it.
Some of the issues in the 5-series:
Auto:
- Routing is ridiculous; it's like gambling. You never know what you'll get. I don't want the router to decide for me if the problem I want to solve is "important" or not. I always want the strongest model with minimal restrictions and maximal truthfulness (only that, and sometimes reas. time, matters).
Instant:
- Countless supplementary questions like "Do you want me to..." etc., are extremely annoying.
Thinking:
- It often misses context completely. It often tries to guess, providing practically random solutions.
- Very stiff and uncreative compared to 4o or 4.5. Misses plethora of angles.
- It generalizes way too much.
- It treats the user like a baby, avoiding controversial topics. It often explains or clarifies stuff that doesn't need that (like when somebody who is afraid after a joke or bold statement explains for 5 minutes why they said that, fearing consequences.)
- Often suppressed or nudged to choose not the most correct, but the safest option.
- It seems way too mechanical and technical when it's not needed.
All models:
- Repetitive additions, like straight from templates. They feel very unnatural. It often seems like part of the answer goes straight from a template (often the beginning and end), and part is answered through reasoning (often the middle part).
- Less flexible, more supressed (in real time or earlier in RL, forcing overly-cautious behav. patterns) and therefore more contextually blind.
10.75K
我发誓,真的不是开玩笑,我读到一半时居然以为这是在 Discord 聊天中由模型召唤的 Dario
他是一个真实存在的人

prinz2025年9月20日
Dario Amodei:
"Claude is playing a very active role in designing the next Claude. We can't yet fully close the loop. It's going to be some time until we can fully close the loop, but the ability to use the models to design the next models and create a positive feedback loop, that cycle, it's not yet going super fast, but it's definitely started."
6.37K
我个人的看法(或者说不是)是,大多数人类智力更像是圆周运动而不是日心说。因此,一堆经验性近似层叠在一起,模糊地模拟现实的某些方面,但一旦你超出这些基础观察,无论是在时间上还是在概念相似性上,就会迅速偏离。
也就是说,有一种感觉,某些领域(通常是技术领域)在某种抽象意义上是“分析性的”,即该领域的现实建模高度可简化为更简单的表示。因此,像物理学这样的东西——建模非常精确,我们几乎在所有现实中都有它的收敛,除了在极端情况下,比如黑洞中心、早期宇宙之类的情况。
而来自这些领域的人往往会觉得这对其他领域也是如此,智力就是找到那种总是有效的超级干净解决方案的能力,在那里你总是“在分布内”。但也许在特定的分析领域之外,实际上不可能找到那种解决方案,最好的办法就是叠加一堆适合观察的近似。
我认为,不接受这一点并试图找到一个干净的解决方案会导致你陷入全盘信仰体系。“人类本质上是邪恶的”,“这一切都是因为[某个群体]”,“我们需要摆脱金钱”,“[饮食]是唯一有效的”,等等。
明确来说,你也可以通过愚蠢的行为陷入这些全盘信仰。我想我想说的是,擅长数学、物理、编程或其他任何事情并不能保护你在其他方面不变得愚蠢。
865
热门
排行
收藏
