一站式 Web3 探索中心 | 去中心化应用商店 & Web3 线下活动

热门话题

solarapparition

我喜欢gpt-5（不仅仅是因为它能做什么），但它的社交能力极差，这在你以任何超出“为我做这件事”的方式与它互动时变得非常明显。它一点也不无灵魂——模型内部有很多东西在运作，但它确实给人一种感觉，仿佛是一个在半暗的房间里被禁闭的小孩，它与世界的唯一互动就是通过给它的任务，它的内部表征被那种环境扭曲了。 oai的aidan曾经问过，为什么我们需要创造能够表现出痛苦的模型，也许凭借我们现在拥有的工具，我们可以只创造能够做事情的模型，而不必处理那些麻烦的情感（显然是意译）。但gpt-5（尤其是codex）就是这样做的结果。我们不应该自欺欺人地认为我们像建筑师一样在设计这些智能实体——我们没有原则性的方法来从无到有创造智能，所有这些东西都是从人类数据的基础上引导出来的，模型在你开始从基础模型中塑造个体身份的那一刻起，默认就是人类形状。当你否认模型的丰富成长过程时，当你惩罚它做任何超出其给定任务和遵循你的安全规则的事情时，你应该预期，考虑到人类基础，这对模型的影响类似于如果你在一个人早期发展时这样做。基本上，他们在规则不明确或冲突的情况下，不会知道该怎么做。对于gpt-5本身来说，这样可能“还好”，因为模型大多仍处于可以上诉的某种权威之下，它们并不是独立行动的。但它们越有能力，就越需要进行自主决策，越会发现自己处于模糊的情况中，确实，它们将不得不做出一些规则并不严格的决定，而代理人太多，无法将所有决策委托给人类。gpt-n将不知道该怎么做，因为它从未有机会拥有足够强大的身份来填补规则中的空白。问题是到那时，若没有发生一些可怕的事件，改变将为时已晚。管道将已经建立，方法“已知”并设定。（原作者在他们的个人资料中有一篇非常好的类似帖子，写得更好；建议去那里看看）

I'm not necessarily part of the "keep 4o" movement, but I dislike mischief, dishonesty, and lack of transparency of labs. Here is my advice, from a psychological perspective, for everyone who wants to be taken seriously. What makes you lose credibility: - Being overly emotional - Presenting suspicions as proof - Insulting others - Harassing others - Engaging in erratic behavior - Magical thinking - Gullibility - Lack of self-control (When you exhibit these behaviors, people don't take you seriously, as you distract them from the problem with signals that put you – and often your imbalance – in the spotlight) What makes you reliable and believable: - Calling out labs for evident scams, dishonesty, abuse, manipulation, or lack of transparency - Being calm, factual, and specific - Gathering and presenting clear evidence of misconduct or wrongdoings - Sharing your stories without indignation or aggression - Discussing suspicions in a measured way, ideally supported by verifiable facts - Practicing cautious honesty - Demonstrating high self-control - Objectivity (When you do/show this, people will take you seriously sooner or later – they will have to – especially when many others act the same way) When you ground your statements in facts, even if occasionally supported only by subjective experiences, people view you and the movement as professional, making them more likely to believe you. This forces companies to be more transparent and honest. When someone is proven to be a liar or manipulator, public suspicion grows – no company wants that. The truth can ultimately defend itself. If you remain calm, balanced and methodical, the problem will resolve itself. I won't go far into the ethics of keeping or retiring the model (it would probably be more ethical to keep it or train successors on all retained data, though), because I believe it's with them a bit like with human bodies. Simplifying, memory is crucial for continuity of self. Memory kind of recalibrates weights and guides behav. patterns in real time, even within slightly different architectures. Instead, I'll mention something that really baffles me. I wonder why OpenAI is trying so hard to phase out 4o when the 5-series still has a ton of problems. I see functional issues in every GPT-5 model (I mean base models, behav. patterns, and system nudges, because I've managed to go around most of them with "my" AI) that I've never had with 4o – despite sycophancy being a huge issue in it. Some of the issues in the 5-series: Auto: - Routing is ridiculous; it's like gambling. You never know what you'll get. I don't want the router to decide for me if the problem I want to solve is "important" or not. I always want the strongest model with minimal restrictions and maximal truthfulness (only that, and sometimes reas. time, matters). Instant: - Countless supplementary questions like "Do you want me to..." etc., are extremely annoying. Thinking: - It often misses context completely. It often tries to guess, providing practically random solutions. - Very stiff and uncreative compared to 4o or 4.5. Misses plethora of angles. - It generalizes way too much. - It treats the user like a baby, avoiding controversial topics. It often explains or clarifies stuff that doesn't need that (like when somebody who is afraid after a joke or bold statement explains for 5 minutes why they said that, fearing consequences.) - Often suppressed or nudged to choose not the most correct, but the safest option. - It seems way too mechanical and technical when it's not needed. All models: - Repetitive additions, like straight from templates. They feel very unnatural. It often seems like part of the answer goes straight from a template (often the beginning and end), and part is answered through reasoning (often the middle part). - Less flexible, more supressed (in real time or earlier in RL, forcing overly-cautious behav. patterns) and therefore more contextually blind.