That doesn't stop/prevent abliteration. The creator of XTC/DRY is also a chad who makes sure that you really can access the full model capabilities. Censorship is the devil.
It was pretty funny to see Qwen 3.6 (heretic) tell me about how many death the Chinese government thought happened at Tiananmen Sq. on April 15th 1989.
Makes you wonder where that data was taken from, or if their great firewall is broken, or even if Alibaba engineers have special access...
I don't think it's unreasonable to imagine that Alibaba is allowed to scrape the wider internet, or that some research institution is and then Alibaba got data from them.
What is perhaps more surprising is that the data was not scrubbed before training, but maybe they thought that would be too on-the-nose for the rest of the world and would hamper their popularity if they were too obviously biased.
I don’t think it is very surprising. Ime I don’t think they try that hard to censor them, but only in a very superficial level that they have to. It is trivial to get their models tell you this kind of stuff, I wouldnt even consider it jailbreaking.
2024 which is ancient history. This is not true anymore, the models now are trained to prevent abliteration by spreading out the refusal encoding
See https://arxiv.org/abs/2505.19056
That doesn't stop/prevent abliteration. The creator of XTC/DRY is also a chad who makes sure that you really can access the full model capabilities. Censorship is the devil.
https://github.com/p-e-w/heretic
It was pretty funny to see Qwen 3.6 (heretic) tell me about how many death the Chinese government thought happened at Tiananmen Sq. on April 15th 1989.
Makes you wonder where that data was taken from, or if their great firewall is broken, or even if Alibaba engineers have special access...
I don't think it's unreasonable to imagine that Alibaba is allowed to scrape the wider internet, or that some research institution is and then Alibaba got data from them.
What is perhaps more surprising is that the data was not scrubbed before training, but maybe they thought that would be too on-the-nose for the rest of the world and would hamper their popularity if they were too obviously biased.
I don’t think it is very surprising. Ime I don’t think they try that hard to censor them, but only in a very superficial level that they have to. It is trivial to get their models tell you this kind of stuff, I wouldnt even consider it jailbreaking.