UseBB Support Forum

Official support for the UseBB 1 forum system

Where change a charset?

Post Reply

Page: 1

Author Post
Member
Registered: Aug 2007
Posts: 19
Location: Poland
I want to set a charset of forum (meta) from iso-8859-1 to iso-8859-2 or UTF-8

Becouse i don't know where is the file wich i must edit
Developer
Registered: Apr 2004
Posts: 2310
Location: Belgium
The character set is part of the language (translation). Thus it is set in the lang_x.php file.

Note that multibyte encodings (Unicode, UTF) are not officially supported and not always guaranteed to work.
Member
Registered: Aug 2007
Posts: 19
Location: Poland
iso-8859-2 is guaranted to work?
Developer
Registered: Apr 2004
Posts: 2310
Location: Belgium
As this is a single-byte encoding: yes.
Member
Registered: Jun 2005
Posts: 31
Location: Krefeld
Using UTF-8 is really not that complicated and if all parts of the software "speak" it, you have solved a lot char-set problems. So here is (as I can see UseBB is not fully UTF-8 enabled) a quick instruction:

First you need to set a meta-tag, if already so, modify it to this:
<meta http-equiv="content-type" content="text/html;charset=UTF-8" />

The next step is to extend all form-tags:
<form accept-charset="utf-8" action="[...]" method="post">

Also GET forms needs this enhancement. Sending a proper HTTP header also assists this:
<?php header('Content-Type: text/html; charset=UTF-8'); ?>

To make all mailer programs happy, at last, extend your email headers:
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

All these three lines are required, else maybe products like M$ Outlook (Express) might not properly display it.

Switching all database tables to utf8_generic is also advisable.

And at last developers and translators should set UTF-8 in their IDEs/translation programs to solve strange this.

Using entities did also solve a lot trouble in my project, by the way.

Hope, this helps you a little.
Roland
« Last edit by Quix0r on Sat Aug 14, 2010 3:17 pm. »
Member
Registered: Jun 2005
Posts: 31
Location: Krefeld
Okay, first and third steps are already done. The second step, to enhance the form-tag, do this:
- Search for all files containing <form action
- Replace it with:
<form accept-charset="'.$lang['character_encoding'].'" action

Also the email-step is already done. :)

@Dietrich: Please include this fix in your next release, it will really solve trouble with not-well configured browsers.
« Last edit by Quix0r on Sat Aug 14, 2010 3:40 pm. »
Developer
Registered: Apr 2004
Posts: 2310
Location: Belgium
Well the point I have taken some years ago was not to officially support UTF-8. I know you can make those changes but the real problem is that PHP itself does not handle multibyte (Unicode) strings well. Just try to use substr() on a multibyte string. PHP's string functions treat everything as single byte.

You can install mbstring but it will need to be configured to be used instead of the old implementations. You could as well develop your own (as Drupal etc have done).

I could put in custom string functions and use them instead (which is already some work) but to fully solve this major problem UseBB should also stop supporting the other encodings. Right now when the page's encoding is different than the one the user uses to type text the server receives the text as HTML entities. (I don't think adding accept-charset to forms will help. When not present the browser should use the page's encoding according to W3C documents, and e.g. Russian simply is not covered by the default ISO-8859-1.) So to avoid entities the default and only encoding should always be usable for the user's language (no matter what it be), and the only acceptable one that does should be UTF-8.

If I was to switch to UTF-8 all the databases will also have to be upgraded (all the posts converted, entities finally replaced by their true characters, etc. This simply is a huge work that I have postponed to the merge system of UseBB 2 (that will have a module to merge/convert UseBB 1 databases).

The thing is that it "works" for now, the only gripe is the tedious implementation and checking for entities and the fact that an entity takes several bytes to store and thus taking more database space.
« Last edit by Dietrich on Sat Aug 14, 2010 5:46 pm. »

Post Reply

Page: 1

UseBB Support Forum is powered by UseBB 1 Forum Software