Able to encode utf-16 or something in builder view?

Hello,

I am a new user of Psychopy and conducting some experiments.
I used Korean characters in the text and in the data file, they were all broken.
I think that’s because of utf-8, default encoding in psychopy.
Although it doesn’t have a problem on psychopy view, it happens on excel file.
Can I adjust the encoding from utf-8 to utf-16 or others in Builder view?
Or should I adjust in coder view?
Please some advice, Thank you.

Best,
Nayeon

Hi Nayeon,

I think UTF8 should be able to handle Korean properly so not sure that pursuing a change to UTF16 will necessarily be fruitful here.

• What happens if you use a plain text .csv file instead of an Excel file, with its encoding set to UTF-8?

• Are you able to post a small example experiment plus conditions file here that demonstrates the problem?

Thank you, Michael

When I opened the data file (.csv), all the Korean characters were broken both on a plain text and Excel.
I attached the screen shots about the both.

I saved the data file as a new file encoded ANSI or Unicode on a plain text but it didn’t work.
On the psychopy it looks ok, though.

Before you do anything else make sure you are working with copies of your data, because saving in the wrong format could put you in a worse situation.

UTF-8 definitely can handle Korean characters, and I wouldn’t assume anything about what encoding excel is using to open the file as it can sometimes guess wrong.

I would imagine the issue is simply telling excel to use a certain encoding when you open it. In this <a href=http://discourse.psychopy.org/t/encoding-data-in-excel/1059/4?u=daniel.riggs1>previous post, I pointed to a Microsoft help page and repeated instructions for how to do this. Hopefully it’s as simple as that, because beyond this the source of this problem could prove difficult to track down.

Keep in mind that If you aren’t opening the file in the correct encoding, then you can’t successfully convert it, it will result in a different garbled mess.

I would also recommend you upload the file (not one you’ve tried to convert, the original), and also something (a screenshot maybe) showing what it’s supposed to look like.

1 Like

I finally solved copying the contents on psychopy and pasting them onto the excel. ph06_post_2016_10_18_2051.csv (4.2 KB)

I still can’t solve the problem.
Instead, I open the data files on the psychopy coder view and copy the intact contents into a new excel page.
It works, a bit tiresome, though.
I attach one of my data file and hope it can turn out with unbroken characters :slight_smile:
Thank you.

Best,
Nayeon

Firstly, it’s good that you can copy paste from the builder and not lose anything!

And please confirm, this file you gave us is an original file that you had not tried to convert, right? You have not opened this and then saved it again?

If this is an original file, you’re right, it’s not opening correctly in any common format (I even tried a bunch of Korean ones).

I glanced at the source code in data.py and it should be saving everything as utf-8.

So @jon is there something I’m missing here with the builder interface that could give a clue about where this went wrong?

But again, sorry to repeat, please make sure this is an original file generated by the program, not one that has been edited or saved after opening in excel, otherwise we’ll be wasting our time.

Take care!

This is definitely an original file generated by the program :slight_smile:
thanks!

I think psychopy should be saving this as utf-8 and the fact that it correctly decodes the file from that suggests it is doing so.

My guess is that the issue is on the receiving end; that excel is not correctly detecting the character encoding, or is using a font that doesn’t include the korean characters?

I don’t know much more than that I’m afraid.

I had the same thought, and opened the file with several other applications (a text editor, libreoffice) specifying the encoding to utf-8 and others, and it wouldn’t open correctly.

I only insisted on asking because I would worry this is symptomatic of some bug. I hoped you might have an idea of somewhere else this could go wrong (i.e. does the gui always use utf-8 as well or does it use the system locale). But if we’re not getting anywhere with it and Nayeon has found a workaround, and it’s not affecting other users, I suppose we should leave it.

When I open it in a text editor, it detects the encoding as ‘Western (Mac OS Roman)’, so it is not surprising that it is broken.

But when I open a .csv file that I know has been created by PsychoPy on my computer, the encoding is UTF-8. So I’m still not convinced that we’re seeing a file that hasn’t been through some sort of cut-and-paste or export, rather than a .csv file directly saved by PsychoPy.

1 Like

I meet the same problem.
It’s really a matter of character encoding.
If you convers the character encoding to ANSI, you can solve this problem.

  1. Right mouse click the scrambled Excel file, click “open mode” → “select the default program”.

  2. Then in the open window, click notepad.

  3. Then in notepad, click “file” → “save as”.

  4. Then in the open window, click the [code] drop-down arrow below.

  5. Then select [ANSI] and click [save].