Tree of Savior Forum

Problems with Google Translate

After looking through a collection of translations that was requested a pull from the collections of people contributing on ToS Base, I can’t help but feel a lot of its google translated. This only makes it worse because then the original meaning gets jumbled up even worse. Looking through the recently submitted file there is just so many errors, misinterpretations, and stretched meaning in the text that was submitted by ToS Base. No offense to the contributors but after proof-reading many lines and adding comments I am beginning to realize some of its so jumbled up that trying to proof-read and edit it is meaningless since a lot of the original translation isn’t correct.

Here is requested pull I was referring to:

As you can see I started going through some of it before realizing the text is just jumbled and then re translated when it could be the completely wrong meaning. I don’t know Korean so I am not entirely sure but after sending some lines through google translate, they were word for word. Once again no offense to the contributors, thanks for helping, but translating and editing google translate won’t create accurate results.

So I guess my question would be, should time be spent running lines through google translate? or would hand editing be better? I find it might be worse google translating because once in-game with the context given, the translators will have to piece together the sentence from original Korean if the context is wrong. It might be more time consuming to find a broken line of english in-game or on github and have to hand translate it then it would be to just hand translate in the first place.

1 Like

I believe they somewhat recently machine translated (Google) everything to speed up the process, but I might be wrong.

It seems to mostly be a placeholder till more translations can be done / players can help fix issues with the open text client when the games in CBT.

Yeah but the issue is that if when in game you find a line to be inaccurate, grammatically incorrect, or not flow properly then you will try to fix it by piecing together what you think is right based on the context and the translation. But if the translation is wrong then that is only perpetuating the wrong context/meaning further through correcting the broken English. It could just get so lost in translation unless you know Korean and go to the original files and compare the translation one by one. Which is more time consuming then just translating once and having people fix the small errors.

I know they aren’t supposed to be the most accurate translations and maybe its me being a little too anal. Since hand-translating can take a very long time. But I want to know what the community/IMC thinks of this. I don’t mind very rudimentary translation as long as others don’t.

I guess the problem is the lack of native korean speakers that work on the translation project.

imcGames even suggests on the GitHub page to use machine translations:

Don’t know Korean but still want to contribute?

If you don’t understand Korean but still want to contribute to translating TOS, you may use machine translators like google or bing translator and edit the translations. It may not give you accurate translations every single time but it usually does work and gives you results that are pretty useful with a bit of a touch up.

Tip : When using machine translator ‘Korean -> Japanese -> English’ will get you much better results than ‘Korean -> English’.

So I guess most of the translations that were done right from the beginng were based one google and bing. And only few translators actually translate stuff ‘themself’. I agree that this has some potential leading to bad translations but I fear it has already happened and I’m not sure if there are better options. I remeber that when this project started imcGames even filled all empty lines with auto translated lines but removed them some time later when translators asked them.

I try my best to filter out those edits that seem too far off and focus on the ones that fix errors but often it’s kinda hard to decide and I don’t want to throw away translations that seem fine even though I don’t know if they are 100% based on the korean meaning.
I guess imc has to decide what kind of translations they want and I’ll do my best but I see my Web Translator more as some kind of proxy.

I don’t know Korean, but I just try to fix up the grammar of the translations where possible and make them read nicely. Maybe certain members could be marked as official korean translators if they speak it fluently? That would mean we can exclude the translations that are certified as correct.

I had noticed the issue on the GitHub regarding the machine translations and its been itching at me whether this is the best route or not. There is 1000’s of lines of text so it is quite troublesome to do it all by hand. This is also a very rare occasion where the fans are helping translate the game so its very open to interpretation what should work and what should not. I do agree with you, having to throw away translations is not good and its very hard to decide. It will become a bit easier for a lot of lines once we are in-game but at the same time a lot of text and context will lose its intended meaning because the translation will be wrong to begin with. There doesn’t seem to be much better options at the moment then to try and mass translate a lot of things. My hope is that we won’t lose a lot of the core context/meaning in the text otherwise the game could get confusing, especially story wise. It may be too late already. No translation will ever be perfect, I just fear that many lines will be very far off the mark. I think your Web Translators is great! The concept is great especially since GitHub isn’t too beginner friendly. The tool is pretty effective and you are doing a great job. So thank you.

I’d love to know what IMC thinks. I personally don’t mind as long as the core meaning stays intact.

Yes, a great deal of the lines are translated via google translate. And yes, a lot of them are incredibly broken. However, it’s fairly easy to understand what’s going on by working through passages, as separate units. There’s a story that’s being told, and you have to keep that in mind. You can figure out where a quest ends, and a new quest begins, and then run through the lines to get the gist of the story. Additionally, you can tell when certain story elements are referencing other areas. I can tell you right now that, by editing the job quests, the Ranger Trainer has a negative history with another female archer tree trainer (I forget which one, but it’s specific). They can’t stand to look at each other. It’s easy to gather context clues for yourself if you put your mind to it.

This post turned out to be more of a guide to editing the translations, but whatever. The thing is, lines are done one by one. Google translate itself loses a lot of meaning when passages get long. Thankfully we’re not dealing with pages of texts being translated, as if that were the case, I’d agree with you in there being a problem. But honestly, no, I don’t see an issue - this speeds up the process greatly in the current context, and prevents IMC from worrying about funding a larger translation team. On the flip side, if we waited around for the handful of Korean speaking international fans to help us out, we’d be waiting for ages (or IMC would have to hire a team).

tl;dr Not too much meaning is lost, and it’s not that hard to fix / flush out the broken bits. This is the best case scenario.

2 Likes

Thank you for the input. We would be waiting for a very long time unless we or IMC employed a staff of people to hand translate the game. I agree with you that a lot of it can be figured out. Having not known Korean I was a bit worried that google translate is losing a lot of the intended meaning. But as you said its more along the lines of when passages get long so it isn’t that bad. Sorting it out isn’t too much of an issue, I definitely agree. Google translate is not wildly inaccurate but it has its flaws. For now I will keep editing lines as I see them and hopefully when the game comes out with OTC it will be a lot easier to fix up lines even if the original meaning is jumbled. As long as it gets the point across and does the bulk work, then I agree, google translating lines isn’t as bad as I thought. A lot of its up to interpretation which I have no qualms with what-so-ever. Thank you for the input everyone. :smile:

Just wanted to point out that some of the ‘translations’ from TOSBase are very OLD translations. The user, emailboxu has actually edited these ‘over-complicated’ sentences and made them shorter & more accurate to the Korean meaning. (The red-highlighted lines). You can see these files under “commits”

This pull is actually… overwriting his edits with our old translations. This needs to be fixed.

Google translate was used months ago, when sentences didn’t make ANY sense at all. At the moment, most lines are well written and only need simple grammar/typo fixes. :blush:

1 Like

If I’m not completely wrong, every translation in that pull request was either submitted on July 1st or July 2nd via the Web Translator. On July 1st right before the renaming of the files I did the last merge between GitHub und Web Translator. However it can always happen that translations are done simultaneously via GitHub and Web Translator. It’s hard to completly avoid these conflicts even though I’m trying to make sure that I don’t overwrite already changed files it also depends on in which order imcGames adds the pull request to the main branch etc.

Ahhhhhh I see. I think it’s because some lines in Tosbase translator isn’t updated instantly. That’s why I was confused when I saw older translations.

E.g. one line i recognize is line 5089 “some of the contents are covered in wet blood…”.
This line was edited 3 days ago on GitHub to “Part of it are covered in blood…”.
TOSBase still says “some of the contents…” on the web translator but it was edited recently.

1 Like

Well, yes and no ^^

This line is actually different because if an edit that was done via the Web Translator but yeah it can take about 1-2 days before an edit is shown in the Web Translator because first imcGames needs to accept the pull request and I need to import the newest files.

Well, it seems that Google translator usually works for simple lines but it’s actually been a while since it was last used. We once tested filling all untranslated lines with machine translations hoping that it would help the translators but it didn’t, so we had to delete back most of it based on the feedback we received. There may be some machine translations left if not much meaning is lost but there also may be some machine translated lines left unintentionally that we weren’t able to delete. :sweat:

For the question on whether time should be spent running lines through google translate, I think most machine translations that are usable have already been used. So at this point, hand translating directly would be better and more efficient, rather than running through google translate beforehand.

6 Likes

Thank you very much for the response. I have just started proof-reading, as I have been waiting for most Korean lines to be translated, and I haven’t come across a lot of lines that are hard to distinguish or even look like they were machine translated. Unfortunately I do not know Korean to help hand translate each line otherwise I would. I agree hand translating is much better, wish I could help. It’s good to see that even the machine translation hasn’t lost the meaning for a lot of lines. I am less concerned now about losing meaning through machine translation so thank you.

I will continue proof-reading lines and editing them, if I find any lines that are left over machine translations or any meaning I can’t seem to distinguish - I will note it in my pull requests.

I was lurking on the forums as always when I saw this topic and immediately made an account for one purpose, to sincerely apologize. As someone who also wishes to help ToS launch soon, I am one of those who have used the web translator to help contribute. However, I now know that my help might have caused inconveniences instead. While proofreading the English portion, I felt that context was extremely important in the process. It was hard to proofread certain entries where sentences could be read in multiple ways, so I used several machine translators and dictionaries to try and translate the sentence for a reference of some sort. Admittedly, I only thoroughly checked difficult or strange phrases. I try to be as close to possible as the previous entry, but I have no way of ensuring my revisions are accurate. Also, I tend to have a habit of over complicating sentences for aesthetic appeal, so I’m sorry for that as well. All in all, I’m regretful and will be taking much closer care now. Any tips or advice would be helpful.

Don’t stress it, thank you for helping! A lot of the context is not lost in Google Translate as both IMC and Sourpuss have mentioned. A lot of lines have already been translated using machine translation and haven’t lost that much meaning unless they were long lines. At this point in the stage it is better to use hand translations, it will cause less headaches down the line. A lot of lines still need proofreading and editing. Proofreading them and editing them to the best of your ability, even with a machine translation, is fine! As long as the context of the original translation (machine translated or not) is not changed that much it will be fine. Once we are in-game it will be much easier to translate given more context. So thank you for helping out and don’t kick yourself in the butt for trying to help :smile:

We have no way of ensuring unless hand-translated accurately and even then we don’t have a lot of context. Continue to try and help! I will say that over-complicating definitions and lines of text for aesthetic appeal is not needed. The idea is to get the text readable and convey the meaning without adding too much flair. The less words used to convey the context, the better.

Alright, I’ll try to keep things concise and as accurate as possible. Thanks!

I’m a bit surprised that Google Translate is even being seriously considered. Korean-Japanese machine translation might work fairly well, but Korean-English and Japanese-English machine translation has held a notoriously bad reputation - unless it’s a two or three word sentence, and you’re lucky.

I decided to review about 100 lines of dialogue on the project, thinking it’d be quick, but rather then just editing it, I ended up retranslating most of it and reintroducing all the nuances. It’s a mess. It’s almost as if I was simply doing it from scratch, since the project’s not very organised.

1 Like

I think for many countries including all latino and north american doesn’t speak more than one language, so people are going to use translators.
That’s why these texts must have a “revision”.
DUH !
No offense, but this topic is kinda useless.
For what i saw, there are translation teams, ranging from 1-20 people.
If in some specific language something get google translated, then its unfortunate for that particular language.
I know for sure that spanish/portuguese are going to get pretty damn good translation, for they have massive dedicated communities.