Sisulizer at Embarcadero Technology Partner Spotlight

Today we had the great pleasure to present Sisulizer in two live webcast sessions at Embarcadero Technology Partner Spotlight.

It was a pleasure for us, to finally talk to David I again and answer your questions. We hope you enjoyed it. And if you missed, don’t worry. Embarcadero will publish it on YouTube too.

We will post the link when the questions and answer section is online. In the meantime you may want to visit Embarcadero for links to the other sessions with other popular tool developers like Raize, DevExpress, and TMS, to name a view.

It would be nice, if you comment it, share it, or give us a thums up. We are curious if you like it, and may want more videos about software localization with Sisulizer.

Squares instead of foreign language characters on Windows 8

Whenever you encounter squares in Windows 8 software dialogs, file names, menus instead of e.g. Japanese characters like in 火.txt, you do not have the corresponding language pack installed.

The solution is fortunately easy.

Make sure you are logged in with administrator rights.

First open the Windows control panel. If you don’t know how to find it, take a look here.

Next, click on Languages. You should see something like this:

add-a-language-cp-win-8

Click Add Language. And select the desired language. Here we go for Chinese (Simplified).

languages-add-language-cp-win-8

Chinese (Simplefied) has regional settings as well. We pick one.

language-add-languages-regional-variants-chinese-cp-win8

Now we are back, and see our changed language preferences. We can see our changes in the list, but the new language pack is not installed yet.

change-your-language-preference-cp-win8

So we double click it, or click on Options, to open the Control Panel – Language Options dialog again. Here we chose to Download and install language pack

download-and-install-language-pack-win8-control-panel

Here is the installation process. Depending on the selected language, and the already installed languages the packs are quite big, eg. 80 MB or 147 MB.

download-and-install-updates-language-packs downloading-chinese-simplified-language  the-language-updates-are-being-installed installation-complete

You may need to restart your Windows.

 

Now you should have the fonts installed, and should not see any squares.

Hope this helps… Just let me know. Currently it looks that the installation of additional language packs do not work under all circumstances. I have to dig deeper into it, and will come back with another article, soon.

For older Windows versions like Windows 7 or XP please check Janusz’ article here.

Renate

What the heck is a Windows language pack?

A language pack consists of different informations you need to work within a language.

The display language of Windows in example contains resources with all text elements of Windows dialogs, menus, output, and error messages, and tool tips.

Another important thing you need is the keyboard layout for the desired language. Depending of a language the keys on a keyboard have different characters printed on. Some languages have in example umlaut characters, like ä,ö,ü in German, Turkish, and Finish. Or accents, like in é, à, î, and ô (and more) in French, and their is a whole bench of other characters in other Northern Europe languages. Naturally this language feature is not limited to European languages.

But even if the characters which two languages or regions are using are identical, they may be used in a different order. Especially numbers, brackets, punctuations, and currency symbols may be on different keys.

Don’t trust my word. Give it a try. Visit an internet café or a PC in a hotel lobby, on your next travel into another country. Typing such simple thing as an email may be very challenging. 😉

A Windows language pack also includes sets of regional settings for Clock, Currency, number formats, and more.

Depending on the language, it may also contain fonts, or special character entry tools, like Microsoft IME for Japanese or Microsoft  Pinyin SimpleFast for Chinese.

More things to read

  • How to install a language pack in Windows 8?

 

Hope this helps. Just let me know, and leave a comment.

Renate

Vote for your favorite Delphi localization tool

Please consider to vote for Sisulizer at delphitools.info/2012/02/15/what-do-you-use-for-string-localization-in-delphi/ as your favorite localization tool.
Currently Sisulizer is doing well. GNU gettext/dxgettext is still leading.

If you do not already know: Sisulizer does support localization of PO/POT files, too. And I know that quite a few translators are using Sisulizer to translate, even when the developer just use GNU gettext. Why? Translators love the comfortable and intuitive user interface, and all the great filters, search/replace features, integrated translation memory, and the validation functionality. Especially the validation features help translators and developers to create great, and save localized versions of your software. Please read more in the following articles about localizing PO/POT files created by GNU gettext routines with Sisulizer:

Resourcing hard coded strings in Delphi

Software localization tools like our Sisulizer localizes binaries, like executables (.exe) or library files (.dll). That has many advantages like

  • You do not need to give third-parties like translators access to your source code
  • Your source code will not be changed itself
  • Less files need to be exchanged with your translators

to name a few.
This works fine when your strings are stored in string tables of Windows resources. Luckily this is very easy in Delphi. Just use a resourcestring block instead of hard-coding the string.

Don’t use hard coded strings like

procedure TForm1.FormCreate(Sender: TObject);

begin

  Label2.Caption := 'Click the above button to process data'; 

end;

Use Delphi resourcestring blocks like

procedure TForm1.FormCreate(Sender: TObject);

resourcestring

  SClickButton = 'Click the above button to process data';

begin

  Label2.Caption := SClickButton;

end;

Use unique and descriptive names as resource strings. That gives you and your translator a much better idea of the context of the string, and a way better chance for a fast and good translation.

What if you have hard coded strings in 100k+ lines of Delphi code

What do you want to hear? Bad luck, internationalization tools are very expensive, chances a good that you have a lot of stupid work in front of you? I fear I have given too much people answers more of less like these. Shame on me.
First, you are not alone. And even better there is a great tool around that will help you through that job. And all this at a very competitive price.

Resourcing with CodeExplorer

CodeExplorer provides a whole set of ways to convert your hard coded strings to resources.

  • Converting a hard coded string at the cursor.
  • Converting all hard coded strings of an entire file, or even multiple files with sophisticated wizards
  • Converting hard coded strings by using the command line tool

And it comes with various options for scanning, and converting, I.e. it helps you to identify strings that do not need to be localized at all, or do need special attention like SQL statements. It also provides warning mechanism for units.
CodeExplorer is made by our Dutch Embarcadero CodeGear Technology Partner fellow Modelmaker Tools. Prices start at EUR 99. Read more about resource string conversion with CodeExplorer on their web site.

Delphi Developer Days

Are you developing in Delphi, and want to take a look beyond software localization? Delphi Developer Days 2010 is a five-city tour in the United States and UK/Europe this May – June 2010. We are proud sponsor of these interesting event from top Delphi experts Marco Cantù and Cary Jensen. Hurry, to get one of the last seats in London or Frankfurt.
London, UK: 26-27 May, 2010
Frankfurt, Germany: 31 May-1 June, 2010
This one-of-a-kind event includes both joint sessions, presented by Marco and Cary together, as well as simultaneous tracks, where Marco and Cary break out into separate rooms to present individual topics. Whether you are using the latest version of Delphi, Delphi Prism, or are developing with an older Delphi version, you will come away with loads of information that will improve your development and make you more productive. Complete descriptions and a schedule are at: www.DelphiDeveloperDays.com/descriptions.html
There web site is in any case worth a visit because of its nice list of Delphi resources like user groups.

Online Help Localization

Localizing windows software has an easy phase and a hard phase. Usually, the user interface elements, such as the menu and dialogs, are easy. In most cases, the amount of translatable text is minimal. This leads many companies to the conclusion: “OK, let’s go for other language versions of our software.” If the process involves one language, you might not even consider using a localization tool.

The hard job

Does your software have a large online Help file? Guess what is the hard part? Translating the initial language version can be expensive; however, this part of the process is not difficult. The difficult part comes when you improve your software and online Help. Finding the differences between the original Help file and the updated Help can be hard, especially as you want to keep your translation costs low and the translation process efficient.

No tools

Why is that so? When we began localizing software from English to German, back in 1995, no tools existed for online Help localization. Computer-aided translation tools focused on documents for printed manuals; software localization tools ignored online Help. And no online Help authoring tool had real localization support.

Localizing with HAT

You can use the same process we did: we translated online Help in the same online Help authoring tool that is used to create the software’s original Help by copying the original project. We did that easily because we used a tool we know ell, ForeHelp. But with every new version, we had difficulties in determing what needed changing. Fortunately, the Help authoring tool was extended with a functionality that listed the changed and new topics. Unfortunately, we found that determing what changed in a Help topic was still very tedious work.

Localizing with HTML Editor

Later on, with the introduction of HTML-based online Help, html editors, like Dreamweaver or FrontPage, became very popular for creating the translated versions. But, even with these tools, determing changes is difficult. This is because changing the layout of an html Help page usually changes the file date too. Therefore, you might that the page needs updating, when it really doesn’t. You might have to read an entire page, sentence by sentence, to determine this. Even a file diff tool does not help much because it usually does not see the difference between html tags and real content.

HATs with localization support

A few years ago, the first online Help authoring tools offered localization support. Often, this kind of feature exports xml file(s). Surprisingly these tools are either topic based or process such small chunks of text that it is hard to determine the context. The context is usually in a separate file. None of these Help authoring tools (HATs) offer built-in translation memory support.

Localization Tools with HTML Help support

The newest generation of localization tools like Sisulizer, finally integrates online help localization in a software localization process. Whenever you implement changes in the software, and/or html help, just rescan, and your translator sees exactly the strings that must be translated. And, because the text is segmented in linguistic sentences, instead of splitting at html tags, the context is always clear. At long last, you have access to a tool that allows all parts of your application to share the same translation memory.

You may want to give this tool a try. Click the following link to get a 30-day full evaluation version: Sisulizer Localization Tool Evaluation. During the Setup process, please choose Sisulizer Enterprise.

— Renate Reinartz

Software developers and localization tools

In the past, I have worked with many software companies, and they all have one problem in common: Software developers like home-brewed solutions, independent from the development language they use.

The reason is this: Software developers are all brilliant in their job! And there is nothing on earth they cannot code in a few hours, especially if the code concerns exporting and importing strings. Don’t take me wrong: I have falling into this coding trap many times.

A developer may think that having all strings in one format can do the job. However, the context for the strings can be lost in one file format. He/she may produce a text file, Microsoft Office® document, database, or xml file(s). If the programmer plans in advance, he/she might even export only changed or new strings; or, even better, add this kind of information to the translatable text. But have ever considered working with translation memory. Usually, money is not the issue: even successful software companies resist spending money on a localization tool.

The funny thing is that software developers love to automate their work and hate to code the same thing twice. That is exactly the point where you can get them.

Life is like that: if you haven’t done it, you can hardly imagine doing it. To be honest, if you never have localized software on your own, especially for more than one release, you hardly know anything about the complexity of the process.

No translator enjoys translating the same strings again and again. And translating text out of context is just guessing. trust me: you should not hire a professional software translator for that kind of work.

Markus article A beginner’s guide to Windows software localization provides more detailed information about these issues.

— Renate Reinartz

Skip the typical software localization beginner’s traps

Localized software opens new markets and creates more revenue. This concept sounds great; however, what does it mean for you, as a software developer? How can you write software that you can easily localize? And what does localization mean for your day-to-day work?

You might consider the following: What is different in other cultures? The differences include many areas:

Address, character sets, code pages, currency, date, language, list separators, measurements, numbers, paper sizes, phone numbers, sort order, taxes, time

Languages

Your first consideration is most likely the language. At the very least, you must think about how to translate all strings in the application user interface. Usually, these are strings for menu entries, dialog boxes, message boxes, the status bar, and error messages.

If you want to send all strings to a translator, you must consistently separate strings from source code. Start this process now. Don’t wait until you are under pressure to meet a deadline. All major development languages for Windows support resource files. You have no excuse! You can easily read strings from resources, such as in classic Visual Basic with LoadResString or LoadStr in a VCL application.

Solution: Store all strings in Windows standard resource files; or, if you develop with Microsoft .Net, use ResX.

Character Sets

Using a different language often means you must consider another character set. Especially if English is your first language, you might think that you need only 128 characters. However, many languages use special characters:

  • French accents, like in à, é, î, and ç,
  • Spanish punctuation, like the reverse question mark ¿,
  • Umlauts, like ß, ä, ö and ü in Germany or Finland,
  • Other umlauts, like æ and å used in Denmark, Norway or Sweden.

The list is endless. So how would you feel if you couldn’t use characters from your own native alphabet? What if your name is Henry, and you couldn’t write your name, because a Russian software developer would not support the letter H as his/her language does not use that letter? Would you write enry instead? Or would you directly uninstall the application?

Solution: You should use Unicode string handling in your application, whenever possible. This allows you to support all languages and character sets. The Unicode Windows API can help you accomplish this. If your development environment does support ANSI character sets, you should ensure that you do not restrict your input to the first 128 characters.

Code Pages

Unfortunately, few development systems support Unicode in their visual components. This is the case in Visual C++ and .NET languages like C#, Visual Basic .NET. Many others, such as VCL Delphi and classic Visual Basic development systems, do not have built-in Unicode support for the GUI. In these cases, your program will start with the code page the user set for non-Unicode applications in the system. You cannot influence this setting from within your application. If you exchange data with your translator or user, make sure that you know how to handle code pages (read more).

Solution: Use a localization tool that handles all languages and streamlines the data exchange process. Search for a localization tool that supports double-byte and left-to-right code, so that you don’t have problems later.

Numbers

A number is a number is a number, you might suppose. Wrong. Numbers are formatted. Two things differ between countries: the decimal and the thousand separator. Some countries, like the USA, use a comma to separate thousands and a point as a decimal separator. Therefore, one thousand and two cents are 1,000.02. Other languages, such as German, use a comma and a point in the opposite way, so 1.000,02 displays the same as the previous example. In Switzerland, it is 1’000,02 because the Swiss use the quote sign as the thousand separator.

Solution: Do not store numbers internally, in databases, or in files as formatted strings. Always use a numeric variable type like Long or Float. When you display numbers, format them with the right system setting for the thousand and decimal separators. The Windows API provides functions to get the appropriate values. In .Net, check the culture name space. When you allow user input, make sure that the user knows which format is required.

Currencies

The currency used in a country affects your application, too. Most currencies have their own currency symbol. Examples are € for Euro in Europe, £ for the British pound, or Italian lira (outdated), ¥ for the Japanese yen or Chinese Yuan, and $ for the dollar used in Australia, Canada, Jamaica, New Zealand, USA, and many others. The currency symbol is defined in the character set used in the country. The symbol is also defined in the regional settings of the Windows control panel.

Because the symbol does not fully specify the currency as shown in the previous examples, you should use the international three-character currency codes derived from ISO 4217, like USD for US dollar, EUR for Euro and so on. If your application handles more than one currency, you should save the currency code, too. You should be careful when you define a currency field, and exchange data with a spreadsheet or database application, like Excel or Access. These applications use the system setting. The monetary difference is quite large in the conversion, such as going from the Japanese Yen (JPY) to the US $ (USD) without currency conversion.

In addition, you should be aware that the currency code might be placed in front of, or behind, the currency value.

Solution: Be sure to check the system settings for the default currency and symbol placement. The Windows API provides functions to get the appropriate values. In .Net, check the culture name space. Be prepared and use the international currency codes. When you allow user input, make sure that the user knows which format is required.

Dates

You should never hard-code date values. The date order is different between countries. In the short date format, the USA uses mm/dd/yyyy where m is the month, d is the day, and y is the year. Germany uses dd.mm.yyyy. If you do not take care of this, for example, in Visual Basic, a date string like 12/9/2006 can be interpreted as 9th December or 12th September. If you use medium or long date formats, the day and month names must also be translated. If you use format routines, you should ensure that your development system supports date format in the way that you require. If you need to calculate with dates, store them in a format that is system independent like the ISO 8601 format yyyy-mm-ddThh:mm:ss; you can also convert the dates to a system-independent date number format, such as date serial. This makes dates sorting easy.

Solution: Be sure to store dates internally and in files without using a format. Use the data type for your programming language. If you allow user input, collect the day, month and year in separate fields, and internally build a date data type from these fields. When you display dates, format them with the right system settings. The Windows API provides functions to get the appropriate values. In .Net, check the culture name space. When you allow user input, make sure that the user knows which format is required.

List Separators

Who needs list separators? You do, trust me. You should consider list separators whenever you handle a string array in multi-column list boxes, memory, or comma separated values files (.csv). .Csv files are only comma separated for languages that use a comma as list separator. However, many languages do use a comma to separate decimals in numbers, that’s why they use a semi-colon (;) to separate string arrays.

Solution: Get the list separator setting of the user system. The Windows API provides functions to get the appropriate values. In .Net, check the culture name space.

Measurements

You should never hard-code local measurements, like inches and miles. Whenever possible, you should use the metric standards, such as centimeter and kilometer. You can take the same approach for weight: instead of pounds, use kilograms. In addition, the liter is more popular than pint or gallons. The metric system differs in other countries. Moreover, ISO favors the metric system now. Even kilobyte is no longer 1024 bytes in ISO; kilobyte is 1000 bytes, because average people are accustomed to counting metrics. I am personally humbled and embarrassed that I missed that for seven years. On a more serious note, in some European Union (EU) countries, placing ads with non-metric measurements is against the law.

Solution: Don’t hard-code measurements and be prepared to select and convert them.

Paper Formats

If you print many documents, you might have wondered how many odd formats your printer driver can handle. Not surprisingly, paper sheets come in many more sizes than just the standard letter and A4 paper.

Solution: If you must format the printout, check the paper format. You can get this information from the Windows API or directly from the printer object or class in your development language. Do not expect only one of these sizes because user-defined types might be used. For example, professional output devices might have sizes like A4+ for borderless A4 output.

Phone Numbers

Usually, an international phone number has three parts after the leading plus sign: country calling code, area code, and local phone number. A country calling code consists of one to three digits; for example, 1 for USA and Canada, 32 for Belgium, 420 for Czech Republic, and 86 for China. However, many countries, such as Denmark, do not have an area code. The number of area code digits also differs. Sometimes it is a defined number, like three in the USA; however, in Germany, the area code can have three to five digits after the leading zero. German callers do not use the leading zero in international calls from Germany. This contrasts with Italy, where you must dial the leading zero in international calls. The digit number for the local number also differs. In Germany, local numbers can contain three to seven digits, sometimes even eight for numbers to a pbx. In some countries like the USA, phone numbers can also contain an extension at the end, separated by a hash #, which is used only by the pbx of the phone holder. The only consistent aspect of a telephone format is that an international phone number can’t be more than 15 digits.

Solution: To be safe, internally save international phone numbers. Don’t accept input that is only in your local phone format. You should always accept international numbers. For example, don’t limit the area code to three digits or require seven digits for local numbers.

Sort Order

As described in the previous information about character sets, different countries or languages have different lists of characters, or, in other words, different alphabets. Thus, the languages can have additional characters like umlauts or accents, such as in German, French, Danish, Swedish, Norwegian, Finnish, Turkish, and so on. Some languages don’t support some of our favorite characters, like h in Russian, x in Greek, and many more. All these languages sort their characters differently. If you do alphabetical sorts in your application, you should at least think about supporting the sort order in the localized language. Amazingly, even large or popular applications do not support sort order in their localized applications, at least in the past. Supporting different sort orders depends on the importance of your users alphabetizing their data. For example, if you have an application that handles addresses, contact information, or other large amounts of data, your user will definitely miss this feature.

Solution: In .Net, you can check the culture name space for the sort order. In other development systems, you must check your string sorting routines; and, for your own implementation, you may have to collect the data on the web first.

States

If you design an online contact form, you should never force your user to enter a state, or, even worse, select one of the 50 US states. Not all users live in the USA and are used to providing the state they live in. Some users might live in countries that use other systems, like departments in France or counties in Great Britain.

Tax System

If you produce an accounting application, keep in mind that tax systems are different in many countries. In the EU, for example, a gross tax is named a value-added tax, but no local sales tax.

Time

If you use time, you must consider the twelve and twenty-four hour models over the world. Twelve-hour systems, as used in the United Kingdom or USA, use am and pm to define whether the time is before or after lunch. You must ensure that time zones are reflected in your application. Which time coding do you need? Local times, like Eastern Standard Time (EST) in New York, Mountain Standard Time (MST) in Colorado, or Pacific Standard Time (PST) in California, USA? Greenwich Mean Time (GMT) is international time and is the basis for the world time clock. GMT is the preferred time if you exchange data in other countries. This time system is based on the local time in the English city Greenwich (GMT+0). All time differences are given in GMT+x or GMT-x. For example, France, Germany, and the Netherlands show GMT+1, PST is GMT-8, EST is GMT-5, and Japan is GMT+9. Differences may also occur in summer and winter. Most countries have summer time savings, although Japan does not. There is another time system, Zulu or UTC. If you need to code time, such as in e-mail or Internet formats, you can check the related RFCs to store time.

Solution: Make sure that you store time internally and always use the same time zone in files. When you display a time format , use the correct system settings. The Windows API provides functions to get the appropriate values. In .Net, check the culture name space. Use GMT time coding, instead of local formats, such as MST and PST. These time formats are not common outside the USA, and a US user probably has no clue as to what CET (Central European Time) means.

Other traps

Depending on your application, you should check legal issues, cultural differences, standards, and more. These issues may result in logical changes to your source code. For example, accounting rules differ between the USA and Germany and popular payment methods can vary.

Cultural differences can bring you serious trouble or kick your product completely out of the market. I’m sure that you don’t want to offend your users. You should never use certain colors, body parts, or animals. Different cultures see these things differently. There is a rumor about Borland from the very early days. In Germany, Borland used a pig in the Turbo Pascal ads. These ads were very popular: a pig brings luck and can represent speed in Germany. The rumor is that the Borland founder and former CEO, Philippe Kahn, stopped the German ads. As a native Frenchman, he did not want a pig associated with his products.

Conclusion

Don’t be shocked about the number of differences in our many global cultures. Most of the differences are easy to address in your application code. Other software companies have tackled these problems, so you don’t need to reinvent the wheel. Many fine tools are available. In particular, the Windows API provides all of the tools you need. Moreover, if you are already using .NET, you are a top priority with Microsoft, because Microsoft knows they must appeal to the global programmer and provide you with the right tools—at the beginning of your software development process, when it matters most.

When you write flexible code, localization is a snap. However, if you have hard-coded some of the items mentioned in this article, you have much work to do. Take the chance to implement this article’s recommendations in the beginning of your project. Plan a bit more in advance. This will save you time and money.

Most developers who have hard-coded strings are afraid of converting their application to string resources. Yes, if you have a large application with many strings, this process can cost you a few extra days. However, you will save more time and money when you start the localization process. All your hard work will pay off, as you can process every language and every update easily and efficiently. Remember to ensure that your resource files are Unicode-aware, so that you are prepared for the future.

— Renate Reinartz