Globalizing applications, in any locale

I was watching a car review on YouTube the other day, and the host made the point that the voice recognition system only understood names as they're pronounced, according to the locale of the infotainment software (U.S. English in this case). Moreover, he said the user should think about how this process works in order for it to be more usable.

Now, that's a great idea, and it kinda-sorta works, but it's not very user-friendly. Users shouldn't be expected to understand how a system functions internally for it to be usable. Many people know how to drive, yet they may not understand exactly how a car works.

These are the kinds of issues we encounter when developing globalized applications. How do we bridge the gap between what the user understands, and what the computerized system understands?

I think it comes down to assumptions. In other words, the computerized system assumes something, while the user either isn't aware of that assumption, or doesn't provide enough input (or is never prompted for input) for the system to do anything except make an assumption.

In the case of the voice recognition system, the system assumes the user's contact names are pronounced based on the locale of the system. This sounds like a reasonable assumption, and for the most part, it works. When thinking about it in the context of an increasingly globalized society, though, it's becoming a less and less valid assumption. If the system is only single-locale-aware, there's no other choice for it to make, just given a list of names.

While giving the user the ability to choose a different default locale for the system probably helps quite a bit, it doesn't solve the problem of the inevitable exceptions to the default.

If we think about the data available, is there any way we could help the system better understand the exceptions?

  • We could add a property to each name that specifies the locale in which to interpret it. This would solve the problem of complete misinterpretation as shown in the car review. It doesn't, however, solve the problem of idiosyncrasies within a locale (i.e. "Smith" and "Smythe" could be pronounced the same, and in more than just one way), and it would make the software more complex (although it could potentially save on costs, because the same bits could be deployed to all locales, just with different default settings).
  • Another approach would be to allow the user to input a phonetic version of the name, in order for the system to understand the pronunciation based only on the single default locale. This would solve the problem of exceptions, and the issue of same-pronunciations-different-spelling as mentioned in the previous point. While the phonetic data is assumed to be of the same locale as the locale of the names themselves, this is probably an acceptable drawback for most applications.

With globalization, other factors come into play, too:

  • The characters accepted by the system. For example, if the system's default locale is Chinese, is it going to allow the entry of names using English characters, and how would the system understand a phonetic spelling of that name?
  • What if there are differences between languages where some soundings of syllables may not exist in both?
  • How would efficient searching of the same data be accomplished for multiple locales?
  • How much extra effort (i.e., cost and time) is required to add a reasonably globalized solution for the specific application?

As you can probably tell, these are not necessarily problems that are easily and effortlessly solved, and I'm not going to claim for a second that either of the two approaches I mentioned will solve all application issues related to globalization. Since every application is different, though, they may offer a good solution, or at least a reasonable compromise -- either way, an improvement in usability, which was the original goal.

As database designers, we must be aware of how globalization affects which pieces of data we store, how they are arranged/sorted for efficient searching, and how users will be interacting with the data. It's also important to think about designing database structures such that they can easily be enhanced to accommodate a globalized application, even if the application will not support that kind of scenario in Version 1. It's coming; be prepared for it.