Internationalization With React-Intl

Everyone in the global marketplace should understand your website.

There’s a lot to keep in mind when it comes to internationalization of a project, but most companies don’t want to think about supporting multiple languages right from the start. In fact, most companies shouldn’t localize right from the start. It’s faster to develop when not thinking about translations.

This was the case for our website, calm.com, in which we had built out 30+ pages — each with dozens of blocks of text — without thinking about languages other than English. Yet we were able to go from 1 language to 2+ languages in a matter of 30 days.

This is a short guide into the ins and outs of react-intl, with pain points and solutions that we found along the way.


React-Intl

This has been the biggest savior to our process of internationalization 🙏. Yahoo’s React framework easily wraps any text, and can be added into any application. For those who are not used to internationalization, the idea is to get one single file for all of your original strings, mapped out to IDs. Then in your code you inject the string by calling that id from the json object.

When doing this from the start, there’s honestly no need for a framework like react-intl at all. You would simply have something like this:

{
  homepage: {
    title: 'Welcome to the homepage',
    subtitle: 'We have lots of cool things',
    body: 'Come check out the page tomorrow to see more',
  },
  loginForm: {
    newUser: 'Create an account',
    currentUser: 'Log in to your account',
  }
}

But for the rest of the companies that didn’t want to implement localized content from the beginning, we need to generate this file somehow. And react-intl has two major features to help with this: <FormattedMessage /> and defineMessages.

I won’t go into too much detail on how react-intl works, but a great article to follow (and one that I read through twice before starting) is this one: Internationalization in React . It documents the process of getting started with react-intl. I recommend reading this if you want to have a good grasp on how it works under the hood.

The first helpful feature, <FormattedMessage />, takes three main attributes: id (the key), defaultMessage (the value), and values (variables to inject using FormatJS). So a typical FormattedMessage component would look something like this:

<div>
  <FormattedMessage
    id="homepage.greeting"
    defaultMessage="Welcome, {firstName}"
    values={{
      firstName: user.firstName
    }}
  />
</div>

react-intl can then automatically inject in the firstName variable and store the string Welcome, {firstName} to the id homepage.greeting in a file of your choosing. You would give react-intl the file in the outside <Provider> that wraps your React components, which is also how you would provide different languages. Obviously if the file is missing an id, it will fall back to the defaultMessage attribute.

The second helpful feature, defineMessages, can be used to pass in raw strings. We used this quite a bit in our application in conjunction with [injectIntl](https://github.com/yahoo/react-intl/wiki/API#injectintl) in order to format the message, although <FormattedMessage/> is always the preferred message since defineMessages won’t be able to pass in variables to the string. An example usage of this would be:

const messages = defineMessages({
  header: {
    id: "homepage.header",
    defaultMessage: "Welcome to the homepage",
  }
});...render() {
  const {formatMessage} = this.props.intl;
  return (
    <HeaderComponent
      title={formatMessage(messages.header)}
    />
  )
}...export default injectIntl(Component)

For more info on react-intl, the Yahoo documentation can be found here.


Problems

So all of this seems pretty straightforward: you just wrap your strings in some sort of react-intl component and call it good, right? Unfortunately it’s not that simple — especially when you have a codebase with around 17k words that need translation. So here is a list of some problems that I ran into with react-intl configurations, and how we chose to solve them.

React-Intl Is Terrible At Detecting Formatting Errors

In the month+ time that I spent using react-intl, I can’t even count how many times I accidentally pushed code that should work, only to realize I misspelled some react-intl attribute. Simple things like spelling defaultMessage as defautlMessage. And when I’m working with thousands of different strings, I would occasionally miss when this string wasn’t displayed correctly in the browser. A string that should be Welcome that is instead displayed as homepage.greeting can easily be overlooked if it’s in the middle of a massive block of text.

This is combined with the fact that occasionally I would miss html tags with raw text (especially on pages with a lot of text). And when viewing this in the browser, there’s no way to tell if the string is in the data.json file until translations have actually come through.

This is why we came up with a simple solution: enforce proper styling of react-intl components.

We created and open-sourced a simple library on top of eslint to help with this.


The eslint plugin solves the following problems:

  • Enforce any html tag with raw text to have the <FormattedMesssage/> component
  • Numbers are ignored (they’re the same in every language) if the raw text contains only numbers
  • Trailing whitespace is not allowed (this can be disabled) since some languages use spacing differently (or not at all). This includes trailing whitespace on the defaultMessage attributes themselves
  • In defaultMessage attributes within components, {variable} declarations must be declared in the values attribute. Too often I would accidentally misspell a variable name from defaultMessage so it would never be injected into the string by FormatJS.
  • Components must have both defaultMessage and id attributes set
  • defaultMessage and id attributes cannot be empty for <FormattedMessage/> components

It also a few customized features for our use case:

  • <a> tags are ignored since the majority of our uses had actual URLs as the inner html (i.e. <a>calm.com/subscribe</a>) which should not be translated
  • Although we don’t allow trailing whitespace for strings, we figured other people might want to have it

defineMessages Can’t Take In Variables

This had to have been one of the most annoying things, but it makes sense why it can’t take any variables. The variable instances within React are tied to the React scope, and defineMessages is meant to be called outside of the React scope before the page loads. Thus variables that would change in the DOM wouldn’t actually exist yet.

But when you have an older codebase with strings being passed in various formats, you end up finding places where neither defineMessages nor <FormattedMessage/> work. The simple solution? Well there wasn’t one here.

What we ended up doing is having to refactor each of these instances to somehow work with <FormattedMessage/> (our preferred format). This was mostly for cases where we were passing in strings to other components, and those strings contained variables in them. Most of the changes involved passing in both the string and the variable to the component and using <FormattedMessage/> in the child.

The only time this was really a problem was when we needed to pass in strings containing variables to external apis or components. For instance, something like <FacebookButton label="label with {variable}"/> where we don’t control the code for FacebookButton. We had a very hacky solution to fix this: invisibly render the string to the DOM, pull out the text, and then pass it to the component.

export const getLocalizedString = (localizationComponent) => {
  const rootElement = document.getElementById('localization-root');
  ReactDOM.render(
    <IntlProvider locale={globalLanguage}>
      {localizationComponent}
    </IntlProvider>,
    rootElement,
  );
  return rootElement.firstChild.innerHTML;
};

We are able to pass in the <FormattedMessage/> component to this function and grab the string. This requires having a #localization-root component on the outer level.

return getLocalizedString(
  <FormattedMessage
    id="paymentForm.paymentLabel"
    defaultMessage="{quantity, number}x Gift {quantity, plural, one {Certificate} other {Certificates}}"
    values={{
      quantity: this.state.quantity,
    }}
  />,
);

IDs Must Be Unique

When generating the file of IDs and default messages from the react-intl data, you have to do some work to make sure the formatting of the file is correct. It’s not a lot of work, but it would be nice if react-intl did it for you.

The following code exhibits how we ended up formatting the messages. Unfortunately, react-intl needs keys to be single strings, and you can’t map to a more json-like structure. So our keys ended up looking like homepage.header.greeting instead of homepage: { header: { greeting: ... } }. But this works out fine in terms of being able to search for keys easily.

import \* as fs from 'fs';
import { sync as globSync } from 'glob';
import { sync as mkdirpSync } from 'mkdirp';const filePattern = './build/messages/\*\*/\*.json';
const outputLanguageDataDir = './locales/';
const defaultMessages = globSync(filePattern)
  .map((filename) => fs.readFileSync(filename, 'utf8'))
  .map((file) => JSON.parse(file))
  .reduce((collection, descriptors) => {
    descriptors.forEach(({ id, defaultMessage }) => {
      if (collection.hasOwnProperty(id)) {
        if (collection\[id\] !== defaultMessage) {
          throw new Error(\`Duplicate message for id: ${id}\`);
        }
      }
      collection\[id\] = defaultMessage;
      return collection;
    }, {});
  });const fileBody = JSON.stringify(defaultMessages, null, 2).concat('\\n\\n');
const fileName = \`${outputLanguageDataDir}data.json\`;
mkdirpSync(outputLanguageDataDir);
fs.writeFileSync(fileName, fileBody);

What the function is doing is grabbing each of the keys/values and checking to make sure the string has a unique key, or that the value is the same if the key is not unique. This allowed for us to create generic keys that were reusable throughout the application, and that only need to be translated once. If you are using a service like Smartling (which has been great for us so far), they will map the translation values automatically if they detect similarities, so this is actually redundant. But hey, redundancy is a good thing, right?


Javascript Objects Are Weird

But this still wasn’t perfect. We quickly realized that on each commit to GitHub, the file was marked as changed. Javascript will non-deterministically map strings into whatever order it pleases, which is not ideal for managing actual diffed content.

Our solution was to make sure the strings are ordered as follows:

import \_ from 'lodash';const sortedMessages = \_(defaultMessages)
  .toPairs()
  .sortBy(0)
  .fromPairs()
  .value();

So the final output of our key/value mappings for internationalization is as follows:

import \* as fs from 'fs';
import { sync as globSync } from 'glob';
import { sync as mkdirpSync } from 'mkdirp';const filePattern = './build/messages/\*\*/\*.json';
const outputLanguageDataDir = './locales/';
const defaultMessages = globSync(filePattern)
  .map((filename) => fs.readFileSync(filename, 'utf8'))
  .map((file) => JSON.parse(file))
  .reduce((collection, descriptors) => {
    descriptors.forEach(({ id, defaultMessage }) => {
      if (collection.hasOwnProperty(id)) {
        if (collection\[id\] !== defaultMessage) {
          throw new Error(\`Duplicate message for id: ${id}\`);
        }
      }
      collection\[id\] = defaultMessage;
      return collection;
    }, {});
  });// sort them to make the build:langs idempotent if nothing changed
const sortedMessages = \_(defaultMessages)
  .toPairs()
  .sortBy(0)
  .fromPairs()
  .value();mkdirpSync(outputLanguageDataDir);
fs.writeFileSync(\`${outputLanguageDataDir}data.json\`, JSON.stringify(sortedMessages, null, 2).concat('\\n\\n'));

Force Content To Always Update

Another issue is common user error. We could be generating this file only when we make changes, or maybe we could forget to generate the file at all. So we have a simple solution: force every push to build locales.

Put simply, I added the following script in front of our deploy script, which generates our locales file:

./node\_modules/.bin/babel-node app/utils/translator.js

That way our data.json file will always be up to date.

This isn’t perfect, however. It doesn’t account for the fact that two users on two different branches can have two different data.json files for translation. One solution is to build out an api call that will merge any two locale files — whether they be from branches or from master. But with this you can be left with stale data that doesn’t actually need translation.

Although this is only a problem for new strings that won’t have a translation yet, the simplest solution is to flag translations as stale on the backend when pushing the new file. So if X amount of pushes come through and some id has not been in any of those X files, the api will finally remove the key.

This allows for the locales to constantly be uploaded to a translation service before the code is actually ready to be deployed to production. So in an ideal scenario, translations can get started for every branch, and branches can get deployed to production sooner.


Dates Need Localization, Too

moment.js to the rescue! Although react-intl has a built in formatter for handling dates/plurals, I found the dates aspect to be kind of clunky. We were already using moment for all of our date formatting across the application (as most people probably are), so I simply added in the locale to moment on the global scope.

moment.locale(language);

Server Side Rendering

Our application uses React Server Side Rendering. For some this is probably not applicable and can be skipped.

The issue here was getting the locales on both the server and the client, enforcing that the text on the pages matched. Since the server is serving up the pre-compiled Javascript as ReactDOMServer.renderToString(component), we need to somehow make the strings available to the client as well. Given that there generally isn’t any secure data in localized strings, we simply throw it on the window for the client to access as well. The following element is what’s passed into renderToString.

addLocaleData(\[...en, ...de, ...es\]);
const element = (
    React.createElement(IntlProvider, { locale: language, messages: messages },
      React.createElement(component, propsWithLanguage),
    )
  );

We then have express serving up the page with a set of params that get injected globally to the window. So in the standard (req, res, next) => ... format, we have

params.globalLanguage = language;
params.globalMessages = messages;
...
res.render(pageToRender, params);

After making sure this is available on the frontend, our client-side render can look something like this:

const render = (reactHTML) => {
  addLocaleData(\[...en, ...de, ...es\]);
  moment.locale(globalLanguage);
  const fullComponent = (
    <IntlProvider locale={globalLanguage} messages={globalMessages} >
      {reactHTML}
    </IntlProvider>
  );
  ReactDOM.render(fullComponent, document.getElementById('react-root'));
};export default render;

Unavoidable Issues

Intl IDs through React-Intl must be strings

As stated above, react-intl expects ID attributes to be a string. It can’t automatically map things to an object. This means if you want to organize your json object to look more like a tree structure, you can’t…

It does add the benefit that IDs are easily searchable (i.e. homepage.header.title is easier to find in a single search), but it still doesn’t quite function like a json object. It instead is just a standard hashmap. But hey, maybe some people will prefer this.

Deploys Need To Be Coordinated

Whether you’re deploying your branch to a staging tier or to production, you will always run into the problem that strings need to be translated before it goes live in other countries.

We debated on having a system that rolled out individual components to individual countries so that we can show new features only when translations were ready, but this would have been a huge overhaul to how the website works. It also would have meant that web could be out of parity with the mobile apps. The end solution was to just be careful.

When we push code to dev, it automatically pushes the new strings to Smartling. These are then flagged for translation. We simply set up another checkpoint that things can’t go to production until the translations are finished. For our case, this was another lane in our Kanban board (we use ZenHub). This leaves room for error, but we figured a worst-case scenario is that small portions of the site could be in English instead of the native language.

Gruntwork

A lot of the translations were just brute-force. For most apps this will probably be unavoidable. We did have a couple of cases where I was able to write quick scripts to auto-format the file to have <FormattedMessage/> components, but this wasn’t extensible to the entire application. It was only worth writing scripts when we had heavily repeated components and formatting.


Resources

React-Intl

React-Intl Documentation

Article that’s great to follow for setting up React-Intl:

Calm’s React-Intl Linter/Formatter

Moment.js Internationalization