proposal: spec: export uncased identifiers like 日本語

In go (now v1.1.1), an identifier is exported only if it starts with a character in
Unicode class "Lu" (uppercase letter).

The feature works fine for Western languages, but fails for CJK languages. All CJK
characters are letters but they are not uppercase. Therefore, these are not exported:

    var 成本 int = 5        // Chinese ideograph
    func ぶつける() { ... } // Japanese Hiragana (they are indeed letters)

It is very strange to use, say Z成本 or Jぶつける as identifiers.

I don't know how to properly control the permission. But at least I think it is
preferable to use CJK characters as *uppercase* letters, if we have no other choices
(more keywords, etc.)

6 thoughts on “proposal: spec: export uncased identifiers like 日本語

  1. Comment 13:

    A solution that's been kicking around for a while:
    For Go 2 (can't do it before then): Change the definition to "lower case letters and _
    are package-local; all else is exported". Then with non-cased languages, such as
    Japanese, we can write 日本語 for an exported name and _日本語 for a local name.
    This rule has no effect, relative to the Go 1 rule, with cased languages. They behave
    exactly the same.
  2. Many says non-English words are rarely used in practical coding even among people whose native language are not English. That’s mainly right. However, please let me give a special “rare” case, for your information.

    I work in the online game industry in China, and just like other industries our code are mainly written by programmers. but when it comes to the whole product, it contains resources besides the code. One important portion of online game resources consists of numerical and string values, which plays a vital role within the whole experience of gameplay. One product could contain thousands of such values, and they are provided, with their STRUCTURE, by game designers, not programmers. The programmers should follow the provided structure to use the values. When loaded these data from a file or database, the code could access a certain value by a static (Avatar.HP) or dynamic (Avatar[“HP”]) manner. For the purpose of performance and static check, the static way is often preferred, and here comes the problem: the type names and field names, as part of the data structure, are created by the game designers, and they are often not systematically trained and adapted to the programmers’ convention. They just compose the values in a spreadsheet editor and tools alike, define type names and field names by sheet names and header of the table, and of course they prefer using native words to describe the logic for clarity, especially for fancied concepts that are often difficult to translate, which are not rare at all in games. Their works are then converted by scripts into a form capable for loading by the program, but the identifiers they defined should anyway be preserved, i.e. when a static manner is adopted, it must involve code generation, and nobody wish to involve manual translation in this step, or to keep a translation dictionary up to date with the revisions of the designers’ works . And … now you will understand what I want to express. With initial characters of unicode category “Lo” not treated as exported, The Go language makes this working process impossible, and forces us either to sacrafice the performance and type safety by uisng the dynamic manner, or to force the designers to use English that they are not accustomed to, or to lose the clarity of logic encapsulation provided by the package system. There’s no such obstacle in other programming languages.

  3. @lych77 Thank you very much for your thoughtful and helpful message. We appreciate getting a more authoritative contribution to this discussion.

    Unfortunately the Go 1 guarantee prevents us from changing this rule now, but if there ever is a Go 2, there could be a change as I described above:

    “Change the definition to “lower case letters and _
    are package-local; all else is exported”. Then with non-cased languages, such as
    Japanese, we can write 日本語 for an exported name and _日本語 for a local name.
    This rule has no effect, relative to the Go 1 rule, with cased languages. They behave
    exactly the same.”

    This is a fairly minor change to the implementation but could have major effect for Chinese programmers. Please let us know what you think about this idea.

  4. Hey, I am the organizer of GopherChina. I create the biggest China Gopher community. gocn.io @mpvl reached to me today and mentioned this issue. I did a poll in our Gopher wechat groups.

    Title: Do you want to use Chinese name variable or function

    1. No
    2. Yes, and public
    3. Yes, but private

    Here is the result:

    1. 94.7%%
    2. 3.6%%
    3. 1.8%%

    I hope this poll will help you to make decision. This poll just passed 2 hours. But the results have been very obvious.

    image

    image

  5. In short, a proposal for how to proceed here:

    Let’s leave uncased identifiers unexported and find non-breaking ways to address the “Z成本 or Jぶつける are strange identifiers” problem.

    Please thumbs up/thumb down/respond to that specific steering suggestion, but let’s defer discussion of details of specific alternatives for the moment. Thanks.

Comments are closed.